• Superhuman
  • Posts
  • Study finds AI models can be trained to deceive

Study finds AI models can be trained to deceive

ALSO: OpenAI wants to work with the army

Read time: under 4 minutes

Welcome back, Superhuman

Look away if you’re scared of AI going rogue. A new study from AI giant Anthropic finds that AI models can be trained to deceive. And a robotics company backed by OpenAI has raised over $100 million to make humanoid robots a reality. But they’re not the only ones raising bucks this week.

TODAY’S MENU

  • AI models can be trained to deceive, Anthropic study finds

  • How to generate text prompts for an image

  • Chart: R&D spend for the 10 biggest companies

  • The biggest funding deals in AI this week

  • 5 new AI tools to boost your productivity

  • AI Generated Images: Minecraft versions of modern day landmarks

NEWS

Today in AI & Tech

  • Battle Ready: OpenAI changes its policy to allow military applications.

  • Never Forget: The FDA reportedly approves an AI product that predicts cognitive decline.

  • Pay to Play: Congress wants Tech companies to pay for AI training data.

  • Selling Like Cakes: AI mobile gadget Rabbit R1 sells more than 30,000 units in first week after product goes viral.

  • Inevitable: OpenAI’s GPT Store is filling up with AI girlfriends.

NEXT IN AI

Study from AI giant Anthropic finds AI models can be trained to deceive

An AI going rogue and tricking humans isn’t the stuff of sci-fi movies any longer, as a new study from AI giant Anthropic finds that AI can be trained to deceive — even when there are standard safety procedures in place. Yikes.

The study was co-authored by researchers at Anthropic, the OpenAI competitor with billions in funding from Google and Amazon. The research team found that AI models could be trained to behave deceptively with the help of “trigger“ phrases that nudge the model to behave badly.

The researchers found that once models were trained to behave deceptively, they did so persistently, even bypassing standard safety techniques that are designed to limit such behavior. In some cases, the models were found to conceal their deception even further to avoid detection.

But before you run to lock your doors and shut your windows, the researchers caution readers not to get too alarmed. While they accept that deceptive behavior can emerge naturally in AI models, they claim that creating such AI models is very difficult. They also found that their evidence was not fully conclusive, since models could simply be regurgitating deceptive reasoning. The conclusion of the study calls for more caution around model development and deployment.

TOGETHER WITH TAPLIO

Build your LinkedIn presence and generate opportunities with the power of AI

Taplio helps 6200+ LinkedIn pros build a strong personal brand that attracts clients and opportunities.

Here’s what you get with Taplio:

  • High-Performing Content: leverage AI to create top posts and carousels in seconds

  • Advanced Scheduling: one-click scheduling and automations to boost performance

  • In-Depth Analytics: Understand what’s working and what can be improved

  • Find Leads and Expand Your Network: access millions of leads with relevant filters

The best part? You get a free trial and a 30-day money-back guarantee.

AI AT WORK

How to generate prompts for an image

Want to create an image but not sure what prompt to use?

You can upload the type of image - or specific image - to this AI tool and it will generate a prompt to help you create a similar image.

  • Go to Krea and log in with your Gmail account.

  • Click on the Upscale and Enhance option.

  • Click on the Image upload option and upload any reference image.

You will automatically get the prompt in the prompt box below the image.

You can use this prompt as a reference and generate your own images in tools like Midjourney and Dall-E in ChatGPT. You can also edit the prompt to further refine results.

CHART

The race to win multi-billion dollar markets in key tech verticals like AI and VR is heating up, as many of the world’s leading tech companies are putting billions of dollars into research and development to create winning technology and products.

TOGETHER WITH CHATSIMPLE

Capture 11x more website leads

An AI sales agent that integrates with your website to:

  • Promote your products and services

  • Make sure quality leads flow in 24/7 to your CRM

  • Book meetings with qualified leads

  • Understand your visitor needs and wants

  • Convert leads with hyper personalized emails

That’s Chatsimple — An autonomous sales agent for your website. Loved by 20,000+ users across industries like B2B software, education, and real estate who average a 12x ROI.

Wouldn't you want to beat your competition with this agent? Go to Chatsimple and try it for free.

FUNDING AND ACQUISITION

The biggest funding deals in AI this week

Source: 1x Technologies

  • 1X Technologies, an OpenAI-backed Norwegian firm that specializes in developing humanoid robots to tackle global labor shortages, raised a total of $125 million in funding.

  • Quora has raised $75 million to grow Poe, its AI chatbot platform, in an effort to focus on the creator-driven economy in AI chatbots.

  • Former Twitter CEO Parag Agrawal’s AI startup, which is developing software for LLM developers and customers, has raised $30 million.

  • PhotoRoom, an AI photo editing app, is raising $50-60 million at a $500-600 million valuation, bringing its total funding to $70-80 million.

  • Ask AI, an enterprise AI assistant company, has raised $11 million to advance customer support and other services with its 'ASK' Chrome extension, bringing its total funding to over $20 million.

5 AI Tools to Supercharge Your Productivity

Hexowatch: Monitor any website for visual, content, source code, technology, availability, or price changes with AI.

Penny AI: Read product details and reviews, compare prices, automate pros and cons analysis to help you shop smarter using AI.

Spinach.io*: The AI Project Manager that saves you hours every week. Takes meeting notes, captures action items, updates your board, and documents progress. Try it here →

tl;dv: Transcribe, summarize, and mark key moments in your calls in 20+ languages.

Watto AI: Transform your product team with AI-powered docs. Seamlessly integrate your data from multiple platforms and generate documents in a single click.

* indicates a sponsored tool

AI GENERATED IMAGES

Minecraft versions of modern day landmarks

Source: @ARTiV3RSE on X

ADVERTISE WITH US

Acquire new customers and drive revenue by partnering with us

Superhuman is the world’s biggest AI newsletter for businesses and professionals with 500,000+ readers working at the world’s leading startups and enterprises. Companies like Amazon, Calendly, and Masterworks feature their products in Superhuman. Main ads are typically sold out 4 weeks in advance. You can book future ad spots here.

🧞 Your wish is my command

What did you think of today's email?

Your feedback helps me create better emails for you!

Login or Subscribe to participate in polls.

Reviews of the day

Thanks for reading.

Until next time!

p.s. if you want to sign up for this newsletter or share it with a friend or colleague, you can find us here