Anthropic's Claude adds a prompt playground to quickly improve your AI apps

Maxwell Zeff

July 9, 2024 at 8:11 PM·2 min read

Prompt engineering became a hot job last year in the AI industry, but it seems Anthropic is now developing tools to at least partially automate it.

Anthropic released several new features on Tuesday to help developers create more useful applications with the startup's language model, Claude, according to a company blog post. Developers can now use Claude 3.5 Sonnet to generate, test and evaluate prompts, using prompt engineering techniques to create better inputs and improve Claude's answers for specialized tasks.

Language models are pretty forgiving when you ask them to perform some tasks, but sometimes small changes to the wording of a prompt can lead to big improvements in the results. Normally you'd have to figure out that wording yourself, or hire a prompt engineer to do it, but this new feature offers quick feedback that could make finding improvements easier.

The features are housed within Anthropic Console under a new Evaluate tab. Console is the startup's test kitchen for developers, created to attract businesses looking to build products with Claude. One of the features, unveiled in May, is Anthropic's built-in prompt generator; this takes a short description of a task and constructs a much longer, fleshed out prompt, utilizing Anthropic's own prompt engineering techniques. While Anthropic's tools may not replace prompt engineers altogether, the company said it would help new users, and save time for experienced prompt engineers.

Within Evaluate, developers can test how effective their AI application's prompts are in a range of scenarios. Developers can upload real-world examples to a test suite or ask Claude to generate an array of AI-generated test cases. Developers can then compare how effective various prompts are side-by-side, and rate sample answers on a five-point scale.

A prompt being fed generated data to find good and bad responses.

In an example from Anthropic's blog post, a developer identified that their application was giving answers that were too short across several test cases. The developer was able to tweak a line in their prompt to make the answers longer, and apply it simultaneously to all their test cases. That could save developers lots of time and effort, especially ones with little or no prompt engineering experience.

Anthropic CEO and co-founder Dario Amodei said prompt engineering was one of the most important things for widespread enterprise adoption of generative AI in an interview from Google Cloud Next earlier this year. "It sounds simple, but 30 minutes with a prompt engineer can often make an application work when it wasn't before," said Amodei.

BusinessTechCrunch
Meta releases its biggest 'open' AI model yet
Meta's latest open source AI model is its biggest yet. Today, Meta said it is releasing Llama 3.1 405B, a model containing 405 billion parameters. Trained using 16,000 Nvidia H100 GPUs, it also benefits from newer training and development techniques that Meta claims makes it competitive with leading proprietary models like OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet (with a few caveats).
BusinessTechCrunch
Anthropic releases Claude app for Android
Anthropic launched its Claude Android app on Tuesday to bring its AI chatbot to more users. This is Anthropic's latest effort to convince users to ditch ChatGPT by making Claude available in more places. The Claude Android app will work just like the iOS version released in May, including free access to Anthropic's best AI model, Claude 3.5 Sonnet, alongside upgraded plans through Anthropic's Pro and Team subscriptions.
TechnologyTechCrunch
TTT models might be the next frontier in generative AI
After years of dominance by the form of AI known as the transformer, the hunt is on for new architectures. Transformers underpin OpenAI’s video-generating model Sora, and they're at the heart of text-generating models like Anthropic’s Claude, Google’s Gemini and GPT-4o. A promising architecture proposed this month is test-time training (TTT), which was developed over the course of a year and a half by researchers at Stanford, UC San Diego, UC Berkeley and Meta.
BusinessTechCrunch
News outlets are accusing Perplexity of plagiarism and unethical web scraping
In the age of generative AI, when chatbots can provide detailed answers to questions based on content pulled from the internet, the line between fair use and plagiarism, and between routine web scraping and unethical summarization, is a thin one. Perplexity AI is a startup that combines a search engine with a large language model that generates answers with detailed responses, rather than just links. Unlike OpenAI’s ChatGPT and Anthropic’s Claude, Perplexity doesn’t train its own foundational AI models, instead using open or commercially available ones to take the information it gathers from the internet and translate that into answers.
BusinessYahoo Finance
Google shows investor patience with Big Tech's AI spending might be running short
Wall Street’s AI optimism shrinks when the main business is under pressure.
BusinessTechCrunch
ZoomInfo alum raises $15M for startup that builds AI sales engineers
Until a year ago, Arjun Pillai had the comfortable yet important role of chief data officer at ZoomInfo, a B2B database company. Given that he spent more than a decade running startups or working in sales tech, he was sure that generative AI could help make technical sales faster and more effective. In August 2023, Pillai left ZoomInfo and launched DocketAI, a virtual sales engineer.
BusinessTechCrunch
Archera helps customers access deep cloud discounts
Amid the generative AI boom, companies are spending a lot on cloud infrastructure — and they're concerned about it. A number of public cloud providers, including AWS, Google Cloud and Azure, offer savings plans and reserved instances designed to incentivize companies to spend on infrastructure by passing along discounts. Aran Khanna was an AI engineer at AWS when he realized that there might be a way around this.
BusinessTechCrunch
Intron Health gets backing for its speech recognition tool that recognizes African accents
Voice recognition is getting integrated in nearly all facets of modern living, but there remains a big gap: speakers of minority languages and those with thick accents or speech disorders like stuttering, are typically less able to use speech recognition tools that control applications, transcribe or automate tasks, among other functions. Tobi Olatunji, founder and CEO of clinical speech recognition startup Intron Health, wants to bridge this gap.
BusinessYahoo Finance
Stock market news today: Dow leads stock comeback from steep sell-off
Earnings misses have put the spotlight on the health of the economy after a Big Tech-led sell-off.
LifestyleYahoo Tech
Too many gadgets, too few outlets? This surge protector, down to $8, is the fix
It'll help you rest easy knowing TVs, computers and other tech toys are safe from power spikes — and it's nearly 60% off.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Anthropic's Claude adds a prompt playground to quickly improve your AI apps

Recommended Stories

Meta releases its biggest 'open' AI model yet

Anthropic releases Claude app for Android

TTT models might be the next frontier in generative AI

News outlets are accusing Perplexity of plagiarism and unethical web scraping

Google shows investor patience with Big Tech's AI spending might be running short

ZoomInfo alum raises $15M for startup that builds AI sales engineers

Archera helps customers access deep cloud discounts

Intron Health gets backing for its speech recognition tool that recognizes African accents

Stock market news today: Dow leads stock comeback from steep sell-off

Too many gadgets, too few outlets? This surge protector, down to $8, is the fix