This is a cache of https://arstechnica.com/ai/2024/11/anthropic-raises-eyebrows-with-haiku-price-hike-citing-increased-intelligence/. It is a snapshot of the page at 2024-11-07T01:11:45.675+0000.
Anthropic’s Haiku 3.5 surprises experts with an “intelligence” price increase - Ars Technica
Skip to content
Smarter than the average bear

Anthropic’s Haiku 3.5 surprises experts with an “intelligence” price increase

Anthropic’s smallest AI model now beats its older largest LLM, Opus, at some tasks.

Benj Edwards | 45
The Claude 3 Haiku logo.
The Claude 3 Haiku logo Credit: Anthropic
The Claude 3 Haiku logo Credit: Anthropic
Story text

On Monday, Anthropic launched the latest version of its smallest AI model, Claude 3.5 Haiku, in a way that marks a departure from typical AI model pricing trends—the new model costs four times more to run than its predecessor. The reason for the price increase is causing some pushback in the AI community: more smarts, according to Anthropic.

"During final testing, Haiku surpassed Claude 3 Opus, our previous flagship model, on many benchmarks—at a fraction of the cost," Anthropic wrote in a post on X. "As a result, we've increased pricing for Claude 3.5 Haiku to reflect its increase in intelligence."

"It's your budget model that's competing against other budget models, why would you make it less competitive," wrote one X user. "People wanting a 'too cheap to meter' solution will now look elsewhere."

On X, TakeOffAI developer Mckay Wrigley wrote, "As someone who loves your models and happily uses them daily, that last sentence [about raising the price of Haiku] is *not* going to go over well with people." In a follow-up post, Wrigley said he was not surprised by the price increase or the framing, but saying it out loud might attract ire. "Just say it’s more expensive to run," he wrote.

The new Haiku model will cost users $1 per million input tokens and $5 per million output tokens, compared to 25 cents per million input tokens and $1.25 per million output tokens for the previous Claude 3 Haiku version. Presumably being more computationally expensive to run, Claude 3 Opus still costs $15 per million input tokens and a whopping $75 per million output tokens.

Speaking of Opus, Claude 3.5 Opus is nowhere to be seen, as AI researcher Simon Willison noted to Ars Technica in an interview. "All references to 3.5 Opus have vanished without a trace, and the price of 3.5 Haiku was increased the day it was released," he said. "Claude 3.5 Haiku is significantly more expensive than both Gemini 1.5 Flash and GPT-4o mini—the excellent low-cost models from Anthropic's competitors."

Cheaper over time?

So far in the AI industry, newer versions of AI language models typically maintain similar or cheaper pricing to their predecessors. The company had initially indicated Claude 3.5 Haiku would cost the same as the previous version before announcing the higher rates.

"I was expecting this to be a complete replacement for their existing Claude 3 Haiku model, in the same way that Claude 3.5 Sonnet eclipsed the existing Claude 3 Sonnet while maintaining the same pricing," Willison wrote on his blog. "Given that Anthropic claim that their new Haiku out-performs their older Claude 3 Opus, this price isn’t disappointing, but it’s a small surprise nonetheless."

Claude 3.5 Haiku arrives with some trade-offs. While the model produces longer text outputs and contains more recent training data, it cannot analyze images like its predecessor. Alex Albert, who leads developer relations at Anthropic, wrote on X that the earlier version, Claude 3 Haiku, will remain available for users who need image-processing capabilities and lower costs.

The new model is not yet available in the Claude.ai web interface or app. Instead, it runs on Anthropic's API and third-party platforms, including AWS Bedrock. Anthropic markets the model for tasks like coding suggestions, data extraction and labeling, and content moderation, though, like any LLM, it can easily make stuff up confidently.

"Is it good enough to justify the extra spend? It's going to be difficult to figure that out," Willison told Ars. "Teams with robust automated evals against their use-cases will be in a good place to answer that question, but those remain rare."

Photo of Benj Edwards
Benj Edwards Senior AI Reporter
Benj Edwards is Ars Technica's Senior AI Reporter and founder of the site's dedicated AI beat in 2022. He's also a widely-cited tech historian. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.
45 Comments