They've got, ballpark, $5t to $10t to make back in the next 5 years, or the hardware buildouts will start getting written down.
This means we're going to need $1t+ per year in spending, per year, on tokens. 200m knowledge workers in the world, 30m developers. We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.
That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.
We're not there yet. This is still the upswing of the hype cycle, and unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.
The bottleneck has moved from producing a thing that works to knowing that the thing was the right thing to build. The more of the latter they can take on, the fewer knowledge workers are needed at all. So rather than 5% of every knowledge worker's salary going into tokens, 100% of the knowledge worker's total employment cost goes into tokens and you get a 20x productivity boost as a theoretical minimum across those tasks.
That's the game. There's a view you could take of this that this is just a growing of the pie: with those cost dynamics a lot more "small businesses" get a vast amount of leverage, so the overall economy grows without replacing the knowledge workers. I'm not sure I trust the MBA class to have that view.
I mean this case with AI-productivity fires itself back when we talk about GDP.
The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.
Net effect of this productivity increase: less consumption by the masses, even though you may be producing more good and much more efficiently.
A third effect also comes into play that once all this starts to happen, common people, who are generally living paycheck to paycheck, will now start to hesitate towards making any long term investment, housing included. And that indirectly will end up impacting financial and banking sector, which will then impact existing savings, bonds yields and retirement funds, and the recession-like cycle starts.
This productivity increase only makes sense if it is capped to a very small number.. like 20% max. Beyond that, who these companies will even be selling to?
The difference is that they had room to charge more of their customers and pay less to their workers. The AI industry doesn't have both sides to play at this point. Training and inference are getting more expensive and if you take on the high prices now you're just floating yourself further downstream from profitability long term (which does not look viable for any of them currently).
The companyās gone but the assets just got sold to other commercial real estate firms.
Uber was basically only ever software to help people use their own cars so a very small part of their valuation was physical stuff to upkeep, it was just deals and obligations they had.
Not sure how it shakes out for Anthropic and OpenAI. Thereās a lot of physical capacity that needs to be built out and can depreciate. But thereās also a lot of network effects and dependencies being built in with enterprise users.
You are making the assumption that the models are only used / paid for by 2.5% of the population (your knowledge workers value). There will be new value created by these models which people are happy to pay for which simply did not exist at all before. It is also naive to say that the hyperscalers are going to be expecting a return on this in 5 years, it will be entirely propped up by investments / IPOs as has been the case with any tech company for decades now to reach scale. The hyperscalers are currently spending ~650b combined annually, which they have the cash for and can sell in future compute instantly.
YEPPP... and I'm kind of shocked at how many people can't do simple math.
Let's put it context. Google's annual revenue seems to be north of $400B. So if OpenAI suddenly had Google's revenue, it would still be insufficient to recover their investment.
and it's a ticking time bomb because $1T in servers, CPUs, GPUs and memory is going to be worth $200B in 5 years. You can say they can keep using what they've got. Sure. But they're also not going to stop spending on new hardware. And the competitor that comes along in 5 years and spends $1T doing the exact same thing is going to have a huge advantage.
OpenAI at this point reminds me very much of the Russ Henneman pre-money hype cycle.
There is also the EV (expected value) of developing AGI. Even if you personally believe the probability is low within the lifetime of either of these companies, the value would still be extraordinarily high, enough to forgive a $5T or so miscalculation here or there.
I don't think AGI was ever a serious endeavour. It was something the labs talked up when it gave them headlines and attention.
Now they're charging serious money (and facing public scrutiny about the externalities) they are much more muted about it. They would far rather talk about concrete ROI than a killer robot taking everyone's jobs
Source on 200 million knowledge workers worldwide? My understanding is that it's just above 1 billion. I dont think a billion subscriptions at $1000/yr is out of the question but it might take a decade to get roiling
I googled "number of knowledge workers worldwide" and read the top results. If you read it as I was confident in a billion I apologize, Im just trying to get an accurate count. What numbers do you have now and where did you find them?
I asked claude to generate a frontend and it made the same template. Same san serif and serif fonts together. Same colors. Same typography. Same layout and animations even. Itās wild how similar it is. No not similar itās the same damn thing.
> Gartner forecasts that large AI companies would need to earn cumulatively close to $7 trillion in AI-driven revenue through 2029, which is close to $2 trillion per year by the end of the period. In order to achieve āhistoric returns,ā the providers would need to earn nearly $8.2 trillion in the same period.
The numbers are made up political correctness anyway.
Everyone's agency is 100% captured by belief in Wall Street. Too few <50 have any meaningful labor skills to blink.
We'll continue to have consent manufactured via media platforms and in 3 years no one will bat an eye at these companies being worth $12 trillion as Altman and Musk climb two ladders holding a "mission accomplished" banner.
Yeah, itās nothing, and itās also not the cost that enterprises are paying. As the article states, the price is $20 per seat per month, PLUS per-token API usage. Enterprises are paying consumption billing, not fixed rate oversubscribed āall you can eat per seat.ā
I think it's fair to say they had achieved product-market fit when their revenues were growing deep triple digits month over month. What we're seeing now is that perhaps they have a achieved profitability or at the least a more sustainable balance sheet.
āTokensā donāt have an intrisic cost or value. Saying that I used $2,180.16 worth of tokens is like relying on the salesperson to convince me Iām getting a billion dollars worth of pots and pans for $19.99.
I think itās funny how we are throwing critical thinking out the window when it comes to evaluating biased sources of info.
I'm not sure what you're pushing back against here.
I spent $200. If I had been paying API pricing it would have been $2,180.16. The article is about how enterprise customers get charged API pricing, which means if I had been employed by one of those companies I would have cost them $2,180.16.
Just because API pricing would've been $2180.16 doesn't mean that's the value of those tokens. For starters, you personally probably wouldn't have paid that. But also, sales price isn't value. This is like saying, oh, I saw this bar of gold somewhere for $10000 but got it here for $1000! So I got $10000 worth of gold for $1000! - no, the value of that gold is determined by its weight, which wasn't even mentioned.
We have no market convergence on tokens yet (and it'll differ between LLMs), so it's impossible to say what value you got for your $200.
> If I had been paying API pricing it would have been $2,180.16
The point being made above is that API pricing is calculated... somehow... seemingly arbitrarily. Possibly untethered to the infrastructure costs entirely: which would be the basis of any 'value', however that holds the labor theory of value, which isn't accurate either. So how do you accurately price these tokens at all (other than through price-discovery: which is slow, messy and fuzzy)?
I love HackerNews. God its fantastic. Only on HackerNews can you find these deranged personalities who think the pricing model of a near-trillion dollar company is determined "seemingly arbitrarily".
Hi Simon, nice article. The parent there may be making the same assumption I am, that large enterprise _never_ pays sticker price.
Also, to just color in the picture here, as I haven't seen it mentioned elsewhere, there is a very large Saas company at the moment who has given everyone unlimited tokens on Claude. And they have a dashboard showing who spends the most. So the "budget" went from about USD500 per per person (split between Claude and cursor) in Jan to... Well a soft limit of USD100k... Per month... Per person.
People can still see the top line sticker price on their spend, but honestly I can't believe that the Saas is paying that full price when the invoice comes in.
That said, there are some finance reports which are probably dropping soon where we will find out!
I do know of moderate-size companies deploying OSS LLMs on their own GPU clusters, for ownership/security/maybe cost reasons. I'm somewhat surprised F500 companies are apparently just handing over all their data to the model providers.
Could be fantastic for small shops while it lasts. The big guys have to pay 10x for precious tokens.
Claude is so in demand at the moment that there aren't really volume discounts. Anthropic sets the terms and you either accept them or get lost they have that much of a lead (mindshare/desirability wise).
And "large" just means that AWS will assign an account manager to talk with you. I was at a start-up who spent $300k/year on AWS and that was enough to get special attention and discounts. Enterprise pricing is confusing.
API pricing drops DRAMATICALLY in enterprise agreements.
As with pretty much anything priced on volume/usage.
Enterprise deals are negotiated ad-hoc, the listed pricing is simply a jumping off point for the final negotiated discount.
If youāre going to give 20,000 employees Claude code you are not going to be spending $1B per year on Anthropic tokens as if you gave everyone an individual API key. Just as Anthropic isnāt paying AWS SES $10,000,000 to send 1 email update to their massive user base when the next Claude version drops.
This isn't true at the moment, though. So far there hasn't been the negotiating power. What happens is you end up capping usage for employees at a fixed amount. I think eventually, prices will come down and there will be discounts, but for enterprise accounts at least of our size (<5000), we're paying almost 100% retail, which kind of sucks, because it's expensive, and pretty easy to burn $50 to $100+ in a day, if you're not careful. In fact we got pushed off the former plan to the token-utility one at the last contract negotiation.
Going to be interesting to determing the metrics we give to engineers for determining whether the spend on this is worth it. Measuring PRs, lines of code committed, commits fully generated by agentic workflows, etc.....
Tokens do have a clearly calculable intrinsic cost. There's the marginal cost of production (i.e. the inference cost) and the amortized R&D cost that goes into the model producing them.
Yes, value is hard to calculate, but luckily market pricing mechanisms exist exactly for this purpose. There isn't a better number to use than what people are willing to pay for them.
So he's saying that on an enterprise plan, he'd be spending $2,180.16. He's not paying that much, but enterprises are.
Lol. They obviously have intrinsic cost, the floor being the cost of electricity. Itās hilarious how we are throwing critical thinking out the window when it comes to evaluating biased sources of info.
"[would have spent] $1,199 with Anthropic, $980 with OpenAI"
How many tokens is that, input/output-wise?
(a) I'm curious if you feel like you got $2000 with of value out of them in the last month?
(b) I'm also curious if you would have gotten similar quality out of a slightly lower-cost provider of an open-weight model? (e.g. Kimi K2.6 and DeepSeek v4 Pro) and what the spend would have been for that.
I myself have managed to spend not quite $4 on OpenRouter and have felt it was very worth it; I just have much smaller, or more targeted requests I guess. (Lately, adding features to a static site generator in Python, or setting up log forwarding via a docker compose file)
Input tokens: 52,545,485
Output tokens: 5,767,253
Cache create tokens: 5,112,029
Cache read tokens: 1,475,069,465
Total tokens: 1,538,494,232
Total cost: $1,199.79
OpenAI Codex:
Input tokens: 52,598,013
Output tokens: 4,681,867
Reasoning output: 2,091,063
Cached input tokens: 1,153,844,864
Total tokens: 1,211,124,744
Total cost: $980.37
I'm confident I got value out of OpenAI - I've been mainly on Codex for the last few weeks.
Not so sure I got that value from Claude, just because I've been using it a lot less and somehow the price came to about the same as OpenAI.
Given the code I've been able to build in the past month I genuinely do think I got value for the API price version, and (don't tell OpenAI or Anthropic) I think I'd have paid full price.
I've not spent nearly enough time with GLM-5.1 and co to compare, but I do know that the prompts I'm using with the agents are not prompts I would have expected to work just three months ago.
The costs are exorbitant and most software is not produced by companies with such a huge moat. Anthropic made a profit through their recent bait amd switch pricing. There is zero useful insights online to indicate whether this might die due to commoditisation with good enough open models or fail the race to get more people subsidising unsustainable growth with other peopleās money. Who knows? In any case they dont seem to be able to drop usage costs so the business model seems based on wishes
Algorithms are also improving. I believe it's very unlikely for these two improvements together to not result in one to two orders of magnitude cheaper cost per "intelligence". Of course, that might just make use cases that are too expensive today viable and thereby increase usage further.
Does this analysis factor in potential caching of tokens on the server side? It seems that if they organize things well (as a model provider), they can save quite a lot on that. Looking at my Cursor statistics makes it clear that the token calculations are not at all trivial.
With deepseek and xiaomi mimo models slashing their prices 99%, I don't see a great future for openai / antrhopic with regards to their 1T valuations. Maybe 1T valuation will be the whole market, West + East.
They'll still have their dedicated enterprise customers. I think the Chinese providers will pull more of the single users who're paying their own way, than those backed by company budget. And it's a pretty good split as the demand becomes better distributed, resulting in better service (I'll never forgot must how bad access to Claude became until they got access to Colossus) and less potential for lock-in (we really don't want there to be a duopoly, etc on good AI).
So how do openai and anthropic plan to keep customers when GLM-5.1 is just as good and open source and a lot cheaper?
I don't see the business model working. My closest friend actually does automation software for large companies.
He does not use Claude or openai at all. He primarily uses gpt 120b on cerebras and glm-5.1 for heavy thinking work.
And some other small models for various tasks. All open source.
And these systems are extremely useful for the businesses and are able to run fully automated pipelines that are very stable and fast.
We discuss this a lot, and we both think any business doing heavy agentic work on Claude and openai just aren't aware of exactly how good and cheap open source has gotten on the last year.
So... once the legacy businesses and developers catch up, won't Claude and openai be unable to recoup their costs?
Same. It's a nightmare from a Porter's Five Forces perspective.
There will be a ton of businesses competing in this space, and there will be something of a moat due to how capital intensive the business can be, but there will still basically be infinite competitors.
For coding you always want to go with the best model in the category, not something that would be the best model if we went 1 year back which GLM 5.1 is, and I'm saying that as a big fan of GLM cause I run a translation site where GLM is good enough for the price.
Most of the money right now is in coding. Openai and Anthropic just have to be 6 months ahead of SOTA open source models and they'll capture most of the enterprise and dev market
Yes I'm an engineer (20 years most in games/graphics industry) and only use it for code. I've been using glm 5.1 this week a lot. I went in expecting another "decent" but not really "up to standard" open source model.
I highly doubt I'll ever use Claude again.
I think you are wrong about Claude being any significant level better
I've been mostly coding with GLM-5.1 as well and I agree with you. DeepSeek V4 Flash is another very good surprise. Incredibly cheap, fast and effective.
Cost for the value delivered. Like if you offered the current SOTA open source models at $0.1/M, I still think I'd be using Opus or 5.5 at $30/M. Or say GPT 5 which was released Aug 25, I don't think I'd use it for coding for even $0.1. I'd def find other uses for it(translations, agentic workflows, prompt guards etc), but for coding I don't think I'd ever completely switch to a SOTA open model
Unless ofc there was an actual speed difference, only reason I'd be willing to go with a worse model couple of percent worse than current best model is if the speed was at least 5x higher. Looking forward to kimi k2.6 offered publicly by Cerebras
For coding assistance, I have tried OpenCode with several large open models through OpenRouter. All were fairly bad compared to Claude Opus.
Could you provide some hints on how I should be holding these open models so that I might get more value out of them?
I agree with the common trope that open models lag behind by about a year, but something magical happened just around a year ago when the state of the art models became extremely useful. By this reasoning we're about to see open models perform well, but I'm afraid there is more to it than just waiting for another revolution around the sun.
Note, my application is coding assistance. Open models can be great for other purposes.
Great article I know this upsets a lot of people who are used to thinking Anthropic/OpenAI are just lighting cash on fire but they've cornered the market on enterprise who cannot walk away from these $200/month plans
However the valuations are still far far away from actual sanity
> enterprise who cannot walk away from these $200/month plans
Any org with more than 150 users aren't on $200/month plans, they are forced into API pricing + $20/month/user
For individuals and orgs small enough to get to use the subscription plans, that's all well and good until usage limits keep going down, or cost goes up. If you compare the usage you get on $200/month maxed out vs. what that would cost at API pricing, the $200/mont plan is an absolute steal. I doubt it will last long.
It's good enough for personal stuff. It doesn't compare to the latest Opus I use at work. You can certainly argue I don't need Opus for work, but there is clearly a difference.
Also, at least with z.ai, GLM-5.1 is s l o w! After using Claude at work, I get really impatient with GLM-5.1 at home. When doing "true" vibe coding (i.e. not really examining the code), Opus is a ton faster (easily 5x).
But yeah, I'm not willing to personally pay for the frontier models. I won't even renew my annual Z.ai plan - it's become too expensive.
Hmm, I use opencode subscription, and glm seems just as fast from the tests I've tried to compare between the two. Tbh it mostly took Claude longer (mostly significantly longer) for the same tests.
Also, and I know you may not want to answer. But could you give me an idea of the type of thing you found glm to be worse with?
I think I've been fairly unbiased in testing a bunch of different development tasks. But am curious if maybe it performs well for some stuff and not others. So if you could share what you feel it's worse at.
Also are you an experienced developer or less experience?
I'll repeat something I wrote on an entirely separate HN submission.
When DeepSeek V4 Pro came out, I had been mostly coding with GLM-5.1 on a Z.ai coding plan.
I had a large analysis task on a relatively complex codebase. I decided to try the models out.
GLM-5.1 did acceptably but got a few things wrong (easily corrected) and took quite a while to get there.
Opus 4.6 burnt through the US$10 budget I had given it in about 10-15 min, without ever returning from the first prompt.
DeepSeek V4 returned a full analysis within 2-3 min, and I carried on all the way to implementing the feature I was after. Total cost less than US$1.00.
I now mostly alternate between GLM-5.1 and DeepSeek V4 Flash, with an occasional dip into V4 Pro for more complex analyses.
task i am working on right now at work is comparing two verisions of apis and documenting responses in their outputs. i suspect a vast majority of work at entrprise is of similar complexity.
right now everyone is using latest and greatest to do dumb stuff like that. that would change fast if companies start caring about costs.
Ai has become indispensable but maybe not at all cost. My company just had a company-wide meeting to talk about how they're restricting who can use which models and instructing us the "be more responsible with company's tokens". And it's not an small company by any means.
If nothing else this blog did give me the idea that I should split my $200 claude max plan into two $100 CC max and $100 codex plan, esp because Claude is now offering 1.5x weekly limits so its the 5x usage is now more like 7.5x usage.
Love how everyone boasted about replacing all the software with ChatGPT and then we end up with coding agents meaning the software engineer are STILL important. The sell is the development tool. It's classic cloud. Where did all the ops people go, many got subsumed by the cloud companies YET every company still has DevOps people to manage cloud infrastructure. The layer of abstraction went up but we still need the people to write the glue code and understand the business. OK great there's a new cash printer in the room. There's a new tool. Let's just start to ground the tooling in its new found gravity, profitability and IPO market dynamics... Reality has set in. The hype cycle is about to explode... Do you remember ride hailing and just how much cash was burned on credits pre Uber IPO. Then remember the IPO itself? These companies are not the new Google. They are a layer on top. Google was still the most efficient cash printing machine in history beyond the the US government and might still be. Will be interesting to see what the trillion dollar IPOs turn into. I'm going to say we see those prices get cut to a third in less than 5 years and scale back up over the next 15-20 years.
So it largely sounds like many more people will be able to write software - and will use AI to do it. Existing software engineers will continue to automate their tasks away like they always did, but perhaps at a faster rate.
The impact of AI in other fields seems to be muted.
I think it is applicable to a much wider range of knowledge work, but it's also harder to apply there.
Software development has the huge advantage that mistakes and hallucinations are very easy to spot: the software works or it doesn't.
Spotting errors in a research report or legal brief is a whole lot harder!
But... non-software professionals spend a huge amount of their time on tasks that can be safely automated - reformatting documents, extracting numbers from PDFs, all kinds of flavor of data entry.
Learning how to use a tool like Claude Cowork can take a big dent out of those.
How is the lack of bad news declaring a victory for AI? I am yet to see any company concretely publish analysis about the ROI from AI. Most companies as far as I know are still treating AI investment as sunk cost with no expectation of returns at the moment. We could very well see a world where companies heavily scale back investment.
I think the reasons for them going with API pricing will become abundantly clear when the S-1s become available. If they don't have a story covering how they can get revenue closer to expenses, then they're relying on the market to believe the pixie dust version of their profitability story, which I think people increasingly don't.
I wonder how a focus on per-token API profits will impact the incentives to improve token efficiency and drive down costs through optimized compute. I suppose as long as a few leading labs are competing, we'll see progress in this regard, but it's certainly less in their interest than it is with a flat subscription pricing model.
Who's to say those enterprises won't churn after XYZ comes out with a decent enough model that costs 10x less to use?
There's a whole bag of clever tricks you can play to juice short term results leading to an IPO that may not work longer term.
I'll believe they've found product-market fit when they have a product. Right now they're selling the infrastructure, in a highly subsidized and undifferentiated way (at least over a sufficient long period of time of, say, a couple of years).
Companies are kool-aid drinking now due to hype, but given how much they're spending, if they don't see REAL, BIG wins from it soon, they're going to scale it back quickly and switch to Chinese models. Claude isn't worth the API cost for a lot of development work, and once companies have had time to collect and crunch data they'll see this.
>Somehow this fragment turned into headlines like Uberās COO says itās getting harder to justify the money spent on AI tokenmaxxing, because the market for stories about AI failures remains enormous.
I notice this all over the place. Many people hate AI and want it to fail, and they're willing to invent misinformation if it supports that idea.
They've got, ballpark, $5t to $10t to make back in the next 5 years, or the hardware buildouts will start getting written down.
This means we're going to need $1t+ per year in spending, per year, on tokens. 200m knowledge workers in the world, 30m developers. We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.
That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.
We're not there yet. This is still the upswing of the hype cycle, and unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.
The bottleneck has moved from producing a thing that works to knowing that the thing was the right thing to build. The more of the latter they can take on, the fewer knowledge workers are needed at all. So rather than 5% of every knowledge worker's salary going into tokens, 100% of the knowledge worker's total employment cost goes into tokens and you get a 20x productivity boost as a theoretical minimum across those tasks.
That's the game. There's a view you could take of this that this is just a growing of the pie: with those cost dynamics a lot more "small businesses" get a vast amount of leverage, so the overall economy grows without replacing the knowledge workers. I'm not sure I trust the MBA class to have that view.
> We're talking about a world where you need 5% of every knowledge workers salary to go into tokens.
They are assuming ~10% global GDP growth instead of ~3%. You probably don't need the same %s if the pie grows a ton.
I'm highly skeptical we get that growth, but if you aren't, it makes it easier to digest.
I mean this case with AI-productivity fires itself back when we talk about GDP.
The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.
Net effect of this productivity increase: less consumption by the masses, even though you may be producing more good and much more efficiently.
A third effect also comes into play that once all this starts to happen, common people, who are generally living paycheck to paycheck, will now start to hesitate towards making any long term investment, housing included. And that indirectly will end up impacting financial and banking sector, which will then impact existing savings, bonds yields and retirement funds, and the recession-like cycle starts.
This productivity increase only makes sense if it is capped to a very small number.. like 20% max. Beyond that, who these companies will even be selling to?
Am I overthinking all this?
Somehow Uber and WeWork survived the same kind of grand projections that they never met.
The difference is that they had room to charge more of their customers and pay less to their workers. The AI industry doesn't have both sides to play at this point. Training and inference are getting more expensive and if you take on the high prices now you're just floating yourself further downstream from profitability long term (which does not look viable for any of them currently).
uber sure....but how did wework survive? they are a smoldering husk of a failed company looted by its founder
The companyās gone but the assets just got sold to other commercial real estate firms.
Uber was basically only ever software to help people use their own cars so a very small part of their valuation was physical stuff to upkeep, it was just deals and obligations they had.
Not sure how it shakes out for Anthropic and OpenAI. Thereās a lot of physical capacity that needs to be built out and can depreciate. But thereās also a lot of network effects and dependencies being built in with enterprise users.
I don't think Uber was doing $1 trillion in infrastructure spend.
WeWork absolutely did not survive
somehow the invisible hand of the market is also blind af
You are making the assumption that the models are only used / paid for by 2.5% of the population (your knowledge workers value). There will be new value created by these models which people are happy to pay for which simply did not exist at all before. It is also naive to say that the hyperscalers are going to be expecting a return on this in 5 years, it will be entirely propped up by investments / IPOs as has been the case with any tech company for decades now to reach scale. The hyperscalers are currently spending ~650b combined annually, which they have the cash for and can sell in future compute instantly.
YEPPP... and I'm kind of shocked at how many people can't do simple math.
Let's put it context. Google's annual revenue seems to be north of $400B. So if OpenAI suddenly had Google's revenue, it would still be insufficient to recover their investment.
and it's a ticking time bomb because $1T in servers, CPUs, GPUs and memory is going to be worth $200B in 5 years. You can say they can keep using what they've got. Sure. But they're also not going to stop spending on new hardware. And the competitor that comes along in 5 years and spends $1T doing the exact same thing is going to have a huge advantage.
OpenAI at this point reminds me very much of the Russ Henneman pre-money hype cycle.
There is also the EV (expected value) of developing AGI. Even if you personally believe the probability is low within the lifetime of either of these companies, the value would still be extraordinarily high, enough to forgive a $5T or so miscalculation here or there.
I don't think AGI was ever a serious endeavour. It was something the labs talked up when it gave them headlines and attention.
Now they're charging serious money (and facing public scrutiny about the externalities) they are much more muted about it. They would far rather talk about concrete ROI than a killer robot taking everyone's jobs
Source on 200 million knowledge workers worldwide? My understanding is that it's just above 1 billion. I dont think a billion subscriptions at $1000/yr is out of the question but it might take a decade to get roiling
A billion? Really? At 200M youāre already including a lot of people that stretch the definition of knowledge worker.
[delayed]
> At 200M youāre already including a lot of people that stretch the definition of knowledge worker.
How do you know this? Im certainly open to recalibrating my numbers which is why I asked for the source
What's your source, because it looks wildly out of proportion compared to numbers we have now.
I googled "number of knowledge workers worldwide" and read the top results. If you read it as I was confident in a billion I apologize, Im just trying to get an accurate count. What numbers do you have now and where did you find them?
Here is a serious question.. Can we sell into the hype cycle and on the way down with this: https://safebots.ai/costs.html
I asked claude to generate a frontend and it made the same template. Same san serif and serif fonts together. Same colors. Same typography. Same layout and animations even. Itās wild how similar it is. No not similar itās the same damn thing.
Iāve seen the same dashboard for a dozen custom web applications now, including a couple I had it make for me.
It really does have a particular lane for each chore, and itās reproducible.
> $5t to $10t to make back in the next 5 years
Wait what? They spent 2 order of magnitude less on hardware.
From the verge: https://archive.is/kU4Zg
> Gartner forecasts that large AI companies would need to earn cumulatively close to $7 trillion in AI-driven revenue through 2029, which is close to $2 trillion per year by the end of the period. In order to achieve āhistoric returns,ā the providers would need to earn nearly $8.2 trillion in the same period.
Those numbers don't even track even in the same sentence. If it is $2T/year by the end of 2029, it would be something < $6T cumulative in 3 years.
The numbers are made up political correctness anyway.
Everyone's agency is 100% captured by belief in Wall Street. Too few <50 have any meaningful labor skills to blink.
We'll continue to have consent manufactured via media platforms and in 3 years no one will bat an eye at these companies being worth $12 trillion as Altman and Musk climb two ladders holding a "mission accomplished" banner.
> unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.
Simple - you make them work 2x, 5x, or 10x more hours.
There are not enough hours to do that
200$ per month per seat is nothing .
A single 3D CAD license pack for the guys in our R&D group costs multiple thousands of dollars per seat, per month.
It's about time software seats get some love too.
AutoCAD is $175 per user per month [1].
[1] https://www.autodesk.com/products/autocad/buy
Yeah, itās nothing, and itās also not the cost that enterprises are paying. As the article states, the price is $20 per seat per month, PLUS per-token API usage. Enterprises are paying consumption billing, not fixed rate oversubscribed āall you can eat per seat.ā
CATIA licenses which are the most expensive I've seen are roughly $600/month per user. Where are you seeing "thousands of dollars per seat"?
How many guys is that? Every single white color worker is in the AI ICP.
white collar*, not color
What does ICP mean?
Insane Clown Posse, though given the context here probably Ideal Customer Profile.
I think it's fair to say they had achieved product-market fit when their revenues were growing deep triple digits month over month. What we're seeing now is that perhaps they have a achieved profitability or at the least a more sustainable balance sheet.
> $2,180.16 worth of tokens for $200
āTokensā donāt have an intrisic cost or value. Saying that I used $2,180.16 worth of tokens is like relying on the salesperson to convince me Iām getting a billion dollars worth of pots and pans for $19.99.
I think itās funny how we are throwing critical thinking out the window when it comes to evaluating biased sources of info.
I'm not sure what you're pushing back against here.
I spent $200. If I had been paying API pricing it would have been $2,180.16. The article is about how enterprise customers get charged API pricing, which means if I had been employed by one of those companies I would have cost them $2,180.16.
What am I missing?
Just because API pricing would've been $2180.16 doesn't mean that's the value of those tokens. For starters, you personally probably wouldn't have paid that. But also, sales price isn't value. This is like saying, oh, I saw this bar of gold somewhere for $10000 but got it here for $1000! So I got $10000 worth of gold for $1000! - no, the value of that gold is determined by its weight, which wasn't even mentioned.
We have no market convergence on tokens yet (and it'll differ between LLMs), so it's impossible to say what value you got for your $200.
> If I had been paying API pricing it would have been $2,180.16
The point being made above is that API pricing is calculated... somehow... seemingly arbitrarily. Possibly untethered to the infrastructure costs entirely: which would be the basis of any 'value', however that holds the labor theory of value, which isn't accurate either. So how do you accurately price these tokens at all (other than through price-discovery: which is slow, messy and fuzzy)?
> So how do you accurately price these tokens at all
Like anything else in the economy: at the point where enough customers can pay you, and not enough will go to the cheaper competition.
I love HackerNews. God its fantastic. Only on HackerNews can you find these deranged personalities who think the pricing model of a near-trillion dollar company is determined "seemingly arbitrarily".
Large enterprises make deals and wonāt be paying 2,180.16$ either. Just like with AWS
That doesn't seem to be the case. From what I've seen enterprise deals get API pricing now. Have you seen evidence that's not true?
Hi Simon, nice article. The parent there may be making the same assumption I am, that large enterprise _never_ pays sticker price.
Also, to just color in the picture here, as I haven't seen it mentioned elsewhere, there is a very large Saas company at the moment who has given everyone unlimited tokens on Claude. And they have a dashboard showing who spends the most. So the "budget" went from about USD500 per per person (split between Claude and cursor) in Jan to... Well a soft limit of USD100k... Per month... Per person.
People can still see the top line sticker price on their spend, but honestly I can't believe that the Saas is paying that full price when the invoice comes in.
That said, there are some finance reports which are probably dropping soon where we will find out!
I do know of moderate-size companies deploying OSS LLMs on their own GPU clusters, for ownership/security/maybe cost reasons. I'm somewhat surprised F500 companies are apparently just handing over all their data to the model providers.
Could be fantastic for small shops while it lasts. The big guys have to pay 10x for precious tokens.
Claude is so in demand at the moment that there aren't really volume discounts. Anthropic sets the terms and you either accept them or get lost they have that much of a lead (mindshare/desirability wise).
And "large" just means that AWS will assign an account manager to talk with you. I was at a start-up who spent $300k/year on AWS and that was enough to get special attention and discounts. Enterprise pricing is confusing.
The point is that those a real prices real people are paying for real API usage. it's not made up.
your point is large players won't pay those prices at massive volume. ok
API pricing drops DRAMATICALLY in enterprise agreements.
As with pretty much anything priced on volume/usage.
Enterprise deals are negotiated ad-hoc, the listed pricing is simply a jumping off point for the final negotiated discount.
If youāre going to give 20,000 employees Claude code you are not going to be spending $1B per year on Anthropic tokens as if you gave everyone an individual API key. Just as Anthropic isnāt paying AWS SES $10,000,000 to send 1 email update to their massive user base when the next Claude version drops.
This isn't true at the moment, though. So far there hasn't been the negotiating power. What happens is you end up capping usage for employees at a fixed amount. I think eventually, prices will come down and there will be discounts, but for enterprise accounts at least of our size (<5000), we're paying almost 100% retail, which kind of sucks, because it's expensive, and pretty easy to burn $50 to $100+ in a day, if you're not careful. In fact we got pushed off the former plan to the token-utility one at the last contract negotiation.
Going to be interesting to determing the metrics we give to engineers for determining whether the spend on this is worth it. Measuring PRs, lines of code committed, commits fully generated by agentic workflows, etc.....
> API pricing drops DRAMATICALLY in enterprise agreements
Do you have any numbers or reports to back that up?
Tokens do have a clearly calculable intrinsic cost. There's the marginal cost of production (i.e. the inference cost) and the amortized R&D cost that goes into the model producing them.
Yes, value is hard to calculate, but luckily market pricing mechanisms exist exactly for this purpose. There isn't a better number to use than what people are willing to pay for them.
So he's saying that on an enterprise plan, he'd be spending $2,180.16. He's not paying that much, but enterprises are.
Lol. They obviously have intrinsic cost, the floor being the cost of electricity. Itās hilarious how we are throwing critical thinking out the window when it comes to evaluating biased sources of info.
a little critical thinking led me to read that sentence as $2180 worth of tokens [at current api pricing]
I think it's funnier that you can believe some things have an intrinsic cost and others don't
"[would have spent] $1,199 with Anthropic, $980 with OpenAI"
How many tokens is that, input/output-wise?
(a) I'm curious if you feel like you got $2000 with of value out of them in the last month?
(b) I'm also curious if you would have gotten similar quality out of a slightly lower-cost provider of an open-weight model? (e.g. Kimi K2.6 and DeepSeek v4 Pro) and what the spend would have been for that.
I myself have managed to spend not quite $4 on OpenRouter and have felt it was very worth it; I just have much smaller, or more targeted requests I guess. (Lately, adding features to a static site generator in Python, or setting up log forwarding via a docker compose file)
Claude Code:
OpenAI Codex: I'm confident I got value out of OpenAI - I've been mainly on Codex for the last few weeks.Not so sure I got that value from Claude, just because I've been using it a lot less and somehow the price came to about the same as OpenAI.
Given the code I've been able to build in the past month I genuinely do think I got value for the API price version, and (don't tell OpenAI or Anthropic) I think I'd have paid full price.
I've not spent nearly enough time with GLM-5.1 and co to compare, but I do know that the prompts I'm using with the agents are not prompts I would have expected to work just three months ago.
The costs are exorbitant and most software is not produced by companies with such a huge moat. Anthropic made a profit through their recent bait amd switch pricing. There is zero useful insights online to indicate whether this might die due to commoditisation with good enough open models or fail the race to get more people subsidising unsustainable growth with other peopleās money. Who knows? In any case they dont seem to be able to drop usage costs so the business model seems based on wishes
Usage costs will come down with better hardware. Hardware is improving rapidly each generation.
That trend held true for the past three years, but it doesn't feel as safe to me now.
But memory costs are going way up. And both OpenAI and Anthropic bumped up the price of their frontier models in April.
Algorithms are also improving. I believe it's very unlikely for these two improvements together to not result in one to two orders of magnitude cheaper cost per "intelligence". Of course, that might just make use cases that are too expensive today viable and thereby increase usage further.
Does this analysis factor in potential caching of tokens on the server side? It seems that if they organize things well (as a model provider), they can save quite a lot on that. Looking at my Cursor statistics makes it clear that the token calculations are not at all trivial.
I believe the ccusage tool I used takes cached token pricing into account.
With deepseek and xiaomi mimo models slashing their prices 99%, I don't see a great future for openai / antrhopic with regards to their 1T valuations. Maybe 1T valuation will be the whole market, West + East.
They'll still have their dedicated enterprise customers. I think the Chinese providers will pull more of the single users who're paying their own way, than those backed by company budget. And it's a pretty good split as the demand becomes better distributed, resulting in better service (I'll never forgot must how bad access to Claude became until they got access to Colossus) and less potential for lock-in (we really don't want there to be a duopoly, etc on good AI).
So how do openai and anthropic plan to keep customers when GLM-5.1 is just as good and open source and a lot cheaper?
I don't see the business model working. My closest friend actually does automation software for large companies.
He does not use Claude or openai at all. He primarily uses gpt 120b on cerebras and glm-5.1 for heavy thinking work. And some other small models for various tasks. All open source.
And these systems are extremely useful for the businesses and are able to run fully automated pipelines that are very stable and fast.
We discuss this a lot, and we both think any business doing heavy agentic work on Claude and openai just aren't aware of exactly how good and cheap open source has gotten on the last year.
So... once the legacy businesses and developers catch up, won't Claude and openai be unable to recoup their costs?
> I don't see the business model working.
Same. It's a nightmare from a Porter's Five Forces perspective.
There will be a ton of businesses competing in this space, and there will be something of a moat due to how capital intensive the business can be, but there will still basically be infinite competitors.
Great for consumers.
For coding you always want to go with the best model in the category, not something that would be the best model if we went 1 year back which GLM 5.1 is, and I'm saying that as a big fan of GLM cause I run a translation site where GLM is good enough for the price.
Most of the money right now is in coding. Openai and Anthropic just have to be 6 months ahead of SOTA open source models and they'll capture most of the enterprise and dev market
Yes I'm an engineer (20 years most in games/graphics industry) and only use it for code. I've been using glm 5.1 this week a lot. I went in expecting another "decent" but not really "up to standard" open source model.
I highly doubt I'll ever use Claude again.
I think you are wrong about Claude being any significant level better
I've been mostly coding with GLM-5.1 as well and I agree with you. DeepSeek V4 Flash is another very good surprise. Incredibly cheap, fast and effective.
For coding like for everything else in life cost is a factor.
Cost for the value delivered. Like if you offered the current SOTA open source models at $0.1/M, I still think I'd be using Opus or 5.5 at $30/M. Or say GPT 5 which was released Aug 25, I don't think I'd use it for coding for even $0.1. I'd def find other uses for it(translations, agentic workflows, prompt guards etc), but for coding I don't think I'd ever completely switch to a SOTA open model
Unless ofc there was an actual speed difference, only reason I'd be willing to go with a worse model couple of percent worse than current best model is if the speed was at least 5x higher. Looking forward to kimi k2.6 offered publicly by Cerebras
Most work is not coding.
And also, people have it wrong⦠their models are not the main problem anymore. Itās the RAG
Depending on RAG is a workflow problem, not an AI problem
For coding assistance, I have tried OpenCode with several large open models through OpenRouter. All were fairly bad compared to Claude Opus. Could you provide some hints on how I should be holding these open models so that I might get more value out of them?
I agree with the common trope that open models lag behind by about a year, but something magical happened just around a year ago when the state of the art models became extremely useful. By this reasoning we're about to see open models perform well, but I'm afraid there is more to it than just waiting for another revolution around the sun.
Note, my application is coding assistance. Open models can be great for other purposes.
Great article I know this upsets a lot of people who are used to thinking Anthropic/OpenAI are just lighting cash on fire but they've cornered the market on enterprise who cannot walk away from these $200/month plans
However the valuations are still far far away from actual sanity
> enterprise who cannot walk away from these $200/month plans
Any org with more than 150 users aren't on $200/month plans, they are forced into API pricing + $20/month/user
For individuals and orgs small enough to get to use the subscription plans, that's all well and good until usage limits keep going down, or cost goes up. If you compare the usage you get on $200/month maxed out vs. what that would cost at API pricing, the $200/mont plan is an absolute steal. I doubt it will last long.
Have you tried the large open source code models?
I use glm-5.1 and occasionally deep seek v4.
They are as good or better than Claude's latest models.
And significantly cheaper. I've converted 3 of my engineer friends as well. All three have dropped their $200 month plans they had with anthropic.
We've all been a bit shocked at just how good these models are now.
If you "have" tried GLM (I specifically find it shockingly good for code). Did you not think it's not competitive to Claude, and why?
I use GLM-5.1.
It's good enough for personal stuff. It doesn't compare to the latest Opus I use at work. You can certainly argue I don't need Opus for work, but there is clearly a difference.
Also, at least with z.ai, GLM-5.1 is s l o w! After using Claude at work, I get really impatient with GLM-5.1 at home. When doing "true" vibe coding (i.e. not really examining the code), Opus is a ton faster (easily 5x).
But yeah, I'm not willing to personally pay for the frontier models. I won't even renew my annual Z.ai plan - it's become too expensive.
Hmm, I use opencode subscription, and glm seems just as fast from the tests I've tried to compare between the two. Tbh it mostly took Claude longer (mostly significantly longer) for the same tests.
Also, and I know you may not want to answer. But could you give me an idea of the type of thing you found glm to be worse with?
I think I've been fairly unbiased in testing a bunch of different development tasks. But am curious if maybe it performs well for some stuff and not others. So if you could share what you feel it's worse at.
Also are you an experienced developer or less experience?
Perhaps opencode zen isn't using z.ai as a provider?
I'll repeat something I wrote on an entirely separate HN submission.
When DeepSeek V4 Pro came out, I had been mostly coding with GLM-5.1 on a Z.ai coding plan.
I had a large analysis task on a relatively complex codebase. I decided to try the models out.
GLM-5.1 did acceptably but got a few things wrong (easily corrected) and took quite a while to get there.
Opus 4.6 burnt through the US$10 budget I had given it in about 10-15 min, without ever returning from the first prompt.
DeepSeek V4 returned a full analysis within 2-3 min, and I carried on all the way to implementing the feature I was after. Total cost less than US$1.00.
I now mostly alternate between GLM-5.1 and DeepSeek V4 Flash, with an occasional dip into V4 Pro for more complex analyses.
task i am working on right now at work is comparing two verisions of apis and documenting responses in their outputs. i suspect a vast majority of work at entrprise is of similar complexity.
right now everyone is using latest and greatest to do dumb stuff like that. that would change fast if companies start caring about costs.
> enterprise who cannot walk away from these $200/month plans
But that's the point of the article. Enterprise plans are starting to get API pricing, not the subsidized subscription pricing.
Anyone actually making money paying all of these monthly fees? Or just hobbyists? I have yet to see any real ROI posted anywhere.
Ai has become indispensable but maybe not at all cost. My company just had a company-wide meeting to talk about how they're restricting who can use which models and instructing us the "be more responsible with company's tokens". And it's not an small company by any means.
If nothing else this blog did give me the idea that I should split my $200 claude max plan into two $100 CC max and $100 codex plan, esp because Claude is now offering 1.5x weekly limits so its the 5x usage is now more like 7.5x usage.
Love how everyone boasted about replacing all the software with ChatGPT and then we end up with coding agents meaning the software engineer are STILL important. The sell is the development tool. It's classic cloud. Where did all the ops people go, many got subsumed by the cloud companies YET every company still has DevOps people to manage cloud infrastructure. The layer of abstraction went up but we still need the people to write the glue code and understand the business. OK great there's a new cash printer in the room. There's a new tool. Let's just start to ground the tooling in its new found gravity, profitability and IPO market dynamics... Reality has set in. The hype cycle is about to explode... Do you remember ride hailing and just how much cash was burned on credits pre Uber IPO. Then remember the IPO itself? These companies are not the new Google. They are a layer on top. Google was still the most efficient cash printing machine in history beyond the the US government and might still be. Will be interesting to see what the trillion dollar IPOs turn into. I'm going to say we see those prices get cut to a third in less than 5 years and scale back up over the next 15-20 years.
So it largely sounds like many more people will be able to write software - and will use AI to do it. Existing software engineers will continue to automate their tasks away like they always did, but perhaps at a faster rate.
The impact of AI in other fields seems to be muted.
I think it is applicable to a much wider range of knowledge work, but it's also harder to apply there.
Software development has the huge advantage that mistakes and hallucinations are very easy to spot: the software works or it doesn't.
Spotting errors in a research report or legal brief is a whole lot harder!
But... non-software professionals spend a huge amount of their time on tasks that can be safely automated - reformatting documents, extracting numbers from PDFs, all kinds of flavor of data entry.
Learning how to use a tool like Claude Cowork can take a big dent out of those.
How is the lack of bad news declaring a victory for AI? I am yet to see any company concretely publish analysis about the ROI from AI. Most companies as far as I know are still treating AI investment as sunk cost with no expectation of returns at the moment. We could very well see a world where companies heavily scale back investment.
I think the reasons for them going with API pricing will become abundantly clear when the S-1s become available. If they don't have a story covering how they can get revenue closer to expenses, then they're relying on the market to believe the pixie dust version of their profitability story, which I think people increasingly don't.
I wonder how a focus on per-token API profits will impact the incentives to improve token efficiency and drive down costs through optimized compute. I suppose as long as a few leading labs are competing, we'll see progress in this regard, but it's certainly less in their interest than it is with a flat subscription pricing model.
Who's to say those enterprises won't churn after XYZ comes out with a decent enough model that costs 10x less to use?
There's a whole bag of clever tricks you can play to juice short term results leading to an IPO that may not work longer term.
I'll believe they've found product-market fit when they have a product. Right now they're selling the infrastructure, in a highly subsidized and undifferentiated way (at least over a sufficient long period of time of, say, a couple of years).
Realistically, OpenAI found product market fit with the OpenAI API playground in 2021. People were using that as ChatGPT at the time.
Companies are kool-aid drinking now due to hype, but given how much they're spending, if they don't see REAL, BIG wins from it soon, they're going to scale it back quickly and switch to Chinese models. Claude isn't worth the API cost for a lot of development work, and once companies have had time to collect and crunch data they'll see this.
>Somehow this fragment turned into headlines like Uberās COO says itās getting harder to justify the money spent on AI tokenmaxxing, because the market for stories about AI failures remains enormous.
I notice this all over the place. Many people hate AI and want it to fail, and they're willing to invent misinformation if it supports that idea.
I wonder how Ed Zitron will shift goal posts this time, and how long it will take for that article, when published, to reach HN front page.
> Anthropic are strongly rumored to be about to have their first profitable quarter.
Is that quarter same as any other quarter in terms of infrastructure costs (e.g. are there any temporary discounts happening coincidentally)?
Hey man, that discounted rate on Colossus 1 inference is purely coincidental...
Didn't xAI basically donate the compute for that quarter so Anthropic could get to say they turned a profit?
The SpaceX S-1 says they're charging Anthropic $1.25b a month.
It also states that the first few months (this current quarter where Anthropic are reporting profit) are discounted.