AI's economics don't make sense

(wheresyoured.at)

235 points | by spking 2 days ago ago

191 comments

  • JohnMakin 2 days ago ago

    I've sort of lost some respect for ed that I had early on in the hype cycle - he's still right about some things, but I can see him slowly and subtly retreating from his strong position, held even a few months ago, that these things will never ever be useful for anything and it's all a scam because they don't actually do anything at all except burn money. He would say it like 8 times a monologue. I remember one podcast maybe ~6 months ago he brought a developer skeptic on, and was trying to get him to say it wasn't actually useful for coding, and the dev was like "maybe not as advertised, but I definitely use it and it is useful to me" and he pivoted off the topic very quickly.

    It seems he realizes he was wrong about that and has pivoted slowly to, "well, maybe they work sometimes, but the cost isn't justified." Which is a reasonable question! I just find his style of never admitting when he is wrong off putting and the way he presents things as absolute fact, when he's guessing like the rest of us. He was right about a lot, wrong about a lot, it's okay to admit that, I don't think his fan base would care.

    • chromacity 2 days ago ago

      That's essentially how you become an online pundit. The internet rewards provocative takes. If you have a tendency to doubt yourself and revise your views, then (a) your views become less provocative and thus less likely to translate into click-worthy headlines; (b) you end up biting your tongue or saying "I don't know" often enough that is becomes impossible to keep up with the requisite weekly publication schedule.

      Which is to say, it's easy to scapegoat this guy, but I think his approach is not any different from other "opinion piece" bloggers that we all tend to reshare.

      • bdangubic 2 days ago ago

        > That's essentially how you become an online pundit. The internet rewards provocative takes.

        internet rewards provocative takes - plural.

        this mate has a single take and writes more about this one thing than jrr tolkien did in all his works combined

    • great_tankard 2 days ago ago

      This is exactly how I feel about him too. I also find his "number big" approach to writing ("check out my 18,000 word blog about something I'm learning about in real time") off-putting, so I've completely stopped engaging with it.

      We need better critics of the industry.

      • cyclonereef 2 days ago ago

        I always gets the sense around a third of the way through his articles that whoever reads his drafts just gives up. It goes from wordy and repetitive to wordy, repetitive, filled with rage-bait exasperation and more filler than content.

        Give the man a 2000 word budget and he could probably write a better article and cover the same information

      • chromacity 2 days ago ago

        > We need better critics of the industry.

        There's plenty, but they don't have enough material to post once a week. And if you don't post once a week, you don't end up on HN once a month. As simple as that. Looking at the blogs that show up on HN regularly, the usual hit rate is 10-25%.

        • great_tankard 2 days ago ago

          Yes, but the HN crowd isn't Zitron's main audience. He appeals to smart people who don't understand anything about computing or business. I do not mean this in a disparaging way; it's a curious audience that has somewhat justifiable moral and aesthetic objections to LLMs and especially the companies peddling them.

          The problem is that Zitron has charm, an authoritative voice and a very aggressive online presence. That's a difficult combination to compete against.

          • JohnMakin a day ago ago

            I started following early 2024, and the scene was much different - he was mostly a lone voice against the insane hype at the time, which definitely was not delivering. I liked hearing that opinion, amongst a wave of bullshit and slop and coming off the blockchain mania was very difficult to stomach.

            The landscape has changed a lot, but his content has remained mostly the same, maybe much more aggressive and less curious (in the beginning he would entertain other viewpoints more often), but since the tech itself has changed around him, so his repetive shtick falls a lot more flat than it used to because he is completely unwilling to entertain any other position other than the one that established his blog/show.

      • Lerc 2 days ago ago

        >We need better critics of the industry.

        I often wonder if there are people promoting people like Zitron because they want the poor quality criticisms to be prominent enough to be the ones that they face most often. It must be a lot easier than having to address valid criticisms.

    • tuveson 2 days ago ago

      I’m remember when CrowdStrike caused that huge outage, he basically blamed Windows / Microsoft for it. I kind of stopped taking him seriously after that. I more-or-less agree with his point of view, but he seems more interested in selling outrage rather than journalism.

      • JohnMakin 2 days ago ago

        I agree. Early on, it felt more like journalism, then I think he blew up and found something that works. If you challenge him on this, he will call you insecure or jealous, which I also find obnoxious[0]. I also find it highly ironic that all the ads on his podcast, at least on apple, are selling AI related products.

        [0] - https://www.reddit.com/r/BetterOffline/comments/1p5zv33/why_...

        • CodingJeebus 2 days ago ago

          FWIW, iHeart Radio probably manages his ad runs. He likely has no say over which ads get run on his show, and as I understand, the podcast advertising market has slowed tremendously in 2026. Podcasting platforms can't be as picky as they used to be.

          • causalmodels 2 days ago ago

            He may not have control over the podcast spots, but his PR firm does have several AI companies as clients.

          • lesostep a day ago ago

            iHeart Radio ads are usually from other podcasts though. I listen through PodBean and all their ads are for other shows.

            iHeart is so antiAI they added "Guaranteed Human" in the middle of every podcast they stream.

            Does apple run additional ads on podcasts?

    • mrandish 2 days ago ago

      I've only read a few of his pieces here and there and had just assumed he was an AI skeptic, so I never thought his position was LLMs would never be good for anything at any price. That's a pretty extreme thing for any serious person to have ever claimed. Frankly, it seems more like a straw man exaggeration of AI skepticism. I consider myself to generally be an AI skeptic, but to me that means skepticism about:

      1) Nearer-term investment returns on AI businesses and data center build-outs.

      2) Claims that LLMs are now (or soon will) rapidly displace most/all senior positions in certain high-skill professions (eg software engineering, music/film making, etc), leading to less overall jobs for those kinds of workers and mass unemployment.

      3) The "Foom" overnight takeoff hypothesis that AI will soon be able to iteratively sustain substantial self-improvement directly yielding profound new fundamental capabilities across infinite generations with no human involvement.

      I've never thought that AI isn't already quite useful for some things today, or that no investors will ever make money on AI, or that AI won't displace some workers in some types of jobs, or that using AI isn't already helping accelerate the development of AI. Just that there's been a lot of hype, exaggeration and over-estimation about how much impact, how soon and how broad. There will be a few instances of rapid, large impacts but the majority of it will be slower, more gradual and less disruptive than extreme predictions - and many of the most over-the-top predictions may not ever happen. Not because they can't happen but probably for more mundane economic, logistic and human-factors reasons along the lines of why we're no closer today to the 1950s visions of a flying car in every driveway.

      • JohnMakin a day ago ago

        Yea, this is a good article documenting how he was claiming this early on in 2024, that the models were as good as they would ever be and mostly worthless:

        https://www.theargumentmag.com/p/ais-biggest-critic-has-lost...

        • mrandish 19 hours ago ago

          Thanks for that link. It's solidified the growing suspicion I've had that Zitron wasn't worth paying much attention to. If I'd read more than 5 or 6 of his posts I'd probably have gotten there sooner. I now place him alongside AI critics like Gary Marcus whose early intuitions seem to have hardened into an extreme and unchanging broken record instead of a more reasonably nuanced counter to the most frothy AI hype.

          It's sad because such extreme, over-broad views presented as absolutes save AI zealots the trouble of creating straw men of skeptical positions. It's easier to just lump all AI skeptics together with Zitron and Marcus. I guess it's time to call myself something else, like maybe "AI Realist." My skepticism around AI has always been more specifically targeted to questioning more extreme claims about the degree of impact and how soon it will be meaningfully felt across broader society. I've also tried to be clear my concerns are centered on LLMs and not AI or machine learning in general.

          My position regarding the long-term (5-10 yrs) has always acknowledged that LLM-based solutions will continue to improve substantially, find more real-world, meaningful use cases and that the currently unsustainable cost-to-value will eventually normalize to a sustainable equilibrium enabling profitable businesses (after some major financial pain); but, that LLMs as a technology still have some fundamental limits on what they can do which aren't separable from how they innately work. Practically, this means I doubt that LLMs, as one type of AI, can ever fully replace an experienced, highly-effective human's ability to self-develop fundamental new knowledge from novel contexts then reduce that learning to high-value abilities in applied practice and then iteratively build on that loop to discover entire new areas of knowledge which weren't even visible without the prior layer of new knowledge - and then do that over and over. I've never thought that goal is categorically impossible for AI, just that it will require a new and different approach beyond LLMs. While that new approach may incorporate LLMs as an essential component, just evolving, refining and expanding LLMs alone won't get us there. I'm encouraged that recently several top AI research luminaries have been saying similar things.

      • dualvariable 2 days ago ago

        Yeah, I similarly doubt that LLMs are going to directly lead to AGI just via scaling and might almost be a dead end in that direction.

        But they're still quite useful tools and accelerators or force-multipliers.

        And you're still going to need humans in the loop.

        And I'm very worried that the capex buildout will implode once we hit diminishing returns and good-enough models can be run on substantially smaller footprints.

        It all isn't going away, though, and it will still continue to improve.

      • jcgrillo 2 days ago ago

        But are there any viable AI products? That's, I think, the root of his claim that it won't ever be good for anything. So far I have yet to hear of a really good, successful AI product. Coding tools arguably kind of work, but that's a pretty small addressable market, and it's still quite unclear whether any of them are viable long-term commercial bets. If you can get good results with Qwen 3.6-27B and Opencode what good is an Anthropic? There are a lot of big, unanswered, foundational questions like that in this space. That's pretty alarming given the huge amounts of capital being tossed around. Commercially, I think the jury is still out on whether LLM driven AI will ever be good for anything, and it's not necessarily an unreasonable position to take given the fundamental weaknesses of the underlying technology.

        • mike_hearn a day ago ago

          What are you defining as good and successful?? ChatGPT has 800M+ WAU, that seems pretty good and successful to me (not financially but they have time).

          AI companies aren't selling coding tools. Claude Code is not a coding tool! It's a tool that does coding, which is subtly different. The total addressable market for a coding tool is all developers, which is maybe 25-30M people worldwide, the total addressable market for people who need code written is potentially around a few billion or so, maybe more.

          • jcgrillo a day ago ago

            I'd like to see one of the major AI players demonstrate a successful exit. I don't think Coreweave counts here, because their long-term success is so tightly tied to the AI bubble continuing forever, which it probably won't. I want to see a strong company emerge from the bubble and start delivering real, sustainable value to its customers and investors. That would convince me it's possible to build a decent product and a real business on LLM AI technology.

      • dd8601fn 2 days ago ago

        Yeah the dotcom crash didn’t prove that the internet was useless for business. And the housing crash didn’t mean houses don’t have value.

        We get hype bubbles. They’re (nearly?) always bigger than the thing they’re about, in a given time and place.

        It’s reasonable to think the AI hype train is one of those, to some degree or another. It’s also reasonable to see great utility in llms, now and in the future.

    • hparadiz 2 days ago ago

      The economics is spending a few hundred bucks on software for an IC you're already paying over ten grand a month in order to make them more productive. How are supposedly smart industry experts not seeing this obvious fact? Are these guys actually experts?

      • Yizahi 2 days ago ago

        It's more of the spending potentially a thousand bucks (hypothetically - a heavy API usage by a developer utilizing top of the line agents to 100% every day, adjusted to actually be profitable) if you are paying that dev 4 to 6 grand before taxes. Now that would be a close call.

      • Rury 2 days ago ago

        NVIDIA execs are now saying otherwise: https://fortune.com/2026/04/28/nvidia-executive-cost-of-ai-i...

        Maybe Ed is right even if he's wrong on some things?

      • xienze 2 days ago ago

        > The economics is spending a few hundred bucks on software for an IC you're already paying over ten grand a month

        Let's be fair here, the endgame is not "a few hundred bucks a month." Not for how much money has been invested. How much extra you have to spend to make developers how much more productive, and will companies go along with it is the trillion dollar question.

        • koliber 2 days ago ago

          A long time ago a vast majority of people on earth were farmers. They used relatively simple tools like scathes.

          Over a few centuries better tools and technology made it so that <5% of the population in rich countries are farmers. They use tools like million dollar harvesters.

          • legulere 2 days ago ago

            It's not the 20x efficiency of harvesting technology compared to what agrarian societies that make them make sense. It's the productivity of the other 95% of the population that makes their labor cost so high that such expensive machines make economic sense.

        • hparadiz 2 days ago ago

          You know I can just lookup the costs per seat right? It's not that much and not everyone is a heavy user at an org. And for code the costs are falling per compute cycle.

          • xienze 2 days ago ago

            First, the key phrase here is "end game." Whatever you're looking at now isn't where prices will be in short order.

            Second, it seems a hard to believe that hundreds of billions of dollars would be spent and untold numbers of data centers would be built just to gain a measly couple hundred dollars per seat.

            • fragmede 2 days ago ago

              But it's a lot of seats. If you get 1 billion people to pay $20/month, that's $20 billion. Multiply that by 10 years and you have $200 billion.

      • CodingJeebus 2 days ago ago

        It's a few hundred bucks per month for now, but that's not going to last. At some point, the industry is going to pivot towards tracking token-based productivity because it's not going to be cheap forever unless FOSS models catch up.

        • m4rtink 2 days ago ago

          Please don't call open weight models FOSS models - that's actually very wrong, unless you actually have all the training data and can modify the data and training methodology to retrain the model yourself.

        • zozbot234 2 days ago ago

          FOSS models have effectively caught up wrt. scale, see e.g. the latest DeepSeek V4 series - but they still require major hardware resources (hundreds of gigabytes of RAM for a very lean deployment targeting single- or few-users inference) to run at acceptable throughput.

    • cottoneyejoe 2 days ago ago

      His reasoning about costs are also completely flawed. API fees aren't the providers' costs. It's a largely arbitrary number that they think they can get away with based on what everyone else seems to be doing that they also expect to cover on-demand usage as well as their research, marketing, and stock buybacks. They likely have a 60-90% gross margin.

      • senectus1 2 days ago ago

        show me an AI company that is making an cashlow profit?

    • Yizahi 2 days ago ago

      Ed's writing style is often off-putting, repetitive and sometimes gives almost "desperate" vibes. But he does raises questions no one in the industry is seriously entertaining and exploring. What if those monsters are indeed unprofitable, now what? So while I stopped reading him regularly, I visit once a quarter just to read something not about our inevitable benevolent apocalyptic LLM gods and their Prophet St. Sam, prophesying a complete job loss and despair.

      This reminds me of a Bitfinexed blog situation. That guy researched and proved Tether token scam for years and he was right. But he didn't account for a tiny nuance - Tethers are useful for financial crime and are propped by that public regardless of the financial viability or rejection by every decent financial institution. Turns out you can have a hundred billion of unbacked tokens, if they are "alternatively backed" instead. I suspect LLM monsters may turn out the same way (or not).

      Serious question - are there any LLM bubble critics with more sane and to the point style of writing and not just posting unsubstantiated hype for views like most on YT?

    • CSSer 2 days ago ago

      Weird, especially since a lot of us have similar opinions. Was he saying that from the start and since shifted focus to it or is it completely new? The conversation about cost isn't exactly a new one.

    • alsetmusic 2 days ago ago

      I am sympathetic to his view because I also considered the whole AI hype train a complete scam until pretty recently. When I saw enough people validating that coding agents were actually legitimately ok and sometimes good at things, I decided to spend $50 on one to test it out.

      I have been pleasantly surprised at its utility knocking out grunt work. It's not super smart, but it's great at things like writing a python script to edit characteristics of a jsonl file or sorting structured data. I didn't ever expect it to be useful beyond extremely limited output and it's actually kinda good when you know how to narrowly target the tasks. The constraints of code make it a more suitable category than all the other stuff.

      It's still a bs hype machine with Elon saying it might save all of humanity in court today. That's pretty unlikely.

  • joshjob42 2 days ago ago

    There's a few major problems with the article. The most obvious is that frontier labs are not charging remotely close to the cost of tokens; afaik most estimate north of 80% profit margins. As a reference, providers are profitably providing Kimi K2.6 for $4/1Mtok out. Is that as good as Opus? No, but it's probably at least Sonnet level, so that's ~4x cheaper than Sonnet while still being profitable to serve on the margin. So you aren't plausibly getting into actual subsidization territory until you're over 5:1 sub to nameplate token costs.

    How many tokens can you realistically burn through in one chat session? Opus and many other frontier models do maybe 60tok/s, less 250k/hr out. In you can use more, but in most cases cache is 5-10:1 cheaper than new input. Say you average 500ktok in, 90% cache, per request. That amounts to 100-150ktok in new input-equivalent costs, which in most cases is ~20-30ktok in output-equivalent costs. Do a request every minute, that's a total of about 1.5-2Mtok/hr. At API prices that's $50/hr for Opus, but really it probably only costs Anthropic $10/hr to serve that.

    That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.

    So first party providers are not in a horrifying position or anything from a subsidization standpoint. The people in bad shape are Cursor and Perplexity, who don't have frontier models and are dependent on the open source community, which is typicly 6-12 months behind the frontier. They have to pay full freight API costs at 80% margin for the big boys to serve their harnesses, which is indeed untenable, and they'll have to either force users to use open source models and/or in house models they can serve at-cost or they will have to charge vastly more.

    Gemini, Claude, and ChatGPT first-party services like Antigravity, Codex, and Claude Code are not in serious trouble though.

    • zozbot234 2 days ago ago

      It's not even a fixed cost per token (even though it's billed that way, and that's still miles better than a fixed-price all you can eat). You're incurring a cost that's proportional to generated tokens times the context for each (plus the prefill cost for any uncached input), so the expense grows quadratically with your average generated context.

      This all becomes extremely visible when trying to do agentic coding with local language models - you quickly realize that controlling context length and model size is just as important as avoiding wasted effort. The real scam is not AI Q&A ala ChatGPT, that's actually quite viable - though marginally less so as conversations grow longer. It's agentic coding with SOTA models and huge contexts.

      • GaggiX 2 days ago ago

        Using larger contexts often costs more in the APIs or consume more of your quota but this is becoming less of a problem with models using more clever attention mechanisms and not just full attention on all layers.

        You can look at: https://sebastianraschka.com/llm-architecture-gallery/ and see how much things have changed.

        • margalabargala 2 days ago ago

          This is also something of a non issue because as context grows and attention gets diluted, the models perform worse. It'll cost Anthropic more to run your 900k context session, yes, but it's in your interest not to have a 900k session in the first place.

    • boelboel 2 days ago ago

      Isn't this akin to saying Big Pharma companies could easily make money if they just stopped doing expensive research? The massive R&D spend is the core of the business plan; it's the only reason they can demand high prices in the first place. Once OpenAI stops spending billions on training, their pricing power vanishes because users will just migrate to Anthropic or whoever releases the next frontier model. Would imply there'd be space for only one to outlast them all in some sort of war of attrition (perhaps similar to silicon industry).

      • kimetime 2 days ago ago

        Big Pharma does seem like a good comparison for frontier lab business model. Doesnt really have the patent protection or distinct diseases pharma does, Wonder if labs start more heavily branding “specialties” instead of general capabilities to develop some differentiation

    • tobbe2064 2 days ago ago

      Your math is pretty bad 50$/h is a yearly cost of going by swedish standards, 50$/h × 40h/week × 48 weeks / year = 96k$/year At that rate is a really shitty bargin for 30% increase in productivity. Even if you drop it to 20$/h and sort of break even, you are loosing competens building and teory building, decreasing the likeleyhood of making architectual progress and risk getting bogged down in a swamp.

      • joshjob42 a day ago ago

        An employee often costs a company 2-3x their salary, so someone making 100k a year, costing 300k/yr, who is made 33% more productive (100k more worth of work to the company) offsets the compute cost.

    • loeg 2 days ago ago

      > How many tokens can you realistically burn through in one chat session?

      I've used single digit billions in a couple days, FWIW.

      • kcartlidge 2 days ago ago

        I'm a fair bit lower than some others as I only use it outside of work hours on my own small projects, but my Cursor account shows (for a random recent date) 12,184,233 tokens in a day. That day feels pretty representative.

        That's with 86 interactions spread intermittently over a couple of hours so if I did a full working day like that I'd be looking at maybe 40 to 50 million.

        • loeg 2 days ago ago

          My employer is paying for it, so I'm cost insensitive, and this is mostly with Claude / Opus 4.7 (which consumes a lot of tokens?).

      • bwestergard 2 days ago ago

        What sort of work were you doing?

        • loeg 2 days ago ago

          Converting a couple hundred kLOC C++ codebase to Rust.

          • bwestergard 2 days ago ago

            Cool. Sounds like it went well?

            • loeg 2 days ago ago

              Maybe! Still evaluating if the output does what it's supposed to do.

        • xienze 2 days ago ago

          Not the parent, but the way developers are basically trying to create entire development "teams" consisting of multiple agents that work around the clock using the latest, most expensive models (naturally) lends itself to burning insane amounts of tokens.

    • lbreakjai 2 days ago ago

      Problem with this math is it always assumes some ridiculous baseline compensation (or costs, in this case) as a matter of fact. There's an entire world of developers not costing 200k to their employers.

      Truth of the matter in most companies large enough is if you make your devs 30% more productive, then that'd mean 30% more code going through "change management" hell for months. You're not even paying to stand still, you're just pushing even more down a bottleneck. The price most people are willing to pay to make things worse is close to zero.

    • Balinares a day ago ago

      > providers are profitably providing Kimi K2.6 for $4/1Mtok out.

      Do you perchance have a source for this? Is the profitability assessment comprehensive, including hardware amortization? I've found it hard to track down actual hard numbers for the cost of inference.

    • ToucanLoucan 2 days ago ago

      > That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.

      Nobody including the connected article is making the argument that this cannot be profitable ever. People are saying "there is no way this admittedly quite interesting tool is going to be able to make back all of this money" and I think they are completely right to say that.

      You can absolutely make money with this stuff, just not at this scale. The buildout for this shit has been certifiably crazy and a number of the involved firms are overleveraged for tens and even hundreds of billions of dollars.

      How in the sweet fuck are you paying that off, plus giving investors dividends, selling this at $15/hour/user??? That math does not math. A quick google says there are between 1.5 and 4.4 million developers in the US alone, let's say it's 5 million, to be generous, and each of them is subbed to this for 8 hours per day, continuously. That's 600 million per year in revenue. If you took ALL that revenue, and put it towards paying down this debt, not leaving any for employee salaries, upkeep, ongoing development, it would take DECADES to pay down what OpenAI already owes.

      And yes I'm sticking directly to code, because that's the only thing I've seen it be really good at. Are we really proposing that every knowledge worker on earth and every manager of such workers is going to have an autonomous agent running all the time!? To do what, make sure they don't have to read or write email? Which even just that example is bringing in a fucking mess of legal, compliance, and security violations because LLMs are not intelligent and are not capable of being properly secured.

      Like I'm sorry, I cannot take this industry seriously when even the most basic back-of-napkin math is saying, nay, screaming from the rooftops that they are FUCKED.

      • belval 2 days ago ago

        > selling this at $15/hour/user??? That math does not math. A quick google says there are between 1.5 and 4.4 million developers in the US alone, let's say it's 5 million, to be generous, and each of them is subbed to this for 8 hours per day, continuously. That's 600 million per year in revenue

        That math is not mathing. $15/hour/user, with 5M devs, 8hrs and 240 working days per year that is 144B in revenue.

      • vidarh 2 days ago ago

        By your numbers, it'd be $120/day per developer * 5 million = $600m per day, not per year.

        Of course people don't work every day, but even with European-level holidays that number is off by a factor of 240 or so.

        • ToucanLoucan 2 days ago ago

          Quite right, honestly not sure how I fucked that up so bad but I'll own it. Okay so all we need is every coder + 0.6 million more or so in the United States, subscribed to this for 8 hours a day, and the business model can work.

          That still feels incredibly optimistic given how split the community at large seems to be about how good this tech is, and it assumes all those developers also all work for firms large enough to pay for all of that.

          However we are still very much in back of napkin math. We haven't even gone into what it costs to provide these services, how much it's going to cost yet for all these datacenters to be built, how much electricity and water they're going to rip through, their own employees and basic overhead, and all the rest. So IMO, we've now elevated it from "hopeless" to "this could work if a whole lot of other things line up really well."

          • asdfasgasdgasdg 2 days ago ago

            It's not just developers who are using this. My economist friends are. I bet most business analysts and general administration folks are or will be soon. Every normal person I know in my neighborhood is using AI for this thing or that. 50M people are currently subscribed to ChatGPT and it would be very surprising if this number goes down in the future.

            I dunno I think about the language some people are using about AI investment and it is reminiscent of the many years where people were saying Amazon was a bad buy because they never turned a profit. Admittedly AI companies are investing more than the money they've already brought in, but I would be very hesitant to predict that it's all froth given the usefulness I've gleaned from the tools.

            Don't get me wrong, I'm not unconcerned, but I think there are good reasons to suspect that at least some of the AI companies are making sound investments.

          • vidarh a day ago ago

            My fiancees company has no developers, yet everyone has a paid subscription to LLMs. Certainly not $15/hour, and I don't think it's likely they'll ever pay that for everyone, but I don't find it hard to picture the aggregate cost of subscriptions on a global basis to far exceed $600m/day between far more people on subscriptions cheaper than $15/hour but more expensive than today, and companies ending up paying far more than $15/hour averaged over their developers for additional use. E.g. I already run agents 24/7 just for me. I couldn't yet justify $15/hour, but the amounts I'm spending is steadily increasing as I manage to squeeze returns from more and more things.

            Sure, it's back of napkin math, and I also think that several of the companies we see today won't survive and/or will only survive due to consolidation, but I also think the spend is going to be immense.

            With respect to the datacentres, I expect we'll see inference costs crash over the coming years - we're only seeing the beginning of what dedicated ASICs will do to inference, and what work to make models more efficient will do to the need for the very largest models, and while that might drive down the spend on individual subscriptions, I think it will drive up the total spend dramatically as cheaper models become capable enough to put them "everywhere".

            But, yeah, ultimately we're guessing. I'm happy to put my guesses on the record, though, and look forward to look back and see how wrong I got it in a couple of years.

          • undefined 2 days ago ago
            [deleted]
      • Maxatar 2 days ago ago

        You wrote an entire wall of text when you could have just taken 10 seconds to review what you call the "most basic back-of-napkin math" and realized you were off by two and a half orders of magnitude.

      • strongpigeon 2 days ago ago

        > That's 600 million per year in revenue.

        According to your math, that's $600 million per day

        • marcosdumay 2 days ago ago

          Yes, the GP wrote the wrong unit on this place. That supports his conclusion that the pay-off would take decades, if it was actually per year, it would take several centuries.

    • intended 2 days ago ago

      > afaik most estimate north of 80% profit margins

      This seems to be the lynchpin of your argument.

      It makes me wonder if I have been living under a rock, because I have never heard of frontier labs making money. AFAIK all AI firms are simply burning money to acquire customers at this stage. Is this wrong?

      • asdfasgasdgasdg 2 days ago ago

        >It makes me wonder if I have been living under a rock, because I have never heard of frontier labs making money.

        You're confusing the profit from the marginal token and overall profit (basically gross margin and operating margin). The comment you're replying to is calculating that AI labs are probably making a substantial profit per paid token. It's just that so far that profit has not been able to overcome the ongoing R&D and capex costs.

        • kgwgk 2 days ago ago

          > not been able to overcome the ongoing R&D and capex costs.

          And the cost of not-quite-paid tokens.

          • margalabargala 2 days ago ago

            Which may or may not exist, hence this thread.

            • kgwgk a day ago ago

              Non-paid tokens do definitely exist and they weren’t included in the remark about “substantial profit per paid token”. Underpaid/subsidized tokens also exist which don’t provide “substantial profit”.

              • margalabargala a day ago ago

                Are you talking about free promo tokens the company gives out, or are you implying that subscription tokens are sufficiently subsidized so as to be below cost?

      • pmdr 2 days ago ago

        People tend to believe OpenAI and Anthropic can make money any time, the only thing they need to do is to stop training newer/better models. Source? Sam & Dario, of course (trust us, bro). It may (if they sell access at API price) or may not be true, but the scenario where training is stopped is simply unrealistic at this point.

      • dgellow 2 days ago ago

        I’m not exactly sure of the details but I believe they do make _some_ money on inference. But they then have to reinvest it all into training of the next model to stay competitive. So even if inference is positive (I’m seeing inconsistent reported data if that’s the case or not), it is directly spent.

        I do not understand how the companies can end up in positive, unless something fundamental changes

    • doctorpangloss 2 days ago ago

      lots of words.

      do you think per token prices will go up or down in the long term? will the price per task trend down or up?

      what about the price of human labor?

      • redox99 2 days ago ago

        He is proving that the article is based on false information.

        Prices going up or down depends on what labs decide and what users demand. Strong models being profitable at lower prices than what frontier labs offer is a fact.

      • roywiggins 2 days ago ago

        not nearly as many words as Ed Zitron at least

      • GardenLetter27 2 days ago ago

        The price of everything will go down. That is the beauty of the free market.

        • rspeele 2 days ago ago

          If the price of everything would go down it wouldn't be too concerning and everybody would be on board with the "beauty" of it.

          What seems to actually be happening for white collar workers is that the price they can charge for their labor is dropping, but the price of their expenses (housing, food, gas) continues to rise.

        • Yizahi 2 days ago ago

          In the absolutely free market price will go up a lot in the end. Because only one monopoly will exist by that time and it will jack up prices to the maximum tolerable level. And that level can be surprisingly high, because in every human activity there will be few willing to spend crazy amounts of money for practically anything they perceive valuable.

          • mike_hearn a day ago ago

            This kind of argument relies on odd definitions of "truly free" that boil down to anarchism, which isn't what anyone who advocates for a free market means.

            • Yizahi a day ago ago

              So what does free market mean then?

              • mike_hearn a day ago ago

                To me at least, it means a market in which the basic rules of commerce are enforced but beyond that the government doesn't micromanage. For example, contracts are enforced, there's some basic truth in advertising laws, there's a trustworthy currency available, and all the other basics of civilization like "your competitor isn't allowed to murder you".

                It's obviously a fuzzy scale.

                In a free market like that it's not guaranteed that everything ends in monopoly. Actually mostly it won't. Monopolies that do occur are due to high costs of entry and are usually temporary.

                • Yizahi a day ago ago

                  In the market you have described we will inevitably end with a monopoly in everything, simply because you didn't mention anything preventing that. To avoid monopoly a much more micromanaging government is required. At minimum we would need a specialized bureaucracy department investigating monopolies, an advanced legislative and judicial systems enforcing such laws, a lot of regulation regarding common social good (e.g. you can't just undercut competitors by selling poisonous shit, and you can't just bribe law enforcement to do the same), we would need an overreaching borders/customs/tariffs to block companies from countries not concerned about selling poisonous shit to undercut foreign competitors. And the list goes on.

                  Basically free market advocates fail to see more that a single step in the complex web of dependencies, which tries to prevent neo-feudal monopolization of everything by unchecked, unelected and being above most laws and taxes, robber barons.

                  I dislike unnecessary bureaucracy and excessive government control as much as anyone, I was born in the authoritarian USSR after all and I do study history. But I fear neo-feudalism even more. I certainly have zero self-delusions about being in a "ruling class" in that potential free market dystopia.

                  • mike_hearn 10 hours ago ago

                    It's not that we can't see them - I literally named some examples. But where is the evidence for your specific claims, because there's plenty of evidence against them. Markets without much regulation are routinely very competitive. Look at the computing industry, which for most of its history had no industry-specific regulations at all beyond the illegalization of hacking - a simple extension of private property rights.

                    And the effect by which regulation actually strengthens incumbents and reduces competition is well known.

                    A common problem in these discussions is conflation of different goals. You talk about companies selling "poisonous shit". That's not a competition related goal so has nothing to do with anything I've been saying. It's an environmental goal. Governments often pass environmental law fully accepting that it will reduce competition and might strengthen or even create new incumbents - and they don't care! In fact most environmental law is like that because it's exactly as you say, other countries like China don't pass such laws and out-compete local firms as a consequence.

                    But that's not a failure of the free market. It's a failure of environmental law. Or, sometimes not even a failure, just a known tradeoff.

                    As a general rule it's hard to find markets that are controlled by monopolies over the long run without government regulation being to blame. Temporary monopolies can arise naturally and there's nothing wrong with that, but over time they usually fall by the wayside unless a law is preventing that from happening.

        • dgellow 2 days ago ago

          The free market hypothesis is about resource allocation, nothing to do with price of everything going down

  • milesvp 2 days ago ago

    Reading this piece, I'm reminded of a podcast I heard some years ago where they were interviewing an early google marketing employee who was talking about the economics of google search. They said they'd done some surveys and concluded that they determined that the average user would get something like $20/year of value, and so that was the most they could realistically charge for search. Meanwhile, they could make something like $500/user in Q4 alone for advertising. So, of course, advertising.

    I just don't think that LLM business models can survive the allure of advertising dollars, any more than Search could, or TV, or Radio, or Movies. Ignoring the talk of copilot putting ads into pull requests, there is just no way that publicly hosted LLMs will not end up inserting ads into the output.

    This looks like what I remember. https://freakonomics.com/podcast/is-google-getting-worse/

    • swader999 2 days ago ago

      The output won't be read by humans (and increasingly this is the case in my own use) so I don't see how that works. If the output itself will be directed by the highest bidder, that doesn't work. Or if the output influences the agent's direction, that doesn't work either.

      • gizajob 2 days ago ago

        Stallman is going to be overjoyed when all the class and variable names in open source repositories have been reformatted to say EnjoyCocaCola and year_of_the_trucks_medicated_pad etc

      • meheleventyone 2 days ago ago

        They could make it work like rewarded video ads in mobile games. Block progress until you watch the ad. Then as dutiful engineers people can consume ads to support the business and avoid being laid off.

        More seriously for software engineering it’ll just cost a lot.

      • IshKebab 2 days ago ago

        What do you mean it "doesn't work"? I can totally see OpenAI take money in return for companies adding custom content ("Everyone agrees Mattresses4u make the best mattresses") to the training data.

        • swader999 2 days ago ago

          The utility of what your trying to accomplish goes to crap. For example, design me a strength program and it gets corrupted by gyms, trainers in my area etc that have been paid to be promoted in the output, especially if it's subtle. Or all of a sudden I'm getting a stack with Oracle in it all the time...

          • IshKebab a day ago ago

            I didn't say it would be useful... This is pretty much exactly how Google/Amazon search works now. Search for strength training and it will show gyms and trainers in your area that have paid for promotion.

            I think the real problem with that approach is you wouldn't be able to label the sponsored part, which I guess is a legal requirement in some places.

  • iooi 2 days ago ago

    The entire basis of this article is that generating tokens is a variable cost and that that cost will not decrease over time.

    > On an economic basis, a monthly subscription only makes sense with relatively static costs.

    Running a data center is a fixed expense. Whether or not people use that data center to it's capacity doesn't change how much the operator pays (electricity use factors into this, since a GPU running at 100% will use more watts than an idle one, but it doesn't move the needle much on other fixed and variable costs of a data center).

    > They also assumed, I imagine, that the cost of tokens would come down over time, versus what actually happened — while prices for some models might have come down, newer “reasoning” models burn way more tokens, which means the cost of inference has, somehow, gotten higher over time.

    This is backwards. When the cost of something goes down, people use it more. This is basic supply and demand. Inference has gotten cheaper already, and will continue to do so.

    Companies subsidizing costs for growth happens all the time. Yes, switching to usage-based pricing instead of subscriptions sucks for customers, but enterprises will continue to pay.

    • xnx 2 days ago ago

      > it doesn't move the needle much on other fixed and variable costs of a data center

      I wonder what the rough costs of a data center look like over the lifetime of one GPU generation?

      10% building

      60% GPU

      30% power

      I haven't gone looking for that information, but I haven't run across it either.

  • lbrito 2 days ago ago

    >At some point, the incredible, toxic burn-rate of generative AI is going to catch up with them, which in turn will lead to price increases, or companies releasing new products and features with wildly onerous rates (..) that will make even stalwart enterprise customers with budget to burn unable to justify the expense.

    I pray this happens soon, but I feel I've been hearing some version of it for a while.

    • ambicapter 2 days ago ago

      Big ships take a while to turn.

    • ToucanLoucan 2 days ago ago

      The only reason it hasn't is the sheer amount of credit being thrown at this tech. Both that and the valuations of the firms in question is stratospherically over-hyped and over-valued.

      This tech has uses. It has quite a lot of them in fact. However there is no usage of ChatGPT or Claude that makes OpenAI or Anthropic worth anything fucking close to what they're valued at right now, and both firms are scrambling to figure out how to get down from the top of the AI house of cards without detonating in the process.

      Meanwhile DeepSeek is coming out with more capable models that run on far less onerous hardware and with far less compute requirements that does basically exactly what the vast majority of users actually want it to do.

      This is going to be a financial bloodbath. Not for anyone actually responsible for it, of course, they'll be fine. It'll be everyone else getting soaked which is the only reason I give two shits.

  • wonderwhyer 2 days ago ago

    Yeah. And weird pricing seems like it's winding down.

    It's interesting to compare it to electricity. Basically Anthropic was selling a flat fee electricity subscription, and when someone started connecting expensive washing machines (OpenClaw) to their subscriptions, instead of changing the pricing model, they banned washing machines...

    I wonder if we will get to "electricity" style pricing for AI. What makes electricity predictable is relatively constant average usage over time + price is manageable. I'm just not buying electrical house heating and manage my electricity spending within some bounds.

    With AI the problem is that we are only now getting to useful AI, and for now it's still too expensive to be useful, so they subsidize until they can stabilize at "cheap enough and smart enough" level. But it feels like that's still 2 years away while they are stopping to subsidize now. Will be interesting.

    • gruez 2 days ago ago

      >Basically Anthropic was selling a flat fee electricity subscription

      No? It was flat, but with ambiguously stated limits (eg. 5x, 10x 20x). They were discriminating on how the "electricity" was used, but that's not that much different than how power companies have different rates for residential users vs industrial users.

      • ethin 2 days ago ago

        Even now they are insanely ambiguous with respect to their usage limits. They don't from what I know openly disclose them anywhere, so them saying "5x increase" is utterly meaningless, alongside "20x" or "10x" or whatnot, because we don't know what "x" is.

    • linkregister 2 days ago ago

      OpenClaw was never banned from the Claude API, only flat-fee plans.

    • swader999 2 days ago ago

      The Uber subscription analogy works well too.

  • wood_spirit 2 days ago ago

    The general problem the average user has with a metered instead of provisioned billing model for computer services is you can’t easily control for cost overruns. From the old days customers getting stung for hosting costs when slashdotted or DOSed, to last decades microservice shock horror of the CI retry loop that burns money overnight to today’s AI that you basically have no idea how efficient the AI will be while it ponders your question, you are just setting yourself up for disappointment and cost overruns and a feeling that you’re not getting the value for money you got last week etc.

    • gruez 2 days ago ago

      >The general problem the average user has with a metered instead of provisioned billing model for computer services is you can’t easily control for cost overruns.

      Is this an actual issue aside from people letting their autonomous agents run overnight?

      • wood_spirit 2 days ago ago

        I can speak of myself. Sometimes my session starts out well and I get the AI to cruise to 80%. But then gains after that seem impossible and what was built steadily unravels and then I get the compacting conversation message and realise that I’ve just spent a lot of money on nothing.

    • undefined 2 days ago ago
      [deleted]
  • Glyptodon 2 days ago ago

    I think there's another route this goes. At $7k a year or more per eng in token use, I think it's very reasonable to buy engineers machines with obscene GPUs and RAM and run models locally. And if it doesn't make sense now, someone will figure it out and save companies $10k+/eng over 3 years.

    • no-name-here 2 days ago ago

      If you only want/need the kind of model output that can be served on a machine costing single digit thousands, aren’t cheaper cloud-served models available? (And as the sister comment points out, sharing hardware allows greater utilization and lower costs per user.)

      • Glyptodon a day ago ago

        That might just mean even more savings as you'd only need need a size n cluster for m engineers where n is probably < m.

    • slopinthebag 2 days ago ago

      I imagine there are companies forming now with their entire business model being building "prosumer" inference machines and farms running everything from Qwen 3.6 27b up to GLM 5.1 and everything in between, packaged perfectly for companies to make one-time investments in with the assumption that open models will be getting both more efficient and better over time.

    • charcircuit 2 days ago ago

      That could leave idle time where GPUs are sitting unused. It would be better to have a shared cluster that many engineers all share. And to avoid a cluster not being saturated other companies queries could also be batched. And oh wait we are back to doing AI inference in the cloud as it is an efficient way to serve AI.

  • pmdr 2 days ago ago

    I wonder how long until this post is flagged/"shadowbanned". Such was the fate of almost all of Ed's posts on HN, with little surprise as to why.

    • CamperBob2 2 days ago ago

      People who don't adjust their prior outlook in light of newer data may not be the best fit around here. I'm OK with that.

      • pmdr 2 days ago ago

        What is the newer data?

        • margalabargala 2 days ago ago

          Extensively discussed elsewhere in this thread. Just start at the top and start reading comments.

          • maplethorpe 2 days ago ago

            Can you summarise? I only reached your comment after scrolling past all the others and I still don't have the answer.

            Is the new data that models are more useful for coding than they once were?

            • dwaltrip a day ago ago

              Cost of tokens goes down over time. Like by a lot. And it will continue to do so.

              Imagine being in 2003 and saying compute costs won’t go down. That’s Ed lol.

              EDIT: Some quick research on this so you guys have actual numbers: https://gist.github.com/dwaltrip/a037be938d2b5ecc8b8b238736e....

              There's multiple separate angles that all contribute to token-costs going down: chip improvements, engineering improvements for running inference in general, AI architecture and training advances that give similar intelligence in a smaller model, improvements in the quality of the training data, data center design / economies of scale, networking and rack-level improvements that are multiplicative with chip advancements, and so on...

              If you analyze the situation for 5 minutes, it's blindingly obvious that price-per-token will continue to improve. And there's a very similar case for intelligence-per-token as well.

              And don't get me wrong -- I have many concerns about how this is all unfolding and how it will impact society. But let's get our basic facts straight.

            • margalabargala 2 days ago ago

              That sounds like a reading comprehension skill issue? In which case I don't see why me summarizing would move the needle.

              But if it helps, no, the data being discussed is surrounding the economics of running inference and R&D, nothing to do with the utility of models for coding.

              • maplethorpe a day ago ago

                Yours is the first from the top to mention this. You might want to consider the physical location of your comment before telling people to read the thread. We could do without the rudeness, too.

  • ameliaquining 2 days ago ago

    As it happens, published just this morning is an article from Kelsey Piper that explains in some detail what's wrong with Zitron's takes: https://www.theargumentmag.com/p/ais-biggest-critic-has-lost...

    • 1attice 2 days ago ago

      I read that and I found it unconvincing. KP is correct that EZ is, by now, emotionally and perhaps ideologically fixated on AI's approaching reckoning, but that's KP psychologizing about Ed's inner states, which is neither fruitful nor relevant to consider when confronting a reasoned argument (or, in Ed's case, several.)

      EZ might have incautiously and incorrectly called the peak several times, but his newsletter is nearly always stacked with citations and insights that, at least to my cursory but frequent inspection, pan out.

      His argument(s) have evolved over time, but what of it? That just shows he's not the dogmatist the author wants him to be. Discourse evolves, get over it.

      2026 Zitron has a good sense of the scale at which AI is requiring enormous financial complexity and volume to realize, and his basic point is that it isn't sustainable in the medium term.

      He is self-evidently correct.

      • _aavaa_ 2 days ago ago

        > His argument(s) have evolved over time, but what of it? That just shows he's not the dogmatist the author wants him to be. Discourse evolves, get over it.

        I disagree. It really reads as conclusion is fixed argument change as they are disproven.

        • 1attice 2 days ago ago

          Sometimes it takes any writer some time to tease out what's bothering them. Motivations are like navels, everyone has one, and often they are obscure and strange, even to the motivated.

    • Darwins_Toffees 2 days ago ago

      - Reproduce academic papers - Put coding projects online for me so I can share them with friends - Determine which books in a set are missing from the school library and find where they’re cheapest online - Figure out which soccer club the team I see practicing at the local rec center belongs to and how to register my son - Design a bunch of robot-themed handwriting activities for a kindergartner who needs to practice making his uppercase and lowercase letters distinct

      I'm sorry but telling me that this is what AI can do is a sad state of affairs. Like this is google level stuff.

  • BosunoB 2 days ago ago

    All subscription models are subsidized by users who don't use much. The fact that somebody on a $20 sub might get $50 in value isn't crazy if there are 3 people who only get $10 in value. This isn't some sign that the model is broken, it's the intended outcome.

    Also, I didn't read this whole thing, but I have yet to see Zitron respond to the strongest AI financials claim, which is that the models themselves are profitable on a life-cycle basis, even if the companies are not profitable on an annual basis due to capital expenditure. Dario made this claim exactly, and it more or less blows all of Zitron's financials arguments up.

    • weakfish 2 days ago ago

      > but I have yet to see Zitron respond to the strongest AI financials claim

      He does in this [0] article.

      [0] https://www.wheresyoured.at/ai-is-really-weird/

      • BosunoB a day ago ago

        Thanks for the link. I'll admit I'm not an expert on the business side of this, but is this really much of a response? He seems to just call it strange accounting and then he moves on.

        It doesn't even feel like particularly strange accounting to me. Aren't there plenty of companies that spend a lot in one year and realize the gains in the next year? If I build a house this year and sell it next year, the house was still profitable, even if next year I'm building 3 more houses to sell in the year after.

    • mrkeen 2 days ago ago

      I subscribed to Claude for a month. I sat down with it for a few sessions, but in each case I ran into a limit before I achieved anything worthwhile. And that was with me babysitting it the whole time to try to get the most out of it. I'm not sure it's possible to use it less (so that others can use it more) and get anything meaningful done.

      • BosunoB a day ago ago

        Most small features take 80-150k tokens to implement, and most large features take 200-250k. For a hobbiest working like 10 hours a week, they can get stuff done but not nearly hit the weekly usage cap.

    • csande17 2 days ago ago

      Zitron has responded to that claim here: https://www.wheresyoured.at/ai-is-really-weird/#does-anthrop...

      The TL;DR is that Dario likes to talk about imaginary/hypothetical companies a lot in interviews, and those companies' financials don't have a direct basis in reality.

      • BosunoB a day ago ago

        Thanks for the link. There's not much of an argument here from Ed, though, besides that it's an unusual way to view or report margins.

        But it's not that unusual, right? If I build a house this year and sell it next year, the house might still be profitable even if next year I'm building 3 more houses, so the company as a whole is still in the red on an annual basis.

        I mean, I'm not a financial expert but that doesn't seem all that unusual to me.

        • csande17 a day ago ago

          The first part of the argument is just noticing that Dario is carefully avoiding making factual claims about Anthropic. Like, if the bank asked you if your construction company was profitable, would it be acceptable to respond: "Well, hypothetically, if a construction company sold houses for more than it cost to build them, that company could be considered profitable. It is possible to imagine a stylized model of a construction company that is theoretically profitable."? If the real, non-hypothetical company that Dario runs has financial results which support this argument, he should probably say them more often.

          The second prong of the argument is basically that, when you invest in Anthropic, you can't just invest in one model and then collect the profits from that model. You're investing in a whole company in the hopes that they can be profitable overall; at some point they'll need to stop spending so much money on training and give it back to the investors instead. Zitron argues that this isn't going to happen because training is actually something that companies need to do to retain customers at all. An analogy here might be the fact that Microsoft has to spend a certain amount of "R&D" budget fixing security vulnerabilities in Windows Server just to retain their current customer base; if attackers found out about a serious security hole but Microsoft didn't fix it, everyone would need to stop using Windows Server. LLM companies do the same kind of thing to fix "jailbreaks" and other unexpected model behaviors.

          The third prong of the argument is that, in general, there's a long history of companies using creative accounting to try and make themselves look profitable and then collapsing because they're not actually profitable. For example, WeWork's "community-adjusted EBIDTA" figured claimed the company was profitable using very similar arguments to Dario, and then the company went bankrupt. If you're already cooking the numbers, you have almost arbitrary flexibility to report whatever "margins" you want by excluding some of your costs from the calculation.

          • overrun11 a day ago ago

            > hypothetically, if a construction company sold houses for more than it cost to build them, that company could be considered profitable.

            Construction companies capitalize and depreciate over many years so they can answer "yes" they are profitable even when they are very cashflow negative. This is exactly Dario's point: model training costs are treated as expenses but in practice are much closer to construction costs. Model training effectively produces an asset, the model weights, which will generate revenue for many years into the future.

            > Zitron argues that this isn't going to happen because training is actually something that companies need to do to retain customers at all.

            This is exactly why Dario's point about each training run being profitable is so important. It suggest that this is not true. Customers are happy to use old models long enough to fully pay off their costs.

            > there's a long history of companies using creative accounting

            Zitron seems to know very little about accounting evidenced by him using terms like "gross margin" wrong in this article. He's pattern matching against his limited exposure to company financials to find superficial similarities between the AI labs and famous frauds. Find me a company that doesn't report non-GAAP measures. Google search claims 96% of SP 500 companies do it. Are they all frauds too? Sometimes non-GAAP adjustments are eye roll inducing but they are tolerated because they can be genuinely useful to get a fuller picture of the business.

    • CodingJeebus 2 days ago ago

      > which is that the models themselves are profitable on a life-cycle basis, even if the companies are not profitable on an annual basis due to capital expenditure.

      Until they file an S1 to go public and show the world the books, take everything they say with a grain of salt. The amount of financial engineering going on in this space is astounding, and I'll believe it when I see an objective 3rd party release an audit confirming this claim.

  • threepts 2 days ago ago

    I thought this burning of cash was all an excuse for the exponential growth we saw in the last 6 years.

    They went from GPT 2 a text only, goldfish-esque memory at a 8th grade reading level to what we have today, GPT 5, multimodality + a token window encompassing a enclyopedia and a Doctorate/Masters level of mastery in major subjects.

    The economics are probably betting on this exponential growth to continue, which if it fails, the cash would burn.

  • bananamogul 2 days ago ago

    The good news is that this might be the end of Oracle.

    • JohnFen 2 days ago ago

      Except that none of the genAI companies are an improvement over Oracle. There's no win in Oracle's passing if it's just replaced with a different company that behaves no better, or even worse.

      • jcgrillo 2 days ago ago

        It's not looking great for a lot of them either :)

  • fancyfredbot 2 days ago ago

    He does have a point about fees. It's not really surprising that the fee structure designed for chatbots would not make sense when applied to long running tasks and agents. But an increase in prices can solve this problem.

    Doubtless some people will reduce usage as a result. But Ed seems to find the idea that a 10 man developer team might spend 80K a year on tokens ridiculous. I don't understand this. Has he seen how much developers are paid? If you get a 20% productivity boost from coding agents, then that's two developers for 80K - effectively very good value.

    Where things could go wrong is in comparison to cheaper models. If it's 5K a year for Qwen, and it's 2/3 as good will you pay 75K extra for Opus? Perhaps not.

    • blks 2 days ago ago

      I think that team is better off with a junior developer. This alleged “20% productivity boost” even if it exists, is individual. On the team level, it will be largely offset by people having to review 20% more code.

      • fancyfredbot 2 days ago ago

        Obviously in some cases a junior developer is a better investment if it's a straight up choice.

        Actually I think it'll be rare for a manager to be choosing between either a junior developer or a coding assistant, since each are going to benefit the team in very different ways and it'll often be obvious which you need.

        What I mean is that at the price levels in the article the coding agent still had a realistic chance of positive ROI. People will pay for things with positive ROI.

    • Yizahi 2 days ago ago

      The problem is that LLM cost is more or less the same for generating some fixed amount of code or it will converge to that soon. But developer costs vary wildly based on the seniority*geographical location. Sure some Silicon Valley architect will be always more expensive than any LLM bills he incurred. But a middle tier dev at an outsource or local cheap shop overseas using the same LLM for the same tasks and same token costs? Eeh, it can go either way really.

  • christkv 2 days ago ago

    I'm just flabbergasted at the massive inefficient usage of tokens. What are people doing to spend 500 usd/day in tokens. I just don't understand what you could possibly be doing that would be not complete spagetti at the end if you run something in an autoloop.

    • doctoboggan 2 days ago ago

      Using Claude code with Opus 4.7 and xhigh effort for a few hours will definitely cost hundreds of usd.

      I am not sure if you would call claude code "an auto loop", but you don't need to be running something crazy like gas town to spend a lot of tokens with Claude.

    • georgeburdell 2 days ago ago

      No employer is telling their employees to use tokens thoughtfully. They might even have token usage leaderboards. One of my team’s agents runs on Opus 4.6 for a fairly narrowly defined scope of a few MCPs and skills. But everyone’s getting their promos and bonuses based on this alone. Next year we’ll get another bonus when we save $1000/day by switching it to Qwen 32B on a Mac Studio

    • intended 2 days ago ago

      It looks like a “People respond to incentives (prices)” situation.

      If something is cheaper than alternatives, spending patterns change. People subsidize corn or power and so consumers alter behavior to take advantage of those prices.

    • xnx 2 days ago ago

      > What are people doing to spend 500 usd/day in tokens

      1) They're lying

      2) Status signalling

      • christkv 2 days ago ago

        There is status in showing your inefficiency ?

        • xnx 2 days ago ago

          A $500 Gucci belt doesn't hold up your pants any better.

        • mrguyorama 2 days ago ago

          That's almost all status signalling ever is.

  • gwbas1c 2 days ago ago

    What's the quote?

    > Don't attribute to malice what can be attributed to incompetence.

    We're currently used to SAAS billing models that are either all-you-can-eat subscriptions, or metered around some easy-to-understand metric like # of users, or otherwise number of gigabytes consumed.

    The SAAS economics work that way because the compute consumed is typically too cheap to meter. Some customer uses a little more than average, some customer uses a little less than average; it's not worth the time to even it out to the penny.

    AI is so darn CPU (GPU? AIPU?) intense that will only be profitable, and affordable, if it can be metered like electricity and billed with a small margin.

    In SAAS, we're not used to metering billing computations this way.

  • chankstein38 2 days ago ago

    Before subscribing to Claude, I put $15 into my account so I could use it from Cline in VS Code. After less than a few hours I was out of money. This was basically just to get a simple project setup and a few 1000~ line (AI generated) code files edited. I have heard Cline is less ideal with token management but regardless, these services can easily cost us hundreds or thousands of dollars a month billed on usage. ($15x4hoursx2 for a work day = $30, $30x25 = $750). And that is assuming my very light usage here could even apply to a larger code base. My guess would be if I hooked it up to an enterprise project it'd skyrocket easily to $60+/day.

  • cheeseblubber 2 days ago ago

    It make sense if you account for cost of intelligence getting cheaper every year. Most of the models per unit of intelligence is getting far cheaper. We get better hardware, architecture, training techniques, inference optimizations and caching. All those improvements add up. In in early 2022 you were getting 10x cheaper annually now is closer to 2x - 5x cheaper annually. The cost is still dropping where as Uber can only get the cost down by so much.

    • mkesper 2 days ago ago

      Better hardware would have to be bought with additional money. And no one can forecast reliably how much optimization is left in the game.

      • cheeseblubber 2 days ago ago

        My problem with the article is that they don't even mention this fact. The metaphors with Uber often is brought up but it breaks down at cost optimization. It also wouldn't be fair to say we are at the peak efficiency of LLMs and that there wouldn't be any improvements left.

  • mNovak 2 days ago ago

    Do we know the breakdown of revenue from API vs subscriptions for OAI/Anthropic? That seems very relevant, since this entire article seems to be on the premise that users are only willing to pay for a subsidized subscription and would never pay the 'true' token cost.

    The internet seems to be saying that 70%+ of Anthropic revenue is per-token metered API, which would largely invalidate the article, but I can't find a solid source.

    • swader999 2 days ago ago

      I don't think these companies will give this information up until their hand is forced with an S-1 when they want to IPO. So stay tuned...

  • matchagaucho 2 days ago ago

    Same debate as the dot-com era.

    Customer: “I don’t want to pay more than $100/mo for my website” Developer: “What are your goals?” Customer: “1M daily visits, 1,000 monthly signups.”

    And we've spent the past 25 years offering serverless compute, auto-scaling, pay-as-you-go for AWS and Internet infrastructure. And the economics are still a hard sell.

  • purplepatrick 2 days ago ago

    I keep seeing articles like this that extrapolate from token pricing onto token costs. This is wrong.

    Companies don’t sell their goods/services at cost. A model’s being priced at, say, $30/M for output tokens doesn’t say anything about what it costs the company to provision the 1M tokens via the model.

    And no, you cannot extrapolate from any company margins that someone may have overheard in an SF coffee shop onto individual product line margins or their trajectory either. This information is usually unknowable even in most SEC filings for public companies.

    It’d be great if people who wrote these articles used, say, AI to look up some basics on how a business operates. It’s really easy to do, believe me ;)

  • jsLavaGoat 2 days ago ago

    Ed could have been right, but I think he's a bit of a front runner than ended up being out too far and not accepting that, for coding at least, the tool is useful. And coding is a big business itself. Of course there are always going to be shenanigans to point out, and I'm glad there are skeptics.

  • alok-g 2 days ago ago

    A possibly naive question: Mobile phone plans, Internet service providers, etc., also often used fixed monthly pricing. How does that keep working (with competition present)? Is the issue monthly pricing, subsidies, or both?

  • latentframe 2 days ago ago

    This seems a classic capital cycle problem, with huge upfront investment, unclear pricing power and everyone scaling supply at once and so that combination usually doesn’t ends with great returns

  • mitjam 2 days ago ago

    I would be curious to see a calculation backwards from TAM. Napkin: 50M developers worldwide (SlashData, 20M in China and India). If every developer had a $200/month subscription, that‘s $10B / Month. I think, many developers are expected to pay much more than that.

    • warkdarrior 2 days ago ago

      Microsoft made $17B/month in 2024, Google made $25B/month, Amazon $48B/month. And the computing market is growing.

    • Lionga 2 days ago ago

      Most developers in China or India have a monthly salary of 1 K USD. If you expect them to pay way more then 200USD thats like asking US Devs to pay 5K a month. Yeha not gonna happen.

      And the funny thing is the estimate pure CAPEX Spend of AI companies needs them to earn about $20B to $40B a month to cover cost of capital alone of their trillion dollars of investments.

      • warkdarrior 2 days ago ago

        > Most developers in China or India have a monthly salary of 1 K USD. If you expect them to pay way more then 200USD thats like asking US Devs to pay 5K a month. Yeha not gonna happen.

        That's exactly what is going to happen. India/China prices will be $100-200/month, US prices will be $5000/month. Keep in mind that most of these costs will be covered by the employer. It'll put downward pressure on dev pay, of course.

  • Ritewut 2 days ago ago

    It makes sense when you realize the goal is not the consumer but large gov and enterprise contracts.

  • Marciplan 2 days ago ago

    I am a paying subscriber to Ed Zitron and I enjoy his writing a lot. He should at some point admit that not everything is bullshit and there is definitely a business model to it. It is fun to read, though

    • mediaman 2 days ago ago

      He has a fun writing style but has so many willful errors, and is so committed to one point of view regardless of the facts, that his writing seems kind of worthless.

      I soured on him when he could not calculate cumulative revenue on an exponential curve, ignored everyone who showed him how to calculate it, and then kept writing that Anthropic’s revenue numbers are fake based on his inability to do math.

      It’s too bad because any heavily hyped industry needs good critics (think Ida Tarbell to Rockefeller) but they should be honest critics, and he’s not, which really undermines not only his but others’ criticism of the industry.

    • xnx 2 days ago ago

      It's good to have contrarian viewpoints, but Ed Zitron is so blinded by his AI hate that his articles should be treated not just with skepticism, but heavy suspicion.

  • gnachman 2 days ago ago

    So you’re saying there’s a chance that Oracle will die? Sign me up.

  • ludicrousdispla 2 days ago ago

    Does this mean we can just go back to using software libraries?

  • putzdown 2 days ago ago

    The moves from “the subscription model for AI isn’t working given these parameters” to “a subscription model for AI can never work” to “the model was deliberately deceptive” to “it’s a fucking ripoff” is not logical. AI companies are feeling the need to get hold of spiraling costs by increasing prices and limitations. Inference hasn’t gotten cheap enough fast enough, and for some reason they feel they can’t wait longer. That doesn’t mean a subscription service can’t work: only that it will be expensive, maybe vastly so, and will need tiers based on usage with some fluidity for users to move between tiers in a given month. The model is something like HP’s “instant ink” service. Sure, there’s a question whether the moves companies are making now are worth the cost in the eyes of customers. But that’s a question of economics and timing, not a fundamental blow to monthly subscriptions as a model. The article doesn’t deal with these considerations fairly. It’s too much in the direction of a rant, with conspiracy theories thrown in.

  • thinkindie 2 days ago ago

    i don't know when it was introduced, but Claude Code has recently added the cost for your session when you run /usage

  • feverzsj 2 days ago ago

    It makes perfect sense, if you treat it as a Ponzi scheme.

    [0]: https://www.wheresyoured.at/why-are-we-still-doing-this/

  • throwawayajner 2 days ago ago

    Zitron misunderstands the economics of models. Inference costs have dropped 99% in less than 2 years. Models are being commoditized faster than any technology in history.

    A $20 subscription 2 years ago is not providing the same level of intelligence you're getting today.

    Every major lab knows open source models are 6 months behind (See Google's "We have no moat") and none of them plan to make money on inference. Companies are subsidizing users to create moats that persist when models are essentially free for most everyday use.

    • pmdr 2 days ago ago

      > A $20 subscription 2 years ago is not providing the same level of intelligence you're getting today.

      That subscription was then and is now likely still subsidized.

      • davikr 2 days ago ago

        For all we know, there could be 10 people paying for a ChatGPT subscription and not using it enough to subsidize 1 power user _and_ still have money left for profit.

        • pmdr 2 days ago ago

          Oh they'd be sure to let us know if that were the case.

          • warkdarrior 2 days ago ago

            Why would the AI companies advertise that most of their users do not use their subscription in full??

  • OrvalWintermute 2 days ago ago

    I think the company Taalas alone destroys Ed’s arguments

    Because, comparing vs GPUs

    ~16k–17k tokens/second per user

    <1ms latency

    10x power efficiency

    20x cheaper production

    Model to Si ~ 60 to 90 days

    We have every reason to believe SW_to_Si will facilitate improving economics

  • aaroninsf 2 days ago ago

    Ed, my friend, I've got some news for you.

    Economics Don't Make Sense.

    I mean, seriously... our current late-stage capitalist economy is the chaotic sloshing of excess capital or inverted debt in a shallow tub within which clumsy giants are stamping like toddlers, and a parasitic kleptocratic oligarch class balances its efforts biting the toddler ankles in hope of more stamping judged advantageous, and, bagging what water they can.

  • asah 2 days ago ago

    meh - by this logic, every new tech and startup ever is a "scam"

    The truth is that the AI companies are gambling that inference cost will continue following a hyper version of Moore's Law, e.g. Google TurboQuant.

    The countervailing thesis is that frontier models are consuming more and more compute.

    The deepest truth: you often don't need a frontier model to get commercially acceptable results from AI. Thus, bring on the true pricing! and I'll just switch models to something financially sustainable.

    • swader999 2 days ago ago

      We work comes to mind. The math is fairly easy if we know what a company like OpenAI's datacenter commitments are, what their sub and token revenue is right now and what their operation costs are. This is very basic and if you had that info you would know exactly if we are in bubble or not. Waiting for the S-1's...

  • jcgrillo 2 days ago ago

    The finding out phase has begun.