AI still doesn't work well, businesses are faking it, and a reckoning is coming

(theregister.com)

73 points | by samizdis 2 days ago ago

27 comments

amai a day ago ago

„companies will ask for discounts when they know a service company is using AI tools“
„Insurance underwriters are seriously trying now to remove coverage in policies where AI is applied and there's no clear chain of responsibility“
I see a future coming, where everyone uses AI but nobody admits it.

[-]
- nullpoint420 a day ago ago
  
  You mean like today?
al_borland 2 days ago ago

Even if it did work well, with such a rough start and years of false promises, who is really going to trust it? Everyone who does seems to be riding the hype wave.

[-]
- bigstrat2003 2 days ago ago
  
  Nobody with any sense believes any of the hype. It's the boy who cried wolf effect: after years of improvement it still sucks and can't get work done, so why on earth would I trust in the future when the AI bros claim "no this time it really is good"?
conception 2 days ago ago

This is a hilarious article.
“It passed all the unit tests, the shape of the code looks right," he said. It's 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It's a dumpster fire. Throw it away. All that money you spent on it is worthless.”
“This magic that literally didn’t exist two years ago in more than a toy state is moving at such a rapid rate that it couldn’t even reproduce sqlite three months ago and only got better enough in those weeks to produce a bad version of sqlite! Clearly useless! It has no value, no one is using it to do any work and won’t get better over the next three months or three years!”
An amazing take.

[-]
- yladiz 2 days ago ago
  
  You’re shifting the goalposts. The initial point was that the Rust regeneration of SQLite was wasted money, because it’s unviable due to its slow speed. You’re trying to shift it to be about how it may get better over time. Do you have something that is more specifically refuting the initial quote that doesn’t involve anything about potential improvement?
  
  [-]
  - conception a day ago ago
    
    The point wasn’t to make better SQLite, it was to make a functioning rust SQLite. Which it did. Badly but you don’t start at race cars. No one was assuming production SQLite.
- Pamar 2 days ago ago
  
  Well... the actual problem is, imho, that it looks like the LLMs seem to have reached (or are close to reaching) a plateau. You might be right about the "three months ago it could not produce a working implementation of a DBMS... but what if in 3 months (or 3 years) it stays stuck at the 20K slower threshold?
  
  [-]
  - jjk166 a day ago ago
    
    People have been saying, without any evidence at all, it's reached or about to reach a plateau for years now. We are clearly still seeing significant forward progress. While it's reasonable to think it will hit some plateau eventually, there's no reason to think that right now just happens to be as good as it's ever going to get.
    
    [-]
    - butlike a day ago ago
      
      Context is the plateau. It's why RAM prices are spiking. We're essentially throwing heap at the problem hoping it will improve. That's not engineering. It's not improving on a fundamental, technical level.
      
      [-]
      - rstuart4133 a day ago ago
        
        > Context is the plateau. It's why RAM prices are spiking.
        Yes, context is the plateau. But I don't think it the bottleneck is RAM. The mechanism described in "Attention is all you need" is O(N^2) where N is the size of the context window. I can "feel" this in everyday usage. As the context window size grows, the model responses slow down, a lot. That's due to compute being serialised because there aren't enough resources to do it in parallel. The resources are more likely compute and memory bandwidth than RAM.
        If there is a breakthrough, I suspect it will be models turning the O(N^2) into O(N * ln(N)), which is generally how we speed things up in computer science. That in turn implies abstracting the knowledge in the context window into a hierarchical tree, so the attention mechanism only has to look across a single level in the tree. That in turn requires it to learn and memorise all these abstract concepts.
        When models are trained the learn abstract concepts which they near effortlessly retrieve, but don't do that same type of learning when in use. I presume that's because it requires a huge amount of compute, repetition, and time. If only they could do what I do - go to sleep for 8 hours a day, and dream about the same events using local compute, and learn them. :D Maybe, one day, that will happen, but not any time soon.
      - jjk166 a day ago ago
        
        If a bridge girder isn't strong enough to support a load, you add material in the right places to make a larger, stronger girder. That is engineering. The idea that if you're not making fundamental improvements to your formulation of steel you aren't progressing is absurd. If adding RAM leads to improvements, and we have the engineering ability to add more RAM, then we are still making progress.
        
        [-]
        
        monkaiju a day ago ago
        
        Regardless of how true your statement is (just adding metal to a structure is commonly not a way to solve the problem you stated, it just makes the structure heavier which means other systems have more to support) the point is that it isnt exponential/fundamental progress, which is the type that would be needed to avoid the plateaus folks are mentioning. Also adding RAM doesnt give you even linear improvements, its logarithmic.
        
        [-]
        
        jjk166 17 hours ago ago
        
        > just adding metal to a structure is commonly not a way to solve the problem you stated, it just makes the structure heavier which means other systems have more to support
        As a mechanical engineer, that is exactly how you solve that problem.
        > point is that it isnt exponential/fundamental progress
        You just stuck the goalpost on a rocket and shot it into space. You'd be hard pressed to show evidence that progress in this field was ever exponential - in most fields it never was. Logarithmic progress is typical; you make a lot of progress early on picking the low hanging fruit figuring out the basics, and as the problems get harder and the theory better understood it takes more effort to make improvements, but fundamentally improvements continue.
        Incremental progress from increasing scale is, again, perfectly cromulent. It's how we've made advanced computers that can fit in your pocket, it's how clothing became so cheap it's practically disposable, it's how you can fly across the country for less than the price of a nice dinner. Imagine looking at photolithography, textile manufacturing, or aircraft 5 years after they reached their modern forms and saying "this has plateaued".
        
        [-]
        
        butlike 6 hours ago ago
        
        A little tangental, but I'm not entirely convinced the things you list at the end are improvements, per se. Clothing is so cheap because it's polyester, which is essentially plastic and is demonstrably bad for the environment. Same thing with 'computers in the pocket.' They're so cheap and refreshed at such a rate they become disposable when they really shouldn't be. E-waste is a real problem. Flying across the country...the train is better from a last-mile perspective.
        In a sense, looking at photolithography, textile manufacturing, or aircraft as you suggest, does show they plateaued, at least to me.
        Are we sure we want to be making things so cheap they become discardable in the ever-growing landfills of the world?
        
        monkaiju 4 hours ago ago
        
        > You'd be hard pressed to show evidence that progress in this field was ever exponential - in most fields it never was.
        Literally the introduction of transformers was absolutely exponential, in fact exponential progress is pretty much the defining characteristic of first chunk of a new technology's development. I mean in CS specifically, there are dozens and dozens of instances of exponential improvements. Like... obviously lol. Also the plateau that folks are mentioning is about a lack of fundamental improvements. Perhaps MEs dont experience exponential improvements but we do all the time in CS and SWE lol.
  - conception a day ago ago
    
    Opus 4.5/4.6 are definitely not plateaus. Definitely a threshold of quality improvement.
  - dpoloncsak a day ago ago
    
    20Kx slower is still faster than my manager could write it.
- fzeroracer 2 days ago ago
  
  Where are my flying cars?
  
  [-]
  - conception a day ago ago
    
    Flying cars exist? You can get one.
- hyperhello 2 days ago ago
  
  Why do I want to reproduce sqlite? It’s a library. The point of it is to be already written.
  
  [-]
  - justinclift 2 days ago ago
    
    Maybe a native rust version of it has value for some people? :)
metalman a day ago ago

I have a job stalled right now, I believ because a jackass manager decided to get an opinion from AI on the design for a larger steel ramp structure, which is wrong, but the jackass has zero capability to respond now that I have asked for clarification as to which numbers were used in his "calculations". As of right now I have a whole string of jobs hung up because of errors in designs and blueprints that are bieng sent to me to fix, somehow, but with, again, zero capacity to make descisions, deal with the cold, hard reality of figuring out how to build something out of metal, something they need, now, but only one unit, which has certain indeterminate issue, requirement, or aquardness that will tie up and jam a whole large organisation, where quite litteraly a broken door handle repair, ends up getting bumped, up, and up, to the top, as there is no PO, part numbers, mission statements, or glad handers, and NOBODY wants it on THERE desk. so the front doors, on a major asset located on very prime real estate, sit there flapping, and getting a hunk of chain wrapped around them for months, and now they cant figure out how to pay for having that, suddenly fixed. if I was less busy, I would be angry but it is turning out to be an amusing and bemusing viewpoint of the lurching ,jerking, debaucle brewing itself up.up↑ there

[-]
- bwestergard a day ago ago
  
  Your job sounds really different from what's typical here on HackerNews. I'm really curious - can you tell us more about it?
  
  [-]
  - metalman a day ago ago
    
    metal, the bending, joining, of metal for humans to use for something™,who find me through the interwebs , which I have been useing since the dawn, off and on, clumsily, but since grade school. the apple store was one room above a chinese resturaunt and had painted chip board walls. I have two web sites, one is a rental and I own the other, but I am focusing more and more on my core strengths in dealing with physical realities, which sometimes I call "applied geometry", though often there are curves and shapes that dont realy have names. But as a good deal of the work is designed and comunicated about with the use of computers and phones, I also spend a lot of time thinking about how that could be better, so hanging out here , trying to fight the good fight, is part of most days.
jjk166 a day ago ago

Pretty big jump from "we don't yet know the best ways to use this new tool" to "the tool doesn't work well"

[-]
- specproc a day ago ago
  
  I think that's adequately addressed in the article:
  > "The other way to look at this is like there's no free lunch here," said Smiley. "We know what the limitations of the model are. It's hard to teach them new facts. It's hard to reliably retrieve facts. The forward pass through the neural nets is non-deterministic, especially when you have reasoning models that engage an internal monologue to increase the efficiency of next token prediction, meaning you're going to get a different answer every time, right? That monologue is going to be different.
  > "And they have no inductive reasoning capabilities. A model cannot check its own work. It doesn't know if the answer it gave you is right. Those are foundational problems no one has solved in LLM technology. And you want to tell me that's not going to manifest in code quality problems? Of course it's going to manifest."
  You can argue with specifics in there, but they made their case.