nanochat can now train GPT-2 grade LLM for –$73 (3 hours on single 8XH100 node)

(twitter.com)

12 points | by tosh 9 hours ago ago

6 comments

aitchnyu 7 hours ago ago

Is there a theoritical minimum for computing power required to say, target GPT-2? Is there something fundamental to prevent a gaming laptop from exceeding Claude Opus?

[-]
- willx86 6 hours ago ago
  
  ( all of this math is approximate) https://stackoverflow.com/questions/62491720/in-latency-valu...
  Bear in mind this is: - 5 years old - only cpu
  If you'd do this on a gaming laptop, it'd all be on SSDs, which are orders of magnitude slower than GPU's for memory access
  Also, AI uses maths, called FLOPS, floating point operations
  My laptop cpu (7840U) has 4.1TFLOPS, a H200 GPU has 3,958 TFLOPS
  OpenAI chatgpt 5 was reportedly trained on ~100-200k nvidia GPU's
  So: - accessing data is 1000x slower - maths is 1000x slower - they have up to 200,000x more GPU's than a laptop
  Now remember each part of the data is used multiple times, you start getting into the GPU's being 1000x1000x200,000x( data access multiple times) faster
  So, I don't think there's fundamentally something impossible with training claude opus on your laptop, but moreso the time required would be so infinitely high that it's very improbable.
- TylerLives 2 hours ago ago
  
  You could do it by hand, by calculating the gradients and doing backprop with pen and paper.
JPLeRouzic 8 hours ago ago

I have an unrelated question: Why the URL is submission is "twitter.com" when the link leads to "x.com". Is it still possible to use twitter.com?
clawsyndicate 5 hours ago ago

interesting benchmark but is a gpt-2 class model actually useful for agents? we run ~10k ai companions and found that anything below 7b params struggles hard with reliable json output and tool use. the training cost is impressive but for structured tasks the error rate might be too high for production.
GuestFAUniverse 8 hours ago ago

And why becomes "~" (circa) a minus?
Luckily that error is blatantly obvious.