6 comments

  • aitchnyu 7 hours ago ago

    Is there a theoritical minimum for computing power required to say, target GPT-2? Is there something fundamental to prevent a gaming laptop from exceeding Claude Opus?

    • willx86 6 hours ago ago

      ( all of this math is approximate) https://stackoverflow.com/questions/62491720/in-latency-valu...

      Bear in mind this is: - 5 years old - only cpu

      If you'd do this on a gaming laptop, it'd all be on SSDs, which are orders of magnitude slower than GPU's for memory access

      Also, AI uses maths, called FLOPS, floating point operations

      My laptop cpu (7840U) has 4.1TFLOPS, a H200 GPU has 3,958 TFLOPS

      OpenAI chatgpt 5 was reportedly trained on ~100-200k nvidia GPU's

      So: - accessing data is 1000x slower - maths is 1000x slower - they have up to 200,000x more GPU's than a laptop

      Now remember each part of the data is used multiple times, you start getting into the GPU's being 1000x1000x200,000x( data access multiple times) faster

      So, I don't think there's fundamentally something impossible with training claude opus on your laptop, but moreso the time required would be so infinitely high that it's very improbable.

    • TylerLives 2 hours ago ago

      You could do it by hand, by calculating the gradients and doing backprop with pen and paper.

  • JPLeRouzic 8 hours ago ago

    I have an unrelated question: Why the URL is submission is "twitter.com" when the link leads to "x.com". Is it still possible to use twitter.com?

  • clawsyndicate 5 hours ago ago

    interesting benchmark but is a gpt-2 class model actually useful for agents? we run ~10k ai companions and found that anything below 7b params struggles hard with reliable json output and tool use. the training cost is impressive but for structured tasks the error rate might be too high for production.

  • GuestFAUniverse 8 hours ago ago

    And why becomes "~" (circa) a minus?

    Luckily that error is blatantly obvious.