Is there a theoritical minimum for computing power required to say, target GPT-2? Is there something fundamental to prevent a gaming laptop from exceeding Claude Opus?
If you'd do this on a gaming laptop, it'd all be on SSDs, which are orders of magnitude slower than GPU's for memory access
Also, AI uses maths, called FLOPS, floating point operations
My laptop cpu (7840U) has 4.1TFLOPS, a H200 GPU has 3,958 TFLOPS
OpenAI chatgpt 5 was reportedly trained on ~100-200k nvidia GPU's
So:
- accessing data is 1000x slower
- maths is 1000x slower
- they have up to 200,000x more GPU's than a laptop
Now remember each part of the data is used multiple times, you start getting into the GPU's being 1000x1000x200,000x( data access multiple times) faster
So, I don't think there's fundamentally something impossible with training claude opus on your laptop, but moreso the time required would be so infinitely high that it's very improbable.
interesting benchmark but is a gpt-2 class model actually useful for agents? we run ~10k ai companions and found that anything below 7b params struggles hard with reliable json output and tool use. the training cost is impressive but for structured tasks the error rate might be too high for production.
Is there a theoritical minimum for computing power required to say, target GPT-2? Is there something fundamental to prevent a gaming laptop from exceeding Claude Opus?
( all of this math is approximate) https://stackoverflow.com/questions/62491720/in-latency-valu...
Bear in mind this is: - 5 years old - only cpu
If you'd do this on a gaming laptop, it'd all be on SSDs, which are orders of magnitude slower than GPU's for memory access
Also, AI uses maths, called FLOPS, floating point operations
My laptop cpu (7840U) has 4.1TFLOPS, a H200 GPU has 3,958 TFLOPS
OpenAI chatgpt 5 was reportedly trained on ~100-200k nvidia GPU's
So: - accessing data is 1000x slower - maths is 1000x slower - they have up to 200,000x more GPU's than a laptop
Now remember each part of the data is used multiple times, you start getting into the GPU's being 1000x1000x200,000x( data access multiple times) faster
So, I don't think there's fundamentally something impossible with training claude opus on your laptop, but moreso the time required would be so infinitely high that it's very improbable.
You could do it by hand, by calculating the gradients and doing backprop with pen and paper.
I have an unrelated question: Why the URL is submission is "twitter.com" when the link leads to "x.com". Is it still possible to use twitter.com?
interesting benchmark but is a gpt-2 class model actually useful for agents? we run ~10k ai companions and found that anything below 7b params struggles hard with reliable json output and tool use. the training cost is impressive but for structured tasks the error rate might be too high for production.
And why becomes "~" (circa) a minus?
Luckily that error is blatantly obvious.