How can we compare local LLMs vs. APIs vs. subscriptions objectively?

(wonderwhy-er.medium.com)

2 points | by wonderwhyer 7 hours ago ago

1 comments

  • wonderwhyer 7 hours ago ago

    The debate around local vs API vs subscriptions feels mostly anecdotal. I tried building a tool that compares them using “quality-adjusted tokens per dollar.”

    The idea:

    Tokens per dollar

    Weighted input/output pricing (75/25 assumption)

    Benchmark-normalized quality (Arena, Aider, SWE-bench)

    Early results surprised me (local often loses economically unless privacy is heavily valued).

    I’m mostly looking for critique of the methodology:

    Is quality-adjusted tokens per dollar even the right metric?

    Is normalizing ELO to % defensible?

    What benchmarks am I missing?