HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top)

(hwebench.com)

5 points | by fesens 12 hours ago ago

2 comments

  • fesens 12 hours ago ago

    Current benchmarks have ceilings, usually 100%. This benchmark aims to be a long lasting, high correlation with the ability to solve real world problems and follow complex instructions, and unbounded (meaning it can always go higher).

  • fabiofachini92 12 hours ago ago

    Amazing!