Browser Agent Benchmark: Comparing LLM models for web automation

(browser-use.com)

11 points | by MagMueller 2 days ago ago

6 comments

  • wiradikusuma 2 days ago ago

    Since we're in this topic, can anyone suggest good AI-based tool for exploratory (fuzzy?) web testing?

  • pixel_popping 2 days ago ago

    It's lacking the best model (Opus 4.5) on the benchmark tho.

    • djohnston a day ago ago

      Yeah but then their own product might not score the highest.

      • pixel_popping 10 hours ago ago

        Exactly why I'm pointing it out, which feels a bit corrupt, but understandable.

        • djohnston 8 hours ago ago

          tbh i was a bit cranky yesterday - even if they are #2 on a legit benchmark that would be impressive

  • MagMueller 2 days ago ago

    [dead]