3 comments

  • pstrav 5 hours ago ago

    We are building AI agents for trades businesses (HVAC, electrical, plumbing).

    We tested 13 providers on 100 real customer calls with:

    - Background noise (vans, job sites, crying babies)

    - UK regional accents (England North/South, Scotland, Ireland)

    - Critical info: postcodes, addresses, phone numbers

    - Variable turn length (1-5 words vs 16+)

    Results: 2.5x performance gap

        Best: Deepgram Flux (15.86% WER)
        Worst: OpenAI Whisper (39.78% WER)
    
    Interesting findings:

    (1) Postcode recognition was hardest across ALL providers (50%+ WER).

    (2) Regional variance was massive. Ireland accents destroyed most models (20-30% higher WER than Southern England).

    (3) Short confirmations ("yeah", "ok") actually had worse WER than long explanations. Counter-intuitive but likely due to less context for the language model.

    Full breakdown with graphs: https://x.com/pstrav/status/2018416957003866564

    Context: We're Elyos AI (YC S23), handling 100k+ calls/month for trades businesses across the world.

  • awooga 5 hours ago ago

    [dead]

  • asaf_lerner 5 hours ago ago

    [dead]