(1) Postcode recognition was hardest across ALL providers (50%+ WER).
(2) Regional variance was massive. Ireland accents destroyed most models (20-30% higher WER than Southern England).
(3) Short confirmations ("yeah", "ok") actually had worse WER than long explanations. Counter-intuitive but likely due to less context for the language model.
We are building AI agents for trades businesses (HVAC, electrical, plumbing).
We tested 13 providers on 100 real customer calls with:
- Background noise (vans, job sites, crying babies)
- UK regional accents (England North/South, Scotland, Ireland)
- Critical info: postcodes, addresses, phone numbers
- Variable turn length (1-5 words vs 16+)
Results: 2.5x performance gap
Interesting findings:(1) Postcode recognition was hardest across ALL providers (50%+ WER).
(2) Regional variance was massive. Ireland accents destroyed most models (20-30% higher WER than Southern England).
(3) Short confirmations ("yeah", "ok") actually had worse WER than long explanations. Counter-intuitive but likely due to less context for the language model.
Full breakdown with graphs: https://x.com/pstrav/status/2018416957003866564
Context: We're Elyos AI (YC S23), handling 100k+ calls/month for trades businesses across the world.
[dead]
[dead]