Show HN: Clawphone – Twilio voice/SMS gateway for AI agents using TwiML polling

(github.com)

2 points | by ranacseruet 10 hours ago ago

1 comments

PranayKumarJain an hour ago ago

Nice—this is a very pragmatic “works with just TwiML” approach.
A couple questions / thoughts from building voice agents in production:
- How do you handle barge‑in / interruptions? With <Gather input="speech"> + polling, it’s hard to do true full‑duplex + partial ASR. Have you considered a hybrid mode where you keep the TwiML simplicity for setup, but optionally switch to <Stream> (Media Streams) when people want sub‑second turn-taking? - Twilio’s built-in speech recog is convenient, but in my experience it can be the first thing teams outgrow (accuracy, language coverage, costs, and lack of token-level partials). Do you expose an interface so people can swap STT later without reworking the call control? - For long agent responses: do you chunk <Say> / keep call alive with <Pause>? Any gotchas around Twilio timeouts while the agent is “thinking”?
We’ve run into the same infra-vs-latency tradeoff at eboo.ai (real-time voice agents / telephony + WebRTC). If you ever want a sanity check on the lowest-latency Twilio path (Media Streams + incremental STT + barge-in), happy to compare notes.