interesting that you went with a voice-to-voice realtime pipeline for latency reduction. speech-swift (which I maintain) could complement this by adding on-device speaker diarization, enhancing your voice agent's ability to distinguish between speakers without cloud dependency. https://soniqo.audio/guides/diarize
interesting that you went with a voice-to-voice realtime pipeline for latency reduction. speech-swift (which I maintain) could complement this by adding on-device speaker diarization, enhancing your voice agent's ability to distinguish between speakers without cloud dependency. https://soniqo.audio/guides/diarize
[flagged]