2 comments

  • ipotapov 2 days ago ago

    interesting that you went with a voice-to-voice realtime pipeline for latency reduction. speech-swift (which I maintain) could complement this by adding on-device speaker diarization, enhancing your voice agent's ability to distinguish between speakers without cloud dependency. https://soniqo.audio/guides/diarize

  • Applied_AI 3 days ago ago

    [flagged]