1 comments

  • herzigma an hour ago ago

    This is great work. Most voice AI optimizes for latency - you made the opposite bet (quality over speed, frontier models over lightweight ones) and that's probably the right call.

    The audio pipeline alone is impressive: on-device VAD, parallel TTS chunking, retry-from-failure mid-pipeline. That's not a weekend project; that's production-grade thinking.

    Here's the thing that excites me most, though - the *cognitive layer* is wide open. The experience harness is solid, but right now every session starts cold. Persistent user memory, context that makes your 50th conversation meaningfully smarter than your first, light orchestration that turns a single question into a structured multi-step inquiry - that's likely where this goes next, and it's a compelling frontier.

    Voice access to frontier reasoning is massively underserved. You've built the right foundation for it.