Research
How we make dictation feel instant.
Notes on the latency engineering behind Yapper. We publish what we learn.
Why every other dictation tool feels slow
Most apps wait for the cloud to transcribe and clean up your speech before showing you anything. That round-trip burns 600–1300ms. Yapper transcribes locally with Whisper-v3-turbo, inserts the raw text immediately, then quietly replaces it with the cleaned version 200ms later. By the time the cleaner runs, your text is already on screen.
What we’re measuring
Every dictation logs asr_ms, inject_ms, and end-to-end perceived latency. We aim for ≤150ms p50 on Apple Silicon. More writeups coming.