Voice AI Glossary · 2026

What Is OpenAI Whisper?

Whisper is an open-source speech recognition model developed by OpenAI and released in 2022. Trained on 680,000 hours of multilingual audio data, it achieves near-human accuracy across many languages and is widely used in both consumer and enterprise applications.

Try Lucy OS1 →

Definition in Full

Whisper is notable for two things: multilingual coverage (99 languages) and open weights (freely downloadable and runnable locally). This makes it the default choice for developers who need high-quality transcription without per-request API costs. However, Whisper processes audio in batch mode, not in real time, which makes it slower than streaming alternatives like Deepgram for live voice applications.

How Lucy OS1 Uses OpenAI Whisper

Lucy OS1 uses Deepgram nova-3 rather than Whisper for real-time voice conversation. Deepgram's streaming architecture returns partial transcripts in real time, enabling sub-500ms total conversation latency. Whisper's batch processing would add 1-3 seconds of STT latency, making live conversation feel sluggish.

Try Lucy OS1 →

Key Concepts

Multilingual coverage

Whisper is trained on audio in 99 languages, making it one of the most comprehensive multilingual STT options available.

Open weights

Whisper's model weights are publicly available under MIT license. Anyone can run it locally, fine-tune it, or build commercial products on it without per-request fees.

Batch processing

Whisper processes complete audio files or fixed-length audio chunks rather than streaming. Excellent for transcribing recorded audio; too slow for real-time conversation.

Size variants

Whisper comes in five sizes (Tiny, Base, Small, Medium, Large v3) trading accuracy for speed. Tiny runs on a CPU; Large v3 requires a powerful GPU for reasonable speed.

Frequently Asked Questions

Is Whisper better than Deepgram?

For batch transcription of multilingual audio, Whisper Large v3 matches Deepgram's accuracy. For real-time voice conversation, Deepgram's streaming architecture is significantly better, Whisper's batch-only design is not suited for low-latency applications.

Can I run Whisper offline?

Yes, the open weights make Whisper runnable on your own hardware. Whisper Small runs on a CPU. Whisper Large v3 requires an NVIDIA GPU with 10GB+ VRAM.

Is Whisper free?

The model weights are free. Running it requires compute, either your own hardware or via a cloud API. OpenAI charges $0.006/minute for the Whisper API; Groq offers Whisper inference significantly cheaper.

Experience OpenAI Whisper in Action

Lucy OS1 puts these concepts to work in a real, streaming voice AI pipeline: Natural Voice Recognition, Natural Voice Intelligence, and Natural Voice Synthesis delivering sub-500ms voice conversation.

Start talking to Lucy →