Lucy
Talk
Voice AI Glossary · 28 Terms · 2026

Voice AI Glossary

Clear definitions for every term in voice AI, conversational AI, and speech technology — written for practitioners and curious learners alike.

Try Lucy OS1 →

Browse All Terms

From speech-to-text and text-to-speech to LLMs, AI memory, and ambient AI — every key concept explained.

Term

Voice AI

Voice AI is software that understands spoken language and responds with synthesised speech. Unlike text-based AI, voice …

Read definition →
Term

Speech-to-Text (STT)

Speech-to-text (STT) is the technology that converts spoken audio into written text. It is the first layer in any voice …

Read definition →
Term

Text-to-Speech (TTS)

Text-to-speech (TTS) is the technology that converts written text into spoken audio. In voice AI, TTS is the final layer…

Read definition →
Term

AI Memory

AI memory is the ability of an AI system to retain information from past conversations and use it in future interactions…

Read definition →
Term

Conversational AI

Conversational AI is software designed to engage in natural, multi-turn dialogue with humans. Unlike simple chatbots tha…

Read definition →
Term

Ambient AI

Ambient AI refers to AI systems that operate continuously in the background of your environment rather than being invoke…

Read definition →
Term

Real-Time AI

Real-time AI refers to AI systems that respond fast enough to maintain the natural flow of human conversation — typicall…

Read definition →
Term

AI Latency

AI latency is the time between the end of a user's input and the start of the AI's response. In voice AI, this is measur…

Read definition →
Term

Large Language Model (LLM)

A large language model (LLM) is an AI system trained on vast quantities of text to predict the next most likely word in …

Read definition →
Term

Voice Assistant

A voice assistant is software that responds to spoken commands and queries using natural language understanding. Traditi…

Read definition →
Term

AI OS

An AI OS (AI operating system) is a persistent AI layer that sits across your digital life — understanding your context,…

Read definition →
Term

Context Window

A context window is the total amount of text an AI language model can 'see' and use when generating a response. Everythi…

Read definition →
Term

Multimodal AI

Multimodal AI is an AI system that can process and generate multiple types of data — text, images, audio, video — within…

Read definition →
Term

Voice Cloning

Voice cloning is the process of creating an AI model that replicates the sound, tone, cadence, and personality of a spec…

Read definition →
Term

AI Hallucination

An AI hallucination occurs when a language model generates confident, fluent output that is factually incorrect or entir…

Read definition →
Term

Speech Recognition

Speech recognition is the technology that enables computers to identify and transcribe spoken words. It is often used in…

Read definition →
Term

Natural Language Processing (NLP)

Natural language processing (NLP) is the branch of AI focused on enabling computers to understand, interpret, and genera…

Read definition →
Term

Fine-Tuning

Fine-tuning is the process of taking a pre-trained AI model and continuing to train it on a smaller, specialised dataset…

Read definition →
Term

Retrieval-Augmented Generation (RAG)

Retrieval-augmented generation (RAG) is an AI architecture that combines retrieval — fetching relevant information from …

Read definition →
Term

Voice Activity Detection (VAD)

Voice activity detection (VAD) is the process of automatically detecting when a person starts and stops speaking in an a…

Read definition →
Term

Prompt Engineering

Prompt engineering is the practice of crafting inputs to AI language models to elicit better, more accurate, or more app…

Read definition →
Term

Speaker Diarization

Speaker diarization is the process of automatically segmenting an audio recording to identify 'who spoke when.' It label…

Read definition →
Term

Agentic AI

Agentic AI refers to AI systems that do not just generate text responses but take autonomous actions in the world — brow…

Read definition →
Term

Embeddings

Embeddings are numerical representations of text (or other data) that capture semantic meaning. In an embedding space, w…

Read definition →
Term

Token

A token is the basic unit of text that large language models process. Rather than working with whole words, LLMs break t…

Read definition →
Term

AI Inference

AI inference is the process of running a trained AI model to generate outputs from new inputs. When you ask an AI a ques…

Read definition →
Term

OpenAI Whisper

Whisper is an open-source speech recognition model developed by OpenAI and released in 2022. Trained on 680,000 hours of…

Read definition →
Term

AI Ethics

AI ethics is the branch of ethics concerned with the design, development, and deployment of artificial intelligence in w…

Read definition →

Welcome