Unmarkdown
AI Tools

Claude Code Voice Mode: The Complete Guide to /voice

Updated Mar 17, 2026 · 9 min read

Claude Code just got a microphone. The new /voice command lets you talk to Claude directly from the terminal, using push-to-talk voice input instead of typing. It's rolling out to roughly 5% of Claude Code users as of March 2026, and it changes the way you interact with the CLI in some genuinely useful ways.

If you have access, here's everything you need to know to start using it well.

What is Claude Code voice mode?

Voice mode adds speech input to Claude Code's existing terminal interface. Instead of typing a prompt, you press a key, speak, and release. Your speech is transcribed and sent to Claude as a regular text prompt. Claude's response still appears as text in the terminal, the same as any other interaction.

This is not a voice assistant in the Siri or Alexa sense. There is no spoken response. Claude doesn't read its answers aloud. The voice part is input only: you speak, Claude reads. Everything else about Claude Code works exactly as before. Your CLAUDE.md files, your tools, your context window, your permissions. Nothing changes except how you enter your prompt.

The transcription happens through a speech-to-text model that runs as part of the voice pipeline. Your audio is processed, converted to text, and that text becomes your prompt. The quality of the transcription is high for natural English speech, and it handles technical vocabulary reasonably well once you get a feel for how to phrase things.

How to activate voice mode

Run /voice in your Claude Code session. If the feature is available on your account, Claude Code will request microphone access from your operating system. You will need to grant permission the first time.

Once activated, voice mode uses a push-to-talk model. Hold the designated key while speaking, release when you are done. The transcription appears in your terminal, and Claude processes it like any typed prompt. You can switch back to typing at any time. Voice mode does not lock you in.

Requirements are straightforward: a working microphone, microphone permissions granted to your terminal application, and access to the voice mode rollout. If /voice returns an error or is not recognized, the feature has not been enabled for your account yet.

When voice mode is actually useful

Voice mode is not universally better than typing. It is better in specific situations, and understanding those situations is what separates productive use from frustration.

Explaining complex bugs

Describing a bug verbally is often faster and more natural than typing it out. "The auth middleware is returning a 403 for users who have a valid session token but whose subscription expired in the last 24 hours, and I think the issue is that the token validation runs before the subscription check, so expired users get an auth error instead of a billing redirect." That sentence takes about 15 seconds to say and 45 seconds to type. The spoken version also tends to include more context, because you are thinking out loud rather than editing as you go.

Dictating architectural decisions

When you are making high-level decisions about your codebase, voice is a natural fit. Talking through which approach to take, explaining tradeoffs, describing how components should interact. These are conversations, and voice mode lets them feel like conversations. You can lay out your reasoning in a continuous stream rather than condensing it into terse typed prompts.

Rapid brainstorming

Need to explore several ideas quickly? Voice mode removes the friction of typing each one. "What if we moved the rate limiter to the edge? Or we could keep it in the API layer but add a circuit breaker pattern. Actually, what about a token bucket per user at the edge with a fallback to the central rate limiter?" That train of thought flows naturally in speech. Typing it out interrupts the thinking.

Accessibility

For developers with repetitive strain injuries, mobility limitations, or any condition that makes sustained typing uncomfortable, voice mode is a significant accessibility improvement. The ability to interact with Claude Code without a keyboard opens up the tool to people who may have found long terminal sessions physically difficult.

Context dumps after meetings

You just got out of a meeting where three decisions were made and five tasks were assigned. Instead of typing it all out, you can speak the context directly into Claude. "We decided to drop the Redis cache and go with a local LRU cache for the session store. The team wants the API versioning done by next Wednesday. Oh, and the PM said we need to support bulk imports in the CSV endpoint, up to 10,000 rows." Claude gets all the context, and you can follow up with typed prompts to act on it.

When voice mode is not the right choice

Voice mode has clear limitations. Knowing them up front will save you from fighting the tool.

Code-heavy prompts. If your prompt includes specific code, exact file paths, or precise syntax, type it. Saying "open parenthesis, const user equals await get user by ID, open parenthesis, request dot params dot ID, close parenthesis" is absurd. Type the code. Use voice for the explanation around the code.

Noisy environments. Background noise degrades transcription accuracy. If you are in a coffee shop, open office, or anywhere with significant ambient sound, voice mode will produce transcription errors that change the meaning of your prompt.

Quiet environments where others can hear you. Describing your authentication bug out loud in a shared office is going to attract attention. And not the good kind. Voice mode works best when you have some privacy.

Precision-critical prompts. If you need Claude to follow exact constraints ("use exactly this function signature, with this return type, accepting these three parameters in this order"), type it. Spoken language tends toward approximation. Written language tends toward precision. Match the tool to the need.

Tips for getting the most out of voice mode

Speak in complete thoughts. The transcription works best with full sentences and natural phrasing. Fragments and false starts get transcribed literally, which can confuse Claude.

Combine voice with written follow-ups. Use voice for the initial explanation or high-level direction, then switch to typing for specific refinements. "Here's what I need" in voice, "here's exactly how" in text. This hybrid approach plays to the strengths of both input modes.

Use voice for the "why" and text for the "what." Voice is excellent for explaining why you want something done a certain way, providing background context, or describing a problem. Text is better for specifying exactly what Claude should produce. The combination is more effective than either alone.

Front-load the important context. If your voice prompt is long, put the most critical information early. Transcription accuracy is generally consistent throughout, but Claude's attention to your prompt follows the same patterns as with typed input. Lead with what matters most.

Review the transcription before Claude processes it. The transcription appears in your terminal before Claude acts on it. Take a second to scan it. If something was misheard, you can cancel and re-record rather than letting Claude work from a garbled prompt.

Current availability and limitations

As of March 2026, voice mode is in a limited rollout to approximately 5% of Claude Code users, as noted in the Anthropic Claude Code changelog. There is no waitlist or opt-in form. Anthropic is expanding access gradually, and you will know you have it when /voice starts working.

The feature requires microphone access at the OS level. On macOS, this means granting your terminal application (Terminal, iTerm2, Warp, or whatever you use) permission to access the microphone in System Settings. On Linux, your terminal needs access to the audio input device through PulseAudio or PipeWire. If microphone permission is not granted, voice mode will not activate.

The transcription model handles English well. Support for other languages has not been officially documented, and accuracy for non-English speech may vary.

There is no voice output. Claude's responses remain text-only. This is a deliberate design choice that keeps the interaction grounded in the terminal workflow rather than turning it into a voice assistant experience.

Voice mode and the output problem

Here is the part that most guides skip. Voice mode makes it faster to get things into Claude. But the output side still has the same challenges.

When you use voice to direct Claude through a complex task, the results are often substantial. Architecture documents, technical specifications, detailed reports, structured plans. Claude produces these as markdown in your terminal, and that is where they live until you do something with them.

If you need to share that output with your team in Google Docs, drop it into a Word document for a client, paste it into an email, or send it to Slack, the formatting becomes a problem. Markdown does not paste cleanly into any of these destinations. Headers become plain text. Lists lose their structure. Code blocks turn into monospaced messes or disappear entirely.

This is where Unmarkdown™ fits in. Paste your markdown, pick a template, and copy it formatted for your destination: Google Docs, Word, Slack, OneNote, Email, or Plain Text. The formatting translates correctly because Unmarkdown™ converts the markdown into the specific format each destination expects, not a generic HTML blob that might or might not render properly.

Voice mode makes Claude more conversational. Conversations produce more free-form, document-like output. That output needs to go somewhere. The pipeline is: speak to Claude, get markdown, format with Unmarkdown™, deliver to your team.

Voice mode and context management

One thing to keep in mind: voice prompts tend to be longer and more verbose than typed prompts. You are speaking naturally, which means more words for the same amount of information. This is not a problem in isolation, but it does mean your context window fills up faster.

If you are working on a long session with voice mode, be aware that you will hit compacting sooner than you would with concise typed prompts. The same context window limits apply regardless of how you input your prompts. Longer prompts mean fewer turns before Claude starts summarizing earlier parts of the conversation.

The strategies for preventing compacting still apply. Use CLAUDE.md files for persistent instructions, keep individual prompts focused, and start fresh sessions for distinct tasks rather than running everything in one long conversation. For a deeper look at how to structure those CLAUDE.md files and rules, see Context Engineering for Claude Code: The Complete Guide.

The bottom line

Voice mode is not a gimmick. It is a genuinely useful input method for specific situations: explaining bugs, providing context, brainstorming, and directing high-level work. It does not replace typing for precision tasks, and it is not meant to.

The developers who will get the most out of it are the ones who treat it as another tool in the workflow, not a replacement for the keyboard. Speak when speaking is faster. Type when typing is more precise. Combine both when the task calls for it.

If you have access, try /voice on your next complex bug report or architecture discussion. You will probably find that the hardest part is not the technology. It is getting comfortable talking to your terminal.

Your markdown deserves a beautiful home.

Start publishing for free. Upgrade when you need more.

View pricing