Talk to your terminal

I built a voice MCP for Claude Code. It runs locally, no cloud APIs, and is super fast.

I built this to spec out features. When I'm figuring out what an app should look like or how something should work, I'd rather talk through it than type it. Claude can interview me, push back, ask what happens when the user does X, and I end up with a better spec than I'd have gotten staring at a blank document. Talking through a piece of writing and having Claude shape it also gets closer to what I actually mean than starting from scratch.

Claude can also go off and run a task on its own and come back and tell you it's done. "Hey, I finished. Here's what I changed." You don't have to sit watching the terminal.

Claude Code doesn't have native voice, so I built Interweave. Three models running locally on your Mac, no cloud speech APIs. Hopefully someone reuses it or improves on it. More likely it just gives you an idea.

curl -fsSL https://raw.githubusercontent.com/EndlessHoper/interweave/main/install.sh | bash

How it works

The first version was a standalone Python app using Anthropic's Agent SDK. It listened on your mic, sent what you said to Claude, and played the response back as audio. I wanted it to be interruptible so it ran voice detection while Claude was speaking to catch you cutting in, but it couldn't filter out Claude's own voice from the mic and kept thinking its own speech was the user talking. I scrapped interruptions and went with simple turn-taking, and then realized the whole thing would be more useful as an MCP server than a standalone app. As an MCP server any coding agent can pick it up and it gets access to all their other tools for free.

Claude gets three tools, speak_and_listen, speak, and listen, each doing roughly what it sounds like. Under the hood it's Kokoro for generating speech, Parakeet for transcribing it, and a voice detection model that knows when you start and stop talking, all running on Apple Silicon with basically no delay. Claude speaks fully, then opens the mic, and when you pause for about a second and a half it transcribes and responds.

When to open the mic

Voice is just another tool call in Claude's loop alongside everything else.

Claude (tool calling loop)
  │
  ├── read_file("spec.md")
  ├── edit_file("app.tsx", ...)
  ├── speak("reading your files")      ← keeps going
  ├── run_code("npm test")
  ├── speak_and_listen("should I deploy?")
  │       └── "yeah go for it"         ← your voice
  ├── bash("git push")
  ├── speak("done, pushed to main")    ← keeps going

The hardest part of building with Interweave is deciding when to open the mic. speak sends audio and keeps going, while speak_and_listen opens the mic and waits for a response. Open it too often and you get a tedious loop of confirmations you didn't need. Too rarely and Claude runs off without you. What you want is a loop where Claude moves fast on its own and only pulls you in when your answer actually changes what happens next.

Install

curl -fsSL https://raw.githubusercontent.com/EndlessHoper/interweave/main/install.sh | bash

Requires macOS with Apple Silicon, Python 3.11+, and uv. Around 4.5GB for dependencies and models.

Also works with Codex and OpenCode. See the README for setup.

GitHub