A small desktop companion that turns your Dockbox into a voice assistant. Speech-to-text and text-to-speech run on your machine — your voice never leaves it. The agent, your data, and your tools stay on the server.
Press the on-screen orb or hit F9 from anywhere. A beep marks the start; you talk; another beep marks the end. The hologram pulses while it thinks, then speaks the reply back. Press again at any time to interrupt.
Whisper transcribes your voice on the device. The transcript — and only the transcript — goes to your Dockbox over the same authenticated session you already use. The reply streams back as text. Kokoro speaks it locally.
Half the requests we send to chat assistants don't need a typed message. They need a sentence — while you're walking, cooking, holding something, or just thinking out loud. Dockbox Voice closes that gap by reusing the agent and tools you already configured.
Whisper runs on your CPU or GPU. The audio is decoded, transcribed, and discarded — it never leaves the machine.
Only the transcript and the reply text traverse the network — over the same authenticated session your Dockbox already uses.
Uses the email-verified Dockbox session you already created — no extra auth, no API keys, no cloud accounts.
Download the installer, sign in, and talk. No terminal, no dependencies, no configuration files to edit by hand.
.exe installer — downloads everything it needsDownload and run JarvisSetup.exe. It installs Jarvis and the speech models in one pass.
The configuration window opens automatically. Enter your Dockbox URL and log in with your existing account — same credentials as the web app.
Choose which Dockbox group your voice turns land in. You can change this later from the settings.
Jarvis launches. Press the hologram or hit F9 from anywhere — you're live.
Same agent, same data, same tools — now reachable through a single button and a microphone.