qcode reads your code, proposes changes, runs commands, and streams diffs. All inference runs on your Mac. No cloud, no API keys, no telemetry. Your code stays on your disk.
The full walkthrough: desktop agent, iPhone takeover, voice + vision, and P2P delegated inference. Airplane mode on for the local segment. Filmed as episode one of Ship It Local, a weekly series reviewing apps built on the QVAC SDK.
6 minutes, one continuous review. Source file also available in the repo under docs/.
ReAct loop with 8 tools: read, write, list, grep, bash, diff, plan, reply. Native tool calling via the SDK with Zod schemas. Multi-turn memory with a sliding window and token budget.
14 KB vanilla JS PWA, no install, no app store. Served from the Mac daemon over LAN. Voice input, camera, chat, permission approvals, model picker. All over SSE streams.
Tap the mic on the PWA, speak the task. Whisper Base Q8 runs on-device via the SDK. Opt-in feature, downloaded from the QVAC registry the first time you enable it.
Take a photo from the iPhone, send it, get back a reading of the image. Qwen3-VL 2B with its projector runs on the Mac. Paste a stack trace, hand over a whiteboard photo, point the camera at an error.
Offload heavy tasks to a stronger peer over Hyperswarm. Same SDK, same agent loop. Tool execution stays on the Mac. Verified end-to-end with Qwen3 8B on a Linux peer.
ask (approve every write), plan-first (approve the plan, auto-run), auto-writes (trusted writes), yolo (demo mode, no prompts). Shell metacharacters are rejected; writes are sandboxed under the project root.
A single Express daemon on the Mac runs @qvac/sdk in-process. A capability router dispatches each prompt to the best backend: local Qwen3 1.7B for fast tasks, a P2P peer for heavy work. Tools always execute locally.
loadModel, completion({ tools }), toolCallStream.loadModel({ delegate: { topic, providerPublicKey } }). The SDK forwards the request over Hyperswarm. Tools still resolve locally.loadModel (primary)completion (streaming)tokenStream for assistant text.completion (tools)ToolInput[]. SDK injects tools via Qwen3's native chat template.toolCallStream{ name, arguments } tool calls parsed by the SDK.cancelAbortSignal.loadModel (delegate)startQVACProviderloadModel (Whisper)WHISPER_BASE_Q8_0 on demand when voice is enabled.loadModel (Vision)QWEN3VL_2B_MULTIMODAL_Q4_K + projector on demand when vision is enabled.downloadAssetQWEN3_1_7B_INST_Q4, QWEN3_4B_INST_Q4_K_M, QWEN3_8B_INST_Q4_K_M constants. SDK caches under ~/.qvac/models/.qcode was built to show what the QVAC SDK unlocks: inference, voice, vision, and delegated compute with no external services. The demo was recorded in airplane mode. No API keys, no cloud, no trackers. Your code never leaves your disk.
~/.qvac/models/$HOMEQwen3 1.7B handles single-file edits, reads, and searches reliably. Multi-file refactors benefit from peer delegation to Qwen3 8B.
When Mac and peer share a home router, Hyperswarm's hole-punch fails on routers without NAT hairpin. Workarounds: mobile hotspot, VPS peer, or Wireguard with a public relay.
Shared-secret header is fine on a trusted LAN. Public-internet deployment would need TLS + stronger auth, intentionally out of scope for the local-first demo.
Context window capped at 8192 tokens with sliding-window + token-budget management. Fine for most coding tasks, but large refactors are better delegated.
Everything is in the repo. Signed commits. MIT licence.