Multi-backend chat
Claude, Codex, Gemini, Copilot, and Ollama — five stream parsers, one consistent UI. Markdown. Syntax highlight. Streaming tokens. Cloud or local, same window.
┌─ [ 01 ] · a desktop app for people who live in coding CLIs
Overcli wraps claude, codex, gemini, copilot, and ollama in a single desktop app — conversations, diffs, git worktrees, and usage stats in the same place. Stop juggling terminals. Start reading the work.
over·CLI — a GUI that sits over your CLIs. Yes, that's the name.
A live peek. The three lanes are actually streaming.
[ 02 ] · inside the window
Every surface below exists because somebody got annoyed that it didn't. No feature-ballast, no PM checkboxes — just the pieces you miss the moment you pop back into a raw terminal.
Claude, Codex, Gemini, Copilot, and Ollama — five stream parsers, one consistent UI. Markdown. Syntax highlight. Streaming tokens. Cloud or local, same window.
Overcli sits on top of the CLIs you already use, with the auth they already have. Your Claude Pro, your ChatGPT plan, your gcloud creds, your GitHub Copilot seat — whatever claude, codex, gemini, and copilot are signed into is what overcli runs through. Local ollama for the rest. No API keys to manage. No new bills.
Group related repos into a single workspace and every agent inherits the map: it knows the projects belong together and navigates between them naturally. Ask for a change that spans api-gateway and billing-svc and the agent edits both, runs both test suites, and shows one review. The feature that makes polyrepo feel like monorepo.
Long-running background agents that live next to your work. A doc-writer that keeps /docs in sync as the code changes. A PR-reviewer that quietly comments on every pull request the moment it opens. No Slack pings. No dashboards. Just work getting done.
After every turn, fire a second agent — on a different backend if you like — to review what just landed. Thinking blocks visible, round counter included. Collaboration mode loops rounds until the reviewer stops finding things to fix.
Note: Copilot can't be a reviewer yet — its CLI doesn't read prompts from stdin. When it's your primary, overcli routes the review to claude/codex/gemini and shows a "Routed via X" chip so it's not surprising.
File edits render as diffs. Bash lives in a terminal block. Reads, writes, todos each get their own card — so you can actually see what the agent did.
Claude permission prompts and Codex approval cards (exec + apply_patch) are proper UI elements — not modals interrupting the flow.
Loads prior transcripts straight out of ~/.claude/projects, ~/.codex/sessions, and ~/.gemini/tmp. Nothing gets re-invented; we just read the files you already have.
Syntax highlighting, line-range highlighting, HTML & Markdown preview tabs for previewable files. No context-switch to VS Code to check the agent's work.
One searchable pane for everything the CLIs can do — slash commands, sub-agents, skills, plugins, MCP servers — unified across Claude, Codex, Gemini, and Ollama. Rescan on demand; no YAML to edit.
⌘P file finder. ⌘\ toggles the sidebar. ⌘, opens settings. ⌘K jumps conversations. It is not a website; it is a tool.
Create, update, rebase, merge, push, or remove a git worktree from inside the conversation. Agents work in isolation; you merge when you like what you read.
A live +/− rollup sits above the composer, counting everything the agent has touched this turn. Click to expand the diff, click a file to jump to it. The commit badge is one click away.
A proper UI for your Ollama install: browse the catalog, filter by maker or country, pull and delete with one click, watch server logs live, and see a readout of the GPU you're actually running on.
Rolling 5h / 24h / 7d stats, broken down by backend, model, and project. Know what you're burning before the invoice tells you.
As you approach a rate limit or cost ceiling, Overcli can step you down automatically — opus → sonnet, cloud → local ollama — so the next turn still ships. Configurable per project. Off by default.
Per-backend status pills: ready, unauthenticated, missing, error. The app tells you which CLI is broken before you try to use it.
Fire one prompt at every backend at once. Watch the answers land side by side, compare the diffs, keep the winner, discard the rest. The fastest way to tell which agent actually understood the task. see it below ↓
When a flow finishes, it doesn't have to stop caring. Put a run in watch and it keeps an eye on the thing it touched — the PR, the Jira ticket, the Slack thread — answering follow-up questions with the full context of what it just did. A cheap detect pass polls every tick; the real model only wakes when someone actually asks something.
It tends, it never re-does. If a comment asks for real work — a code change, a re-run — the watcher flags you instead of touching the repo, and pings you on the desktop.
Add a Model Context Protocol server in one place and Overcli writes it into every CLI's config in the exact format each one wants — ~/.claude.json, ~/.codex/config.toml, ~/.gemini/settings.json — dropping a .bak first. Pick from a curated catalog with one-click install, or copy a server you already have on one backend across to all the others. No hand-edited TOML, no format-juggling.
[ 03 ] · flows model
Flows are reusable, open-source workflow templates built into Overcli. Available to everyone through the same public repo and desktop app: define a pipeline once, run it consistently across projects, and keep the full audit trail. And the work doesn't end at the diff — a finished run can stay on watch, fielding follow-up questions on the PR or ticket with the context it built and flagging you the moment something needs real work.
Participants
Steps
Role: research / gather context
Produces: research.md
Role: design / implementation plan
Produces: plan.md
Role: implement / run checks / emit diff
Produces: final.diff
↓ the same flow, running
Add a token-bucket rate limiter to /api/login — 100 req/min per IP, return 429 with Retry-After.
what each participant produced
↪ Picking up this thread — switched to ollama/qwen2.5-coder for the build step. Reads plan.md, emits final.diff.
inputs plan.md
+ src/auth/rateLimiter.ts + const bucket = buckets.get(ip) ?? refill(ip); + if (bucket.tokens < 1) { + res.setHeader('Retry-After', bucket.resetIn); + return res.status(429).end(); + } - // TODO: throttle /login ✓ type-check clean · 18 tests pass
↓ same run, still open — steps shipped, now watching for follow-up
The flow doesn't close when the diff lands. The same run stays open in a watch phase — every step above already complete, every artifact still in context — now fielding follow-up on the work it just shipped. A cheap detect pass polls; the full model only wakes when someone actually asks something. It never touches the repo.
↪ New comment from @dana on the PR: “does the limiter count failed logins too?” — answered inline, grounded in plan.md: yes, every request to /login draws a token before auth runs, so failures count.
reads PR thread · plan.md · final.diff
↪ @sam asked to bump the limit to 200/min. That's real work, not a question — so the watcher left it alone and pinged you on the desktop instead of editing the code.
mix providers · mix models · cloud thinking, local speed · tends the work after the diff
[ 03b ] · ready to run
spotlight
One flow, six steps, three backends — each step runs on the model that fits it. Jira ticket in, reviewed PR out. The review step auto-bounces back to build on failure.
Every step targets any backend — keep the heavy reasoning on cloud models, push mechanical work to a local one. Swap models per step without touching the pipeline.
A few more of the good ones:
Survey the diff, review for bugs and security issues in parallel, adversarially verify every finding, then format a clean report.
Gather CI logs, identify the flaky tests, diagnose what's making each one nondeterministic, then draft fixes.
Map the diff, scan for OWASP-style issues, adversarially verify each finding to kill false positives, then format a report.
Diagnose error logs, locate the failing code, draft a fix, then verify the fix preserves behavior before you ship it.
Research the area, design alternatives, draft a full design doc, then adversarially review it for gaps before circulation.
Fan out searches across Confluence + Jira, deep-read the top hits, then synthesize a single answer with citations.
Browse the full set in the desktop app, or in the open flow registry ↗.
[ 04 ] · rebound reviews
A rebound isn't a pipeline — it happens in the conversation. After a turn lands, a reviewer agent reads the actual diff and writes back, round after round. You watch it concede, push back, and re-check in collaboration mode until it stops finding things. The reviewer can stay in the same family — a heavier thinker checking a faster model's work — or be completely independent, like Claude reviewing what Codex just shipped. A second pair of eyes that doesn't share the first one's blind spots.
Added the rate limiter and a test. stop() now kills the collab client instead of a best-effort interrupt(), and the notification / request / close handlers check codexCollab.get(id) === session before acting. Type-check clean, all 332 tests pass.
codex · collab · round 4
Looks fine overall. Killing and removing the persistent collab session on stop() closes the cross-round race I flagged last time.
Only nit: the comment slightly overstates the close change — it guards map eviction, then still inspects session.active. Fine in practice, but the wording is loose.
Conceded — the wording was loose. No code change needed; just a more accurate restatement of what the close handler actually guards.
codex · collab · round 5
Looks fine. The clarification matches the code — no issue, just tighter wording. Nothing left to flag.
One rebound, many shapes — pick a preset, or open Custom… for the dials.
same conversation · round after round · until it's clean
[ 05 ] · colosseum mode
Fire the same task at every backend in parallel — cloud or local. Watch the answers land at their own pace. Compare the diffs side by side. Pick the best. Keep it. Discard the rest.
claude
opus 4.8
codex
gpt-5
gemini
2.5-pro
copilot github
gpt-5
ollama local
llama3-8b
same prompt · five answers · one diff
[ 06 ] · why overcli exists
Most coding-agent tooling optimizes for the demo. Overcli optimizes for the afternoon of the fourth day — when you've shipped three features with it, hit a weird edge, and need to understand what happened without breaking stride. The whole point is to keep you moving: a clean local environment where nothing is hidden, so you never stop to untangle what the agent did.
Diffs you can read, tool cards you can audit, permission flows you can follow, history that came from a file on disk. No invented abstractions. No hidden state. If something is on screen, there's a line of code you can find that put it there. That's the bar.
Written & maintained by Lionel Farr and Owen Farr. Open-source — contributors, issues, and PRs welcome.
[ 07 ] · grab a build
Overcli is in beta, but the macOS builds are now Developer-ID signed and notarized by Apple — they open on a double-click, no right-click ceremony. Windows isn't code-signed yet, so SmartScreen may still flag it. Either way the source is right there; build it yourself.
| ⌘ macOS arm64 + x64 | .dmg · .zip | ~ 120 MB | [download] |
| ⊞ Windows x64 + arm64 | NSIS installer | ~ 110 MB | [download] |
| 🐧 Linux x64 + arm64 | .AppImage · .deb | ~ 130 MB | [download] |