┌─ [ 01 ] · a desktop app for people who live in coding CLIs

Five coding agents. One honest window.

Overcli wraps claude, codex, gemini, copilot, and ollama in a single desktop app — conversations, diffs, git worktrees, and usage stats in the same place. Stop juggling terminals. Start reading the work.

over·CLI — a GUI that sits over your CLIs. Yes, that's the name.

[ Download · macOS ] view source →

macOS
windows
linux
│
Electron · React · TypeScript

add rate limiter to /loginapi-gateway
fix flaky signup testweb-app

▾ payments-platform +⌕✎
api-gateway
billing-svc
ledger-core
▸ design-system +⌕✎

▾ api-gateway

add rate limiter to /login
fix flaky signup test
migrate to pg14

agents

reviewfeature/rate-limit

+ agent + colosseum archive all

▸ web-app

▸ design-tokens

add a rate limiter to /api/login

Claude CLI opus 4 8 [1m]

⑂ fork ▾ ⊘ Bypass (dangerous) ⚡ Effort ▾ ↥ rebound ❒ ⎇ ⋯ ✕

Writing…

Message… (type / for commands)

A live peek. The three lanes are actually streaming.

[ 02 ] · inside the window

Built by engineers obsessed with their dev environment.

Every surface below exists because somebody got annoyed that it didn't. No feature-ballast, no PM checkboxes — just the pieces you miss the moment you pop back into a raw terminal.

Multi-backend chat

Claude, Codex, Gemini, Copilot, and Ollama — five stream parsers, one consistent UI. Markdown. Syntax highlight. Streaming tokens. Cloud or local, same window.

No new subscriptions

Overcli sits on top of the CLIs you already use, with the auth they already have. Your Claude Pro, your ChatGPT plan, your gcloud creds, your GitHub Copilot seat — whatever claude, codex, gemini, and copilot are signed into is what overcli runs through. Local ollama for the rest. No API keys to manage. No new bills.

Workspaces — projects of projects

Group related repos into a single workspace and every agent inherits the map: it knows the projects belong together and navigates between them naturally. Ask for a change that spans api-gateway and billing-svc and the agent edits both, runs both test suites, and shows one review. The feature that makes polyrepo feel like monorepo.

Silent agents

Long-running background agents that live next to your work. A doc-writer that keeps /docs in sync as the code changes. A PR-reviewer that quietly comments on every pull request the moment it opens. No Slack pings. No dashboards. Just work getting done.

Rebound reviews

After every turn, fire a second agent — on a different backend if you like — to review what just landed. Thinking blocks visible, round counter included. Collaboration mode loops rounds until the reviewer stops finding things to fix.

Note: Copilot can't be a reviewer yet — its CLI doesn't read prompts from stdin. When it's your primary, overcli routes the review to claude/codex/gemini and shows a "Routed via X" chip so it's not surprising.

Tool cards, not wall-of-text

File edits render as diffs. Bash lives in a terminal block. Reads, writes, todos each get their own card — so you can actually see what the agent did.

Permission & approval, first-class

Claude permission prompts and Codex approval cards (exec + apply_patch) are proper UI elements — not modals interrupting the flow.

History from disk

Loads prior transcripts straight out of ~/.claude/projects, ~/.codex/sessions, and ~/.gemini/tmp. Nothing gets re-invented; we just read the files you already have.

File editor, right there

Syntax highlighting, line-range highlighting, HTML & Markdown preview tabs for previewable files. No context-switch to VS Code to check the agent's work.

Extensions browser

One searchable pane for everything the CLIs can do — slash commands, sub-agents, skills, plugins, MCP servers — unified across Claude, Codex, Gemini, and Ollama. Rescan on demand; no YAML to edit.

Keyboard first

⌘P file finder. ⌘\ toggles the sidebar. ⌘, opens settings. ⌘K jumps conversations. It is not a website; it is a tool.

Agent worktrees

Create, update, rebase, merge, push, or remove a git worktree from inside the conversation. Agents work in isolation; you merge when you like what you read.

Changes bar

A live +/− rollup sits above the composer, counting everything the agent has touched this turn. Click to expand the diff, click a file to jump to it. The commit badge is one click away.

Local model dashboard

A proper UI for your Ollama install: browse the catalog, filter by maker or country, pull and delete with one click, watch server logs live, and see a readout of the GPU you're actually running on.

Usage dashboard

Rolling 5h / 24h / 7d stats, broken down by backend, model, and project. Know what you're burning before the invoice tells you.

Smart downgrades

As you approach a rate limit or cost ceiling, Overcli can step you down automatically — opus → sonnet, cloud → local ollama — so the next turn still ships. Configurable per project. Off by default.

Health badges

Per-backend status pills: ready, unauthenticated, missing, error. The app tells you which CLI is broken before you try to use it.

Colosseum

Fire one prompt at every backend at once. Watch the answers land side by side, compare the diffs, keep the winner, discard the rest. The fastest way to tell which agent actually understood the task. see it below ↓

Flow watch

When a flow finishes, it doesn't have to stop caring. Put a run in watch and it keeps an eye on the thing it touched — the PR, the Jira ticket, the Slack thread — answering follow-up questions with the full context of what it just did. A cheap detect pass polls every tick; the real model only wakes when someone actually asks something.

It tends, it never re-does. If a comment asks for real work — a code change, a re-run — the watcher flags you instead of touching the repo, and pings you on the desktop.

Install MCP once, everywhere

Add a Model Context Protocol server in one place and Overcli writes it into every CLI's config in the exact format each one wants — ~/.claude.json, ~/.codex/config.toml, ~/.gemini/settings.json — dropping a .bak first. Pick from a curated catalog with one-click install, or copy a server you already have on one backend across to all the others. No hand-edited TOML, no format-juggling.

[ 03 ] · flows model

Define. Verify. Approve. Ship. Learn.

Flows are reusable, open-source workflow templates built into Overcli. Available to everyone through the same public repo and desktop app: define a pipeline once, run it consistently across projects, and keep the full audit trail. And the work doesn't end at the diff — a finished run can stay on watch, fielding follow-up questions on the PR or ticket with the context it built and flagging you the moment something needs real work.

your request → researchclaude · opus 4.8 → designclaude · opus 4.8 → buildollama · qwen2.5-coder → diff

Participants

ThinkerClaude · opus 4.82 steps

LocalOllama · qwen2.5-coder1 step

Steps

1. researchreads: your request

Role: research / gather context

Produces: research.md

2. designreads: research.md

Role: design / implementation plan

Produces: plan.md

3. buildreads: plan.md

Role: implement / run checks / emit diff

Produces: final.diff

YAML

name: Research + design + build
open_source: true
available_to: everyone
steps:
  - id: research
    use: claude/opus-4-8
  - id: design
    use: claude/opus-4-8
  - id: build
    use: ollama/qwen2.5-coder

↓ the same flow, running

Research + design + build DONE

1.2k thinking 842k fast +96−12 5 files

1 research ✓ → 2 design ✓ → 3 build ✓

Add a token-bucket rate limiter to /api/login — 100 req/min per IP, return 429 with Retry-After.

steps in this thread ✓ research ✓ design ✓ build

what each participant produced

research.mdfrom research6,204 chars“Where /login is handled + the existing middleware chain”
plan.mdfrom design8,710 chars“Plan: token-bucket limiter in the auth middleware”
final.difffrom build+96 / −12 · 5 files“Limiter, config, and tests”

▸ flow step Step: build · implementer

↪ Picking up this thread — switched to ollama/qwen2.5-coder for the build step. Reads plan.md, emits final.diff.

inputs plan.md

+ src/auth/rateLimiter.ts
+ const bucket = buckets.get(ip) ?? refill(ip);
+ if (bucket.tokens < 1) {
+   res.setHeader('Retry-After', bucket.resetIn);
+   return res.status(429).end();
+ }
- // TODO: throttle /login
✓ type-check clean · 18 tests pass

↓ same run, still open — steps shipped, now watching for follow-up

Research + design + build WATCHING

detect every 5m 1 answered 0 escalations

1 research ✓ → 2 design ✓ → 3 build ✓ → watching

The flow doesn't close when the diff lands. The same run stays open in a watch phase — every step above already complete, every artifact still in context — now fielding follow-up on the work it just shipped. A cheap detect pass polls; the full model only wakes when someone actually asks something. It never touches the repo.

▸ watch tick detect → answer · claude · haiku

↪ New comment from @dana on the PR: “does the limiter count failed logins too?” — answered inline, grounded in plan.md: yes, every request to /login draws a token before auth runs, so failures count.

reads PR thread · plan.md · final.diff

▸ watch tick escalate · flagged for you

↪ @sam asked to bump the limit to 200/min. That's real work, not a question — so the watcher left it alone and pinged you on the desktop instead of editing the code.

mix providers · mix models · cloud thinking, local speed · tends the work after the diff

[ 03b ] · ready to run

35 open-source flows ship with Overcli.

spotlight

Solve a ticket end-to-end

One flow, six steps, three backends — each step runs on the model that fits it. Jira ticket in, reviewed PR out. The review step auto-bounces back to build on failure.

claude codex ollama jira

1
fetchresearcher · pulls the ticket over MCP
claude · haiku 4.5 ticket.md
2
planplanner · scopes it against the repo
claude · opus 4.7 plan.md
3
buildimplementer · writes the code
codex · gpt-5.4-mini diff
4
reviewreviewer · auto-bounces to build on fail ↺
claude · opus 4.7 review.md
5
teststest-writer · covers the change
codex · gpt-5.4-mini diff
6
shipshipper · opens the PR (pauses for you first)
ollama · qwen2.5-coder pr_url

Every step targets any backend — keep the heavy reasoning on cloud models, push mechanical work to a local one. Swap models per step without touching the pipeline.

A few more of the good ones:

Code Review

claude

Survey the diff, review for bugs and security issues in parallel, adversarially verify every finding, then format a clean report.

survey→ review ×2→ verify→ report

review
quality

Flaky test hunter

claudecodex

Gather CI logs, identify the flaky tests, diagnose what's making each one nondeterministic, then draft fixes.

gather→ identify→ diagnose→ fix

testing
ci

Security review

claude

Map the diff, scan for OWASP-style issues, adversarially verify each finding to kill false positives, then format a report.

map→ scan→ verify→ report

security
review

Debug error logs

claudecodex

Diagnose error logs, locate the failing code, draft a fix, then verify the fix preserves behavior before you ship it.

diagnose→ locate→ fix→ verify

debugging
logs

System design doc

claudecodex

Research the area, design alternatives, draft a full design doc, then adversarially review it for gaps before circulation.

research→ design→ draft→ review

design
docs

Search company knowledge

claudeconfluence

Fan out searches across Confluence + Jira, deep-read the top hits, then synthesize a single answer with citations.

search→ deep-read→ synthesize

research
confluence

Browse the full set in the desktop app, or in the open flow registry ↗.

[ 04 ] · rebound reviews

Ship a turn. Bounce it back. Let a second agent tear it apart.

A rebound isn't a pipeline — it happens in the conversation. After a turn lands, a reviewer agent reads the actual diff and writes back, round after round. You watch it concede, push back, and re-check in collaboration mode until it stops finding things. The reviewer can stay in the same family — a heavier thinker checking a faster model's work — or be completely independent, like Claude reviewing what Codex just shipped. A second pair of eyes that doesn't share the first one's blind spots.

opus 4·7 copy · copy raw

Added the rate limiter and a test. stop() now kills the collab client instead of a best-effort interrupt(), and the notification / request / close handlers check codexCollab.get(id) === session before acting. Type-check clean, all 332 tests pass.

codex · collab · round 4

codex · gpt-5

Looks fine overall. Killing and removing the persistent collab session on stop() closes the cross-round race I flagged last time.

Only nit: the comment slightly overstates the close change — it guards map eviction, then still inspects session.active. Fine in practice, but the wording is loose.

opus 4·7 copy · copy raw

Conceded — the wording was loose. No code change needed; just a more accurate restatement of what the close handler actually guards.

codex · collab · round 5

codex · gpt-5 ✓ clean

Looks fine. The clarification matches the code — no issue, just tighter wording. Nothing left to flag.

One rebound, many shapes — pick a preset, or open Custom… for the dials.

OffNo secondary review.
Half-finished work checkSame model hunts stubs, TODOs, and missed branches. Cheap. Code-change turns only.
Security reviewA smarter model reads the diff for vulnerabilities. Code-change turns only.
Cheap-and-paranoidA cheap primary writes; a smart reviewer checks every turn.
Skeptical user"Did it actually do what I asked?" Catches scope creep, every turn.
Design reviewA smart model weighs architecture, abstractions, and approach.
Independent second opinionA different CLI entirely, for fully independent reasoning.
Custom…Set reviewer CLI, model, persona, and mode by hand.

reviewer cli: claude (same as primary) · codex · gemini · ollama
mode: review — flag → fix → verify, no loops · collab — ping-pong until the budget's spent
reviewer model: cli default · cheap (sonnet) · smart (opus)
persona: none · half-finished · security · critic · skeptical user · design

same conversation · round after round · until it's clean

[ 05 ] · colosseum mode

One prompt. Five agents. One diff to rule them.

Fire the same task at every backend in parallel — cloud or local. Watch the answers land at their own pace. Compare the diffs side by side. Pick the best. Keep it. Discard the rest.

claude

opus 4.8

0.0s

codex

gpt-5

0.0s

III

gemini

2.5-pro

0.0s

copilot github

gpt-5

0.0s

ollama local

llama3-8b

0.0s

same prompt · five answers · one diff

[ 06 ] · why overcli exists

Move fast without losing the thread.

Most coding-agent tooling optimizes for the demo. Overcli optimizes for the afternoon of the fourth day — when you've shipped three features with it, hit a weird edge, and need to understand what happened without breaking stride. The whole point is to keep you moving: a clean local environment where nothing is hidden, so you never stop to untangle what the agent did.

Diffs you can read, tool cards you can audit, permission flows you can follow, history that came from a file on disk. No invented abstractions. No hidden state. If something is on screen, there's a line of code you can find that put it there. That's the bar.

Written & maintained by Lionel Farr and Owen Farr. Open-source — contributors, issues, and PRs welcome.

[ 07 ] · grab a build

Signed, notarized, and real.

Overcli is in beta, but the macOS builds are now Developer-ID signed and notarized by Apple — they open on a double-click, no right-click ceremony. Windows isn't code-signed yet, so SmartScreen may still flag it. Either way the source is right there; build it yourself.

⌘ macOS arm64 + x64	.dmg · .zip	~ 120 MB	[download]
⊞ Windows x64 + arm64	NSIS installer	~ 110 MB	[download]
🐧 Linux x64 + arm64	.AppImage · .deb	~ 130 MB	[download]