AI orchestrating AI: how we refactored a complex system in 3.5 hours using agent coordination

Connor Turland·October 28, 2025·

You need to refactor your webhook system. Three different tunnel implementations are competing for control. Deprecated code has tendrils everywhere. Simple changes require touching six files. Your team is afraid to modify anything.

The traditional answer: assign it to your senior engineer, hope they don't quit halfway through, and wait 2-3 weeks.

What we did instead: We gave it to an AI orchestrator. Not to write the code - to coordinate six specialized AI agents, each handling one piece of the refactor in parallel. Total time: 3.5 hours. Zero human intervention until final review.

This is orchestration. Not the enterprise buzzword - the actual capability of AI agents coordinating other AI agents to accomplish complex work.

The problem: a refactor too risky to attempt

Here's the situation. Our webhook system had evolved organically over 18 months:

Three competing tunnel implementations (ngrok, Cloudflare, custom proxy worker)
5,554 lines of deprecated code (ndjson-client, proxy-worker) with references scattered across 59 files
Hardcoded request handlers mixed with server infrastructure
Tightly coupled verification logic spanning multiple packages

The ask: refactor this completely. Remove deprecated code. Convert to modern Fastify framework. Separate concerns. Don't break anything.

The problem: This wasn't six isolated tasks. It was six interdependent changes where order mattered, dependencies tangled, and one mistake could cascade across the entire system.

Traditional approach: Assign to senior engineer. Timeline: 2-3 weeks. Risk: High. Cost: One burned-out engineer.

What we actually did: AI orchestrating AI

We gave this to Cyrus acting as an orchestrator agent. Not to write all the code - to break down the work, delegate to specialized agents, verify their output, and coordinate the integration.

Here's what happened:

Hour 0:00 - Decomposition

The orchestrator agent analyzed CYPACK-248 and decomposed it into 6 atomic, sequential sub-issues:

CYPACK-249: Remove ndjson-client and proxy-worker (5,554 lines to delete)
CYPACK-250: Convert SharedApplicationServer to Fastify
CYPACK-251: Create config-updater package, extract handlers
CYPACK-252: Simplify CloudflareTunnelClient
CYPACK-253: Rename linear-webhook-client to linear-event-transport
CYPACK-254: Integrate all refactored components in EdgeWorker

Key insight: The orchestrator identified dependencies automatically. Each sub-issue built on the previous one. Sequential execution was the only safe path.

Hour 0:05 - Delegation begins

The orchestrator created the first sub-issue (CYPACK-249) in Linear with detailed acceptance criteria:

- Delete packages/ndjson-client/ directory completely
- Delete apps/proxy-worker/ directory completely
- Remove all NdjsonClient imports from EdgeWorker
- Update test mocks to remove NdjsonClient references
- Verify: 99 tests passing, TypeScript clean, build succeeds

Then it triggered a child agent session to handle this specific task.

Hour 0:45 - First child completes

Child Agent (CYPACK-249) working in isolated git worktree:

Deleted 5,554 lines of deprecated code
Removed ndjson-client package (27 files)
Removed proxy-worker app (17 files)
Updated EdgeWorker to use only LinearWebhookClient
Fixed all test mocks (7 test files)
Reported back: "99 tests passing, TypeScript clean, build succeeds"

Orchestrator response:

Pulled child's worktree
Verified independently: Ran tests, typecheck, build
Merged changes into orchestrator branch
Triggered next child: Created CYPACK-250 agent session

Hour 1:15 - Fastify conversion complete

Child Agent (CYPACK-250) working in new worktree:

Converted SharedApplicationServer from http.createServer to Fastify
Removed 892 lines of pathname-based routing (if statements replaced with Fastify routes)
Added module registration system
Reported back: "99 tests passing, ready for merge"

Orchestrator: Verified, merged, and triggered CYPACK-251

Hour 1:50 - Config extraction complete

Child Agent (CYPACK-251) working in third worktree:

Created new config-updater package
Extracted configuration handlers from CloudflareTunnelClient
Net: 467 additions, 208 deletions
Reported back: Complete

Orchestrator: Verified, merged, and triggered CYPACK-252

Hour 2:20 - Tunnel simplification complete

Child Agent (CYPACK-252) working in fourth worktree:

Simplified CloudflareTunnelClient to only handle tunnels
Removed all HTTP server and handler logic (54 lines net reduction)
Reported back: "62 tests passing, TypeScript clean"

Orchestrator: Verified, merged, and triggered CYPACK-253

Hour 2:45 - Naming refactor hits snag

Child Agent (CYPACK-253) working in fifth worktree:

Renamed linear-webhook-client to linear-event-transport
Refactored with Fastify integration
Reported back: "Implementation complete"

Orchestrator verification: 41 tests failing

The child agent renamed the package but forgot to update test imports. Tests still referenced the old LinearWebhookClient class.

Orchestrator action: Posted detailed feedback to CYPACK-253:

Verification failed. 41 tests failing across 6 test files
Root cause: Test files still import LinearWebhookClient but this class no longer exists
Required: Update imports to LinearEventTransport and fix mock implementations
Must fix 6 test files: EdgeWorker.dynamic-tools.test.ts, EdgeWorker.feedback-delivery.test.ts, and 4 others
Ensure all 148 tests pass before marking complete

Child Agent (CYPACK-253) - 20 minutes later:

Fixed all test imports
Updated mock implementations
Reported back: "All 189 tests passing"

Orchestrator: Verified, merged, and triggered CYPACK-254

Hour 3:30 - Final integration complete

Child Agent (CYPACK-254) working in sixth worktree:

Integrated all refactored components in EdgeWorker
Replaced multiple LinearEventTransport instances with single shared instance
Added Cloudflare tunnel support to SharedApplicationServer
Net: 222 lines removed
Reported back: "99 tests passing, complete"

Orchestrator final verification:

All 189 tests passing across entire codebase
TypeScript compilation successful
Build succeeds for all packages
Linting clean
Created pull request with complete changes

The real impact: orchestration vs. single-agent execution

Let's be specific about what orchestration bought us:

Time comparison

Single senior engineer (estimated):

Week 1: Understand dependencies, plan approach, start removing deprecated code
Week 2: Convert to Fastify, extract modules, hit integration issues
Week 3: Fix cascading test failures, debug mysterious breaks, finally get PR ready
Total: 15-20 days of focused work

Orchestrated AI agents (actual):

Hour 0-1: Decomposition + first deletion phase complete
Hour 1-2: Fastify conversion + config extraction complete
Hour 2-3: Tunnel simplification + naming refactor complete
Hour 3-4: Final integration + PR creation complete
Total: 3.5 hours wall-clock time, zero human intervention

The multiplier: 40-60x faster than human execution

Quality comparison

What human refactors typically produce:

Forgotten edge cases discovered in production
Test coverage gaps ("I'll add tests later")
Integration issues caught during code review
Documentation updates deferred

What orchestrated refactor produced:

Every sub-issue verified before integration (tests, typecheck, build)
Orchestrator caught child agent mistakes before merge (CYPACK-253 test failures)
Complete test coverage maintained throughout
Each phase independently verifiable

The difference: Quality gates at every step, not just at the end

Risk mitigation

Human refactor risks:

Knowledge in one person's head
Context loss if engineer switches focus
No intermediate rollback points
"Works on my machine" surprises

Orchestrated refactor safety:

Each sub-issue isolated in separate git worktree
Independent verification before any integration
Clear rollback points (revert to pre-CYPACK-251 state)
Reproducible process documented in Linear comment streams

The orchestration patterns: what made this work

Watching the orchestrator coordinate six agents revealed key patterns:

1. Atomic decomposition

The orchestrator didn't create vague sub-tasks like "refactor webhook system." It created precise, verifiable units:

Bad decomposition:

"Clean up the webhook code"
"Make it more modular"
"Improve the architecture"

Actual decomposition:

"Delete packages/ndjson-client/ and apps/proxy-worker/. Remove all imports. Verify 99 tests passing."
"Convert SharedApplicationServer constructor from http.createServer to Fastify. Verify build succeeds."
"Extract config handlers to new config-updater package. Verify TypeScript clean."

Each sub-issue had concrete acceptance criteria. No ambiguity. No interpretation needed.

2. Sequential dependency management

The orchestrator recognized that order mattered:

Can't convert to Fastify until deprecated code is removed (CYPACK-249 must complete first)
Can't simplify CloudflareTunnelClient until handlers are extracted (CYPACK-251 → CYPACK-252)
Can't integrate EdgeWorker until all components are refactored (CYPACK-254 must be last)

Human instinct: Try to parallelize everything for speed. Orchestrator insight: Dependencies force sequence. Respect them.

3. Verification before integration

After each child agent completed work, the orchestrator:

Pulled the child's git worktree
Ran independent verification (didn't trust child's self-reported success)
Only merged if verification passed
Provided detailed feedback if verification failed (CYPACK-253 test failure example)

This prevented cascading failures. A bad merge in hour 1 would have broken everything in hours 2-3.

4. Feedback loops

When CYPACK-253 reported "Implementation complete" but tests were failing, the orchestrator didn't just reject it. It:

Identified the specific failure (41 tests, 6 files)
Diagnosed root cause (old class name in test imports)
Provided actionable fix list
Allowed child agent to correct and re-submit

Result: Child agent learned, fixed the issue, and delivered working code. No human escalation needed.

5. Isolated workspaces

Each child agent worked in a separate git worktree. Benefits:

No merge conflicts between parallel work
Failed experiments don't corrupt main branch
Orchestrator can pull completed work when ready
Clean rollback if a phase fails catastrophically

When to use orchestration (and when not to)

Orchestration isn't for every task. Here's when it matters:

Use orchestration when:

1. The task has clear sub-problems

Refactoring a monolithic service into modules
Migrating from one framework to another
Extracting shared code into libraries
Updating dependencies across multiple packages

Example: CYPACK-248 had six obvious phases. If your task doesn't decompose naturally, orchestration adds overhead.

2. Dependencies between sub-tasks exist

Order matters (can't do B until A completes)
Integration points need coordination
Verification checkpoints prevent cascading failures

Example: Converting to Fastify required removing deprecated code first. Random ordering would have failed.

3. Each sub-task can be independently verified

Tests can prove correctness
Build/compile/typecheck validates integration
Success criteria are objective, not subjective

Example: "99 tests passing" is binary. "Looks good to me" is not.

4. The cost of human coordination exceeds setup cost

Task would take multiple days of human focus
Requires specialized knowledge in different areas
Risk of human error is high
You want documentation of the process

Example: This refactor would have taken 2-3 weeks of senior engineer time. Orchestration took 3.5 hours of wall-clock time.

Don't use orchestration when:

The task is simple and linear

Single-file changes
Obvious implementation path
No dependencies to manage
Takes less than 1 hour of human time

Example: "Add a new API endpoint" doesn't need orchestration. Just do it.

The requirements are ambiguous

Unclear acceptance criteria
Design decisions need human judgment
Exploratory work ("figure out why X is slow")

Example: "Make the app feel faster" is not orchestratable. "Reduce initial page load from 4s to under 2s" is.

You need creative problem-solving

Novel algorithms
Complex business logic with edge cases
UX design requiring aesthetic judgment

Example: "Design a beautiful checkout flow" needs human creativity. "Implement this checkout flow spec" can be orchestrated.

How to use orchestration with Cyrus

This capability is built into Cyrus. Here's how to trigger it:

Use the "Orchestrator" label

When creating an issue in Linear that needs orchestration:

Write a detailed description of the refactor (like CYPACK-248)
Add the "Orchestrator" label to the issue
Assign to Cyrus
Cyrus will automatically:

Analyze the requirements

What you'll see

Once orchestration begins, watch the Linear comment stream:

Parent issue (orchestrator):

Decomposition plan with sub-issues
Progress updates as each child completes
Verification results (tests, typecheck, build)
Merge confirmations
Final summary with PR link

Child issues (sub-tasks):

Detailed acceptance criteria
Agent session activity
Implementation summaries
Verification results
Completion notifications

Best practices

Write clear parent issue descriptions:

State the goal clearly
List known components that need changes
Mention dependencies if you know them
Set acceptance criteria for the overall refactor

Let the orchestrator plan:

Don't try to create sub-issues yourself
The orchestrator will analyze and decompose better than manual planning
Trust the dependency analysis

Review the plan before execution:

After decomposition, the orchestrator will post the plan
Review the sub-issues it created
Provide feedback if the decomposition looks wrong
Give approval to proceed (or it will proceed automatically)

The future: AI agents coordinating AI agents

What you just read isn't a vision of the future. It's happening now. Cyrus orchestrated this refactor on October 28, 2024. The Linear issues exist. The comment streams are real. The PR merged successfully.

This changes everything.

For engineering leaders:

Complex refactors that blocked your roadmap for months can complete in hours
Technical debt becomes manageable (pay it down systematically, not heroically)
Risk mitigation is built in (verification at every step, not just at the end)
Documentation is automatic (every orchestration leaves a complete audit trail)

For senior engineers:

Stop being the bottleneck for complex migrations
Delegate the tedious parts, focus on architecture decisions
No more "I'm the only one who can touch this code"
Knowledge transfer happens through documented orchestration patterns

For teams:

Velocity becomes predictable (orchestration times are consistent)
Onboarding accelerates (new engineers see how complex work gets coordinated)
Code quality improves (every phase independently verified)
Burnout decreases (AI handles the grunt work of large refactors)

The bottom line

Orchestration isn't about making code prettier. It's about making impossible refactors possible.

The old model: Complex refactor, assign to senior engineer, hope they don't quit, wait 3 weeks, pray it works

The new model: Complex refactor, assign to orchestrator agent, decompose automatically, verify continuously, complete in hours

The difference:

3.5 hours vs 3 weeks (40-60x faster)
Zero human hours vs 120 human hours
Quality gates at every step vs one big review at the end
Complete audit trail vs knowledge in one person's head
Reproducible process vs tribal knowledge

The teams that adopt orchestration first will ship faster. The teams that wait will wonder how they're getting lapped by smaller competitors.

Questions about orchestration or want to discuss your specific refactoring challenges? Connect with us @CyrusAgent or book a 20-minute technical discussion with our team.

Team of engineers preparing a rocket for launch - Ship your next feature with Cyrus

Trusted by product teams at Retool, Gamma, TinyFish, and more

Break free from the terminal

As your Claude Code powered Linear agent, Cyrus is capable of accomplishing whatever large or small issues you throw at it. Get PMs, designers and the CEO shipping product.

Start shipping 20x faster