Multi-Agent Development with Claude Code - Part 1 - The Waterfall Agentic Workflow

Viktor Vasylkovskyi•April 20, 2026

At some point after getting an AI agent working reliably on my Raspberry Pi, the natural next question became: how much can I actually delegate? Individual tasks, sure. But what about a full feature set — architecture, multiple repos, several weeks of work — handed off in one go?

This is the first post in a three-part series on that experiment. The approach I'm testing is what I'm calling the waterfall agentic workflow. The idea is to front-load all the context — architecture, tech stack, product vision, roadmap — and let Claude Code handle everything from task generation to merged PRs, with me reviewing at the end rather than driving each step.

This document registers my motivations and experiments with a new development approach. We are testing what I'm calling the waterfall agentic workflow. In this workflow I am attempting a novel approach to building features and products — instead of coding and asking the agent for assistance, I hand it a complete specification and let it run the whole implementation in one go. This is essentially a big chunk of work.

The idea

For the complete view of the product to build I wrote the following artefacts for the agent:

Architecture — Repositories and System Design
Workflow — Development Workflow, how to validate each repository
Product Vision
Tech Stack — backend and frontend tech used
Roadmap — the detailed vision of the milestone to implement

The hypothesis: if the agent has all of this context upfront, it can generate tasks, create GitHub issues, and implement features without me driving each step.

What works

Given full context, we can use AI for:

1. Generating local tasks from context

The skills/roadmap-to-tasks/SKILL.md skill reads the roadmap and produces concrete task files. This worked well.

2. Creating GitHub issues from local tasks

Using heartbeat-prompts/roadmap-tasks-to-issues.md and skills/issue-writing/SKILL.md, the agent creates properly structured GitHub issues from those task files. Also worked well.

3. Running the main agent to execute tasks one by one

This is where it gets interesting — and where the real learnings came from.

Hitting permission issues

I kept getting issues with permissions and having to approve things manually. Running claude --dangerously-skip-permissions bypassed this and the agent started working well and creating PRs.

How the agent works under the hood

Under the hood, the agent spawns sub-agents to work on git worktrees. A worktree is a way to check out multiple branches of the same repo simultaneously into separate directories — each sub-agent gets its own isolated copy of the codebase on its own branch, so they can work in parallel without stepping on each other. This guarantees isolation:

  (use "git restore --staged <file>..." to unstage)
        new file:   .claude/worktrees/issue-69-validate-arm64
        new file:   .claude/worktrees/issue-70-keyboard-navigation
        new file:   .claude/worktrees/issue-71-help-command
        new file:   .claude/worktrees/issue-72-device-type
        new file:   .claude/worktrees/issue-73-ssh-config
        new file:   .claude/worktrees/issue-74-scripts-folder
        new file:   .claude/worktrees/issue-75-download-scripts
        new file:   .claude/worktrees/issue-76-prerequisites
        new file:   .claude/worktrees/issue-77-docker

Each issue gets its own worktree and its own sub-agent.

The worktree problem with single-agent sequential work

Worktrees are good for parallelising agent work. However, I discovered that when using a single agent, sequential approach works better because one agent can then see the big picture. Sub agents seem to fail for the following reasons:

Isolated worktrees don't see related work
Sub agents lack all the context. The ticket context is not enough

Hence, worktrees are good for general bug fixing (to validate in the future), where multiple implementations are unrelated and can be parallelised.

Our scope though is different. We have a full spec of the product and want to kick start it from the ground up. Here it is preferable to run a single agent, which is what we ended up doing.

The fix: force the agent to work alone in the skill. Instead of spawning parallel sub-agents for every issue, the skill instructs a single agent to work sequentially across all issues, maintaining the full picture throughout.

The agent makes mistakes

The agent does make mistakes. It implements the code and opens PRs as expected, but the final behavior doesn't always match what was expected. I have to correct my agent.

This raised a question: what if instead of just a writer, we also had a reviewer? An agent team with a reviewer and a worker? That's what the next post covers.

Closing the loop with skills

From the bugs observed in the implementation, I created a skill to convert bugs into actionable tasks:

.claude/skills/bug-creator/SKILL.md — creates a bug ticket from a short description
.claude/skills/feature-roadmap-executor/SKILL.md — handles both bugs and features (renamed from the roadmap-only version)
.claude/skills/feature-doc-creator/SKILL.md — creates feature docs

The loop this enables:

Define features to implement with .claude/skills/feature-doc-creator/SKILL.md
Implement them with .claude/skills/implementation-orchestrator/SKILL.md
Create bug tickets with .claude/skills/bug-creator/SKILL.md
Implement them with .claude/skills/implementation-orchestrator/SKILL.md

Everything also moved under .claude in the repository to make the skills available as slash commands inside Claude Code.

What we learned from this first experiment

Full context upfront genuinely works — the agent can generate meaningful tasks from architecture docs and a roadmap
Worktrees are the right isolation mechanism for parallel independent work, but not for a sequential full-product build where the agent needs to see everything
Sub-agents without enough context fail in ways that are hard to debug — the ticket alone isn't enough
A single sequential agent with the full spec produces better results than parallel agents with fragmented context
Lint failures happen when the conventions doc doesn't explicitly say "run the linter and fix all errors before committing" — the agent follows instructions literally, not by inference

The next post covers the writer/reviewer sub-agent loop — running a full orchestrator that spawns a writer, then a reviewer, then loops until the feature passes review.

Next: Part 2 — Writer + reviewer sub-agent loop