Building Workflows Guide
Build a multi-step automation workflow for Claude Code. Chain skills, agents, and shell commands into a repeatable pipeline.
Build a multi-step automation workflow for Claude Code. Chain skills, agents, and shell commands into a repeatable pipeline with defined inputs and outputs.
Let's build a real workflow — a code review pipeline that lints, tests, reviews, and summarizes. By the end you will understand the workflow structure, how steps connect through input/output contracts, and how to publish it to the marketplace.
What is a workflow?
A workflow is a multi-step automation sequence for Claude Code. Where a skill handles one task and an agent provides ongoing expertise, a workflow orchestrates a sequence of operations — each step feeding its output into the next.
| Concept | What it means |
|---|---|
WORKFLOW.md | The definition file — steps, inputs, outputs, coordination logic |
steps/ | Optional directory for step-specific instructions |
vault.yaml | The manifest — category: workflow |
/workflow-name | How buyers invoke the workflow |
.claude/workflows/ | Where it lives on the buyer's machine |
The value of a workflow is in the sequence, not any single step. If your automation has one or two steps, a skill is simpler. If it has three or more steps with dependencies between them, a workflow is the right product type.
Workflow structure
Every workflow needs a WORKFLOW.md that defines three things: what goes in, what steps run, and what comes out.
File layout
my-workflow/
WORKFLOW.md # Required — steps, inputs, outputs
steps/ # Optional — detailed step instructions
lint.md
test.md
review.md
summarize.md
vault.yaml # Required — manifest
README.md # Optional — buyer-facing docsWORKFLOW.md anatomy
A workflow definition has four sections:
- Inputs — what the workflow needs to start
- Steps — the ordered sequence of operations
- Outputs — what the workflow produces
- Error handling — what happens when a step fails
Build a code review workflow
Let's build a pipeline that runs four steps on any codebase: lint, test, review, and summarize.
Step 1 — Scaffold
$ myclaude init code-review-pipeline --category workflow
$ cd code-review-pipelineYou now have WORKFLOW.md and vault.yaml.
Step 2 — Write WORKFLOW.md
Replace the scaffolded content with a complete workflow definition:
# Code Review Pipeline
A four-step code review automation: lint, test, review, summarize.
## Inputs
- **target**: Path to the file or directory to review. Required.
- **language**: Programming language hint. Optional — auto-detected if omitted.
- **strict**: Boolean. When true, the pipeline fails on any lint warning. Default: false.
## Steps
### Step 1: Lint
Run static analysis on the target path.
- **Action**: Execute the project's configured linter (eslint, ruff, clippy, etc.)
based on project config files. If no linter config is found, use language defaults.
- **Input**: `target` path from workflow inputs
- **Output**: List of lint findings with file, line, severity, and message.
- **On failure**: If `strict` is true, halt the pipeline and report lint errors.
If `strict` is false, collect findings and continue.
### Step 2: Test
Run the project's test suite scoped to the target.
- **Action**: Detect and run the test framework (jest, pytest, cargo test, etc.).
Scope to tests related to the target path when possible.
- **Input**: `target` path from workflow inputs
- **Output**: Test results — passed count, failed count, and failure details.
- **On failure**: Collect failure details. Do not halt — the review step
needs to see test failures.
### Step 3: Review
Perform a code review incorporating lint and test results.
- **Action**: Review the target code for security issues, performance problems,
and correctness bugs. Incorporate findings from Step 1 (lint) and Step 2 (tests)
into the review — do not re-flag issues already caught by the linter.
- **Input**: `target` path + Step 1 output + Step 2 output
- **Output**: Review findings, each with severity (CRITICAL/HIGH/MEDIUM/LOW),
location, description, and suggested fix.
- **On failure**: Report what could not be reviewed and why.
### Step 4: Summarize
Produce a single summary of the entire pipeline run.
- **Action**: Aggregate all outputs from Steps 1-3 into a structured summary.
Lead with the most critical findings. Include a pass/fail verdict.
- **Input**: Step 1 output + Step 2 output + Step 3 output
- **Output**: Structured summary with sections for lint, tests, review, and verdict.
## Outputs
The workflow produces a single summary document containing:
1. **Verdict**: PASS, WARN, or FAIL
2. **Lint summary**: N findings (X errors, Y warnings)
3. **Test summary**: N passed, M failed
4. **Review findings**: Ordered by severity
5. **Recommended actions**: Prioritized list of what to fix first
## Error Handling
- If a step cannot execute (e.g., no test framework found), skip it and note
the skip in the summary. Do not halt the pipeline.
- If the target path does not exist, halt immediately with a clear error.
- Each step is isolated — a failure in Step 1 does not prevent Step 2 from running.That is a complete, publishable workflow definition. Each step has explicit inputs, outputs, and failure behavior. The steps form a dependency chain: Step 3 reads output from Steps 1 and 2, and Step 4 reads output from all three.
Step 3 — Add step files (optional)
For complex workflows, you can break step instructions into separate files under steps/. This keeps WORKFLOW.md focused on the sequence while each step file provides detailed behavior.
Create steps/review.md:
## Review Step — Detailed Instructions
You are performing a code review as part of an automated pipeline. You have
access to lint findings and test results from previous steps.
### Priority order
1. **Security**: Authentication bypasses, injection vectors, data exposure
2. **Correctness**: Logic errors, race conditions, edge cases
3. **Performance**: O(n^2) patterns, unnecessary allocations, N+1 queries
### What to skip
- Style issues already caught by the linter (do not duplicate)
- Test coverage suggestions (out of scope for this step)
- Refactoring opportunities that do not fix a concrete problem
### Output format
For each finding:[SEVERITY] file:line Issue: one sentence Fix: concrete suggestion or code snippet
Maximum 15 findings per file. If more exist, report the top 15 by severity
and note: "N additional findings omitted."WORKFLOW.md references these step files implicitly. When Claude processes the workflow, it reads both the main definition and any step-specific files.
Step 4 — Configure vault.yaml
name: code-review-pipeline
version: "1.0.0"
category: workflow
description: "Four-step code review: lint, test, review, and summarize — with structured verdict output."
author: your-username
license: MIT
price: 0
tags: [code-review, automation, pipeline, ci]Keep description under 160 characters. The category: workflow value tells the CLI to validate workflow-specific structure and install to .claude/workflows/.
See vault.yaml Specification for all available fields.
Step 5 — Test locally
Install the workflow in your own environment:
$ myclaude install --local .Invoke it in a Claude Code session:
/code-review-pipeline
Review src/api/handlers.tsTest these scenarios:
- Does each step produce output? Check that lint, test, review, and summary sections all appear.
- Do steps pass data forward? The review step should reference lint findings. The summary should aggregate all steps.
- Does error handling work? Point it at a directory with no test framework. Step 2 should skip gracefully, not crash the pipeline.
- Does the strict flag work? Run with
strict: trueon a file with lint warnings. The pipeline should halt after Step 1. - Is the summary useful? The verdict should clearly say PASS, WARN, or FAIL with enough context to act on.
Iterate on WORKFLOW.md until the pipeline produces output you would trust in a real review process.
Step 6 — Publish
$ myclaude validate && myclaude publishValidating vault.yaml... OK
Scanning content... OK
Uploading files... OK
Creating listing... OK
Published: myclaude.sh/p/your-username-code-review-pipelineBuyers install with:
$ myclaude install @your-username/code-review-pipelineBest practices
Idempotency
Every step should produce the same output given the same input, regardless of how many times it runs. Avoid steps that modify the filesystem, create branches, or push commits unless the workflow's explicit purpose is deployment.
Step isolation
Each step should be independent enough that a failure in one does not corrupt the next. Pass data between steps through defined outputs, not side effects. If Step 1 writes to a temp file that Step 2 reads, that is a coupling that will break.
Error handling over error prevention
Do not try to prevent every possible failure. Instead, define what happens when each step fails. A workflow that gracefully handles missing test frameworks is more useful than one that requires a specific test setup.
Input validation
Validate inputs at the start of the workflow, not inside individual steps. If target does not exist, fail immediately with a clear message rather than letting Step 1 discover the problem.
Keep steps focused
Each step should do one thing. If a step is doing lint AND test, split it into two steps. Focused steps are easier to debug, easier to skip, and produce clearer output.
Output structure
Define a consistent output structure across all steps. When every step produces findings in the same format (severity, location, description, fix), the summary step can aggregate them uniformly.
Workflow vs. other product types
| Question | Answer | Product type |
|---|---|---|
| Is it one task, invoked on demand? | Yes | Skill |
| Is it an ongoing advisory role? | Yes | Agent |
| Is it a multi-step sequence? | Yes | Workflow |
| Does it need multiple specialists? | Yes | Squad |
Workflows shine when the value is in the orchestration. A lint step alone is not worth selling. A lint-test-review-summarize pipeline that produces a structured verdict — that is a product.
Related pages
- vault.yaml Specification — complete field reference including workflow defaults
- Writing Skills Guide — if your workflow is simpler as a single skill
- Building Agents — agents as components within workflows
- Product Categories — all 9 product types compared
- Publishing Your First Product — the basics of publishing any product type