Skip to content

Epic Lifecycle

Every epic in OSTwin is driven by two fundamental roles: the Worker and the Evaluator. Their interaction forms a closed loop that ensures quality β€” the Worker produces, the Evaluator inspects, and the cycle repeats until the work passes or the system escalates. This page explains how those roles interact through the lifecycle states and what happens when the loop can’t close on its own.

The Workshop Analogy

Think of an OSTwin epic as a craft workshop:

  • The Worker is the artisan at the bench β€” shaping, assembling, and building the deliverable. In different epics the Worker might be an engineer writing code, a researcher gathering findings, or a writer drafting documentation. The role changes, but the function is the same: produce.

  • The Evaluator is the master inspector β€” examining the work against acceptance criteria, finding defects, and deciding whether it’s ready to ship. In different epics the Evaluator might be a QA tester, a security auditor, or an architect reviewing design. The role changes, but the function is the same: verify.

In a real workshop, the artisan doesn’t self-certify their own work. They hand it to the inspector, who either stamps it passed or sends it back with a list of defects. If the artisan can’t fix the defects after several attempts, the workshop master steps in for triage β€” deciding whether to provide guidance, restructure the work, or scrap it entirely.

OSTwin follows exactly this pattern. The lifecycle is the protocol that governs every handoff between Worker and Evaluator.

Core State Flow

The essential loop every epic follows:

Worker Evaluator Outcome
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚developing│───►│ review │───►│ passed β”‚ βœ“ Ship it
β”‚ β”‚ β”‚ │───►│ failed │──► back to developing
β”‚ β”‚ β”‚ β”‚ β”‚ failed-final│──► triage
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Expanding this into the full state diagram:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ planning β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚ planned β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚ ready β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”Œβ”€β”€β”€β”€β”€β”‚ developing │◄────────────┐
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ review β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β” β”‚
β”‚ β”‚ passed β”‚ β”‚ failed β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ (retry)
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚ β”‚fixing │──► review
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
└────►│ blocked β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚failed-final β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

States and the Roles That Own Them

Each lifecycle state belongs to either the Worker or the Evaluator. Understanding who owns what makes the flow intuitive:

Worker-Owned States

StateWhoDescription
developingWorkerThe Worker is actively building β€” writing code, running tests, producing artifacts
fixingWorkerThe Worker is addressing specific feedback from the Evaluator’s review

Evaluator-Owned States

StateWhoDescription
reviewEvaluatorThe Evaluator is inspecting the Worker’s deliverables against acceptance criteria

System-Owned States

StateTerminalDescription
planningNoEpic is being decomposed into tasks by the manager
plannedNoTasks are defined, waiting for dependency gates to clear
readyNoAll dependencies satisfied, room can be created
passedYesEvaluator approved β€” all acceptance criteria met, work is done
failedNoEvaluator rejected β€” specific issues documented, retry is possible
failed-finalYesMax retries exhausted β€” the Worker-Evaluator loop cannot close
blockedNoWaiting on external dependency or human input
timeoutNoExecution exceeded the configured time limit
escalatedNoSent to architect for design review
redesignNoArchitect determined a design change is needed
cancelledYesManually cancelled by the manager or user

The Worker-Evaluator Handoff

The core cycle is a handoff protocol between two roles:

developing β†’ review

The Worker finishes a task and posts a done message. This triggers the transition to review, where control passes from Worker to Evaluator. The Worker cannot self-approve β€” the handoff is mandatory.

{"type":"done","from":"worker"} β†’ developing β†’ review

review β†’ passed or failed

The Evaluator examines the deliverables against the Definition of Done and Acceptance Criteria:

{"type":"pass","from":"evaluator"} β†’ review β†’ passed
{"type":"fail","from":"evaluator"} β†’ review β†’ failed

If the Evaluator passes the work, the epic reaches its terminal state. If the Evaluator fails it, the loop continues:

failed β†’ fixing β†’ review

The Worker receives the Evaluator’s feedback and enters fixing. Once fixes are complete, the Worker hands off to the Evaluator again:

failed β†’ fixing β†’ review β†’ passed? or β†’ failed again?

Each failure counts toward the retry limit. When retries are exhausted, the system enters triage.

Triage: When the Loop Can’t Close

In the workshop analogy, triage is what happens when the artisan has tried three times and the inspector still won’t sign off. The workshop master steps in.

Attempt 1: develop β†’ review β†’ FAIL
Attempt 2: fix β†’ review β†’ FAIL
Attempt 3: fix β†’ review β†’ FAIL
β”‚
β–Ό
failed-final β†’ manager notified β†’ triage
β”‚
β”œβ”€β–Ί FIX verdict: architect provides guidance, retry allowed
β”œβ”€β–Ί REDESIGN verdict: epic restructured, loop restarts
└─► REPLAN verdict: plan modified, DAG rebuilt

After triage, the manager receives one of three verdicts:

  • FIX β€” the implementation approach is sound, the Worker just needs specific guidance. The loop restarts with a clearer direction.
  • REDESIGN β€” the epic’s architecture needs changes before the Worker can succeed. The epic is restructured and the loop restarts from developing.
  • REPLAN β€” the epic itself is flawed (too large, wrong scope, missing dependency). The plan is modified and the DAG is rebuilt.

Retry Parameters

ParameterDefaultDescription
max_retries3Maximum Evaluator rejection cycles before triage
timeout_seconds900Max time per Worker attempt (15 minutes)
retry_delay_seconds0Delay between retry attempts

Transition Mechanisms

State transitions are triggered by four mechanisms:

1. Agent Messages (Worker ↔ Evaluator Handoff)

The most common trigger. The Worker’s done message hands control to the Evaluator. The Evaluator’s pass or fail message determines the outcome.

{"type":"done","from":"worker"} β†’ developing β†’ review
{"type":"pass","from":"evaluator"} β†’ review β†’ passed
{"type":"fail","from":"evaluator"} β†’ review β†’ failed

2. Manager Orchestration

The manager controls transitions that require cross-room awareness:

  • planning β†’ planned (after task decomposition)
  • planned β†’ ready (after dependency check)
  • failed β†’ developing (after retry decision)
  • failed β†’ failed-final (after max retries β€” triggers triage)

3. Timeout Enforcement

The system monitors execution time and forces transitions when limits are exceeded:

  • developing for > timeout_seconds β†’ timeout
  • timeout β†’ escalated (automatic)

4. Manual Override

Users can force state transitions through the dashboard or CLI.

lifecycle.json Format

Each war-room contains a lifecycle.json that defines its state machine:

{
"states": [
"planning", "planned", "ready", "developing",
"review", "fixing", "passed", "failed",
"failed-final", "blocked", "timeout",
"escalated", "redesign", "cancelled"
],
"initial": "planning",
"terminal": ["passed", "failed-final", "cancelled"],
"transitions": {
"planning": ["planned", "cancelled"],
"planned": ["ready", "blocked", "cancelled"],
"ready": ["developing", "blocked", "cancelled"],
"developing": ["review", "blocked", "timeout", "cancelled"],
"review": ["passed", "failed", "blocked", "cancelled"],
"failed": ["fixing", "failed-final", "escalated"],
"fixing": ["review", "blocked", "timeout"],
"timeout": ["escalated", "developing", "cancelled"],
"escalated": ["redesign", "developing", "failed-final"],
"redesign": ["developing", "cancelled"],
"blocked": ["developing", "cancelled"]
},
"manager_only": ["passed", "failed-final", "cancelled"]
}

Timeout Enforcement

The system tracks elapsed time for each non-terminal state:

Room entered "developing" at 2025-01-15T10:00:00Z
Timeout configured: 900 seconds
Current time: 2025-01-15T10:16:00Z
Elapsed: 960 seconds β†’ TIMEOUT triggered

When a timeout fires:

  1. Room status transitions to timeout
  2. The manager is notified via the channel
  3. The manager decides whether to retry, escalate, or cancel
  4. If no action is taken within 300 seconds, auto-escalation occurs

Audit Trail

Every state transition is logged to lifecycle-audit.jsonl in the war-room:

{"ts":"2025-01-15T10:00:00Z","from":"planning","to":"planned","actor":"manager","reason":"Tasks decomposed"}
{"ts":"2025-01-15T10:00:05Z","from":"planned","to":"ready","actor":"system","reason":"Dependencies satisfied"}
{"ts":"2025-01-15T10:00:10Z","from":"ready","to":"developing","actor":"manager","reason":"Worker assigned"}
{"ts":"2025-01-15T10:15:00Z","from":"developing","to":"review","actor":"worker","reason":"TASK-001 done"}
{"ts":"2025-01-15T10:20:00Z","from":"review","to":"failed","actor":"evaluator","reason":"Test coverage 72%, required 95%"}
{"ts":"2025-01-15T10:20:05Z","from":"failed","to":"fixing","actor":"manager","reason":"Retry 1/3"}

This creates a complete, timestamped record of every lifecycle event for debugging and compliance.

Progress Tracking

Agents report progress through two complementary mechanisms:

progress.json

Machine-readable progress updated by agents via the report_progress MCP tool:

{
"percent": 65,
"message": "Implementing TASK-003 of 5. Tests passing for TASK-001 and TASK-002.",
"updated_at": "2025-01-15T10:12:00Z"
}

The percent field is clamped to 0–100 and drives the dashboard progress bars.

PROGRESS.md

Human-readable progress notes maintained by the Worker:

## Progress
- [x] TASK-001: Login endpoint (done, tests passing)
- [x] TASK-002: Token validation (done, tests passing)
- [ ] TASK-003: Rate limiting (in progress)
- [ ] TASK-004: Integration tests
- [ ] TASK-005: OpenAPI docs

Lifecycle Customization

The lifecycle can be customized per plan or per epic using the create-lifecycle skill. This skill generates specialized lifecycles for different epic types β€” for example, a β€œdocumentation” lifecycle might skip the QA review step entirely, while a β€œsecurity” epic might add an additional security-review stage between developing and review.

Custom lifecycles can also be configured using the Pipeline directive in PLAN.md, which inserts additional review stages with their own correction loops:

Pipeline: architect -> engineer -> security-review -> qa

Each stage containing β€œreview”, β€œqa”, β€œaudit”, β€œcheck”, or β€œverify” gets pass/fail/escalate transitions with correction loops back through fixing.

Key Files

FilePurpose
.agents/war-rooms/*/lifecycle.jsonPer-room state machine definition
.agents/war-rooms/*/status.txtCurrent state
.agents/war-rooms/*/progress.jsonCompletion tracking
.agents/mcp/warroom-server.pyStatus and progress MCP tools