12-Factor Agents Compliance Analysis
Reference: 12-Factor Agents
Input Parameters
| Parameter | Description | Required |
|---|---|---|
docs_path |
Path to documentation directory (for existing analyses) | Optional |
codebase_path |
Root path of the codebase to analyze | Required |
Analysis Framework
The full per-factor rubric — principle, search patterns, file patterns, compliance criteria (Strong/Partial/Weak), and anti-patterns for each of the 13 factors — lives in references/factors.md. During the Analysis Workflow, read the relevant factor sections there for the search patterns to run and the criteria to score against.
| # | Factor | Focus |
|---|---|---|
| 1 | Natural Language to Tool Calls | Schema-validated structured outputs from LLM |
| 2 | Own Your Prompts | Prompts as first-class, versioned, templated code |
| 3 | Own Your Context Window | Custom formatting of history/state/tool results |
| 4 | Tools Are Structured Outputs | Validated JSON triggers deterministic code |
| 5 | Unify Execution State | Single state object merging execution + business state |
| 6 | Launch/Pause/Resume | APIs to launch, pause anywhere, resume |
| 7 | Contact Humans with Tools | Human contact as a structured tool call |
| 8 | Own Your Control Flow | Custom routing/retries, not framework defaults |
| 9 | Compact Errors into Context | Errors fed back for self-healing + escalation |
| 10 | Small, Focused Agents | Narrow responsibility, 3-10 steps each |
| 11 | Trigger from Anywhere | CLI/REST/WebSocket/chat/webhook entry points |
| 12 | Stateless Reducer | Pure (state, input) -> (state, output) agents |
| 13 | Pre-fetch Context | Fetch likely-needed data upfront |
See references/factors.md for the complete rubric for every factor above.
Output Format
Gate order: Do not assign Strong / Partial / Weak or treat recommendations as observed facts until Hard gates (after Analysis Workflow) are satisfied for the factors in scope.
Executive Summary Table
| Factor | Status | Notes |
|--------|--------|-------|
| 1. Natural Language -> Tool Calls | **Strong/Partial/Weak** | [Key finding] |
| 2. Own Your Prompts | **Strong/Partial/Weak** | [Key finding] |
| ... | ... | ... |
| 13. Pre-fetch Context | **Strong/Partial/Weak** | [Key finding] |
**Overall**: X Strong, Y Partial, Z Weak
Per-Factor Analysis
For each factor, provide:
Current Implementation
- Evidence with file:line references
- Code snippets showing patterns
Compliance Level
- Strong/Partial/Weak with justification
Gaps
- What's missing vs. 12-Factor ideal
Recommendations
- Actionable improvements with code examples
Analysis Workflow
Initial Scan
- Run search patterns for all factors
- Identify key files for each factor
- Note any existing compliance documentation
Deep Dive (per factor)
- Read identified files
- Evaluate against compliance criteria
- Document evidence with file paths
Gap Analysis
- Compare current vs. 12-Factor ideal
- Identify anti-patterns present
- Prioritize by impact
Recommendations
- Provide actionable improvements
- Include before/after code examples
- Reference roadmap if exists
Summary
- Compile executive summary table
- Highlight strengths and critical gaps
- Suggest priority order for improvements
Hard gates (evidence before scores)
Run these in order. Do not skip ahead: each Pass is an objective condition you can check (paths on disk, citations present), not internal certainty.
- Scan gate — After the initial scan (workflow step 1), Pass: for every factor (1–13) you have either (a) ≥1 repo-relative path or glob hit to inspect, or (b) a one-line note with rationale (e.g. search command/output, or “no matches — codebase may omit this concern”). Empty hand-waving (“looks fine”) fails this gate.
- Evidence gate (per factor) — Before writing Strong / Partial / Weak for that factor, Pass: “Current Implementation” includes ≥1 citation with file path plus line range or short quoted snippet from
codebase_path, or an explicit no evidence located statement after targeted reads. If evidence is missing after search, default that factor to Weak unless the criterion is clearly N/A (say why). - Synthesis gate — Executive summary table and per-factor analysis sections, Pass: only after gates 1–2 are satisfied for the factors in scope. Recommendations may name new files or patterns only as proposals; they must not be presented as observed facts without matching citations from step 2.
Quick Reference: Compliance Scoring
| Score | Meaning | Action |
|---|---|---|
| Strong | Fully implements principle | Maintain, minor optimizations |
| Partial | Some implementation, significant gaps | Planned improvements |
| Weak | Minimal or no implementation | High priority for roadmap |
When to Use This Skill
- Evaluating new LLM-powered systems
- Reviewing agent architecture decisions
- Auditing production agentic applications
- Planning improvements to existing agents
- Comparing frameworks or implementations