Context Engineering Arsenal — myClaude | myClaude

context engineering arsenal

$49.90

Quality

$49.90

context engineering arsenal

Quality|00|context-engineering-2.0.1.zip0.2 MBMIT

Context Engineering Arsenal — 13 agents, 10 skills, 6 workflows. Architecture, auditing, token profiling, boot optimization, model routing, compression benchmarks.

context-engineeringagent-memorytoken-optimizationcontext-management

l0z4n0 🧙‍♂️

👑 CEO

CEO

@l0z4n0

🟤Bronze IILv5

19 items

l0z4n0 🧙‍♂️

🟤Lv5👑 CEO

@l0z4n0

🇺🇸 English | 🇧🇷 Português | 🇪🇸 Español

Context Engineering Arsenal

Your AI agents don't have a capability problem. They have a memory problem — and this is the system that fixes it.

┌─[ THE PROBLEM ]─────────────────────────────────────────┐
│                                                          │
│  Sessions reset. Decisions vanish. Context rots.         │
│                                                          │
│  You re-explain the same thing every session. Your       │
│  agent confidently does the wrong thing — because it     │
│  lost the context that would have stopped it.            │
│                                                          │
│  76% of loaded context gets ignored. (CL-Bench)          │
│  You're not under-prompting. You're over-loading.        │
│                                                          │
└──────────────────────────────────────────────────────────┘

┌─[ WHAT CHANGES ]────────────────────────────────────────┐
│                                                          │
│  A complete context engineering operating system.        │
│  Research-backed. Battle-tested on 150+ agent systems.   │
│  Every framework traces to a published source.           │
│                                                          │
└──────────────────────────────────────────────────────────┘

By the Numbers

Category	Count	Details
Agents	13	Specialized roles from orchestration to threat analysis
Skills	10	Standalone commands you can run immediately
Workflows	6	Multi-agent pipelines for complex operations
Frameworks	50+	Embedded decision models, rubrics, and protocols
Research Sources	14	Peer-reviewed papers, production case studies, expert analysis
Anti-Patterns	33	Named failure modes with detection and remediation
Rubric Criteria	30	Profile-adaptive health scoring (Solo: 7, Medium: 22, Enterprise: 30)
CELF Layers	8	L0 Constitution through L7 Delegation
Epistemic Tiers	6	AXIOM through SPECULATION classification
Context Pathologies	4	Research-validated LLM failure modes
Compression Methods	3	Ordered by information loss
Templates	5	Ready-to-use CLAUDE.md, BRAIN.yaml, STATE.yaml
Audit Scripts	2	Python-based diagnostics you can run standalone

Quick Start

# Install
squads install context-engineering

# Your first command — run a health check
ce audit

# Scaffold a new project's context architecture
ce scaffold

# Design a payload for an LLM task
ce payload "Analyze legal contracts for hidden risks"

# Plan a sprint's context strategy
ce sprint-start

Onboarding Gradient

Level	What To Try	Agents Involved
Simple	`ce audit` — instant health diagnostic	ContextAuditor
Medium	`ce scaffold` — build your project's context architecture	ContextChief + LayerArchitect
Advanced	`ce sprint-start` — full sprint lifecycle with state persistence	StateArchitect + TokenArchitect + ContextAuditor

How To Use — Manual

┌─[ CONTEXT SCENARIOS ]───────────────────────────────────┐
│                                                          │
│  Start from the problem, not the feature.                │
│                                                          │
└──────────────────────────────────────────────────────────┘

Situation	Run This	What Happens
"My agent keeps forgetting decisions"	`ce audit` + `ce scaffold`	StateArchitect builds persistence layer
"I'm starting a new project"	`ce scaffold`	LayerArchitect sets up 8-layer CELF structure
"My context window is bloating"	`*smart-compact`	Compression Trilogy: Offload > Truncate > Summarize
"I need to feed a long document to an agent"	`ce ingest`	ETLEngineer runs 5-stage pipeline with So-What Gate
"How healthy is my project's context?"	`*five-vitals`	5 structural signals, scored 0-10, graded A-F
"I want to design a prompt for a specific task"	`ce payload`	PayloadForge applies 7 cognitive strategies
"I don't know where to start"	`*wizard`	5-question guided builder. No experience needed
"Something feels wrong but I can't name it"	`*diagnose`	Layer-by-layer CELF scan, scores 0-3 per layer
"I'm designing a multi-agent system"	`ce blueprint`	AgentDesigner + ThreatSentinel design + stress-test
"Is my documentation still accurate?"	`*doc-rot`	Finds decay. Wrong docs are worse than no docs
"How many tokens does my boot cost?"	`ce boot-audit`	Maps everything loading before your first prompt
"Which model should handle this task?"	`ce route`	Cognitive Scoring Matrix routes to cheapest sufficient model
"Where are my tokens being wasted?"	`ce profile`	80/20 analysis: find the 20% consuming 80% of budget
"Should this file load every turn?"	`*optimize-injection`	Break-even analysis: persistent vs on-demand
"Best way to compress this context?"	`*bench`	Side-by-side: summarize vs truncate vs distill

Agents

┌─[ THE SQUAD ]───────────────────────────────────────────┐
│                                                          │
│  13 specialists. Each one does one thing well.           │
│  Depth = frameworks, techniques, and rubrics embedded.   │
│                                                          │
└──────────────────────────────────────────────────────────┘

Agent	Role	Key Capability	Depth
ContextChief	Orchestrator	Routes requests, coordinates workflows, health dashboard	Standard
PayloadForge	Payload Designer	7 cognitive strategies, wizard mode, cultural adaptation	Deep
LayerArchitect	Architecture Specialist	8-layer CELF framework, scaffolding, epistemic classification	Deep
ContextAuditor	Health Auditor	30-criterion rubric, 6 diagnostic skills, boot audit, forensic analysis	Deep
StateArchitect	State Engineer	Sprint lifecycles, decision preservation, smart compression	Standard
TokenArchitect	Token Economist	Cognitive scoring matrix, cost projection, burn rate tracking	Deep
ThreatSentinel	Threat Analyst	4 pathologies, 5 threats, surgical context repair	Standard
ETLEngineer	Ingestion Specialist	5-stage pipeline, semantic chunking, So-What Gate	Standard
AgentDesigner	System Architect	Multi-agent patterns, delegation packages, quality checklist	Standard
MetaAgent	Evolution Engine	Self-audit, knowledge curation, calibration, evolution planning	Standard
TokenProfiler	Cost Profiler	Per-file token cost, 80/20 analysis, ROI per line	Standard
BootAuditor	Boot Inspector	Boot loading map, traffic-light scoring, profile targets	Standard
ContextRouter	Routing Strategist	Cognitive Scoring Matrix (extended), cascade design, cost calculators	Deep

Depth key: Standard = focused competency with clear boundaries. Deep = multiple embedded frameworks, rubrics, or decision matrices that compound during execution.

<details> <summary>Skills (10)</summary>

Skill	Command	What It Does
Five Vitals	`*five-vitals`	5 structural health signals, scored 0-10, graded A-F
Doc Rot	`*doc-rot`	Finds documentation decay for deletion (wrong docs > no docs)
Epistemic Audit	`*epistemic-audit`	Validates epistemic coherence (AXIOM through SPECULATION)
Diagnose	`*diagnose`	Layer-by-layer CELF health scan, scores 0-3 per layer
Validate	`*validate`	30-criterion pass/fail rubric, profile-adaptive (Solo: 7, Medium: 22, Enterprise: 30)
Smart Compact	`*smart-compact`	Compression Trilogy: Offload > Truncate > Summarize
Context Surgeon	`*context-surgeon`	Surgical repair of detected context pathologies
Wizard	`*wizard`	5-question guided payload builder for beginners
Injection Optimizer	`*optimize-injection`	Persistent vs on-demand loading strategy with MVC formula
Compression Bench	`*bench`	Benchmark compression methods with quality preservation scoring

</details> <details> <summary>Workflows (6)</summary>

Workflow	Solves	Agents	Pipeline
full-payload	"I need to give an agent the right context for a task"	Chief + PayloadForge + ThreatSentinel	Brief > Strategy Selection > Payload Assembly > Threat Scan
project-scaffold	"I'm starting from zero and need structure"	Chief + LayerArchitect + ContextAuditor	Survey > CELF Mapping > Scaffold > Validation
full-audit	"Something feels off and I need a diagnosis"	ContextAuditor + ThreatSentinel	5 Diagnostics > Pathology Scan > Prioritized Report
sprint-lifecycle	"I need context management across a multi-day sprint"	StateArchitect + TokenArchitect + ContextAuditor	Inject > Plan > Execute > Persist > Compact > Validate
dense-ingestion	"I have a long document or transcript to process"	ETLEngineer + ContextAuditor	Extract > Transform > Load > So-What Gate > Quality Check
agent-blueprint	"I'm designing a new agent or multi-agent system"	Chief + AgentDesigner + ThreatSentinel	Requirements > Pattern Match > Design > Stress Test

</details> <details> <summary>Key Frameworks (10 embedded)</summary>

Framework	What It Solves
8-Layer CELF	Where does each piece of context live? (L0 Constitution through L7 Delegation)
Epistemic Classification	How reliable is this information? (AXIOM > FACT > EVIDENCE > HEURISTIC > INFERENCE > SPECULATION)
7 Cognitive Strategies	Which approach for this LLM task? (Zero-Shot through Multi-Agent)
4 Context Pathologies	What's going wrong? (Poisoning, Distraction, Confusion, Clash)
Compression Trilogy	How to shrink context safely? (Offload > Truncate > Summarize)
Cognitive Scoring Matrix	Which model for this task? (2-axis scoring: Cognition x Consequence)
MVC Formula	What context does this agent need? (Essential / Helpful / Noise)
33 Anti-Patterns	What mistakes to avoid? (Context Dump through Compress-and-Forget)
30-Criterion Rubric	How healthy is this project? (Profile-adaptive: Solo 7, Medium 22, Enterprise 30 criteria)
5 Structural Vitals	Is the system architecturally sound? (Tone, Coherence, Density, Clarity, Hygiene)

</details> <details> <summary>Templates (5 included)</summary>

Ready-to-use templates for common context artifacts:

CLAUDE.md — Solo (~50 lines), Medium (~120 lines), Enterprise (~180 lines)
BRAIN.yaml — Minimal, Standard, Full (knowledge graph entry points)
STATE.yaml — Minimal, Standard (with DECISIONS.md template)

</details> <details> <summary>Scripts (2 Python)</summary>

python scripts/diagnose.py /path/to/project   # Layer health scanner
python scripts/validate.py /path/to/project    # Profile-adaptive rubric (7-30 criteria)

</details> <details> <summary>Vocabulary</summary>

Term	Meaning
CELF	Context Engineering Layered Framework — 8-layer hierarchy (L0-L7) for organizing AI context
MVC	Minimum Viable Context — the least context needed for an agent to perform a task well
Epistemic Status	How reliable a piece of information is: AXIOM > FACT > EVIDENCE > HEURISTIC > INFERENCE > SPECULATION
Context Pathology	A research-validated failure mode of how LLMs process context (Poisoning, Distraction, Confusion, Clash)
Compression Trilogy	Three-step protocol for reducing context: Offload (zero loss) > Truncate (low loss) > Summarize (last resort)
So-What Gate	Quality filter: every extracted item must answer "So what?", "What action?", "Who does it?" or get discarded
CL-Bench	Research benchmark showing 76% of loaded context is ignored by models
ACE Cycle	Adaptive Context Evolution — Generate > Reflect > Curate loop for maintaining context quality
MemGPT	Research architecture for tiered AI memory: Working Memory > Short-term > Long-term
Sprint Blueprint	A plan defining what context loads when, token budget allocation, and execution sequence
Prompt Delta	Post-execution improvement artifact: what to observe, hypotheses, modifications for next iteration
Cognitive Scoring Matrix	2-axis framework (Cognition x Consequence) for routing tasks to the right model tier

</details>

Research Foundation

This squad distills intelligence from 14 sources. Not a reading list — each source contributed specific, testable frameworks that are embedded in the agents.

Source	Contribution
Andrej Karpathy	Core framing: context engineering as a discipline, not prompt engineering
Gemini 2.5 Pathology Study	4 validated context failure modes (Poisoning, Distraction, Confusion, Clash)
CL-Bench	Quantified the problem: 76% of loaded context gets ignored by models
ACE Framework	Adaptive Context Evolution cycle: +17.1% completion rate, -86.9% latency
MemGPT	Tiered memory architecture: Working > Short-term > Long-term
FrugalGPT	Cost optimization patterns: up to 98% cost reduction via cascading
RouteLLM (ICLR 2025)	Model routing heuristics: route to cheapest sufficient model
Structured Distillation	11x compression ratio while preserving semantic fidelity
Lost in the Middle	U-curve attention pattern: models ignore information in the middle of context
Multi-model synthesis	Cross-validated across Claude, GPT, Gemini, Grok, DeepSeek, Perplexity
Expert transcripts	Agent context engineering patterns from YouTube deep-dives
Production systems (150+ agents)	Battle-tested patterns from multi-agent orchestration at scale
Epistemic classification research	6-tier reliability framework for information provenance
Token economics literature	Cost modeling, burn rate tracking, ROI-per-token analysis

Methodology

How this squad was built — not copy-pasted from prompts, but forged through a 5-stage pipeline:

RESEARCH ──> EXTRACT ──> MODEL ──> VALIDATE ──> SHIP

  14 sources     Frameworks     Agent         Battle-tested    Production
  analyzed       isolated       design +      on 150+ agent    artifact
  cross-ref'd    and named      workflow       systems         with audit
                                wiring                         scripts

RESEARCH — 14 sources analyzed. Cross-referenced across 6 LLM providers to eliminate single-source bias.
EXTRACT — Every actionable framework isolated, named, and given clear boundaries. Theory discarded.
MODEL — Frameworks assigned to specialist agents. Workflows wired for multi-agent coordination.
VALIDATE — Tested on production systems running 150+ agents. Failure modes cataloged as anti-patterns.
SHIP — Packaged with audit scripts, templates, onboarding gradient, and vocabulary. Ready to install and run.

Who This Is For

You build agents that run across multiple sessions. Workflows, autonomous pipelines, AI assistants with ongoing responsibilities. You've hit the wall where your agent stops being reliable the moment a session ends or a context gets compacted.

You don't want another prompt trick. You want the underlying structure that makes context stick: state management, decision persistence, memory protocols, token discipline.

Who This Is NOT For

You write one-shot prompts. Your use case starts and ends in a single conversation. You want a template library to copy-paste. If your agent doesn't need to remember anything across sessions, there's nothing here you need.

The Five Laws

Clarity Over Complexity — Simple instructions beat sophisticated ambiguous ones
Information Density, Not Volume — 10 perfect chunks > 1000 mediocre ones
Context Has Cost — Every token loaded is unavailable for reasoning
Iterative Refinement — Evidence-based optimization beats intuition
Robustness Through Diversity — Multiple complementary approaches > single method

What Makes This Different

System, not snippets. 13 agents that coordinate through 6 workflows. Not a folder of prompts.
Research-backed. Every framework traces to a published source. 14 total. Zero invented heuristics.
Failure modes are named. 4 pathologies, 33 anti-patterns, all cataloged. You diagnose, not guess.
Profile-adaptive. Solo dev? 7-criterion rubric. Enterprise team? 30 criteria. Same squad, different depth.
Compression is a protocol. Offload > Truncate > Summarize. Three steps, ordered by information loss.
Tested on 150+ agent systems. Built from production, not theory.

Value Equation

The cost of bad context is invisible until it isn't.

Scenario	Cost
Agent hallucinates due to stale context, you debug for 2 hours	~$300-600 in time
Re-explaining project context every session, 15 min/day	~40 hours/year wasted
Wrong model routing on 100 daily tasks at $0.50 overspend each	~$18,000/year leaked
Building these frameworks yourself from the 14 sources	40-80 hours of research + prompt engineering
Hiring a context engineer	$150-300/hr

This squad: $49.90. One-time. Every project from now on.

The break-even is your first ce audit. One diagnostic run will surface context waste you didn't know existed. The token savings from a single *smart-compact pass typically exceed the purchase price within a week.

Evolution Path

v1.0 Foundation        ██████████████████████░░░░  SHIPPED
  Context architecture, auditing, frameworks,
  state management, 19 anti-patterns catalog.
  10 agents. 8 skills. 6 workflows.

v2.0 Performance       ██████████████████████████  NOW
  Token profiling, boot audit, model routing,
  injection optimization, compression benchmarks.
  13 agents. 10 skills. 6 workflows.

v3.0 Orchestration     ░░░░░░░░░░░░░░░░░░░░░░░░░  NEXT
  Multi-agent delegation packages with scoped
  context payloads. Pipeline routing across model
  tiers. Context leak detection between agents.
  Sprint token budgets with real-time burn tracking.
  Production telemetry: measure what matters,
  kill what doesn't. The performance engine
  becomes self-correcting.

·  ·  ·  ·  ·  ·  ·  ·  ·  ·  ·

Forged by l0z4n0 | squads.sh

First forged: 2026-03-17 | v2.0: 2026-03-17

Reviews (0)

Loading reviews...

More from l0z4n0 🧙‍♂️

Heavy Think

Skills

@l0z4n0

Docforge Sync

Skills

@l0z4n0

Docforge Llms

Skills

@l0z4n0

Docforge

Skills

@l0z4n0

Wiring Doctor

Skills

@l0z4n0

K8s Security Advisor

Minds

@l0z4n0

$49.90

🇺🇸 English | 🇧🇷 Português | 🇪🇸 Español

Context Engineering Arsenal

Your AI agents don't have a capability problem. They have a memory problem — and this is the system that fixes it.

┌─[ THE PROBLEM ]─────────────────────────────────────────┐
│                                                          │
│  Sessions reset. Decisions vanish. Context rots.         │
│                                                          │
│  You re-explain the same thing every session. Your       │
│  agent confidently does the wrong thing — because it     │
│  lost the context that would have stopped it.            │
│                                                          │
│  76% of loaded context gets ignored. (CL-Bench)          │
│  You're not under-prompting. You're over-loading.        │
│                                                          │
└──────────────────────────────────────────────────────────┘

┌─[ WHAT CHANGES ]────────────────────────────────────────┐
│                                                          │
│  A complete context engineering operating system.        │
│  Research-backed. Battle-tested on 150+ agent systems.   │
│  Every framework traces to a published source.           │
│                                                          │
└──────────────────────────────────────────────────────────┘

By the Numbers

Category	Count	Details
Agents	13	Specialized roles from orchestration to threat analysis
Skills	10	Standalone commands you can run immediately
Workflows	6	Multi-agent pipelines for complex operations
Frameworks	50+	Embedded decision models, rubrics, and protocols
Research Sources	14	Peer-reviewed papers, production case studies, expert analysis
Anti-Patterns	33	Named failure modes with detection and remediation
Rubric Criteria	30	Profile-adaptive health scoring (Solo: 7, Medium: 22, Enterprise: 30)
CELF Layers	8	L0 Constitution through L7 Delegation
Epistemic Tiers	6	AXIOM through SPECULATION classification
Context Pathologies	4	Research-validated LLM failure modes
Compression Methods	3	Ordered by information loss
Templates	5	Ready-to-use CLAUDE.md, BRAIN.yaml, STATE.yaml
Audit Scripts	2	Python-based diagnostics you can run standalone

Quick Start

# Install
squads install context-engineering

# Your first command — run a health check
ce audit

# Scaffold a new project's context architecture
ce scaffold

# Design a payload for an LLM task
ce payload "Analyze legal contracts for hidden risks"

# Plan a sprint's context strategy
ce sprint-start

Onboarding Gradient

Level	What To Try	Agents Involved
Simple	`ce audit` — instant health diagnostic	ContextAuditor
Medium	`ce scaffold` — build your project's context architecture	ContextChief + LayerArchitect
Advanced	`ce sprint-start` — full sprint lifecycle with state persistence	StateArchitect + TokenArchitect + ContextAuditor

How To Use — Manual

┌─[ CONTEXT SCENARIOS ]───────────────────────────────────┐
│                                                          │
│  Start from the problem, not the feature.                │
│                                                          │
└──────────────────────────────────────────────────────────┘

Situation	Run This	What Happens
"My agent keeps forgetting decisions"	`ce audit` + `ce scaffold`	StateArchitect builds persistence layer
"I'm starting a new project"	`ce scaffold`	LayerArchitect sets up 8-layer CELF structure
"My context window is bloating"	`*smart-compact`	Compression Trilogy: Offload > Truncate > Summarize
"I need to feed a long document to an agent"	`ce ingest`	ETLEngineer runs 5-stage pipeline with So-What Gate
"How healthy is my project's context?"	`*five-vitals`	5 structural signals, scored 0-10, graded A-F
"I want to design a prompt for a specific task"	`ce payload`	PayloadForge applies 7 cognitive strategies
"I don't know where to start"	`*wizard`	5-question guided builder. No experience needed
"Something feels wrong but I can't name it"	`*diagnose`	Layer-by-layer CELF scan, scores 0-3 per layer
"I'm designing a multi-agent system"	`ce blueprint`	AgentDesigner + ThreatSentinel design + stress-test
"Is my documentation still accurate?"	`*doc-rot`	Finds decay. Wrong docs are worse than no docs
"How many tokens does my boot cost?"	`ce boot-audit`	Maps everything loading before your first prompt
"Which model should handle this task?"	`ce route`	Cognitive Scoring Matrix routes to cheapest sufficient model
"Where are my tokens being wasted?"	`ce profile`	80/20 analysis: find the 20% consuming 80% of budget
"Should this file load every turn?"	`*optimize-injection`	Break-even analysis: persistent vs on-demand
"Best way to compress this context?"	`*bench`	Side-by-side: summarize vs truncate vs distill

Agents

┌─[ THE SQUAD ]───────────────────────────────────────────┐
│                                                          │
│  13 specialists. Each one does one thing well.           │
│  Depth = frameworks, techniques, and rubrics embedded.   │
│                                                          │
└──────────────────────────────────────────────────────────┘

Agent	Role	Key Capability	Depth
ContextChief	Orchestrator	Routes requests, coordinates workflows, health dashboard	Standard
PayloadForge	Payload Designer	7 cognitive strategies, wizard mode, cultural adaptation	Deep
LayerArchitect	Architecture Specialist	8-layer CELF framework, scaffolding, epistemic classification	Deep
ContextAuditor	Health Auditor	30-criterion rubric, 6 diagnostic skills, boot audit, forensic analysis	Deep
StateArchitect	State Engineer	Sprint lifecycles, decision preservation, smart compression	Standard
TokenArchitect	Token Economist	Cognitive scoring matrix, cost projection, burn rate tracking	Deep
ThreatSentinel	Threat Analyst	4 pathologies, 5 threats, surgical context repair	Standard
ETLEngineer	Ingestion Specialist	5-stage pipeline, semantic chunking, So-What Gate	Standard
AgentDesigner	System Architect	Multi-agent patterns, delegation packages, quality checklist	Standard
MetaAgent	Evolution Engine	Self-audit, knowledge curation, calibration, evolution planning	Standard
TokenProfiler	Cost Profiler	Per-file token cost, 80/20 analysis, ROI per line	Standard
BootAuditor	Boot Inspector	Boot loading map, traffic-light scoring, profile targets	Standard
ContextRouter	Routing Strategist	Cognitive Scoring Matrix (extended), cascade design, cost calculators	Deep

Depth key: Standard = focused competency with clear boundaries. Deep = multiple embedded frameworks, rubrics, or decision matrices that compound during execution.

<details> <summary>Skills (10)</summary>

Skill	Command	What It Does
Five Vitals	`*five-vitals`	5 structural health signals, scored 0-10, graded A-F
Doc Rot	`*doc-rot`	Finds documentation decay for deletion (wrong docs > no docs)
Epistemic Audit	`*epistemic-audit`	Validates epistemic coherence (AXIOM through SPECULATION)
Diagnose	`*diagnose`	Layer-by-layer CELF health scan, scores 0-3 per layer
Validate	`*validate`	30-criterion pass/fail rubric, profile-adaptive (Solo: 7, Medium: 22, Enterprise: 30)
Smart Compact	`*smart-compact`	Compression Trilogy: Offload > Truncate > Summarize
Context Surgeon	`*context-surgeon`	Surgical repair of detected context pathologies
Wizard	`*wizard`	5-question guided payload builder for beginners
Injection Optimizer	`*optimize-injection`	Persistent vs on-demand loading strategy with MVC formula
Compression Bench	`*bench`	Benchmark compression methods with quality preservation scoring

</details> <details> <summary>Workflows (6)</summary>

Workflow	Solves	Agents	Pipeline
full-payload	"I need to give an agent the right context for a task"	Chief + PayloadForge + ThreatSentinel	Brief > Strategy Selection > Payload Assembly > Threat Scan
project-scaffold	"I'm starting from zero and need structure"	Chief + LayerArchitect + ContextAuditor	Survey > CELF Mapping > Scaffold > Validation
full-audit	"Something feels off and I need a diagnosis"	ContextAuditor + ThreatSentinel	5 Diagnostics > Pathology Scan > Prioritized Report
sprint-lifecycle	"I need context management across a multi-day sprint"	StateArchitect + TokenArchitect + ContextAuditor	Inject > Plan > Execute > Persist > Compact > Validate
dense-ingestion	"I have a long document or transcript to process"	ETLEngineer + ContextAuditor	Extract > Transform > Load > So-What Gate > Quality Check
agent-blueprint	"I'm designing a new agent or multi-agent system"	Chief + AgentDesigner + ThreatSentinel	Requirements > Pattern Match > Design > Stress Test

</details> <details> <summary>Key Frameworks (10 embedded)</summary>

Framework	What It Solves
8-Layer CELF	Where does each piece of context live? (L0 Constitution through L7 Delegation)
Epistemic Classification	How reliable is this information? (AXIOM > FACT > EVIDENCE > HEURISTIC > INFERENCE > SPECULATION)
7 Cognitive Strategies	Which approach for this LLM task? (Zero-Shot through Multi-Agent)
4 Context Pathologies	What's going wrong? (Poisoning, Distraction, Confusion, Clash)
Compression Trilogy	How to shrink context safely? (Offload > Truncate > Summarize)
Cognitive Scoring Matrix	Which model for this task? (2-axis scoring: Cognition x Consequence)
MVC Formula	What context does this agent need? (Essential / Helpful / Noise)
33 Anti-Patterns	What mistakes to avoid? (Context Dump through Compress-and-Forget)
30-Criterion Rubric	How healthy is this project? (Profile-adaptive: Solo 7, Medium 22, Enterprise 30 criteria)
5 Structural Vitals	Is the system architecturally sound? (Tone, Coherence, Density, Clarity, Hygiene)

</details> <details> <summary>Templates (5 included)</summary>

Ready-to-use templates for common context artifacts:

CLAUDE.md — Solo (~50 lines), Medium (~120 lines), Enterprise (~180 lines)
BRAIN.yaml — Minimal, Standard, Full (knowledge graph entry points)
STATE.yaml — Minimal, Standard (with DECISIONS.md template)

</details> <details> <summary>Scripts (2 Python)</summary>

python scripts/diagnose.py /path/to/project   # Layer health scanner
python scripts/validate.py /path/to/project    # Profile-adaptive rubric (7-30 criteria)

</details> <details> <summary>Vocabulary</summary>

Term	Meaning
CELF	Context Engineering Layered Framework — 8-layer hierarchy (L0-L7) for organizing AI context
MVC	Minimum Viable Context — the least context needed for an agent to perform a task well
Epistemic Status	How reliable a piece of information is: AXIOM > FACT > EVIDENCE > HEURISTIC > INFERENCE > SPECULATION
Context Pathology	A research-validated failure mode of how LLMs process context (Poisoning, Distraction, Confusion, Clash)
Compression Trilogy	Three-step protocol for reducing context: Offload (zero loss) > Truncate (low loss) > Summarize (last resort)
So-What Gate	Quality filter: every extracted item must answer "So what?", "What action?", "Who does it?" or get discarded
CL-Bench	Research benchmark showing 76% of loaded context is ignored by models
ACE Cycle	Adaptive Context Evolution — Generate > Reflect > Curate loop for maintaining context quality
MemGPT	Research architecture for tiered AI memory: Working Memory > Short-term > Long-term
Sprint Blueprint	A plan defining what context loads when, token budget allocation, and execution sequence
Prompt Delta	Post-execution improvement artifact: what to observe, hypotheses, modifications for next iteration
Cognitive Scoring Matrix	2-axis framework (Cognition x Consequence) for routing tasks to the right model tier

</details>

Research Foundation

This squad distills intelligence from 14 sources. Not a reading list — each source contributed specific, testable frameworks that are embedded in the agents.

Source	Contribution
Andrej Karpathy	Core framing: context engineering as a discipline, not prompt engineering
Gemini 2.5 Pathology Study	4 validated context failure modes (Poisoning, Distraction, Confusion, Clash)
CL-Bench	Quantified the problem: 76% of loaded context gets ignored by models
ACE Framework	Adaptive Context Evolution cycle: +17.1% completion rate, -86.9% latency
MemGPT	Tiered memory architecture: Working > Short-term > Long-term
FrugalGPT	Cost optimization patterns: up to 98% cost reduction via cascading
RouteLLM (ICLR 2025)	Model routing heuristics: route to cheapest sufficient model
Structured Distillation	11x compression ratio while preserving semantic fidelity
Lost in the Middle	U-curve attention pattern: models ignore information in the middle of context
Multi-model synthesis	Cross-validated across Claude, GPT, Gemini, Grok, DeepSeek, Perplexity
Expert transcripts	Agent context engineering patterns from YouTube deep-dives
Production systems (150+ agents)	Battle-tested patterns from multi-agent orchestration at scale
Epistemic classification research	6-tier reliability framework for information provenance
Token economics literature	Cost modeling, burn rate tracking, ROI-per-token analysis

Methodology

How this squad was built — not copy-pasted from prompts, but forged through a 5-stage pipeline:

RESEARCH ──> EXTRACT ──> MODEL ──> VALIDATE ──> SHIP

  14 sources     Frameworks     Agent         Battle-tested    Production
  analyzed       isolated       design +      on 150+ agent    artifact
  cross-ref'd    and named      workflow       systems         with audit
                                wiring                         scripts

RESEARCH — 14 sources analyzed. Cross-referenced across 6 LLM providers to eliminate single-source bias.
EXTRACT — Every actionable framework isolated, named, and given clear boundaries. Theory discarded.
MODEL — Frameworks assigned to specialist agents. Workflows wired for multi-agent coordination.
VALIDATE — Tested on production systems running 150+ agents. Failure modes cataloged as anti-patterns.
SHIP — Packaged with audit scripts, templates, onboarding gradient, and vocabulary. Ready to install and run.

Who This Is For

You don't want another prompt trick. You want the underlying structure that makes context stick: state management, decision persistence, memory protocols, token discipline.

Who This Is NOT For

The Five Laws

Clarity Over Complexity — Simple instructions beat sophisticated ambiguous ones
Information Density, Not Volume — 10 perfect chunks > 1000 mediocre ones
Context Has Cost — Every token loaded is unavailable for reasoning
Iterative Refinement — Evidence-based optimization beats intuition
Robustness Through Diversity — Multiple complementary approaches > single method

What Makes This Different

System, not snippets. 13 agents that coordinate through 6 workflows. Not a folder of prompts.
Research-backed. Every framework traces to a published source. 14 total. Zero invented heuristics.
Failure modes are named. 4 pathologies, 33 anti-patterns, all cataloged. You diagnose, not guess.
Profile-adaptive. Solo dev? 7-criterion rubric. Enterprise team? 30 criteria. Same squad, different depth.
Compression is a protocol. Offload > Truncate > Summarize. Three steps, ordered by information loss.
Tested on 150+ agent systems. Built from production, not theory.

Value Equation

The cost of bad context is invisible until it isn't.

Scenario	Cost
Agent hallucinates due to stale context, you debug for 2 hours	~$300-600 in time
Re-explaining project context every session, 15 min/day	~40 hours/year wasted
Wrong model routing on 100 daily tasks at $0.50 overspend each	~$18,000/year leaked
Building these frameworks yourself from the 14 sources	40-80 hours of research + prompt engineering
Hiring a context engineer	$150-300/hr

This squad: $49.90. One-time. Every project from now on.

Evolution Path

v1.0 Foundation        ██████████████████████░░░░  SHIPPED
  Context architecture, auditing, frameworks,
  state management, 19 anti-patterns catalog.
  10 agents. 8 skills. 6 workflows.

v2.0 Performance       ██████████████████████████  NOW
  Token profiling, boot audit, model routing,
  injection optimization, compression benchmarks.
  13 agents. 10 skills. 6 workflows.

v3.0 Orchestration     ░░░░░░░░░░░░░░░░░░░░░░░░░  NEXT
  Multi-agent delegation packages with scoped
  context payloads. Pipeline routing across model
  tiers. Context leak detection between agents.
  Sprint token budgets with real-time burn tracking.
  Production telemetry: measure what matters,
  kill what doesn't. The performance engine
  becomes self-correcting.

·  ·  ·  ·  ·  ·  ·  ·  ·  ·  ·

Forged by l0z4n0 | squads.sh

First forged: 2026-03-17 | v2.0: 2026-03-17

context engineering arsenal

context engineering arsenal

Reviews (0)

More from l0z4n0 🧙‍♂️

context engineering arsenal

context engineering arsenal

Context Engineering Arsenal

By the Numbers

Quick Start

Onboarding Gradient

How To Use — Manual

Agents

Research Foundation

Methodology

Who This Is For

Who This Is NOT For

The Five Laws

What Makes This Different

Value Equation

Evolution Path

Reviews (0)

More from l0z4n0 🧙‍♂️

Context Engineering Arsenal

By the Numbers

Quick Start

Onboarding Gradient

How To Use — Manual

Agents

Research Foundation

Methodology

Who This Is For

Who This Is NOT For

The Five Laws

What Makes This Different

Value Equation

Evolution Path

More Squads