Pipeline
The 6-stage pluggable processing pipeline: classification, tiered routing, context assembly, transport, scoring, and learning.
Overview
Every message processed by the agent flows through a 6-stage pluggable pipeline defined in crates/clawft-core/src/pipeline/. Each stage is a trait, making stages independently replaceable. The PipelineRegistry maps TaskType variants to specialized Pipeline instances; unregistered task types fall back to the default pipeline.
ChatRequest
|
[1. Classifier] -- TaskProfile (type, complexity)
|
[2. Router] -- RoutingDecision (provider, model)
|
[3. Assembler] -- AssembledContext (messages, token estimate)
|
[4. Transport] -- LlmResponse
|
[5. Scorer] -- QualityScore (overall, relevance, coherence)
|
[6. Learner] -- Trajectory
|
LlmResponseStage 1: Classifier
File: crates/clawft-core/src/pipeline/classifier.rs
Classifies incoming messages by task type using keyword pattern matching. The classifier produces a TaskProfile containing the detected TaskType and a complexity score.
Task types: CodeGeneration, CodeReview, Research, Creative, Analysis, Math, Chat (default).
The first matching keyword group wins. This is a Level 0 implementation -- no ML, no embeddings, just case-insensitive substring matching. The complexity score is derived from the task type, message length, and tool requirements.
Stage 2: Router
Files: crates/clawft-core/src/pipeline/router.rs, tiered_router.rs
Selects the LLM model based on the task classification, user permissions, and cost budgets. Two modes are available:
Static Mode
Always uses the configured default model. No complexity scoring.
Tiered Mode
The tiered router (tiered_router.rs, ~1,650 lines) implements complexity-based routing across model tiers with cost tracking and permission awareness.
Routing flow:
- Classify -- The task type and complexity score arrive from Stage 1.
- Score complexity -- Refined based on task type, message length, and tool requirements.
- Check permissions -- The user's
max_tierpermission limits model selection. - Check budget -- The cost tracker enforces daily and monthly spending limits.
- Select tier -- The complexity score maps to a tier; if the budget is exceeded, the router falls back to a lower tier.
- Route -- Returns a
RoutingDecisionwith the model name, tier, and estimated cost.
Tier mapping:
| Tier | Complexity Range | Example Models | Use Case |
|---|---|---|---|
| Free | 0.0 - 0.15 | Local models | Trivial queries |
| Standard | 0.15 - 0.40 | Haiku-class | Simple tasks |
| Premium | 0.40 - 0.70 | Sonnet-class | Moderate complexity |
| Elite | 0.70 - 1.0 | Opus-class | Complex reasoning |
Stage 3: Assembler
File: crates/clawft-core/src/pipeline/assembler.rs
Assembles the final ChatRequest from the context messages, selected model, tool definitions, and configuration. The TokenBudgetAssembler uses a chars/4 heuristic for token estimation and drops middle messages when the context exceeds the model's token limit, preserving the system prompt and recent turns.
Stage 4: Transport
File: crates/clawft-core/src/pipeline/transport.rs
Sends the assembled request to the selected LLM provider via clawft-llm. Handles streaming (via SSE parsing), retries, and failover. The transport stage uses the ClawftLlmAdapter at runtime; during testing, a stub transport returns canned responses.
Stage 5: Scorer (GEPA FitnessScorer)
File: crates/clawft-core/src/pipeline/scorer.rs
Evaluates response quality after the LLM returns. Produces a QualityScore with overall, relevance, and coherence dimensions. These scores serve as fitness signals for the learner stage.
As of v0.2, the FitnessScorer replaces the previous NoopScorer. It evaluates responses on 4 weighted dimensions:
| Dimension | Weight | Measures |
|---|---|---|
| Relevance | 0.35 | How well the response addresses the request |
| Coherence | 0.25 | Logical consistency and flow |
| Completeness | 0.25 | Coverage of the request's requirements |
| Conciseness | 0.15 | Information density without unnecessary verbosity |
Weights are configurable. The overall score is the weighted sum, normalized to [0.0, 1.0]. These scores drive the learner's mutation decisions -- low-scoring trajectories trigger prompt refinement.
Stage 6: Learner (GEPA TrajectoryLearner)
File: crates/clawft-core/src/pipeline/learner.rs
Records trajectories (request + response + score) for adaptive learning.
As of v0.2, the TrajectoryLearner replaces the previous NoopLearner, implementing GEPA -- Genetic Evolution of Prompt Architectures (ADR-017). The learner operates in three phases:
Phase 1: Trajectory Collection
Every pipeline execution produces a trajectory: the original request, the assembled context, the LLM response, and the fitness score. Trajectories are stored in a ring buffer (configurable size, default 1000).
Phase 2: Pattern Extraction
Periodically (configurable interval, default every 50 trajectories), the learner analyzes collected trajectories to extract patterns:
- High-scoring trajectories: what prompt structures produce good results?
- Low-scoring trajectories: what patterns correlate with poor quality?
- Skill-specific trends: which skills are improving or degrading?
Phase 3: Prompt Mutation
Based on extracted patterns, the learner applies mutation strategies to skill prompts:
| Strategy | Description |
|---|---|
| Rephrase | Rewrite unclear instructions using patterns from high-scoring trajectories |
| Add Examples | Insert few-shot examples extracted from successful trajectories |
| Remove Ineffective | Strip instructions that correlate with low scores |
| Emphasize | Strengthen instructions that correlate with high scores |
Mutations are proposed as skill candidates (similar to skill auto-generation) and require approval before activation. This ensures human oversight over prompt evolution while enabling data-driven improvement.
Cost Tracking
File: crates/clawft-core/src/pipeline/cost_tracker.rs (~954 lines)
The cost tracker enforces per-tier budget limits with configurable daily and monthly caps. It operates in conjunction with the tiered router:
- Pre-call: The router queries the cost tracker to check whether the estimated cost fits within the budget. If not, the router downgrades to a cheaper tier.
- Post-call: After the LLM responds, the actual cost (based on token usage) is recorded against the sender's budget.
Cost records are per-sender, enabling multi-tenant deployments where different users have different spending limits.
Rate Limiting
File: crates/clawft-core/src/pipeline/rate_limiter.rs (~632 lines)
Per-sender rate limiting prevents abuse. Configurable limits include requests per minute and requests per hour. When a sender exceeds their rate limit, the pipeline returns an error response without invoking the LLM.
Permissions
File: crates/clawft-core/src/pipeline/permissions.rs (~757 lines)
The permission resolver controls access to tools and model tiers. Permissions are evaluated at two points:
- Router: The user's
max_tierpermission determines the highest model tier they can use. - Tool execution: Each tool call is checked against the user's tool permissions and the active skill's
allowed_toolslist.
Pipeline Trait Definitions
File: crates/clawft-core/src/pipeline/traits.rs
All six stages are defined as async traits:
#[async_trait]
pub trait Classifier: Send + Sync {
async fn classify(&self, request: &ChatRequest) -> TaskProfile;
}
#[async_trait]
pub trait Router: Send + Sync {
async fn route(&self, profile: &TaskProfile) -> RoutingDecision;
}
#[async_trait]
pub trait Assembler: Send + Sync {
async fn assemble(&self, messages: Vec<Message>, decision: &RoutingDecision) -> ChatRequest;
}
#[async_trait]
pub trait Transport: Send + Sync {
async fn send(&self, request: ChatRequest) -> LlmResponse;
}
#[async_trait]
pub trait Scorer: Send + Sync {
async fn score(&self, request: &ChatRequest, response: &LlmResponse) -> QualityScore;
}
#[async_trait]
pub trait Learner: Send + Sync {
async fn record(&self, trajectory: Trajectory);
}Each trait can be implemented independently and injected into the pipeline via set_pipeline() on the AppContext.
Config-Based Stage Selection (v0.3)
File: crates/clawft-types/src/config/mod.rs -- PipelineConfig
As of Sprint 13, the scorer and learner stages can be selected via configuration
instead of hard-coding the implementation. The [pipeline] section of the
config file maps backend names to trait implementations:
[pipeline]
scorer = "fitness"
learner = "trajectory"Available Backends
| Stage | Backend | Description |
|---|---|---|
| Scorer | "noop" (default) | No-op scorer, returns zero scores |
| Scorer | "fitness" | GEPA FitnessScorer with 4-dimension weighted evaluation |
| Learner | "noop" (default) | No-op learner, discards trajectories |
| Learner | "trajectory" | GEPA TrajectoryLearner with ring buffer and mutation strategies |
Both fields default to "noop" for backward compatibility. To enable the full
GEPA adaptive learning loop, set both to their active implementations:
{
"pipeline": {
"scorer": "fitness",
"learner": "trajectory"
}
}The PipelineConfig struct is part of the root Config and is deserialized
alongside all other configuration sections. The pipeline builder reads
config.pipeline.scorer and config.pipeline.learner at startup to
instantiate the correct trait objects.