It is a well-known fact that those people who most want to rule people are, ipso facto, those least suited to do it. Similarly, those DSPy programs that work perfectly in your notebook are the ones least prepared for production traffic.
— Douglas Adams (adapted)
There's a special kind of confidence that comes from watching your DSPy pipeline ace every test case in your notebook. The structured outputs parse perfectly. The chain-of-thought reasoning is eloquent. The metrics look great. You push to production with the swagger of someone who has clearly figured it all out.
Then traffic hits.
The first user sends a 4,000-word manifesto about their cat's dietary preferences. Your context window explodes. The second user submits the same spam message 200 times in a row, and you watch your API bill climb in real-time. The third user sends content in a language your test suite never considered, and the LLM returns a response that makes your Pydantic parser weep.
This is the chapter that stands between your brilliant prototype and a service that survives contact with real users. We're going to take a straightforward DSPy pipeline — a content moderation system — and systematically armor it with every production feature DSPy offers: caching, cost tracking, observability callbacks, fallback chains, streaming, async processing, batch handling, and deployment behind a FastAPI service.
None of this is glamorous work. It's the engineering equivalent of wearing a seatbelt. But it's the difference between "works on my machine" and "handles 10,000 requests a day without waking me up at 3 AM."
In this chapter, we build a Production-Ready Content Moderation Pipeline that:
mkdir ch06_production && cd ch06_production
poetry init --name ch06-production --python ">=3.10,<3.15" --no-interaction# pyproject.toml
[tool.poetry]
name = "ch06-production"
version = "0.1.0"
description = "Chapter 6: Mostly Harmless (in Production)"
authors = ["Your Name <you@example.com>"]
[tool.poetry.dependencies]
python = ">=3.10,<3.15"
dspy = ">=3.1.3,<4.0.0"
python-dotenv = ">=1.2.2,<2.0.0"
fastapi = ">=0.115.0,<1.0.0"
uvicorn = ">=0.34.0,<1.0.0"
[build-system]
requires = ["poetry-core>=2.0.0,<3.0.0"]
build-backend = "poetry.core.masonry.api"poetry lock && poetry install && poetry shellYour .env file:
LLM_API_KEY=your-anthropic-api-key-here
ANTHROPIC_API_KEY=your-anthropic-api-key-hereWhy both keys? LLM_API_KEY is what we use in our code. ANTHROPIC_API_KEY is what LiteLLM (DSPy's backend) looks for automatically. Belt and suspenders.
Before we bolt on production armor, we need a pipeline worth protecting. Content moderation is ideal for this chapter because it's a real-world problem with clear requirements: take user-generated content, classify it, and decide whether to approve, flag, or reject it.
from pydantic import BaseModel, Field
from typing import Literal
import dspy
class ModerationDecision(BaseModel):
"""Structured output for a moderation decision."""
category: Literal[
"safe", "spam", "toxic", "misinformation",
"adult", "violence", "self_harm"
]
confidence: float = Field(
ge=0.0, le=1.0,
description="Confidence score from 0 to 1"
)
action: Literal["approve", "flag_for_review", "reject"]
explanation: str = Field(
description="Brief explanation of the moderation decision"
)The ModerationDecision Pydantic model does three things for us. First, it constrains the category to exactly seven values — the LLM can't invent new ones. Second, it enforces that confidence is a float between 0 and 1. Third, it requires an explanation, which is critical for audit trails in production.
Now the signature and module:
class ModerateContent(dspy.Signature):
"""You are a content moderator for a social platform. Analyze the given
user-generated content and determine whether it should be approved,
flagged for human review, or rejected. Be fair and avoid over-censoring
legitimate speech. Only reject content that clearly violates policies."""
content: str = dspy.InputField(
desc="The user-generated content to moderate"
)
context: str = dspy.InputField(
desc="Additional context about where this content appears",
default="general social media post",
)
decision: ModerationDecision = dspy.OutputField(
desc="The structured moderation decision"
)
class ContentModerator(dspy.Module):
def __init__(self):
self.moderate = dspy.ChainOfThought(ModerateContent)
def forward(self, content, context="general social media post"):
result = self.moderate(content=content, context=context)
return dspy.Prediction(decision=result.decision)Unlock all 7 chapters with a one-time purchase. No account needed upfront — just pay and get instant access.