42Harmless DSPy
ChaptersReference
42Harmless DSPy
ChaptersReference

Chapters

1Don't Panic2The Restaurant at the End of the Pipeline3Life, the Universe, and Retrieval4The Babel Fish — Optimizers Demystified5So Long, and Thanks for All the Prompts6Mostly Harmless (in Production)7The Answer Is 42 (Tokens)
Chapter 5

So Long, and Thanks for All the Prompts

55 min read

Chapter code

A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools. This is also true of AI agents.

— Douglas Adams (adapted)

You've built programs that classify, retrieve, and optimize. Impressive stuff. But every program you've written so far has a fundamental limitation: it can only work with the information you give it.

Ask your Chapter 4 ticket classifier about the weather? It'll try to classify it as a billing issue. Ask your Chapter 3 codebase explorer a question that isn't in the indexed files? It'll hallucinate an answer or admit defeat. Your programs are smart, but they're stuck in a box.

Agents break out of the box.

An agent is a program that can do things — search the web, read GitHub repositories, call APIs, check facts against live sources. Instead of mapping inputs to outputs in a single shot, an agent reasons about what it needs, takes actions to get it, and then reasons some more. It's the difference between a calculator and a research analyst. The calculator answers one question. The analyst goes on a quest.

In this chapter, we build two real-world agents:

  1. A Fact-Checker that searches the web with DuckDuckGo, reads live pages, fetches Wikipedia summaries, and delivers a structured verdict with sourced evidence.
  2. A Code Reviewer that hits the GitHub API, browses repository structures, reads source files, examines pull requests, and produces a thorough code review.

No fake data. No toy knowledge bases. These agents reach out into the real world. Along the way, you'll learn every agent pattern DSPy offers: ReAct for tool-using agents, ProgramOfThought and CodeAct for code-generating agents, and BestOfN/Refine for making any module's output reliably better.


Project Setup

mkdir ch05_agents && cd ch05_agents
poetry init

This chapter has more dependencies — we're reaching out to real APIs:

[tool.poetry]
name = "ch05-agents"
version = "0.1.0"
description = "Chapter 5: So Long, and Thanks for All the Prompts"
authors = ["Your Name <you@example.com>"]

[tool.poetry.dependencies]
python = ">=3.10,<3.15"
dspy = ">=3.1.3,<4.0.0"
python-dotenv = ">=1.2.2,<2.0.0"
requests = ">=2.31.0,<3.0.0"
beautifulsoup4 = ">=4.12.0,<5.0.0"
duckduckgo-search = ">=8.0.0,<9.0.0"

[build-system]
requires = ["poetry-core>=2.0.0,<3.0.0"]
build-backend = "poetry.core.masonry.api"
poetry install

New additions: requests for HTTP calls to the GitHub API, beautifulsoup4 for extracting readable text from web pages, and ddgs for live web search — no API key needed. Your .env stays the same:

ANTHROPIC_API_KEY=your-anthropic-api-key-here

The Mental Model: Reasoning + Acting

Every DSPy program you've written so far follows a simple pattern: inputs go in, the LLM processes them, outputs come out. One shot. Done.

Agents add a loop. At each iteration, the agent:

  1. Thinks about what it knows and what it still needs
  2. Picks a tool to gather more information or take an action
  3. Calls the tool with specific arguments
  4. Observes the result
  5. Repeats until it has enough information to produce the final answer

This is the ReAct pattern — Reasoning and Acting, interleaved. The "reasoning" part means the agent explains its thinking at each step (like chain-of-thought). The "acting" part means it actually does something about it (like calling a search function). The interleaving is what makes it powerful — each observation informs the next thought, which informs the next action.

1

I should search the web for recent programming language rankings.

search_web(language rankings 2024)

DuckDuckGo results with TIOBE, Stack Overflow survey links

2

Let me check the TIOBE index page for specifics.

fetch_webpage(tiobe.com/tiobe-index)

Page content with current rankings

3

Let me also get Wikipedias overview for context.

get_wikipedia_summary(Python)

Wikipedia summary with usage stats

4

I have enough evidence. Delivering verdict.

finish()

FactCheckVerdict(mostly true, high confidence)

The beauty is that your code doesn't hard-code this sequence. The LLM decides what to do at each step. Give it a different claim, and it'll take a completely different path — different searches, different pages, different evidence chain.


Agent 1: The Fact-Checker

Building Real Tools

Before we can build an agent, we need tools. In DSPy, a tool is just a Python function. DSPy wraps it in a dspy.Tool object that handles schema inference, argument validation, and formatting.

Our fact-checker needs three capabilities: searching the web, reading web pages, and looking things up on Wikipedia.

"""
research_agent.py — Chapter 5: Agents
"""

import os
import requests
from bs4 import BeautifulSoup
from typing import Literal

import dspy
from dotenv import load_dotenv
from pydantic import BaseModel, Field
from duckduckgo_search import DDGS

load_dotenv()


def search_web(query: str) -> str:
    """Search the web using DuckDuckGo. Returns top results with titles,
    URLs, and snippets. Use this to find relevant sources for a claim."""
    try:
        results = DDGS().text(query, max_results=5)
        if not results:
            return "No results found."
        output = []
        for r in results:
            output.append(f"- {r['title']}\n  URL: {r['href']}\n  {r['body']}")
        return "\n\n".join(output)
    except Exception as e:
        return f"Search failed: {e}"
🔒

The rest of this chapter is for paid readers.

Unlock all 7 chapters with a one-time purchase. No account needed upfront — just pay and get instant access.

←
PreviousThe Babel Fish — Optimizers Demystified
NextMostly Harmless (in Production)
→