As I wrote recently in ChatGPT 5 Thinking Is So Damn Smart:

I spent some time at the Pydantic booth at the AI Engineer World’s Fair in SF this June. Today, I got their announcement that Pydantic AI v1 is out, and decided to write about it.

The point of that post was how superbly GPT-5 Thinking filled in my knowledge gaps around Pydantic AI. As I sat down to write this post, however, I realized that ChatGPT’s briefing document should actually BE the core of this post. That document was a joint effort—I conceived the need, I asked the questions, I played critic and editor through dozens of iterations—GPT5 Thinking did the research, drafting and rewriting.

I write as a learning exercise, but spending hours or days to improve upon the cto4ai’s Field Guide to Pydantic AI below would have impeded rather than furthered my learning.

Ethan Mollick recently shared his experience working with Claude’s new Excel-spreadsheet-editing features:

Ethan Mollick's example of Claude's Excel capabilities showing complex spreadsheet with formulas

Just as Claude had produced a very solid solution that would have taken Mollick’s team of MBA students a week to reproduce, so too had ChatGPT 5 Thinking produced a very solid Field Guide to Pydantic AI. So I’ll just post this Guide and move on to the next learning exercise in my very long queue, having learned plenty my the role as its editor.

cto4ai’s Field Guide to Pydantic AI

This Field Guide covers the technical aspects and strategic considerations of Pydantic AI, for CTOs, engineering leaders, and company leadership.

Part 1: Core Features & Framework Comparison

Deep dive into Pydantic AI's core capabilities, architecture, and comparison with competing frameworks like LangGraph, LlamaIndex, and Semantic Kernel.

Pydantic AI v1 — CTO Field Guide (Part 1: Core + Comparison)

Last updated: 2025-09-13 (America/Chicago)

TL;DR

What it is: a Python agent framework from the Pydantic team that brings FastAPI‑style ergonomics, type‑safe I/O, and dependency injection to LLM apps. It’s provider‑agnostic, integrates with MCP, AG‑UI, and OpenTelemetry/Logfire, and now ships v1 API stability.
Sources: Docs, V1 announcement, Changelog & stability policy
Niche: developer‑centric, typed agents with first‑class observability and production surfaces (durable execution, human‑in‑the‑loop UI protocol). Competitors include LangGraph, LlamaIndex, Semantic Kernel, DSPy, CrewAI.
Sources: LangGraph, LlamaIndex Agents, Semantic Kernel Agent Framework, DSPy, CrewAI

What is Pydantic AI?

Pydantic AI is a GenAI agent framework for Python that emphasizes type‑safety, validation, and developer ergonomics (think: FastAPI for agents). You define an Agent with clear input/output types, add function tools, wire dependencies via DI, and run on any supported model provider.

Homepage / Docs: ai.pydantic.dev
Agents (concepts & API): Concepts, API
Dependency Injection: Dependencies
Model providers: Overview

Why it matters

Type‑checked outputs reduce brittle prompt‑parsing.
DI keeps tools and prompts clean and testable.
Provider‑agnostic lets you swap models without rewriting app code.
Observability via OpenTelemetry + Pydantic Logfire gives you traces/spans of prompts, tool calls, and model requests.
Sources: Logfire docs, Logfire product

The niche it fills (and who else is in it)

Pydantic AI focuses on typed, testable, production‑grade agents with built‑in observability and durability.

Framework	Sweet Spot	Typing/DI	Orchestration	Standards & UX	Observability
Pydantic AI	Typed agents; DI; portability	Strong (Pydantic types, DI)	Graphs & multi‑agent patterns; durable exec via Temporal/DBOS	MCP (client/server), AG‑UI, A2A	OTel + Logfire; other OTel backends
LangGraph	Complex agent workflows/state machines	Pythonic types; DI N/A	Graph orchestration, stateful agents	Human‑in‑loop & “time‑travel” in platform	LangSmith integration (platform)
LlamaIndex	Data/RAG‑heavy apps; agents + workflows	Pythonic types; DI varies	Workflows + agents	Ecosystem tools, LlamaCloud	LlamaCloud / community tools
Semantic Kernel	Enterprise SDK across C#/.NET, Python, Java	SDK types; plugin system	Single/multi‑agent patterns	Plugin ecosystem	Azure/enterprise integrations
DSPy	Programmatic LLM pipelines w/ auto‑compile/evals	Declarative modules, compiles programs	Pipelines; can drive agent loops	Research‑centric; evals/optimizers	Ecosystem integrations
CrewAI	Multi‑agent “crews”, YAML first, enterprise builder	Pydantic models for structured output	Crew/flow orchestration	MCP integrations; visual builder	Integrates with Langfuse/others

Sources: LangGraph, LlamaIndex Agents, Semantic Kernel Agent Framework, DSPy, CrewAI Agents

Positioning takeaway: If you want Python‑native typing/validation, clear DI, portable providers, open standards (MCP/A2A/OTel) and observability out‑of‑the‑box, Pydantic AI is the most “software‑engineering‑friendly” choice right now.

When would I use Pydantic AI? (versus alternatives)

Scenario	Choose Pydantic AI if…	Consider Alternative(s)	Notes
Typed contracts to your app (strict JSON / enums / schemas); easy testing	You want first-class typing/validation and FastAPI-style DI, and you’ll swap providers without rewrites	LangGraph + Pydantic models; DSPy for programmatic pipelines	Pydantic AI’s type safety + DI keeps tools/business logic clean and testable
Complex, stateful multi-agent graphs with node-level checkpointing and time-travel	Your graph is moderate and you prefer plain Python + DI	LangGraph	LangGraph excels at large graphs and recovery tooling
Heavy RAG / knowledge work (indexes, retrievers, query engines)	You’ll still orchestrate agents/tools in Python and plug retrieval underneath	LlamaIndex, LangChain RAG	Use Pydantic AI for the agent layer; call into a RAG lib as a tool
Cross-language enterprise SDKs (C#, Java) and Azure tilt	You’re all-Python today	Semantic Kernel	SK shines when you need multi-language parity and Azure integrations
Auto-optimization of prompts/pipelines; researchy LLM programming	You want app-level agents more than compiler-style optimization	DSPy	Pairing Pydantic AI + DSPy is common: DSPy optimizes, Pydantic AI orchestrates
Standards-forward UX & tools (MCP, AG-UI, A2A)	You want out-of-box MCP tools/servers and a consistent agent↔UI protocol	LangGraph (platform features), CrewAI (builder)	Pydantic AI bakes MCP/AG-UI/A2A into the core story
Observability-first (OpenTelemetry, redaction, cost/trace views)	You want vendor-neutral OTel plus deep Logfire hooks	LangSmith (LangChain), Langfuse/Weave (3rd-party)	Pydantic AI has OTel primitives; Logfire adds rich traces/costs
Durable execution / human-in-loop over hours–days	You like Temporal/DBOS-style durability with Python code	LangGraph (checkpointing), bespoke workflow engines	Pydantic AI documents Temporal/DBOS patterns for robust durability
Python/FastAPI shop, want simple DI and clean test seams	Your team’s muscle memory is Pydantic/FastAPI	—	Lowest friction path for Python backends
Low-code multi-agent builder for non-dev teammates	You want a visual builder & hosted flows	CrewAI (builder), LangGraph Studio	Pydantic AI is code-first; great for engineers, less for no-code

Quick checklist: use Pydantic AI when…

You need typed, validated outputs/tools (schemas, enums, auto‑retry on schema fail).
You want FastAPI‑style DI to inject DB clients, secrets, and services cleanly.
You need provider portability (OpenAI ↔ Anthropic ↔ Groq/Bedrock/HF) without rewrites.
You care about observability from day 1 (OpenTelemetry + optional Logfire, cost/redaction).
You need durable/human‑in‑loop flows or standards like MCP/AG‑UI/A2A.

What’s new & unique in v1

API Stability Commitment — no breaking changes until v2 (earliest April 2026).
Source: Changelog & policy
Observability as a first‑class citizen — native OpenTelemetry spans; optional Logfire with deep views and HTTP capture.
Source: Logfire docs
Open standards baked in — MCP, AG‑UI, A2A.
Sources: MCP, AG‑UI, A2A
Durable execution — examples for Temporal and DBOS.
Sources: Overview, Temporal, DBOS
Provider‑agnostic models — OpenAI/Anthropic/Gemini/Groq/Mistral/Cohere/Bedrock/HF.
Source: Model providers

Quick mental model (how it fits together)

Agent: typed instructions + tools + output type.
Source: Agents
Tools: Python functions; input/return types enforce structure; DI injects clients/secrets.
Source: Function Tools
Dependencies (DI): dataclass or Pydantic model passed to runs and tools.
Source: Dependencies
Providers: defined by string IDs (e.g., openai:gpt‑4o, anthropic:claude‑3‑5).
Source: Providers
Observability/Durability/UX: OTel + Logfire; Temporal/DBOS; MCP & AG‑UI.
Sources: Logfire, Durable exec, MCP, AG‑UI

Part 2: Links, Resources & Strategic Analysis

Curated resources, industry perspectives, and strategic considerations for adopting Pydantic AI in your organization.

Pydantic AI v1 — CTO Field Guide (Part 2: Links, Media, Editor’s Angle)

Last updated: 2025‑09‑13 (America/Chicago)

Light comparison notes

Pydantic AI vs. LangGraph

Choose Pydantic AI if you value typing/DI, provider portability and OTel/Logfire out‑of‑box.
Choose LangGraph if your primary challenge is complex multi‑agent orchestration with checkpointing/state machines (and you’re okay leaning on LangSmith/Platform).
Source: LangGraph overview

Pydantic AI vs. LlamaIndex

Choose Pydantic AI for typed agents + DI and provider portability.
Choose LlamaIndex when your app is RAG/data‑heavy and you want its retrieval ecosystem/workflows.
Source: LlamaIndex Agents

Pydantic AI vs. Semantic Kernel

Choose Pydantic AI for Python‑first ergonomics and DI;
Choose SK if you need multi‑language SDKs and Azure‑oriented enterprise patterns.
Sources: SK Agent Framework, SK overview

Pydantic AI vs. DSPy

Choose Pydantic AI for app‑level agents;
Choose DSPy when you want programmatic pipelines that compile to better prompts/weights and integrate into your stack for optimization/evals.
Sources: DSPy, DSPy paper

Pydantic AI vs. CrewAI

Choose Pydantic AI when you want typed agents + DI and standards (MCP/AG‑UI/A2A);
Choose CrewAI for multi‑agent “crew” orchestration and visual builder/enterprise features.
Source: CrewAI Agents

Recent podcast & video appearances (last ~60–90 days)

Software Engineering Radio #676 — Samuel Colvin on Pydantic & Pydantic AI (2025‑07‑10).
Links: Episode page, Apple Podcasts
YouTube (Pydantic channel): Pydantic AI — MCP Sampling (published ~2 months ago).
Link: Video
YouTube (AI Engineer World’s Fair): MCP is all you need — Samuel Colvin (published ~2–3 months ago).
Link: Video

If you want a tighter 60–90 day window by exact publish date, use the timestamps above.

Release notes & ecosystem links

V1 announcement (2025‑09‑04): Blog post
Releases: GitHub releases, PyPI
Stability policy: Changelog (V1 policy & timeline)
Observability: Logfire docs, Logfire product
Standards: MCP, AG‑UI, A2A
Durability: Overview, Temporal, DBOS
Examples: How to run them

Editor’s angle (for cto4.ai)

If you like FastAPI and Pydantic, this is the closest thing to that experience for agents: you get deterministic interfaces (types), clean injection of real services, model portability, and production add‑ons (OTel, Logfire, MCP, AG‑UI, Temporal/DBOS) that reduce time‑to‑prod. The trade‑off vs. a heavy orchestration platform is that you’ll compose more of your own workflow control—by design, it keeps you in normal Python.

Pydantic AI Reaches v1