Back to posts

Pydantic AI Output Validators: Runtime Validation with Dependencies

One of the best things about Pydantic AI is how it integrates with Pydantic's validation system. You get automated output validation with retry when the model produces something invalid. No manual prompt engineering to get the format right, just define your models and let the framework handle the back-and-forth with the LLM.

I've been playing around with the @output_validator decorator and wanted to jot down some notes, particularly about the case where you need runtime dependencies for validation.

A simple example

Here's a setup where I'm generating SQL queries and want to validate them based on a runtime policy:

import os

import logfire
from dotenv import load_dotenv
from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext

load_dotenv()

logfire.configure(token=os.getenv("LOGFIRE_TOKEN"))
logfire.instrument_pydantic_ai()


class SQLOutput(BaseModel):
    sql_query: str


class SQLValidationDeps(BaseModel):
    allow_select_star: bool


agent = Agent[SQLValidationDeps, SQLOutput](
    "anthropic:claude-3-5-haiku-latest",
    output_type=SQLOutput,
    deps_type=SQLValidationDeps,
    system_prompt="Generate PostgreSQL flavored SQL queries based on user input.",
    retries=3,
)


@agent.output_validator
async def validate_sql(
    ctx: RunContext[SQLValidationDeps], output: SQLOutput
) -> SQLOutput:
    """Validate that SQL queries follow the SELECT * policy."""

    if not ctx.deps.allow_select_star and "SELECT *" in output.sql_query.upper():
        raise ModelRetry(
            "SELECT * is not allowed. Please specify explicit column names."
        )

    return output


# Example usage
if __name__ == "__main__":
    # Demo with SELECT * disabled
    result = agent.run_sync(
        "Get all users who were active yesterday",
        deps=SQLValidationDeps(allow_select_star=False),
    )
    print("Result with SELECT * disabled:")
    print(result.output)
    print()

    # Demo with SELECT * enabled
    result = agent.run_sync(
        "Show me all user data", deps=SQLValidationDeps(allow_select_star=True)
    )
    print("Result with SELECT * enabled:")
    print(result.output)

Why output validators instead of field validators?

I initially thought I could just use Pydantic's regular @field_validator on the SQLOutput model. The problem is that field validators run during model instantiation and don't have access to runtime dependencies.

In my actual use case, I needed to run EXPLAIN {query} on the generated SQL to check it used the correct functions available in the database. To do that I needed the user's database credentials, which came in with the request. A @field_validator has no way to access that context since it's not part of the model being validated.

Output validators solve this. They run after the model is created but before it's returned, with full access to the RunContext and dependencies.

The ModelRetry exception

When validation fails, you raise ModelRetry. This tells Pydantic AI to retry the generation with the feedback you provide. It's not just any exception; it's a signal to the framework.

This matters for error handling. If you're wrapping validation logic in try/except, you need to re-raise ModelRetry or it'll get swallowed:

try:
    # some validation logic
    if validation_fails:
        raise ModelRetry("Try again with better parameters")
except ModelRetry:
    # Re-raise ModelRetry exceptions - don't catch them!
    raise
except Exception as e:
    # Handle other exceptions
    logger.error(f"Unexpected error: {e}")
    raise ModelRetry("An unexpected error occurred, please try again")

Retries

The retries=3 parameter on the agent controls how many times it'll try to generate a valid response before giving up. The default is 1, which I found too conservative for complex validation. 3 retries seemed reasonable for my SQL case. Enough to handle simple mistakes like SELECT * when it's not allowed, but not so many that you're burning tokens on fundamentally broken queries.

When the retry limit is hit, you get an UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation exception. The agent keeps trying until either the model produces something that passes validation, or it runs out of retries.