Streamlit is synchronous. Pydantic AI is built around async/await. This post is a brain dump of what I figured out when trying to make them play nicely together.
The full code for a simple Streamlit x PydanticAI chat UI is provided below. Skip ahead if you already know about sync vs async in Python, or just want the code.
Quick async primer
In normal Python, statements execute one after another. When you call time.sleep(2), the entire interpreter stops and waits. Nothing else can run during those 2 seconds.
Async changes this. When you mark a function with async def, it becomes a coroutine that can be paused and resumed. The key difference: await asyncio.sleep(2) doesn't block the interpreter. It tells the event loop "I'm waiting, run other code while I wait."
import asyncio
import time
async def task_a():
print("Task A starting")
time.sleep(1) # Blocking call — does NOT yield to event loop
print("Task A about to await")
await asyncio.sleep(2) # Now we yield control
print("Task A finished")
async def task_b():
print("Task B starting")
await asyncio.sleep(1)
print("Task B finished")
async def main():
start = time.time()
await asyncio.gather(task_a(), task_b()) # Run concurrently
print(f"Total time: {time.time() - start:.1f} seconds")
asyncio.run(main())
# Sample Output:
# Task A starting
# (1-second pause due to blocking time.sleep)
# Task A about to await
# Task B starting
# Task B finished (after 1 second)
# Task A finished (after 2 seconds more)
# Total time: ~3.0 seconds
The event loop is the scheduler behind all this. It keeps a queue of coroutines, and when one hits an await, it pauses that coroutine and switches to another. When the awaited operation completes, it resumes where it left off.
In the example above, task_a() blocks the entire event loop for 1 second with time.sleep(1) because that's a synchronous call. task_b() can't start until task_a() reaches its await. After that, both run concurrently. Total runtime: about 3 seconds.
Importantly, only one coroutine runs at a time. This isn't threading. Think the GIL. Async only helps when operations can run independently. If one operation depends on the output of another, you still have to wait.
This is why Pydantic AI's streaming in a chat UI is well-suited to async. Each token can be processed as it arrives without waiting for the complete response. But in a batch agentic workflow where one agent depends on the previous agent's output, async wouldn't help with that dependency chain.
Also, async functions must be called from within an async context. You can't await fetch_data() from synchronous code. You need asyncio.run() to bridge the gap, which creates a new event loop, runs the coroutine, then closes the loop.
# Synchronous context - need to create event loop
result = asyncio.run(fetch_data())
# Async context - can await directly
async def main():
result = await fetch_data()
Random thought: it'd be nice if you could use Polars async. Since the lazy API offloads to Rust, seems like it should be possible? Similarly for any C based lib etc.?
Pydantic AI's async design
Pydantic AI's streaming interface is async:
async with agent.run_stream(prompt) as response:
async for text in response.stream_text():
print(text)
This makes sense for LLM applications where you're waiting for tokens to stream back from remote APIs.
The Streamlit bridge
The solution uses asyncio.run() to execute async code within Streamlit's synchronous environment. It's simple once you see it.
import asyncio
import streamlit as st
from pydantic_ai import Agent
from pydantic_ai.messages import (
ModelMessage,
ModelRequest,
ModelResponse,
TextPart,
UserPromptPart,
)
def convert_messages_to_pydantic_ai(messages: list[dict]) -> list[ModelMessage]:
"""Convert simple chat messages to pydantic-ai format"""
pydantic_messages = []
for msg in messages:
if msg["role"] == "user":
pydantic_messages.append(
ModelRequest(parts=[UserPromptPart(content=msg["content"])])
)
elif msg["role"] == "assistant":
pydantic_messages.append(
ModelResponse(parts=[TextPart(content=msg["content"])])
)
return pydantic_messages
async def stream_agent_response(
agent: Agent, prompt: str, message_history: list[ModelMessage] = None
) -> str:
"""Stream response from agent with conversation history support"""
message_placeholder = st.empty()
full_response = ""
async with agent.run_stream(prompt, message_history=message_history) as response:
async for text in response.stream_text():
full_response = text # pydantic-ai gives cumulative text
message_placeholder.markdown(full_response + "▌")
# Remove cursor and show final text
message_placeholder.markdown(full_response)
return full_response
st.set_page_config(page_title="Streamlit + Pydantic-AI", page_icon="⚡")
st.markdown("### Streamlit + Pydantic-AI Streaming Demo")
# Initialise session state
if "messages" not in st.session_state:
st.session_state.messages = []
# Create agent
agent = Agent(
"openai:gpt-4o",
system_prompt="You are a helpful assistant. Be conversational and engaging.",
)
# Display chat history
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.markdown(message["content"])
# Chat input
if prompt := st.chat_input("Type your message..."):
# Add user message to history and display
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.markdown(prompt)
# Stream the assistant response
with st.chat_message("assistant"):
# Convert chat history to pydantic-ai format
message_history = convert_messages_to_pydantic_ai(st.session_state.messages)
# Stream the response using asyncio.run
full_response = asyncio.run(
stream_agent_response(agent, prompt, message_history)
)
# Add assistant message to history
st.session_state.messages.append({"role": "assistant", "content": full_response})
There are two things worth noting about the implementation.
Message format conversion: Streamlit uses simple dicts with "role" and "content" keys. Pydantic AI needs its own ModelRequest and ModelResponse objects. The convert_messages_to_pydantic_ai() function handles that translation.
Streaming with st.empty(): The streaming effect uses st.empty() to create a placeholder that gets updated in real-time. The cursor character "▌" gives visual feedback that the response is still generating.
I initially worried this approach might cause full page reloads after each chunk, which would be terrible for networking costs. I ran tests and verified this doesn't happen.
I'm interested in trying out Pydantic AI with FastHTML instead of Streamlit, but it takes a bit more work to get up and running.