Redis OpenAI Agents#
Production-ready Redis integrations for the OpenAI Agents SDK. Replace 5+ separate systems with a single Redis deployment.
Session Storage
Persistent, distributed conversation storage with AgentSession and JSONSession.
Semantic Caching
Reduce LLM costs by 25%+ with two-level semantic caching.
Semantic Routing
Route queries to agents using vector similarity - no LLM calls required.
Vector Search
Build RAG applications with Redis vector search (HNSW).
Token Streaming
Reliable, replayable token streaming via Redis Streams.
Observability
Built-in metrics with RedisTimeSeries and Prometheus.
Middleware
Around-style middleware for the Agents SDK: cache, router, composition.
Installation#
Install redis-openai-agents into your Python (>=3.10) environment using pip:
pip install redis-openai-agents
Then make sure to have Redis accessible with Search & Query features enabled on Redis Cloud or locally in docker. The official redis:8 image includes Search, JSON, Time Series, and Bloom filters:
docker run -d --name redis -p 6379:6379 redis:8
For a GUI, run Redis Insight separately: docker run -d --name redisinsight -p 5540:5540 redis/redisinsight:latest.
Quick Start#
Agent Sessions#
Replace SQLite sessions with Redis for persistent, distributed conversation storage:
from agents import Agent, Runner
from redis_openai_agents import AgentSession
# Create a session
session = AgentSession.create(
user_id="user_123",
redis_url="redis://localhost:6379"
)
# Define your agent
agent = Agent(name="assistant", instructions="You are a helpful assistant.")
# Run the agent
result = await Runner.run(agent, input="Hello!")
# Store the conversation
session.store_agent_result(result)
Semantic Caching#
Reduce LLM costs by caching responses for similar queries:
from redis_openai_agents import SemanticCache
cache = SemanticCache(
redis_url="redis://localhost:6379",
similarity_threshold=0.9,
ttl=3600
)
# Check cache before calling LLM
result = cache.get(query="What is the capital of France?")
if result:
print(f"Cache hit: {result.response}")
else:
response = "Paris is the capital of France."
cache.set(query="What is the capital of France?", response=response)
Semantic Routing#
Route queries to the appropriate agent using vector similarity:
from redis_openai_agents import SemanticRouter, Route
router = SemanticRouter(
name="support-router",
redis_url="redis://localhost:6379",
routes=[
Route(
name="billing",
references=["payment issue", "invoice", "refund request"],
metadata={"agent": "billing_agent"},
distance_threshold=0.3
),
Route(
name="technical",
references=["bug report", "error message", "not working"],
metadata={"agent": "tech_agent"},
distance_threshold=0.3
),
]
)
# Route a query (vector lookup, not LLM call)
match = router.route("I need help with my payment")
print(f"Route to: {match.name}") # "billing"
Why Redis OpenAI Agents?#
Challenge |
Without Redis |
With Redis OpenAI Agents |
|---|---|---|
Session Storage |
SQLite (single-node) |
Distributed Redis sessions |
Caching |
None or external service |
Built-in semantic cache |
Vector Search |
Pinecone, Qdrant ($70+/mo) |
Redis Vector Search (free) |
Streaming |
Custom WebSocket code |
Redis Streams (reliable) |
Metrics |
Prometheus + Grafana setup |
Built-in TimeSeries |
Total Services |
5+ separate systems |
1 Redis deployment |