Middleware#
Redis-backed middleware for LangGraph agent workflows, including semantic caching, tool result caching, conversation memory, and semantic routing.
Middleware Implementations#
- class SemanticCacheMiddleware(config)[source]#
Bases:
AsyncRedisMiddlewareMiddleware that caches LLM responses based on semantic similarity.
Uses redisvl.extensions.llmcache.SemanticCache to store and retrieve cached responses. When a request is semantically similar to a previous request (within the distance threshold), the cached response is returned without calling the LLM.
By default, only “final” responses (those without tool_calls) are cached. This prevents caching intermediate responses that require tool execution.
Example
```python from langgraph.middleware.redis import (
SemanticCacheMiddleware, SemanticCacheConfig,
)
- config = SemanticCacheConfig(
redis_url=”redis://localhost:6379”, distance_threshold=0.1, ttl_seconds=3600,
)
middleware = SemanticCacheMiddleware(config)
- async def call_llm(request):
# Your LLM call here return response
# Use middleware result = await middleware.awrap_model_call(request, call_llm) ```
Initialize the semantic cache middleware.
- Parameters:
config (SemanticCacheConfig) – Configuration for the semantic cache.
- async awrap_model_call(request, handler)[source]#
Wrap a model call with semantic caching.
Checks the cache for a semantically similar request. If found, returns the cached response. Otherwise, calls the handler and optionally caches the result.
- Parameters:
request (ModelRequest) – The model request containing messages.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.
- Returns:
The model response (from cache or handler).
- Raises:
Exception – If graceful_degradation is False and cache operations fail.
- Return type:
ModelResponse | AIMessage
- class ToolResultCacheMiddleware(config)[source]#
Bases:
AsyncRedisMiddlewareMiddleware that caches tool call results using exact-match lookup.
Uses direct Redis GET/SET for deterministic tool result caching. When a tool is called with the same arguments as a previous call, the cached result is returned without executing the tool.
Tool caching is especially useful for expensive operations like: - API calls to external services - Database queries - Web searches - Complex calculations
Example
```python from langgraph.middleware.redis import (
ToolResultCacheMiddleware, ToolCacheConfig,
)
- config = ToolCacheConfig(
redis_url=”redis://localhost:6379”, cacheable_tools=[“search”, “calculate”], excluded_tools=[“random_number”], ttl_seconds=3600,
)
middleware = ToolResultCacheMiddleware(config)
- async def execute_tool(request):
# Your tool execution here return result
# Use middleware result = await middleware.awrap_tool_call(request, execute_tool) ```
Initialize the tool cache middleware.
- Parameters:
config (ToolCacheConfig) – Configuration for the tool cache.
- async awrap_model_call(request, handler)[source]#
Pass through model calls without caching.
This method is part of the LangChain AgentMiddleware protocol. Tool cache middleware only caches tool calls, not model calls.
- Parameters:
request (ModelRequest) – The model request.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.
- Returns:
The model response from the handler.
- Return type:
ModelResponse | AIMessage
- async awrap_tool_call(request, handler)[source]#
Wrap a tool call with exact-match caching.
This method is part of the LangChain AgentMiddleware protocol. Checks the cache for an exact tool+args match. If found, returns the cached result. Otherwise, calls the handler and caches the result.
- Parameters:
request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The async function to execute the tool.
- Returns:
The tool result (from cache or handler).
- Raises:
Exception – If graceful_degradation is False and cache operations fail.
- Return type:
ToolMessage | Command
- class ConversationMemoryMiddleware(config)[source]#
Bases:
AsyncRedisMiddlewareMiddleware that injects relevant past messages into context.
Uses redisvl.extensions.message_history.SemanticMessageHistory to store conversation history and retrieve semantically relevant past messages. This enables long-term memory for conversational agents by: - Storing all messages in Redis with vector embeddings - Retrieving relevant past context based on the current query - Injecting context to help the model maintain coherent conversations
Example
```python from langgraph.middleware.redis import (
ConversationMemoryMiddleware, ConversationMemoryConfig,
)
- config = ConversationMemoryConfig(
redis_url=”redis://localhost:6379”, session_tag=”user_123”, top_k=5, distance_threshold=0.7,
)
middleware = ConversationMemoryMiddleware(config)
# Use with your model calls result = await middleware.awrap_model_call(request, call_model) ```
Initialize the conversation memory middleware.
- Parameters:
config (ConversationMemoryConfig) – Configuration for the conversation memory.
- async awrap_model_call(request, handler)[source]#
Wrap a model call with conversation memory.
This method is part of the LangChain AgentMiddleware protocol. Retrieves relevant past messages based on the current query, injects them into the context, and stores the new exchange.
- Parameters:
request (ModelRequest) – The model request containing messages.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.
- Returns:
The model response.
- Raises:
Exception – If graceful_degradation is False and history operations fail.
- Return type:
ModelResponse | AIMessage
- async awrap_tool_call(request, handler)[source]#
Pass through tool calls without memory operations.
This method is part of the LangChain AgentMiddleware protocol. Conversation memory only applies to model calls, not tool calls.
- Parameters:
request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The async function to execute the tool.
- Returns:
The tool result from the handler.
- Return type:
ToolMessage | Command
- class SemanticRouterMiddleware(config)[source]#
Bases:
AsyncRedisMiddlewareMiddleware that routes requests based on semantic similarity.
Uses redisvl.extensions.router.SemanticRouter to classify user intents and route requests to appropriate handlers. This is useful for: - Directing queries to specialized agents - Triggering specific workflows based on intent - Adding routing metadata for downstream processing
Example
```python from langgraph.middleware.redis import (
SemanticRouterMiddleware, SemanticRouterConfig,
)
- routes = [
{“name”: “greeting”, “references”: [“hello”, “hi”, “hey”]}, {“name”: “support”, “references”: [“help”, “issue”, “problem”]},
]
- config = SemanticRouterConfig(
redis_url=”redis://localhost:6379”, routes=routes,
)
middleware = SemanticRouterMiddleware(config)
# Register custom handler for greeting route @middleware.register_route_handler(“greeting”) async def handle_greeting(request, route_match):
return {“content”: “Hello! How can I help you today?”}
Initialize the semantic router middleware.
- Parameters:
config (SemanticRouterConfig) – Configuration for the semantic router.
- async awrap_model_call(request, handler)[source]#
Wrap a model call with semantic routing.
This method is part of the LangChain AgentMiddleware protocol. Determines the route based on the user’s message and either: - Calls a registered route handler if one exists - Adds routing info to the request and calls the default handler
- Parameters:
request (ModelRequest) – The model request containing messages.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.
- Returns:
The model response.
- Raises:
Exception – If graceful_degradation is False and routing fails.
- Return type:
ModelResponse | AIMessage
- async awrap_tool_call(request, handler)[source]#
Pass through tool calls without routing.
This method is part of the LangChain AgentMiddleware protocol. Semantic router only applies to model calls, not tool calls.
- Parameters:
request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The async function to execute the tool.
- Returns:
The tool result from the handler.
- Return type:
ToolMessage | Command
- register_route_handler(route_name, handler=None)[source]#
Register a handler for a specific route.
Can be used as a decorator or called directly.
- Parameters:
route_name (str) – The name of the route to handle.
handler (Callable[[ModelRequest, Dict[str, Any]], Awaitable[ModelResponse | AIMessage]] | None) – Optional handler function. If not provided, returns a decorator.
- Returns:
The handler function, or a decorator if handler not provided.
- Return type:
Callable[[Callable[[ModelRequest, Dict[str, Any]], Awaitable[ModelResponse | AIMessage]]], Callable[[ModelRequest, Dict[str, Any]], Awaitable[ModelResponse | AIMessage]]]
Example
```python # As decorator @middleware.register_route_handler(“greeting”) async def handle_greeting(request, route_match):
return {“content”: “Hello!”}
# Direct registration middleware.register_route_handler(“greeting”, handle_greeting) ```
- class MiddlewareStack(middlewares)[source]#
Bases:
AgentMiddlewareA stack of middleware that chains calls through all middlewares.
Inherits from LangChain’s AgentMiddleware, so can be used directly with create_agent(middleware=[stack]) or as a single middleware entry.
Middleware are applied in order: the first middleware wraps the second, which wraps the third, etc. This means the first middleware’s before-processing runs first, and its after-processing runs last.
Example
```python from langchain.agents import create_agent from langgraph.middleware.redis import (
MiddlewareStack, SemanticCacheMiddleware, ToolResultCacheMiddleware,
)
- stack = MiddlewareStack([
SemanticCacheMiddleware(cache_config), ToolResultCacheMiddleware(tool_config),
])
# Use with create_agent agent = create_agent(
model=”gpt-4o”, tools=[…], middleware=[stack], # Pass stack as middleware
)#
Initialize the middleware stack.
- param middlewares:
List of middleware to chain together.
- async awrap_model_call(request, handler)[source]#
Wrap a model call through all middleware.
This method is part of the LangChain AgentMiddleware protocol.
- Parameters:
request (ModelRequest) – The model request.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The final handler to call the model.
- Returns:
The model response.
- Return type:
ModelResponse | AIMessage
- async awrap_tool_call(request, handler)[source]#
Wrap a tool call through all middleware.
This method is part of the LangChain AgentMiddleware protocol.
- Parameters:
request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The final handler to execute the tool.
- Returns:
The tool result.
- Return type:
ToolMessage | Command
- Parameters:
middlewares (Sequence[AsyncRedisMiddleware])
Base Classes#
- class AsyncRedisMiddleware(config)[source]#
Bases:
AgentMiddleware,Generic[AsyncRedisClientType]Abstract base class for async Redis middleware.
This class provides common functionality for all async Redis-based middleware: - Async Redis client lifecycle management - Lazy initialization with double-checked locking - Graceful degradation on Redis errors - Async context manager support - Default pass-through implementations for model/tool wrapping
Subclasses must implement: - _setup_async(): Called once during initialization to set up resources
Example
```python class MyAsyncMiddleware(AsyncRedisMiddleware):
- async def _setup_async(self) -> None:
# Initialize resources self._cache = SemanticCache(redis_client=self._redis)
config = MiddlewareConfig(redis_url=”redis://localhost:6379”) async with MyAsyncMiddleware(config) as middleware:
result = await middleware.awrap_model_call(request, handler)
Initialize the async middleware.
- Parameters:
config (MiddlewareConfig) – Middleware configuration with Redis connection details.
- Raises:
ValueError – If neither redis_url nor redis_client is provided.
- async awrap_model_call(request, handler)[source]#
Wrap a model call with middleware logic.
This method is part of the LangChain AgentMiddleware protocol. Default implementation passes through to the handler. Subclasses can override to add caching, logging, etc.
- Parameters:
request (ModelRequest) – The model request (typically contains messages).
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.
- Returns:
The model response (ModelResponse or AIMessage).
- Return type:
ModelResponse | AIMessage
- async awrap_tool_call(request, handler)[source]#
Wrap a tool call with middleware logic.
This method is part of the LangChain AgentMiddleware protocol. Default implementation passes through to the handler. Subclasses can override to add caching, logging, etc.
- Parameters:
request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The async function to execute the tool.
- Returns:
The tool result message or command.
- Return type:
ToolMessage | Command
Configuration Types#
- class SemanticCacheConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True, name='llmcache', distance_threshold=0.1, ttl_seconds=None, vectorizer=None, cache_final_only=True, deterministic_tools=None)[source]#
Bases:
MiddlewareConfigConfiguration for SemanticCacheMiddleware.
Uses redisvl.extensions.llmcache.SemanticCache for semantic similarity caching.
- Parameters:
redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
name (str)
distance_threshold (float)
ttl_seconds (int | None)
vectorizer (Any | None)
cache_final_only (bool)
deterministic_tools (List[str] | None)
- name#
Index name for the semantic cache.
- Type:
str
- distance_threshold#
Maximum distance for cache hits (lower = stricter).
- Type:
float
- ttl_seconds#
Time-to-live for cache entries in seconds.
- Type:
int | None
- vectorizer#
Optional vectorizer for embeddings. If not provided, uses default from redisvl.
- Type:
Any | None
- cache_final_only#
If True, only cache responses without tool_calls.
- Type:
bool
- deterministic_tools#
List of tool names whose results are deterministic. When a request contains tool results, cache lookup is only performed if ALL tool results are from tools in this list. If None, cache is always skipped when tool results are present (safest default).
- Type:
List[str] | None
- cache_final_only: bool = True#
- deterministic_tools: List[str] | None = None#
- distance_threshold: float = 0.1#
- name: str = 'llmcache'#
- ttl_seconds: int | None = None#
- vectorizer: Any | None = None#
- class ToolCacheConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True, name='toolcache', distance_threshold=0.1, ttl_seconds=None, vectorizer=None, cacheable_tools=None, excluded_tools=<factory>, volatile_arg_names=None, ignored_arg_names=None, side_effect_prefixes=None)[source]#
Bases:
MiddlewareConfigConfiguration for ToolResultCacheMiddleware.
Uses exact-match Redis GET/SET for deterministic tool result caching. The cache key is
{name}:{tool_name}:{sorted_json_args}.- Parameters:
redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
name (str)
distance_threshold (float)
ttl_seconds (int | None)
vectorizer (Any | None)
cacheable_tools (List[str] | None)
excluded_tools (List[str])
volatile_arg_names (AbstractSet[str] | None)
ignored_arg_names (AbstractSet[str] | None)
side_effect_prefixes (Tuple[str, ...] | None)
- name#
Key prefix for the tool cache.
- Type:
str
- distance_threshold#
Deprecated – ignored. Kept for backward compatibility. Tool cache uses exact-match, not vector similarity.
- Type:
float
- ttl_seconds#
Time-to-live for cache entries in seconds.
- Type:
int | None
- vectorizer#
Deprecated – ignored. Kept for backward compatibility. Tool cache does not use vector embeddings.
- Type:
Any | None
- cacheable_tools#
List of tool names to cache. If None, all tools except excluded_tools are cached.
- Type:
List[str] | None
- excluded_tools#
List of tool names to never cache.
- Type:
List[str]
- volatile_arg_names#
Set of argument names whose presence prevents caching (e.g. {“timestamp”, “now”, “date”}). Checked recursively at any nesting depth. None disables the check.
- Type:
AbstractSet[str] | None
- ignored_arg_names#
Set of argument names to strip from the cache key before serialization (e.g. {“request_id”, “trace_id”}). Stripped at the top level only. None disables stripping.
- Type:
AbstractSet[str] | None
- side_effect_prefixes#
Tuple of tool-name prefixes that indicate side-effecting tools which should never be cached (e.g. (”send_”, “delete_”, “create_”)). None disables the check.
- Type:
Tuple[str, …] | None
- cacheable_tools: List[str] | None = None#
- distance_threshold: float = 0.1#
- excluded_tools: List[str]#
- ignored_arg_names: AbstractSet[str] | None = None#
- name: str = 'toolcache'#
- side_effect_prefixes: Tuple[str, ...] | None = None#
- ttl_seconds: int | None = None#
- vectorizer: Any | None = None#
- volatile_arg_names: AbstractSet[str] | None = None#
- class ConversationMemoryConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True, name='conversation_memory', session_tag=None, top_k=5, distance_threshold=0.7, vectorizer=None, ttl_seconds=None)[source]#
Bases:
MiddlewareConfigConfiguration for ConversationMemoryMiddleware.
Uses redisvl.extensions.session_manager.SemanticSessionManager for semantic message history.
- Parameters:
redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
name (str)
session_tag (str | None)
top_k (int)
distance_threshold (float)
vectorizer (Any | None)
ttl_seconds (int | None)
- name#
Index name for message history.
- Type:
str
- session_tag#
Tag to identify the conversation session.
- Type:
str | None
- top_k#
Number of relevant messages to retrieve.
- Type:
int
- distance_threshold#
Maximum cosine distance for relevant messages. Higher values (e.g. 0.9) are more permissive, lower values (e.g. 0.1) require near-exact matches. Default 0.7 works well for typical conversations.
- Type:
float
- vectorizer#
Optional vectorizer for embeddings.
- Type:
Any | None
- ttl_seconds#
Time-to-live for messages in seconds.
- Type:
int | None
- distance_threshold: float = 0.7#
- name: str = 'conversation_memory'#
- session_tag: str | None = None#
- top_k: int = 5#
- ttl_seconds: int | None = None#
- vectorizer: Any | None = None#
- class SemanticRouterConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True, name='semantic_router', routes=<factory>, vectorizer=None, max_k=3, aggregation_method='avg')[source]#
Bases:
MiddlewareConfigConfiguration for SemanticRouterMiddleware.
Uses redisvl.extensions.router.SemanticRouter for intent-based routing.
- Parameters:
redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
name (str)
routes (List[Dict[str, Any]])
vectorizer (Any | None)
max_k (int)
aggregation_method (str)
- name#
Index name for the router.
- Type:
str
- routes#
List of route configurations. Each route should have: - name: Route identifier - references: List of example phrases for this route - distance_threshold: Optional distance threshold for this route
- Type:
List[Dict[str, Any]]
- vectorizer#
Optional vectorizer for embeddings.
- Type:
Any | None
- max_k#
Maximum number of routes to consider.
- Type:
int
- aggregation_method#
Method to aggregate route scores.
- Type:
str
- aggregation_method: str = 'avg'#
- max_k: int = 3#
- name: str = 'semantic_router'#
- routes: List[Dict[str, Any]]#
- vectorizer: Any | None = None#
- class MiddlewareConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True)[source]#
Bases:
objectBase configuration for all Redis middleware.
- Parameters:
redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
- redis_url#
Redis connection URL. If not provided, redis_client must be set.
- Type:
str | None
- redis_client#
Existing Redis client instance to use.
- Type:
redis.client.Redis | redis.asyncio.client.Redis | None
- connection_args#
Additional arguments for Redis connection.
- Type:
Dict[str, Any] | None
- graceful_degradation#
If True, middleware passes through on Redis errors.
- Type:
bool
- connection_args: Dict[str, Any] | None = None#
- graceful_degradation: bool = True#
- redis_client: Redis | Redis | None = None#
- redis_url: str | None = None#