Middleware

Middleware#

Redis-backed middleware for LangGraph agent workflows, including semantic caching, tool result caching, conversation memory, and semantic routing.

Middleware Implementations#

class SemanticCacheMiddleware(config)[source]#

Bases: AsyncRedisMiddleware

Middleware that caches LLM responses based on semantic similarity.

Uses redisvl.extensions.llmcache.SemanticCache to store and retrieve cached responses. When a request is semantically similar to a previous request (within the distance threshold), the cached response is returned without calling the LLM.

By default, only “final” responses (those without tool_calls) are cached. This prevents caching intermediate responses that require tool execution.

Example

```python from langgraph.middleware.redis import (

SemanticCacheMiddleware, SemanticCacheConfig,

)

config = SemanticCacheConfig(: redis_url=”redis://localhost:6379”, distance_threshold=0.1, ttl_seconds=3600,

)

middleware = SemanticCacheMiddleware(config)

async def call_llm(request):: # Your LLM call here return response

# Use middleware result = await middleware.awrap_model_call(request, call_llm) ```

Initialize the semantic cache middleware.

Parameters:: config (SemanticCacheConfig) – Configuration for the semantic cache.

async awrap_model_call(request, handler)[source]#

Wrap a model call with semantic caching.

Checks the cache for a semantically similar request. If found, returns the cached response. Otherwise, calls the handler and optionally caches the result.

Parameters:

request (ModelRequest) – The model request containing messages.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.

Returns:

The model response (from cache or handler).

Raises:

Exception – If graceful_degradation is False and cache operations fail.

Return type:

ModelResponse | AIMessage

class ToolResultCacheMiddleware(config)[source]#

Bases: AsyncRedisMiddleware

Middleware that caches tool call results using exact-match lookup.

Uses direct Redis GET/SET for deterministic tool result caching. When a tool is called with the same arguments as a previous call, the cached result is returned without executing the tool.

Tool caching is especially useful for expensive operations like: - API calls to external services - Database queries - Web searches - Complex calculations

Example

```python from langgraph.middleware.redis import (

ToolResultCacheMiddleware, ToolCacheConfig,

)

config = ToolCacheConfig(: redis_url=”redis://localhost:6379”, cacheable_tools=[“search”, “calculate”], excluded_tools=[“random_number”], ttl_seconds=3600,

)

middleware = ToolResultCacheMiddleware(config)

async def execute_tool(request):: # Your tool execution here return result

# Use middleware result = await middleware.awrap_tool_call(request, execute_tool) ```

Initialize the tool cache middleware.

Parameters:: config (ToolCacheConfig) – Configuration for the tool cache.

async awrap_model_call(request, handler)[source]#

Pass through model calls without caching.

This method is part of the LangChain AgentMiddleware protocol. Tool cache middleware only caches tool calls, not model calls.

Parameters:

request (ModelRequest) – The model request.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.

Returns:

The model response from the handler.

Return type:

ModelResponse | AIMessage

async awrap_tool_call(request, handler)[source]#

Wrap a tool call with exact-match caching.

This method is part of the LangChain AgentMiddleware protocol. Checks the cache for an exact tool+args match. If found, returns the cached result. Otherwise, calls the handler and caches the result.

Parameters:

request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The async function to execute the tool.

Returns:

The tool result (from cache or handler).

Raises:

Exception – If graceful_degradation is False and cache operations fail.

Return type:

ToolMessage | Command

class ConversationMemoryMiddleware(config)[source]#

Bases: AsyncRedisMiddleware

Middleware that injects relevant past messages into context.

Uses redisvl.extensions.message_history.SemanticMessageHistory to store conversation history and retrieve semantically relevant past messages. This enables long-term memory for conversational agents by: - Storing all messages in Redis with vector embeddings - Retrieving relevant past context based on the current query - Injecting context to help the model maintain coherent conversations

Example

```python from langgraph.middleware.redis import (

ConversationMemoryMiddleware, ConversationMemoryConfig,

)

config = ConversationMemoryConfig(: redis_url=”redis://localhost:6379”, session_tag=”user_123”, top_k=5, distance_threshold=0.7,

)

middleware = ConversationMemoryMiddleware(config)

# Use with your model calls result = await middleware.awrap_model_call(request, call_model) ```

Initialize the conversation memory middleware.

Parameters:: config (ConversationMemoryConfig) – Configuration for the conversation memory.

async awrap_model_call(request, handler)[source]#

Wrap a model call with conversation memory.

This method is part of the LangChain AgentMiddleware protocol. Retrieves relevant past messages based on the current query, injects them into the context, and stores the new exchange.

Parameters:

request (ModelRequest) – The model request containing messages.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.

Returns:

The model response.

Raises:

Exception – If graceful_degradation is False and history operations fail.

Return type:

ModelResponse | AIMessage

async awrap_tool_call(request, handler)[source]#

Pass through tool calls without memory operations.

This method is part of the LangChain AgentMiddleware protocol. Conversation memory only applies to model calls, not tool calls.

Parameters:

request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The async function to execute the tool.

Returns:

The tool result from the handler.

Return type:

ToolMessage | Command

class SemanticRouterMiddleware(config)[source]#

Bases: AsyncRedisMiddleware

Middleware that routes requests based on semantic similarity.

Uses redisvl.extensions.router.SemanticRouter to classify user intents and route requests to appropriate handlers. This is useful for: - Directing queries to specialized agents - Triggering specific workflows based on intent - Adding routing metadata for downstream processing

Example

```python from langgraph.middleware.redis import (

SemanticRouterMiddleware, SemanticRouterConfig,

)

routes = [: {“name”: “greeting”, “references”: [“hello”, “hi”, “hey”]}, {“name”: “support”, “references”: [“help”, “issue”, “problem”]},

]

config = SemanticRouterConfig(: redis_url=”redis://localhost:6379”, routes=routes,

)

middleware = SemanticRouterMiddleware(config)

# Register custom handler for greeting route @middleware.register_route_handler(“greeting”) async def handle_greeting(request, route_match):

return {“content”: “Hello! How can I help you today?”}

```

Initialize the semantic router middleware.

Parameters:: config (SemanticRouterConfig) – Configuration for the semantic router.

async awrap_model_call(request, handler)[source]#

Wrap a model call with semantic routing.

This method is part of the LangChain AgentMiddleware protocol. Determines the route based on the user’s message and either: - Calls a registered route handler if one exists - Adds routing info to the request and calls the default handler

Parameters:

request (ModelRequest) – The model request containing messages.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.

Returns:

The model response.

Raises:

Exception – If graceful_degradation is False and routing fails.

Return type:

ModelResponse | AIMessage

async awrap_tool_call(request, handler)[source]#

Pass through tool calls without routing.

This method is part of the LangChain AgentMiddleware protocol. Semantic router only applies to model calls, not tool calls.

Parameters:

request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The async function to execute the tool.

Returns:

The tool result from the handler.

Return type:

ToolMessage | Command

register_route_handler(route_name, handler=None)[source]#

Register a handler for a specific route.

Can be used as a decorator or called directly.

Parameters:

route_name (str) – The name of the route to handle.
handler (Callable[[ModelRequest, Dict[str, Any]], Awaitable[ModelResponse | AIMessage]] | None) – Optional handler function. If not provided, returns a decorator.

Returns:

The handler function, or a decorator if handler not provided.

Return type:

Callable[[Callable[[ModelRequest, Dict[str, Any]], Awaitable[ModelResponse | AIMessage]]], Callable[[ModelRequest, Dict[str, Any]], Awaitable[ModelResponse | AIMessage]]]

Example

```python # As decorator @middleware.register_route_handler(“greeting”) async def handle_greeting(request, route_match):

return {“content”: “Hello!”}

# Direct registration middleware.register_route_handler(“greeting”, handle_greeting) ```

class MiddlewareStack(middlewares)[source]#

Bases: AgentMiddleware

A stack of middleware that chains calls through all middlewares.

Inherits from LangChain’s AgentMiddleware, so can be used directly with create_agent(middleware=[stack]) or as a single middleware entry.

Middleware are applied in order: the first middleware wraps the second, which wraps the third, etc. This means the first middleware’s before-processing runs first, and its after-processing runs last.

Example

```python from langchain.agents import create_agent from langgraph.middleware.redis import (

MiddlewareStack, SemanticCacheMiddleware, ToolResultCacheMiddleware,

)

stack = MiddlewareStack([: SemanticCacheMiddleware(cache_config), ToolResultCacheMiddleware(tool_config),

])

# Use with create_agent agent = create_agent(

model=”gpt-4o”, tools=[…], middleware=[stack], # Pass stack as middleware

)#

Initialize the middleware stack.

param middlewares:: List of middleware to chain together.

async aclose()[source]#

Close all middleware in the stack.

Return type:: None

async awrap_model_call(request, handler)[source]#

Wrap a model call through all middleware.

This method is part of the LangChain AgentMiddleware protocol.

Parameters:

request (ModelRequest) – The model request.
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The final handler to call the model.

Returns:

The model response.

Return type:

ModelResponse | AIMessage

async awrap_tool_call(request, handler)[source]#

Wrap a tool call through all middleware.

This method is part of the LangChain AgentMiddleware protocol.

Parameters:

request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The final handler to execute the tool.

Returns:

The tool result.

Return type:

ToolMessage | Command

Parameters:: middlewares (Sequence[AsyncRedisMiddleware])

Base Classes#

class AsyncRedisMiddleware(config)[source]#

Bases: AgentMiddleware, Generic[AsyncRedisClientType]

Abstract base class for async Redis middleware.

This class provides common functionality for all async Redis-based middleware: - Async Redis client lifecycle management - Lazy initialization with double-checked locking - Graceful degradation on Redis errors - Async context manager support - Default pass-through implementations for model/tool wrapping

Subclasses must implement: - _setup_async(): Called once during initialization to set up resources

Example

```python class MyAsyncMiddleware(AsyncRedisMiddleware):

async def _setup_async(self) -> None:
# Initialize resources self._cache = SemanticCache(redis_client=self._redis)

config = MiddlewareConfig(redis_url=”redis://localhost:6379”) async with MyAsyncMiddleware(config) as middleware:

result = await middleware.awrap_model_call(request, handler)

```

Initialize the async middleware.

Parameters:: config (MiddlewareConfig) – Middleware configuration with Redis connection details.
Raises:: ValueError – If neither redis_url nor redis_client is provided.

async aclose()[source]#

Close the Redis connection if owned by this middleware.

Return type:: None

async awrap_model_call(request, handler)[source]#

Wrap a model call with middleware logic.

This method is part of the LangChain AgentMiddleware protocol. Default implementation passes through to the handler. Subclasses can override to add caching, logging, etc.

Parameters:

request (ModelRequest) – The model request (typically contains messages).
handler (Callable[[ModelRequest], Awaitable[ModelResponse]]) – The async function to call the model.

Returns:

The model response (ModelResponse or AIMessage).

Return type:

ModelResponse | AIMessage

async awrap_tool_call(request, handler)[source]#

Wrap a tool call with middleware logic.

This method is part of the LangChain AgentMiddleware protocol. Default implementation passes through to the handler. Subclasses can override to add caching, logging, etc.

Parameters:

request (ToolCallRequest) – The tool call request.
handler (Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]]) – The async function to execute the tool.

Returns:

The tool result message or command.

Return type:

ToolMessage | Command

Configuration Types#

class SemanticCacheConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True, name='llmcache', distance_threshold=0.1, ttl_seconds=None, vectorizer=None, cache_final_only=True, deterministic_tools=None)[source]#

Bases: MiddlewareConfig

Configuration for SemanticCacheMiddleware.

Uses redisvl.extensions.llmcache.SemanticCache for semantic similarity caching.

Parameters:

redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
name (str)
distance_threshold (float)
ttl_seconds (int | None)
vectorizer (Any | None)
cache_final_only (bool)
deterministic_tools (List[str] | None)

name#

Index name for the semantic cache.

Type:: str

distance_threshold#

Maximum distance for cache hits (lower = stricter).

Type:: float

ttl_seconds#

Time-to-live for cache entries in seconds.

Type:: int | None

vectorizer#

Optional vectorizer for embeddings. If not provided, uses default from redisvl.

Type:: Any | None

cache_final_only#

If True, only cache responses without tool_calls.

Type:: bool

deterministic_tools#

List of tool names whose results are deterministic. When a request contains tool results, cache lookup is only performed if ALL tool results are from tools in this list. If None, cache is always skipped when tool results are present (safest default).

Type:: List[str] | None

cache_final_only: bool = True#

deterministic_tools: List[str] | None = None#

distance_threshold: float = 0.1#

name: str = 'llmcache'#

ttl_seconds: int | None = None#

vectorizer: Any | None = None#

class ToolCacheConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True, name='toolcache', distance_threshold=0.1, ttl_seconds=None, vectorizer=None, cacheable_tools=None, excluded_tools=<factory>, volatile_arg_names=None, ignored_arg_names=None, side_effect_prefixes=None)[source]#

Bases: MiddlewareConfig

Configuration for ToolResultCacheMiddleware.

Uses exact-match Redis GET/SET for deterministic tool result caching. The cache key is {name}:{tool_name}:{sorted_json_args}.

Parameters:

redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
name (str)
distance_threshold (float)
ttl_seconds (int | None)
vectorizer (Any | None)
cacheable_tools (List[str] | None)
excluded_tools (List[str])
volatile_arg_names (AbstractSet[str] | None)
ignored_arg_names (AbstractSet[str] | None)
side_effect_prefixes (Tuple[str, ...] | None)

name#

Key prefix for the tool cache.

Type:: str

distance_threshold#

Deprecated – ignored. Kept for backward compatibility. Tool cache uses exact-match, not vector similarity.

Type:: float

ttl_seconds#

Time-to-live for cache entries in seconds.

Type:: int | None

vectorizer#

Deprecated – ignored. Kept for backward compatibility. Tool cache does not use vector embeddings.

Type:: Any | None

cacheable_tools#

List of tool names to cache. If None, all tools except excluded_tools are cached.

Type:: List[str] | None

excluded_tools#

List of tool names to never cache.

Type:: List[str]

volatile_arg_names#

Set of argument names whose presence prevents caching (e.g. {“timestamp”, “now”, “date”}). Checked recursively at any nesting depth. None disables the check.

Type:: AbstractSet[str] | None

ignored_arg_names#

Set of argument names to strip from the cache key before serialization (e.g. {“request_id”, “trace_id”}). Stripped at the top level only. None disables stripping.

Type:: AbstractSet[str] | None

side_effect_prefixes#

Tuple of tool-name prefixes that indicate side-effecting tools which should never be cached (e.g. (”send_”, “delete_”, “create_”)). None disables the check.

Type:: Tuple[str, …] | None

cacheable_tools: List[str] | None = None#

distance_threshold: float = 0.1#

excluded_tools: List[str]#

ignored_arg_names: AbstractSet[str] | None = None#

name: str = 'toolcache'#

side_effect_prefixes: Tuple[str, ...] | None = None#

ttl_seconds: int | None = None#

vectorizer: Any | None = None#

volatile_arg_names: AbstractSet[str] | None = None#

class ConversationMemoryConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True, name='conversation_memory', session_tag=None, top_k=5, distance_threshold=0.7, vectorizer=None, ttl_seconds=None)[source]#

Bases: MiddlewareConfig

Configuration for ConversationMemoryMiddleware.

Uses redisvl.extensions.session_manager.SemanticSessionManager for semantic message history.

Parameters:

redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
name (str)
session_tag (str | None)
top_k (int)
distance_threshold (float)
vectorizer (Any | None)
ttl_seconds (int | None)

name#

Index name for message history.

Type:: str

session_tag#

Tag to identify the conversation session.

Type:: str | None

top_k#

Number of relevant messages to retrieve.

Type:: int

distance_threshold#

Maximum cosine distance for relevant messages. Higher values (e.g. 0.9) are more permissive, lower values (e.g. 0.1) require near-exact matches. Default 0.7 works well for typical conversations.

Type:: float

vectorizer#

Optional vectorizer for embeddings.

Type:: Any | None

ttl_seconds#

Time-to-live for messages in seconds.

Type:: int | None

distance_threshold: float = 0.7#

name: str = 'conversation_memory'#

session_tag: str | None = None#

top_k: int = 5#

ttl_seconds: int | None = None#

vectorizer: Any | None = None#

class SemanticRouterConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True, name='semantic_router', routes=<factory>, vectorizer=None, max_k=3, aggregation_method='avg')[source]#

Bases: MiddlewareConfig

Configuration for SemanticRouterMiddleware.

Uses redisvl.extensions.router.SemanticRouter for intent-based routing.

Parameters:

redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)
name (str)
routes (List[Dict[str, Any]])
vectorizer (Any | None)
max_k (int)
aggregation_method (str)

name#

Index name for the router.

Type:: str

routes#

List of route configurations. Each route should have: - name: Route identifier - references: List of example phrases for this route - distance_threshold: Optional distance threshold for this route

Type:: List[Dict[str, Any]]

vectorizer#

Optional vectorizer for embeddings.

Type:: Any | None

max_k#

Maximum number of routes to consider.

Type:: int

aggregation_method#

Method to aggregate route scores.

Type:: str

aggregation_method: str = 'avg'#

max_k: int = 3#

name: str = 'semantic_router'#

routes: List[Dict[str, Any]]#

vectorizer: Any | None = None#

class MiddlewareConfig(redis_url=None, redis_client=None, connection_args=None, graceful_degradation=True)[source]#

Bases: object

Base configuration for all Redis middleware.

Parameters:

redis_url (str | None)
redis_client (Redis | Redis | None)
connection_args (Dict[str, Any] | None)
graceful_degradation (bool)

redis_url#

Redis connection URL. If not provided, redis_client must be set.

Type:: str | None

redis_client#

Existing Redis client instance to use.

Type:: redis.client.Redis | redis.asyncio.client.Redis | None

connection_args#

Additional arguments for Redis connection.

Type:: Dict[str, Any] | None

graceful_degradation#

If True, middleware passes through on Redis errors.

Type:: bool

connection_args: Dict[str, Any] | None = None#

graceful_degradation: bool = True#

redis_client: Redis | Redis | None = None#

redis_url: str | None = None#

Version

Middleware

Contents

Middleware#

Middleware Implementations#

)#

Base Classes#

Configuration Types#