Advanced Vector Search

This guide covers advanced vector search features for production deployments, including distance normalization, performance tuning, and algorithm-specific parameters.

Distance Normalization

Different distance metrics return different ranges making scores hard to interpret:

COSINE: 0 to 2 (0 = identical, 2 = opposite)
L2: 0 to ∞ (0 = identical, larger = more different)
IP (Inner Product): -∞ to ∞ (larger = more similar)

The normalizeDistance parameter converts all scores to a 0-1 similarity range where 1 = most similar and 0 = least similar.

Basic Usage

import { VectorQuery, VectorDistanceMetric } from 'redis-vl';

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    distanceMetric: VectorDistanceMetric.COSINE,
    normalizeDistance: true, // Convert to 0-1 similarity scores
    numResults: 10,
});

const results = await index.search(query);
results.documents.forEach((doc) => {
    console.log(`Similarity: ${doc.score.toFixed(3)}`); // 0.0 to 1.0
});

When to Use Distance Normalization

✅ Use when:

Displaying similarity scores to users (e.g., "95% match" in search results)
Comparing results across different distance metrics
Need consistent 0-1 score ranges for thresholding or ranking
Building recommendation systems where similarity percentage matters
A/B testing or analytics where scores need to be interpretable

⚠️ Consider not using when:

Using Inner Product (IP) with non-normalized vectors (results may be unreliable)
Need raw distances for debugging or mathematical analysis
Performance is critical (adds minimal overhead but still post-processing)
Working with algorithms that expect raw distance values

Distance Metric Normalization

COSINE Distance:

similarity = 1 - (distance / 2)

L2 Distance:

similarity = 1 / (1 + distance)

Inner Product (IP):

// Only works reliably with normalized vectors
similarity = (distance + 1) / 2

Example with Different Metrics

COSINE (Recommended)
L2 (Euclidean)
IP (Inner Product)

Best for: Most semantic search use cases, text embeddings

import { VectorQuery, VectorDistanceMetric } from 'redis-vl';

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    distanceMetric: VectorDistanceMetric.COSINE,
    normalizeDistance: true,
    numResults: 5,
});

const results = await index.search(query);
// Scores in [0, 1] range where 1 = most similar

Why use COSINE:

Works well with embeddings from language models
Handles different vector magnitudes
Most common choice for semantic search

Best for: Clustering, exact matching, image embeddings

import { VectorQuery, VectorDistanceMetric } from 'redis-vl';

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    distanceMetric: VectorDistanceMetric.L2,
    normalizeDistance: true,
    numResults: 5,
});

const results = await index.search(query);
// Scores in [0, 1] range where 1 = most similar

Why use L2:

Actual geometric distance between vectors
Good for computer vision tasks
Sensitive to vector magnitude

Best for: Pre-normalized vectors, maximum inner product search (MIPS)

import { VectorQuery, VectorDistanceMetric } from 'redis-vl';

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    distanceMetric: VectorDistanceMetric.IP,
    normalizeDistance: true, // ⚠️ Only reliable with normalized vectors
    numResults: 5,
});

const results = await index.search(query);
// Scores in [0, 1] range where 1 = most similar

Why use IP:

Faster than COSINE for normalized vectors
Use when all vectors are unit length
⚠️ Warning: Normalization with IP requires vectors to be normalized

HNSW Algorithm Tuning

HNSW (Hierarchical Navigable Small World) is the most popular vector index algorithm. These parameters control the accuracy vs speed tradeoff during search.

efRuntime Parameter

Controls the size of the dynamic candidate list during search. Higher values = better recall but slower queries.

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    efRuntime: 200, // Default varies by index size
    numResults: 10,
});

Guidelines:

Small indexes (under 10,000 vectors): efRuntime = 100-200
Medium indexes (10,000 to 1M vectors): efRuntime = 200-500
Large indexes (over 1M vectors): efRuntime = 500-1000
Rule of thumb: Start with 2-5x your numResults

Combined HNSW Tuning

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    efRuntime: 300,
    numResults: 10,
});

Hybrid Search Policy

When combining vector search with metadata filters, the query execution policy significantly impacts performance.

hybridPolicy Parameter

BATCHES Policy
ADHOC_BF Policy

Best for: Selective filters (filter keeps FEW documents)

Iteratively searches vector index in batches, filtering results until K matches found.

import { VectorQuery } from 'redis-vl';

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    filter: '@category:{electronics}', // Keeps only 10% of data
    hybridPolicy: 'BATCHES',
    numResults: 10,
});

When to use (selective filters):

Filter keeps minority of documents
Example: @category:{niche_category} keeps 5% of docs
Example: @isPremium:{true} when only 10% are premium
Result: Small candidate set (e.g., 10,000 out of 100,000 docs)

How it works: Vector search in batches → filter each batch → stop when K results found

Performance: Fast when filter is selective (keeps few docs)

Best for: Non-selective filters (filter keeps MOST documents)

Computes vector scores for all documents passing the filter (brute force on filtered set).

import { VectorQuery } from 'redis-vl';

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    filter: '@price:[0 1000000]', // Keeps 95% of data
    hybridPolicy: 'ADHOC_BF',
    numResults: 10,
});

When to use (non-selective filters):

Filter keeps majority of documents
Example: @price:[0 1000000] when most items are under $1M (keeps 95%)
Example: @inStock:{true} when 90% of items are in stock
Result: Large candidate set (e.g., 90,000 out of 100,000 docs)

How it works: Filter first → compute vector scores for all filtered docs → return top K

Performance: Fast when filter is non-selective (keeps most docs)

Impact: Choosing the wrong policy can result in 5-10x slower queries!

Rule of Thumb:

Selective filter (keeps FEW docs) → Use BATCHES
Non-selective filter (keeps MOST docs) → Use ADHOC_BF

batchSize Parameter

When using BATCHES policy, control the batch processing size:

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    filter: '@category:{electronics}',
    hybridPolicy: 'BATCHES',
    batchSize: 100, // Process 100 documents per batch
    numResults: 10,
});

Guidelines:

Default: Redis determines automatically
Small batches (50-100): Better memory usage
Large batches (500-1000): Better throughput
Note: Only works with BATCHES policy

SVS-VAMANA Algorithm Tuning

SVS-VAMANA is Redis's newest algorithm optimized for billion-scale vector search. These parameters are only relevant if your index uses VAMANA.

searchWindowSize Parameter

Controls the search window size for KNN queries.

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    searchWindowSize: 100,
    numResults: 10,
});

Guidelines:

Larger values improve recall but increase latency
Start with 2-5x your numResults
Tune based on your recall requirements

useSearchHistory Parameter

Controls whether to use search history for improved performance.

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    useSearchHistory: 'AUTO', // 'OFF', 'ON', or 'AUTO'
    numResults: 10,
});

Options:

OFF: Disable search history
ON: Always use search history
AUTO: Let Redis decide based on query patterns (recommended)

searchBufferCapacity Parameter

Tunes the internal buffer for compressed vectors.

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    searchBufferCapacity: 1000,
    numResults: 10,
});

Guidelines:

Higher values: Better performance with memory overhead
Lower values: Lower memory with potential performance cost
Typically: 100-2000 depending on your workload

Combined VAMANA Tuning

const query = new VectorQuery({
    vector: embedding,
    vectorField: 'embedding',
    searchWindowSize: 200,
    useSearchHistory: 'AUTO',
    searchBufferCapacity: 1000,
    numResults: 10,
});

Complete Production Example

Here's a production-ready query combining multiple advanced features:

import { VectorQuery, VectorDistanceMetric, HuggingFaceVectorizer } from 'redis-vl';

// Generate embedding
const vectorizer = new HuggingFaceVectorizer({
    model: 'Xenova/all-MiniLM-L6-v2',
});
const queryEmbedding = await vectorizer.embed('laptop computer');

// Create optimized query
const query = new VectorQuery({
    // Basic parameters
    vector: queryEmbedding,
    vectorField: 'embedding',
    numResults: 20,
    returnFields: ['title', 'price', 'category'],

    // Distance normalization for user-friendly scores
    distanceMetric: VectorDistanceMetric.COSINE,
    normalizeDistance: true,

    // HNSW tuning for better recall
    efRuntime: 300,

    // Hybrid search with high-selectivity filter
    filter: '@category:{electronics} @price:[0 1000]',
    hybridPolicy: 'BATCHES',
    batchSize: 100,
});

// Execute search
const results = await index.search(query);

// Process results with normalized scores
results.documents.forEach((doc, i) => {
    console.log(`${i + 1}. ${doc.value.title}`);
    console.log(`   Similarity: ${(doc.score * 100).toFixed(1)}%`);
    console.log(`   Price: $${doc.value.price}`);
});

Performance Tuning Checklist

Before Production

Test different efRuntime values to find accuracy/speed balance
Choose correct hybridPolicy based on filter selectivity
Enable normalizeDistance if you need interpretable 0-1 similarity scores
Set appropriate batchSize for your workload
Monitor query latency and adjust parameters

Monitoring

Track these metrics:

Query latency (p50, p95, p99)
Recall rate (if you have ground truth)
Filter selectivity (% of data matching filter)
Memory usage with different buffer sizes

Common Issues

Slow queries with filters:

Try switching hybridPolicy (BATCHES ↔ ADHOC_BF)
Increase batchSize if using BATCHES
Check filter selectivity

Low recall:

Increase efRuntime (HNSW)
Increase searchWindowSize (VAMANA)
Verify distance metric matches index

High memory usage:

Decrease searchBufferCapacity (VAMANA)
Decrease batchSize
Use smaller efRuntime

Next Steps

Search Index - Basic vector search
Schema - Define vector fields
Vectorizers - Generate embeddings
API Reference - Complete API documentation

Distance Normalization​

Basic Usage​

When to Use Distance Normalization​

Distance Metric Normalization​

Example with Different Metrics​

HNSW Algorithm Tuning​

efRuntime Parameter​

Combined HNSW Tuning​

Hybrid Search Policy​

hybridPolicy Parameter​

batchSize Parameter​

SVS-VAMANA Algorithm Tuning​

searchWindowSize Parameter​

useSearchHistory Parameter​

searchBufferCapacity Parameter​

Combined VAMANA Tuning​

Complete Production Example​

Performance Tuning Checklist​

Before Production​

Monitoring​

Common Issues​

Next Steps​

Distance Normalization

Basic Usage

When to Use Distance Normalization

Distance Metric Normalization

Example with Different Metrics

HNSW Algorithm Tuning

efRuntime Parameter

Combined HNSW Tuning

Hybrid Search Policy

hybridPolicy Parameter

batchSize Parameter

SVS-VAMANA Algorithm Tuning

searchWindowSize Parameter

useSearchHistory Parameter

searchBufferCapacity Parameter

Combined VAMANA Tuning

Complete Production Example

Performance Tuning Checklist

Before Production

Monitoring

Common Issues

Next Steps