Skip to main content

Schema

Schema in RedisVL provides a structured format to define index settings and field configurations. A schema consists of three main components:

ComponentDescription
indexIndex-specific settings like name, key prefix, and storage type
fieldsField definitions with types and attributes
versionSchema version (currently 0.1.0)

Creating a Schema

From Object

The most common way to create a schema is using IndexSchema.fromObject():

import { IndexSchema } from 'redisvl';

const schema = IndexSchema.fromObject({
index: {
name: 'products',
prefix: 'product:',
storage_type: 'hash',
},
fields: [
{ name: 'title', type: 'text' },
{ name: 'category', type: 'tag' },
{ name: 'price', type: 'numeric' },
],
});

From YAML

For complex schemas, YAML files provide better readability:

# schema.yaml
version: '0.1.0'

index:
name: products
prefix: product:
storage_type: hash

fields:
- name: title
type: text
- name: category
type: tag
- name: price
type: numeric
const schema = IndexSchema.fromYAML('schema.yaml');

Programmatically

Build schemas dynamically by adding fields:

const schema = new IndexSchema({
name: 'products',
prefix: 'product:',
storageType: 'hash',
});

// Add single field
schema.addField({ name: 'title', type: 'text' });

// Add multiple fields
schema.addFields([
{ name: 'category', type: 'tag' },
{ name: 'price', type: 'numeric' },
{
name: 'embedding',
type: 'vector',
attrs: { dims: 768, algorithm: 'hnsw', distance_metric: 'cosine' },
},
]);

Field Types

RedisVL supports several field types for different data:

Text Fields

Full-text search with stemming and phonetic matching:

{
name: 'description',
type: 'text',
attrs: {
weight: 2.0, // Boost importance in search
no_stem: false, // Enable stemming
}
}

Tag Fields

Exact-match filtering for categorical data:

{
name: 'category',
type: 'tag',
attrs: {
separator: ',', // Multi-value separator
case_sensitive: false
}
}

Numeric Fields

Range queries and sorting:

{
name: 'price',
type: 'numeric',
attrs: {
sortable: true
}
}

Geo Fields

Location-based search:

{
name: 'location',
type: 'geo'
}

Vector Fields

Semantic similarity search using vector embeddings. RedisVL provides three approaches to define vector fields.

Best for: Prototypes, config files, quick development

Simple and concise for basic vector fields:

{
name: 'embedding',
type: 'vector',
attrs: {
algorithm: 'hnsw',
dims: 768,
distance_metric: 'cosine',
datatype: 'float32'
}
}

Pros:

  • ✅ Simple and readable
  • ✅ Works with IndexSchema.fromObject()
  • ✅ Good for configuration files (YAML/JSON)

Cons:

  • ❌ No type checking on attributes
  • ❌ Easy to make mistakes with attribute names

Vector Field Parameters

Common Parameters (all approaches):

  • name: Field name in Redis
  • dims: Vector dimensions (must match embedding model)
  • distanceMetric: COSINE, L2, or IP
  • datatype: float32 (default) or float64

HNSW-Specific Parameters:

  • m: Number of bi-directional links per node (default: 16)
    • Higher = better recall, more memory
    • Typical range: 8-64
  • efConstruction: Build-time accuracy parameter (default: 200)
    • Higher = better index quality, slower build
    • Typical range: 100-500

Storage Types

HASH Storage

Best for simple, flat data structures:

{
index: {
name: 'users',
storage_type: 'hash'
}
}

Characteristics:

  • ✅ Fast and memory-efficient
  • ✅ Good for simple key-value pairs
  • ❌ Cannot store nested objects or arrays
  • ❌ All values stored as strings

JSON Storage

Best for complex, nested data:

{
index: {
name: 'products',
storage_type: 'json'
},
fields: [
// JSONPath syntax for nested fields
{ name: '$.title', type: 'text', attrs: { as: 'title' } },
{ name: '$.category', type: 'tag', attrs: { as: 'category' } },
{ name: '$.metadata.rating', type: 'numeric', attrs: { as: 'rating' } },
{ name: '$.embedding', type: 'vector', attrs: {
as: 'embedding',
dims: 384,
algorithm: 'hnsw',
distanceMetric: 'cosine'
}},
]
}

JSON Path Syntax:

  • Use $.fieldName for top-level fields
  • Use $.parent.child for nested fields
  • Use attrs.as to define the field alias for queries
  • The as attribute is the name you'll use in search queries

Example Document:

{
"title": "Laptop",
"category": "electronics",
"metadata": {
"rating": 4.5,
"reviews": 120
},
"embedding": [0.1, 0.2, ...]
}

Querying:

// Use the alias, not the JSON path
const query = new VectorQuery({
vector: embedding,
vectorField: 'embedding', // ← Uses alias from attrs.as
filter: '@category:{electronics} @rating:[4.0 5.0]',
numResults: 10,
});

Characteristics:

  • ✅ Supports nested objects and arrays
  • ✅ Type preservation (numbers, booleans, arrays)
  • ✅ JSONPath queries with full path support
  • ✅ Field aliases for cleaner query syntax
  • ❌ Slightly higher memory usage than HASH

Error Handling

Schema validation errors are thrown when invalid field configurations are provided:

import { SchemaValidationError } from 'redisvl';

try {
const schema = IndexSchema.fromObject({
index: { name: 'test', prefix: 'doc:', storage_type: 'hash' },
fields: [
{ name: 'embedding', type: 'vector', attrs: { dims: -1 } }, // Invalid dims
],
});
} catch (error) {
if (error instanceof SchemaValidationError) {
console.error('Invalid schema:', error.message);
}
}

Next Steps