图式

RedisVL 中的 Schema 提供了一种结构化格式来定义索引设置和 使用以下三个组件的字段配置:

元件 描述
版本 架构规范的版本。当前支持的版本为 0.1.0。
指数 索引特定设置,如名称、键前缀、键分隔符和存储类型。
领域 数据中要包含在索引中的字段子集和任何自定义设置。

IndexSchema

class IndexSchema(*, index, fields={}, version='0.1.0')

Redis 中搜索索引的架构定义,在 RedisVL 中用于 配置索引设置并组织矢量和元数据字段。

该类提供了从 YAML 文件或 Python 字典,支持灵活的 schema 定义和简单的 集成到各种工作流程中。

示例 schema.yaml 文件可能如下所示:

version: '0.1.0'

index:
    name: user-index
    prefix: user
    key_separator: ":"
    storage_type: json

fields:
    - name: user
      type: tag
    - name: credit_score
      type: tag
    - name: embedding
      type: vector
      attrs:
        algorithm: flat
        dims: 3
        distance_metric: cosine
        datatype: float32

Loading the schema for RedisVL from yaml is as simple as:

from redisvl.schema import IndexSchema

schema = IndexSchema.from_yaml("schema.yaml")

Loading the schema for RedisVL from dict is as simple as:

from redisvl.schema import IndexSchema

schema = IndexSchema.from_dict({
    "index": {
        "name": "user-index",
        "prefix": "user",
        "key_separator": ":",
        "storage_type": "json",
    },
    "fields": [
        {"name": "user", "type": "tag"},
        {"name": "credit_score", "type": "tag"},
        {
            "name": "embedding",
            "type": "vector",
            "attrs": {
                "algorithm": "flat",
                "dims": 3,
                "distance_metric": "cosine",
                "datatype": "float32"
            }
        }
    ]
})

NOTE

The fields attribute in the schema must contain unique field names to ensure correct and unambiguous field references.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

  • Parameters:
    • index (IndexInfo)
    • fields (Dict [ str , BaseField ])
    • version (Literal [ '0.1.0' ])

add_field(field_inputs)

Adds a single field to the index schema based on the specified field type and attributes.

This method allows for the addition of individual fields to the schema, providing flexibility in defining the structure of the index.

  • Parameters: field_inputs (Dict [ str , Any ]) – A field to add.
  • Raises: ValueError – If the field name or type are not provided or if the name already exists within the schema.
# Add a tag field
schema.add_field({"name": "user", "type": "tag})

# Add a vector field
schema.add_field({
    "name": "user-embedding",
    "type": "vector",
    "attrs": {
        "dims": 1024,
        "algorithm": "flat",
        "datatype": "float32"
    }
})

add_fields(fields)

Extends the schema with additional fields.

This method allows dynamically adding new fields to the index schema. It processes a list of field definitions.

  • Parameters: fields (List [ Dict [ str , Any ] ]) – A list of fields to add.
  • Raises: ValueError – If a field with the same name already exists in the schema.
schema.add_fields([
    {"name": "user", "type": "tag"},
    {"name": "bio", "type": "text"},
    {
        "name": "user-embedding",
        "type": "vector",
        "attrs": {
            "dims": 1024,
            "algorithm": "flat",
            "datatype": "float32"
        }
    }
])

classmethod from_dict(data)

Create an IndexSchema from a dictionary.

  • Parameters: data (Dict [ str , Any ]) – The index schema data.
  • Returns: The index schema.
  • Return type: IndexSchema
from redisvl.schema import IndexSchema

schema = IndexSchema.from_dict({
    "index": {
        "name": "docs-index",
        "prefix": "docs",
        "storage_type": "hash",
    },
    "fields": [
        {
            "name": "doc-id",
            "type": "tag"
        },
        {
            "name": "doc-embedding",
            "type": "vector",
            "attrs": {
                "algorithm": "flat",
                "dims": 1536
            }
        }
    ]
})

classmethod from_yaml(file_path)

Create an IndexSchema from a YAML file.

  • Parameters: file_path (str) – The path to the YAML file.
  • Returns: The index schema.
  • Return type: IndexSchema
from redisvl.schema import IndexSchema
schema = IndexSchema.from_yaml("schema.yaml")

remove_field(field_name)

Removes a field from the schema based on the specified name.

This method is useful for dynamically altering the schema by removing existing fields.

  • Parameters: field_name (str) – The name of the field to be removed.

to_dict()

Serialize the index schema model to a dictionary, handling Enums and other special cases properly.

  • Returns: The index schema as a dictionary.
  • Return type: Dict[str, Any]

to_yaml(file_path, overwrite=True)

Write the index schema to a YAML file.

  • Parameters:
    • file_path (str) – The path to the YAML file.
    • overwrite (bool) – Whether to overwrite the file if it already exists.
  • Raises: FileExistsError – If the file already exists and overwrite is False.
  • Return type: None

property field_names: List[str]

A list of field names associated with the index schema.

  • Returns: A list of field names from the schema.
  • Return type: List[str]

fields: Dict[str, BaseField]

Fields associated with the search index and their properties

index: IndexInfo

Details of the basic index configurations.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

version: Literal['0.1.0']

Version of the underlying index schema.

Defining Fields

Fields in the schema can be defined in YAML format or as a Python dictionary, specifying a name, type, an optional path, and attributes for customization.

YAML Example:

- name: title
  type: text
  path: $.document.title
  attrs:
    weight: 1.0
    no_stem: false
    withsuffixtrie: true

Python Dictionary Example:

{
    "name": "location",
    "type": "geo",
    "attrs": {
        "sortable": true
    }
}

Supported Field Types and Attributes

Each field type supports specific attributes that customize its behavior. Below are the field types and their available attributes:

Text Field Attributes:

  • weight: Importance of the field in result calculation.
  • no_stem: Disables stemming during indexing.
  • withsuffixtrie: Optimizes queries by maintaining a suffix trie.
  • phonetic_matcher: Enables phonetic matching.
  • sortable: Allows sorting on this field.

Tag Field Attributes:

  • separator: Character for splitting text into individual tags.
  • case_sensitive: Case sensitivity in tag matching.
  • withsuffixtrie: Suffix trie optimization for queries.
  • sortable: Enables sorting based on the tag field.

Numeric and Geo Field Attributes:

  • Both numeric and geo fields support the sortable attribute, enabling sorting on these fields.

Common Vector Field Attributes:

  • dims: Dimensionality of the vector.
  • algorithm: Indexing algorithm (flat or hnsw).
  • datatype: Float datatype of the vector (bfloat16, float16, float32, float64).
  • distance_metric: Metric for measuring query relevance (COSINE, L2, IP).

HNSW Vector Field Specific Attributes:

  • m: Max outgoing edges per node in each layer.
  • ef_construction: Max edge candidates during build time.
  • ef_runtime: Max top candidates during search.
  • epsilon: Range search boundary factor.
Note:
See fully documented Redis-supported fields and options here: https://redis.io/commands/ft.create/
RATE THIS PAGE
Back to top ↑