Redis 查询引擎性能的最佳实践
注意:
如果您使用的是 Redis Software 或 Redis Cloud,请参阅可扩展 Redis 查询引擎的最佳实践页面。清单
以下是确保 Redis 查询引擎 (RQE) 性能良好的一些基本步骤。
- 创建 Redis 数据模型时要考虑您的查询模式。
- 使用大小计算器确保 Redis 架构的大小已针对预期负载进行调整。
- 为 Redis 节点预置足够的资源(RAM、CPU、网络)以支持预期的最大负载。
- 回顾
FT.INFO
和FT.PROFILE
异常和/或错误的输出。 - 在测试环境中使用实际查询和由 memtier_benchmark 或自定义负载应用程序生成的负载执行负载测试。
索引注意事项
常规
非线程搜索
线程 (查询性能因子或 QPF) 搜索
- 将查询字段和任何投影字段 (
RETURN
或LOAD
) 中。 - 将所有字段设置为
SORTABLE
. - 将 TAG 字段设置为 UNF。
- 可选:将
TEXT
fields 设置为NOSTEM
如果用例支持它。 - 用
DIALECT 4
.
查询优化
- 避免返回大型结果集。用
CURSOR
或LIMIT
. - 避免通配符搜索。
- 避免投影所有字段(例如
LOAD *
).仅投影属于索引架构的字段。 - 如果查询长时间运行,请启用线程 (查询性能因素) 以减少对主 Redis 线程的争用。
验证性能 (FT.PROFILE
)
您可以分析FT.PROFILE
output 以获取有关查询执行情况的见解。
以下信息项可用于分析:
- 总执行时间
- 每个分片的执行时间
- 协调时间(对于多分片环境)
- 将查询细分为基本组件,例如
UNION
和INTERSECT
- 警告,例如
TIMEOUT
反模式
在 RQE 中设计和查询索引时,某些做法可能会阻碍性能、可伸缩性和可维护性。以下是一些需要避免的常见反模式:
- 大型文档:在 Redis 中存储过大的文档会导致数据检索速度变慢并增加内存使用量。尽可能将数据分解为更小、更集中的记录。
- 深度嵌套字段:检索深度嵌套的 JSON 字段或为其编制索引的计算成本很高。使用更扁平的架构以获得更好的性能。
- 大型结果集:获取不必要的大型结果集会给内存和网络资源带来压力。将结果限制为仅需要的结果。
- 通配符:在查询中不加选择地使用通配符模式可能会导致大型且效率低下,尤其是在索引大小很大的情况下。
- 大型投影:在查询结果中包含过多的字段会增加内存开销并减慢查询执行速度。将投影限制为基本字段。
以下示例描述了反模式索引架构和查询,后跟为使用 RQE 实现可扩展性而设计的更正版本。
反模式索引架构
以下架构在可伸缩性和性能方面带来了挑战:
FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles:
SCHEMA $.tags.* as t NUMERIC SORTABLE
$.firstName as name TEXT
$.location as loc GEO
Issues:
- Minimal schema definition: the schema is sparse and lacks fields like
lastName
, id
, and version
that might be frequently queried. This results in additional operations to fetch these fields separately, reducing efficiency.
- Missing
SORTABLE
flag for text fields: sorting operations on unsortable fields require full-text processing, which is slow.
- Wildcard indexing:
$.tags.*
creates a broad index that can lead to excessive memory usage and reduced query performance.
Anti-pattern query
The following query is inefficient and not optimized for vertical scaling:
FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' LOAD * LIMIT 0 10
Issues:
- Wildcard projection (
LOAD *
): retrieving all fields in the result set is inefficient and increases memory usage, especially if the documents are large.
- Unnecessary fields: fields that aren't required for the current operation are still fetched, slowing down execution.
- Lack of advanced query syntax: without specifying a query dialect or leveraging features like tagging, the query may perform unnecessary computations.
Improved index schema
Here’s an optimized schema that adheres to best practices for vertical scaling:
FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles:
SCHEMA $.tags.* as t NUMERIC SORTABLE
$.firstName as name TEXT NOSTEM SORTABLE
$.lastName as lastname TEXT NOSTEM SORTABLE
$.location as loc GEO SORTABLE
$.id as id TAG SORTABLE UNF
$.ver as ver TAG SORTABLE UNF
Improvements:
NOSTEM
for text fields: prevents stemming on fields like firstName
and lastName
to allow for exact matches (e.g., "Smith" stays "Smith").
- Expanded schema: adds commonly queried fields like
lastName
, id
, and version
, making queries more efficient by reducing the need for post-query data retrieval.
TAG
fields: id
and ver
are defined as TAG
fields to support fast filtering with exact matches.
SORTABLE
for all relevant fields: ensures that sorting operations are efficient without requiring full-text scanning.
You might be wondering why $.tags.* as t NUMERIC SORTABLE
is acceptable in the improved schema and it wasn't previously.
The inclusion of $.tags.*
is acceptable when:
- It has a clear purpose: it is actively used in queries, such as filtering on numeric ranges or matching specific values.
- Other fields in the schema complement it: these fields reduce over-reliance on
$.tags.*
for all query operations, distributing the load more evenly.
- Projections and limits are managed carefully: queries that use
$.tags.*
should avoid loading unnecessary fields or returning excessively large result sets.
Improved query
The following query is better suited for vertical scaling:
FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]'
LOAD 6 id t name lastname loc ver
LIMIT 0 10
DIALECT 3
Improvements:
- Targeted projection: the
LOAD
clause specifies only essential fields (id, t, name, lastname, loc, ver
), reducing memory and network overhead.
- Limited results: the
LIMIT
clause ensures the query retrieves only the first 10 results, avoiding large result sets.
DIALECT 3
: enables the latest RQE syntax and features, ensuring compatibility with modern capabilities.
On this page