数据非规范化
了解非规范化策略
源数据库中的数据通常是规范化的。 这意味着列不能具有复合值(例如数组)和实体之间的关系 表示为不同 table 之间的主键到外键的映射。 规范化数据模型减少了冗余并提高了写入查询的数据完整性,但这是随之而来的 以牺牲速度为代价。 另一方面,Redis 缓存专注于加快读取查询的速度,因此 RDI 提供了数据非规范化来帮助实现这一点。
嵌套策略
嵌套是 RDI 用于对源数据库中的多对一关系进行非规范化的策略。
它通过表示
父对象(“一个”)作为 JSON 文档,子对象(“多个”)嵌套在 JSON 映射中
属性。下图显示了映射中子对象的嵌套
叫InvoiceLineItems
:

您可以使用nest
块,如以下示例所示:
source:
server_name: chinook # Optional. Use the value of `debezium.source.topic.prefix` property in Debezium's `application.properties`
schema: public
table: InvoiceLine
output:
- uses: redis.write
with:
nest: # cannot co-exist with other parameters such as 'key'
parent:
# server_name: chinook
# schema: public
table: Invoice
nesting_key: InvoiceLineId # cannot be composite
parent_key: InvoiceId # cannot be composite
path: $.InvoiceLineItems # path must start from document root ($)
structure: map # only map supported for now
on_update: merge # only merge supported for now
data_type: json # only json supported for now
The job has a
with
section under output
that includes the nest
block.
The job must include the following attributes in the nest
block:
parent
: This specifies the RDI data stream for the parent entities. Typically, you only
need to supply the parent table
name, unless you are nesting children under a parent that comes from
a different source database. If you do this then you must also specify server_name
and
schema
attributes.
nesting-key
: The field of the child entity that stores the unique ID (primary key) of the child entity.
parent-key
: The field in the parent entity that stores the unique ID (foreign key) of the parent entity.
child_key
: The field in the child entity that stores the unique ID (foreign key) of the parent entity.
You only need to add this attribute if the name of the child's foreign key field is different from the parent's.
path
: The JSONPath
for the map where you want to store the child entities. The path must start with the $
character, which denotes
the document root.
structure
: (Optional) The type of JSON nesting structure for the child entities. Currently, only a JSON map
is supported so if you supply this attribute then the value must be map
.
There are several important things to note when you use nesting:
-
When you specify nest
in the job, you must also set the data_type
attribute to json
and
the on_update
attribute to merge
in the surrounding output
block.
-
Key expressions are not supported for the nest
output blocks. The parent key is always calculated
using the following template:
<nest.parent.table>:<nest.parent_key>:<nest.parent_key.value | nest.child_key.value>
For example:
Invoice:InvoiceId:1
-
If you specify expire
in the nest
output block then this will set the expiration on the parent object.
-
You can only use one level of nesting.
-
If you are using PostgreSQL then you must make the following change for all child tables that you want to nest:
ALTER TABLE <TABLE_NAME> REPLICA IDENTITY FULL;
This configuration affects the information written to the write-ahead log (WAL) and whether it is available
for RDI to capture. By default, PostgreSQL only records
modified fields in the log, which means that it might omit the parent_key
. This can cause incorrect updates to the
Redis key in the destination database.
See the
Debezium PostgreSQL Connector Documentation
for more information about this.
On this page