数据非规范化

了解非规范化策略

源数据库中的数据通常是规范化的。 这意味着列不能具有复合值(例如数组)和实体之间的关系 表示为不同 table 之间的主键到外键的映射。 规范化数据模型减少了冗余并提高了写入查询的数据完整性,但这是随之而来的 以牺牲速度为代价。 另一方面,Redis 缓存专注于加快读取查询的速度,因此 RDI 提供了数据非规范化来帮助实现这一点。

嵌套策略

嵌套是 RDI 用于对源数据库中的多对一关系进行非规范化的策略。 它通过表示 父对象(“一个”)作为 JSON 文档,子对象(“多个”)嵌套在 JSON 映射中 属性。下图显示了映射中子对象的嵌套 叫InvoiceLineItems:

您可以使用nest块,如以下示例所示:

source:
  server_name: chinook # Optional. Use the value of `debezium.source.topic.prefix` property in Debezium's `application.properties`
  schema: public
  table: InvoiceLine
output:
  - uses: redis.write
    with:
      nest: # cannot co-exist with other parameters such as 'key'
        parent:
          # server_name: chinook
          # schema: public
          table: Invoice
        nesting_key: InvoiceLineId # cannot be composite
        parent_key: InvoiceId # cannot be composite
        path: $.InvoiceLineItems # path must start from document root ($)
        structure: map # only map supported for now
      on_update: merge # only merge supported for now
      data_type: json # only json supported for now

The job has a with section under output that includes the nest block. The job must include the following attributes in the nest block:

  • parent: This specifies the RDI data stream for the parent entities. Typically, you only need to supply the parent table name, unless you are nesting children under a parent that comes from a different source database. If you do this then you must also specify server_name and schema attributes.
  • nesting-key: The field of the child entity that stores the unique ID (primary key) of the child entity.
  • parent-key: The field in the parent entity that stores the unique ID (foreign key) of the parent entity.
  • child_key: The field in the child entity that stores the unique ID (foreign key) of the parent entity. You only need to add this attribute if the name of the child's foreign key field is different from the parent's.
  • path: The JSONPath for the map where you want to store the child entities. The path must start with the $ character, which denotes the document root.
  • structure: (Optional) The type of JSON nesting structure for the child entities. Currently, only a JSON map is supported so if you supply this attribute then the value must be map.

There are several important things to note when you use nesting:

  • When you specify nest in the job, you must also set the data_type attribute to json and the on_update attribute to merge in the surrounding output block.

  • Key expressions are not supported for the nest output blocks. The parent key is always calculated using the following template:

    <nest.parent.table>:<nest.parent_key>:<nest.parent_key.value | nest.child_key.value>
    

    For example:

    Invoice:InvoiceId:1
    
  • If you specify expire in the nest output block then this will set the expiration on the parent object.

  • You can only use one level of nesting.

  • If you are using PostgreSQL then you must make the following change for all child tables that you want to nest:

    ALTER TABLE <TABLE_NAME> REPLICA IDENTITY FULL;
    

    This configuration affects the information written to the write-ahead log (WAL) and whether it is available for RDI to capture. By default, PostgreSQL only records modified fields in the log, which means that it might omit the parent_key. This can cause incorrect updates to the Redis key in the destination database. See the Debezium PostgreSQL Connector Documentation for more information about this.

RATE THIS PAGE
Back to top ↑