为 RDI 准备 PostgreSQL
准备 PostgreSQL 数据库以使用 RDI
PostgreSQL 支持多个逻辑解码插件来启用 CDC。如果您不想使用本机pgoutput
逻辑复制流支持
然后,您必须将首选插件安装到 PostgreSQL 服务器中。完成此作后,
您必须启用复制槽,并配置具有执行复制权限的用户。
如果您使用 Heroku Postgres 之类的服务来托管
您的数据库,则这可能会限制您可以使用的插件。如果您无法使用首选
插件可以尝试pgoutput
decoder(如果您使用的是 PostgreSQL 10 或更高版本)。
如果这不适合您,那么您将无法将 RDI 用于您的数据库。
Amazon RDS for PostgreSQL
请按照以下步骤使用 Amazon RDS for PostgreSQL 启用 CDC:
-
设置 instance 参数
rds.logical_replication
设置为 1。 -
检查
wal_level
parameter 设置为logical
通过运行查询SHOW wal_level
作为数据库 RDS 主用户。该参数在多区域复制中可能没有此值 设置。您无法手动更改该值,但当您设置rds.logical_replication
parameter 设置为 1。如果它没有改变,那么你可能只需要 重新启动数据库实例。您可以手动重新启动,也可以等待重新启动 在维护时段内。 -
设置 Debezium
plugin.name
参数设置为pgoutput
. -
从具有
rds_replication
角色。该角色授予 管理逻辑槽和使用逻辑槽流式传输数据的权限。默认情况下,只有 AWS 上的主用户账户具有rds_replication
角色,但如果您具有管理员权限, 您可以使用如下所示的查询将角色授予其他账户:GRANT rds_replication TO <my_user>
To enable accounts other than the master account to create an initial snapshot, you must grant
SELECT
permission to the accounts on the tables to be captured. See the documentation about security for PostgreSQL logical replication for more information.
Install the logical decoding output plug-in
As of PostgreSQL 9.4, the only way to read changes to the write-ahead-log is to
install a logical decoding output plug-in.
These plug-ins are written in C using PostgreSQL-specific APIs, as described in the
PostgreSQL documentation.
The PostgreSQL connector uses one of Debezium’s supported logical decoding
plug-ins to receive change events from the database in either the default
pgoutput
format (supplied with PostgreSQL) or the
Protobuf
format.
See the
decoderbufs Protobuf plug-in documentation
for more details about how to compile it and also its requirements and limitations.
For simplicity, Debezium also provides a container image that compiles and installs the plug-ins
on top of the upstream PostgreSQL server image. Use this image as an example of the steps
involved in the installation.
Note:
The Debezium logical decoding plug-ins have been tested on Linux machines, but if you are
using Windows or other operating systems, the installation steps might be different from
those listed here.
Plug-in differences
Plug-ins don't all behave in exactly the same way. All of them refresh information about
the database schema when they detect that it has changed, but the pgoutput
plug-in is
more "eager" than some other plug-ins to do this. For example, pgoutput
will refresh
when it detects a change to the default value of a column but other plug-ins won't
notice this until another, more significant change happens (such as adding a new table
column).
The Debezium project maintains a
Java class that tracks the known differences between plug-ins.
Configure the PostgreSQL server
If you want to use a logical decoding plug-in other than the default pgoutput
then
you must first configure it in the postgresql.conf
file. Set the shared_preload_libraries
parameter to load your plug-in at startup. For example, to load the decoderbufs
plug-in, you would add the following line:
# MODULES
shared_preload_libraries = 'decoderbufs'
Add the line below to configure the replication slot (for any plug-in).
This instructs the server to use logical decoding with the write-ahead log.
# REPLICATION
wal_level = logical
You can also set other PostgreSQL streaming replication parameters if you need them.
For example, you can use max_wal_senders
and max_replication_slots
to increase
the number of connectors that can access the sending server concurrently,
and wal_keep_size
to limit the maximum WAL size that a replication slot retains.
The
configuration parameters
documentation describes all the parameters you can use.
PostgreSQL’s logical decoding uses replication slots. These are guaranteed to retain all the WAL
segments that Debezium needs even when Debezium suffers an outage. You should monitor replication
slots carefully to avoid excessive disk consumption and other conditions such as catalog bloat that can arise
if a replication slot is used infrequently. See the PostgreSQL documentation about
replication slots
for more information.
If you are using a synchronous_commit
setting other than on
, then you should set wal_writer_delay
to a value of about 10 milliseconds to ensure a low latency for change events. If you don't set this then
the default value of about 200 milliseconds will apply.
Note:
This guide summarizes the operation of the PostgreSQL write-ahead log, but we strongly
recommend you consult the PostgreSQL write-ahead log
documentation to get a better understanding.
Set up permissions
The Debezium connector needs a database user that has the REPLICATION and LOGIN roles so that it
can perform replications. By default, a superuser has these roles but for security reasons, you
should give the minimum necessary permissions to the Debezium user rather than full superuser
permissions.
If you have administrator privileges then you can create a role for your Debezium user
using a query like the following. Note that these are the minimum permissions the user
needs to perform replications, but you might also need to grant other permissions.
CREATE ROLE <name> REPLICATION LOGIN;
Set privileges for Debezium to create PostgreSQL publications with pgoutput
The Debezium user needs specific permissions to work with the pgoutput
plug-in.
The plug-in captures change events from the
publications
that PostgreSQL produces for your chosen source tables. A publication contains change events from
one or more tables that are filtered using criteria from a publication specification.
If you have administrator privileges, you can create the publication specification
manually or you can grant the Debezium user the privileges to create the specification
automatically. The required privileges are:
- Replication privileges in the database to add the table to a publication.
CREATE
privileges on the database to add publications.
SELECT
privileges on the tables to copy the initial table data. Table owners
automatically have SELECT
permission for the table.
To add a table to a publication, the user must be an owner of the table. However, in
this case, the source table already exists, so you must use a PostgreSQL replication
group to share ownership between the Debezium user and the original owner. Configure
the replication group using the following commands:
-
Create the replication group (the name replication_group
here is
just an example):
CREATE ROLE replication_group;
-
Add the original owner of the table to the group:
GRANT replication_group TO original_owner;
-
Add the Debezium replication user to the group:
GRANT replication_group TO replication_user;
-
Transfer ownership of the table to replication_group
:
ALTER TABLE table_name OWNER TO replication_group;
You must also set the value of the publication.autocreate.mode
parameter to filtered
to allow Debezium to specify the publication configuration. See the
Debezium documentation for publication.autocreate.mode
to learn more about this setting.
Configure PostgreSQL for replication with the Debezium connector host
You must configure the database to allow replication with the host that runs
the PostgreSQL Debezium connector. To do this, add an entry to the
host-based authentication file, pg_hba.conf
, for each client that needs to
use replication. For example, to enable replication for <youruser>
locally,
on the server machine, you would add a line like the following:
local replication <youruser> trust
To allow <youruser>
on localhost to receive replication changes using IPV4,
add the line:
host replication <youruser> 127.0.0.1/32 trust
To allow <youruser>
on localhost to receive replication changes using IPV6,
add the line:
host replication <youruser> ::1/128 trust
Find out more from the PostgreSQL pages about
pg_hba.conf
and
network address types.
Supported PostgreSQL topologies
You can use the Debezium PostgreSQL connector with a standalone PostgreSQL server or
with a cluster of servers.
For versions 12 and below, PostgreSQL supports logical replication slots on only primary servers.
This means that Debezium can only connect to a primary server for CDC and the connection will
stop if this server fails. If the same server is promoted to primary when service resumes
then you can simply restart the Debezium connector. However, if a different server is
promoted to primary, then you must reconfigure Debezium to use the new server
before restarting. Also, make sure the new server has the correct plug-in and configuration
for Debezium.
On this page