配置副本分片的高可用性
为副本分片配置高可用性,以便集群自动将副本分片迁移到可用节点。
Redis 企业软件 |
---|
启用数据库复制时, Redis Enterprise Software 会创建每个主分片的副本。副本分片将始终为 位于与主分片不同的节点上,以使您的数据具有高可用性。如果主分片 失败,或者如果托管主分片的节点失败,则副本将提升为主分片。
如果未启用副本高可用性 (replica_ha),则提升的主分片将成为单点故障 作为数据的唯一副本。
启用 replica_ha 会将集群配置为在可用节点上自动复制提升的副本。 这会自动将数据库返回到有两个数据副本的状态: 以前的副本分片已提升为主分片和新的副本分片。
可用节点:
- 满足副本迁移要求,例如机架感知。
- 有足够的可用 RAM 来存储副本分片。
- 不包含主分片。
在实践中,副本迁移会创建一个新的副本分片,并将数据从主分片复制到新的副本分片。
例如:
-
node:2 具有主分片,node:3 具有相应的副本分片。
-
也:
- Node:2 失败,node:3 上的副本分片被提升为主分片。
- Node:3 失败,主分片不再复制到故障节点上的副本分片。
-
如果启用了副本 HA,则会在可用节点上创建新的副本分片。
-
主分片中的数据将复制到新的副本分片。
- 副本 HA 遵循副本迁移的所有先决条件,例如机架感知。
- 副本 HA 根据目标节点中的可用 DRAM 迁移尽可能多的分片。当没有可用的 DRAM 时,副本 HA 会停止将副本分片迁移到该节点。
配置副本分片的高可用性
如果集群和数据库都启用了副本高可用性,则 当主分片或副本分片发生故障时,数据库的副本分片会自动迁移到另一个节点。 如果未在集群级别启用副本 HA,则 即使为数据库启用了副本 HA,副本 HA 也不会迁移副本分片。
默认情况下,集群的副本高可用性处于启用状态。
当您使用集群管理器 UI 创建数据库时,如果启用复制,则默认情况下会为数据库启用副本高可用性。

要使用不带复制高可用性的复制,请清除 Replica high availability (副本高可用性) 复选框。
您还可以使用rladmin
或 REST API。
为副本 HA 配置集群策略
要默认为整个集群启用或关闭副本高可用性,请使用以下方法之一:
-
rladmin tune cluster slave_ha { enabled | disabled }
-
Update cluster policy REST API request:
PUT /v1/cluster/policy { "slave_ha": <boolean> }
Turn off replica HA for a database
To turn off replica high availability for a specific database using rladmin
, run:
rladmin tune db db:<ID> slave_ha disabled
You can use the database name in place of db:<ID>
in the preceding command.
Configuration options
You can see the current configuration options for replica HA with:
rladmin info cluster
Grace period
By default, replica HA has a 10-minute grace period after node failure and before new replica shards are created.
Note:
The default grace period is 30 minutes for containerized applications using Redis Enterprise Software for Kubernetes.
To configure this grace period from rladmin, run:
rladmin tune cluster slave_ha_grace_period <time_in_seconds>
Shard priority
Replica shard migration is based on priority. When memory resources are limited, the most important replica shards are migrated first:
-
slave_ha_priority
- Replica shards with higher
integer values are migrated before shards with lower values.
To assign priority to a database, run:
rladmin tune db db:<ID> slave_ha_priority <positive integer>
You can use the database name in place of db:<ID>
in the preceding command.
-
Active-Active databases - Active-Active database synchronization uses replica shards to synchronize between the replicas.
-
Database size - It is easier and more efficient to move replica shards of smaller databases.
-
Database UID - The replica shards of databases with a higher UID are moved first.
Cooldown periods
Both the cluster and the database have cooldown periods.
After node failure, the cluster cooldown period (slave_ha_cooldown_period
) prevents another replica migration due to another node failure for any
database in the cluster until the cooldown period ends. The default is one hour.
After a database is migrated with replica HA,
it cannot go through another migration due to another node failure until the cooldown period for the database (slave_ha_bdb_cooldown_period
) ends. The default is two hours.
To configure cooldown periods, use rladmin tune cluster
:
-
For the cluster:
rladmin tune cluster slave_ha_cooldown_period <time_in_seconds>
-
For all databases in the cluster:
rladmin tune cluster slave_ha_bdb_cooldown_period <time_in_seconds>
Alerts
The following alerts are sent during replica HA activation:
- Shard migration begins after the grace period.
- Shard migration fails because there is no available node (sent hourly).
- Shard migration is delayed because of the cooldown period.
On this page