群集节点的维护模式
准备用于维护的群集节点。
Redis 企业软件 |
---|
使用维护模式可防止在 Redis Enterprise 服务器上的硬件修补或作系统维护期间丢失数据。当维护模式开启时,所有分片都会从正在维护的节点中移出并迁移到另一个可用节点。
激活维护模式
当您激活维护模式时,Redis Enterprise 将执行以下作:
-
检查节点关闭是否会导致 quorum 丢失。如果是这样,维护模式将不会打开。
维护模式不能防止仲裁丢失。如果您为集群中的大多数节点激活维护模式并同时重新启动它们,则仲裁将丢失,这可能会导致数据丢失。
-
如果不存在维护模式快照,或者您使用
overwrite_snapshot
当您激活维护模式时,Redis Enterprise 会创建一个新的节点快照,用于记录节点的分片和终端节点配置。 -
将节点标记为 quorum 节点,以防止分片和终端节点迁移到该节点。
此时,
rladmin status
以黄色显示节点的 shards 字段,这表示 shards 无法迁移到该节点。 -
在空间可用时迁移分片并将终端节点绑定到其他节点。
默认情况下,维护模式不会降级主节点。当原始主节点重新启动时,集群会选择一个新的主节点。
添加demote_node
选项添加到rladmin
命令在激活维护模式时降级主节点。
要激活节点的维护模式,请运行以下命令:
rladmin node <node_id> maintenance_mode on overwrite_snapshot
You can start server maintenance if:
-
All shards and endpoints have moved to other nodes
-
Enough nodes are still online to maintain quorum
Prevent replica shard migration
If you do not have enough resources available to move all of the shards to other nodes, you can turn maintenance mode on without migrating the replica shards.
Before you prevent replica shard migration during maintenance mode, consider the following effects:
-
Replica shards remain on the node during maintenance.
-
If the maintenance node fails, the master shards do not have replica shards to maintain data redundancy and high availability.
-
Replica shards that remain on the node can still be promoted during failover to preserve availability.
To activate maintenance mode without replica shard migration, run:
rladmin node <node_id> maintenance_mode on evict_ha_replica disabled evict_active_active_replica disabled
Demote a master node
If maintenance might affect connectivity to the master node, you can demote the master node when you activate maintenance mode. This lets the cluster elect a new master node.
To demote a master node when activating maintenance mode, run:
rladmin node <node_id> maintenance_mode on demote_node
Verify maintenance mode activation
To verify maintenance mode for a node, use rladmin status
and review the node's shards field. If that value is displayed in yellow (shown earlier), then the node is in maintenance mode.
Avoid activating maintenance mode when it is already active. Maintenance mode activations stack. If you activate maintenance mode for a node that is already in maintenance mode, you will have to deactivate maintenance mode twice in order to restore full functionality.
Deactivate maintenance mode
When you deactivate maintenance mode, Redis Enterprise:
-
Loads a specified node snapshot or defaults to the latest maintenance mode snapshot.
-
Unmarks the node as a quorum node to allow shards and endpoints to migrate to the node.
-
Restores the shards and endpoints that were in the node at the time of the snapshot.
-
Deletes the snapshot.
To deactivate maintenance mode after server maintenance, run:
rladmin node <node_id> maintenance_mode off
By default, a snapshot is required to deactivate maintenance mode. If the snapshot cannot be restored, deactivation is cancelled and the node remains in maintenance mode. In such events, it may be necessary to reset node status.
Specify a snapshot
When you turn off maintenance mode, you can restore the node configuration from a maintenance mode snapshot or any snapshots previously created by rladmin node <node_id> snapshot create
. If you do not specify a snapshot, Redis Enterprise uses the latest maintenance mode snapshot by default.
To get a list of available snapshots, run:
rladmin node <node_id> snapshot list
To specify a snapshot when you turn maintenance mode off, run:
rladmin node <node_id> maintenance_mode off snapshot_name <snapshot_name>
Note:
If an error occurs when you turn on maintenance mode, the snapshot is not deleted.
When you rerun the command, use the snapshot from the initial attempt since it contains the original state of the node.
Skip shard restoration
You can prevent the migrated shards and endpoints from returning to the original node after you turn off maintenance mode.
To turn maintenance mode off and skip shard restoration, run:
rladmin node <node_id> maintenance_mode off skip_shards_restore
Reset node status
In extreme cases, you may need to reset a node's status. Run the following commands to do so:
$ rladmin tune node <node_id> max_listeners 100
$ rladmin tune node <node_id> quorum_only disabled
Use these commands with caution. For best results, contact Support before running these commands.
Cluster status example
This example shows how the output of rladmin status
changes when you turn on maintenance mode for a node.
The cluster status before turning on maintenance mode:
redislabs@rp1_node1:/opt$ rladmin status
CLUSTER NODES:
NODE:ID ROLE ADDRESS EXTERNAL_ADDRESS HOSTNAME SHARDS
*node:1 master 172.17.0.2 rp1_node1 2/100
node:2 slave 172.17.0.4 rp3_node1 2/100
node:3 slave 172.17.0.3 rp2_node1 0/100
The cluster status after turning on maintenance mode:
redislabs@rp1_node1:/opt$ rladmin node 2 maintenance_mode on
Performing maintenance_on action on node:2: 0%
created snapshot NodeSnapshot<name=maintenance_mode_2019-03-14_09-50-59,time=None,node_uid=2>
node:2 will not accept any more shards
Performing maintenance_on action on node:2: 100%
OK
redislabs@rp1_node1:/opt$ rladmin status
CLUSTER NODES:
NODE:ID ROLE ADDRESS EXTERNAL_ADDRESS HOSTNAME SHARDS
*node:1 master 172.17.0.2 rp1_node1 2/100
node:2 slave 172.17.0.4 rp3_node1 0/0
node:3 slave 172.17.0.3 rp2_node1 2/100
After turning on maintenance mode for node 2, Redis Enterprise saves a snapshot of its configuration and then moves its shards and endpoints to node 3.
Now node 2 has 0/0
shards because shards cannot migrate to it while it is in maintenance mode.
Maintenance mode REST API
You can also turn maintenance mode on or off using REST API requests to POST /nodes/{node_uid}/actions/{action}
.
Activate maintenance mode (REST API)
Use POST /nodes/{node_uid}/actions/maintenance_on
to activate maintenance mode:
POST https://<hostname>:9443/v1/nodes/<node_id>/actions/maintenance_on
{
"overwrite_snapshot": true,
"evict_ha_replica": true,
"evict_active_active_replica": true
}
You can set evict_ha_replica
and evict_active_active_replica
to false
to prevent replica shard migration.
The maintenance_on
request returns a JSON response body:
{
"status":"queued",
"task_id":"<task-id-guid>"
}
Deactivate maintenance mode (REST API)
Use POST /nodes/{node_uid}/actions/maintenance_off
deactivate maintenance mode:
POST https://<hostname>:9443/v1/nodes/<node_id>/actions/maintenance_off
{ "skip_shards_restore": false }
The skip_shards_restore
boolean flag allows the maintenance_off
action to skip shard restoration when set to true
.
The maintenance_off
request returns a JSON response body:
{
"status":"queued",
"task_id":"<task-id-guid>"
}
Track action status
You can send a request to GET /nodes/{node_uid}/actions/{action}
to track the status of the maintenance_on
and maintenance_off
actions.
This request returns the status of the maintenance_on
action:
GET https://<hostname>:9443/v1/nodes/<node_id>/actions/maintenance_on
The response body:
{
"status":"completed",
"task_id":"38c7405b-26a7-4379-b84c-cab4b3db706d"
}
On this page