2.3. Перебалансировка данных

2.3.1. Автоматическая перебалансировка данных

По умолчанию используется режим автоматической перебалансировки. Процесс перебалансировки запускается автоматически после добавления узлов (по умолчанию, если не задан параметр --no-rebalance) или перед удалением узла. Перебалансировку также можно запустить вручную. Суть процесса перебалансировки заключается в равномерном распределении секций для каждой сегментированной таблицы между группами репликации.

Процесс перебалансировки для каждой сегментированной таблицы итерационно определяет группу репликации с максимальным и минимальным количеством секций и создаёт задание на перемещение одной секции в группу репликации с минимальным количеством секций. Этот процесс повторяется, пока соблюдается условие max - min > 1. Для перемещения секций используется логическая репликация. Секции совместно размещённых таблиц перемещаются совместно с секциями сегментированных таблиц, на которые они ссылаются.

Важно помнить, что max_logical_replication_workers должно быть довольно большим, так как процесс перебалансировки использует до max(max_replication_slots, max_logical_replication_workers, max_worker_processes, max_wal_senders)/3 параллельных потоков. На практике можно использовать max_logical_replication_workers = Repfactor + 3 * task_num (task_num — количество параллельных заданий перебалансировки).

Чтобы выполнить перебалансировку сегментированных таблиц в кластере cluster0 вручную, выполните следующую команду (где etcd1, etcd2, etcd3 — это узлы кластера etcd):

                    $ shardmanctl --store-endpoints http://etcd1:2379,http://etcd2:2379,http://etcd3:2379 rebalance
                

Если процесс завершается ошибкой, необходимо вызвать команду shardmanctl cleanup с параметром --after-rebalance.

2.3.2. Ручная перебалансировка данных

Бывают случаи, когда необходимо определённым образом разместить секции сегментированных таблиц по узлам кластера. Для выполнения этой задачи в Shardman поддерживается режим ручной перебалансировки данных.

Как это работает:

  1. Получите список сегментированных таблиц с помощью команды shardmanctl tables sharded list. Вывод будет примерно следующим:

    $ shardmanctl shardmanctl tables sharded list
    
    Sharded tables:
    
    public.doc
    public.resolution
    public.users
    
                                
  2. Запросите информацию о выбранных сегментированных таблицах. Пример:

    $ shardmanctl shardmanctl tables sharded info -t public.users
    
    Table public.users
    
    Partitions:
                                        
    Partition    RgID     Shard                            Master
    0            1        clover-1-shrn1                   shrn1:5432
    1            2        clover-2-shrn2                   shrn2:5432
    2            3        clover-3-shrn3                   shrn3:5432
    3            1        clover-1-shrn1                   shrn1:5432
    4            2        clover-2-shrn2                   shrn2:5432
    5            3        clover-3-shrn3                   shrn3:5432
    6            1        clover-1-shrn1                   shrn1:5432
    7            2        clover-2-shrn2                   shrn2:5432
    8            3        clover-3-shrn3                   shrn3:5432
    9            1        clover-1-shrn1                   shrn1:5432
    10           2        clover-2-shrn2                   shrn2:5432
    11           3        clover-3-shrn3                   shrn3:5432
    12           1        clover-1-shrn1                   shrn1:5432
    13           2        clover-2-shrn2                   shrn2:5432
    14           3        clover-3-shrn3                   shrn3:5432
    15           1        clover-1-shrn1                   shrn1:5432
    16           2        clover-2-shrn2                   shrn2:5432
    17           3        clover-3-shrn3                   shrn3:5432
    18           1        clover-1-shrn1                   shrn1:5432
    19           2        clover-2-shrn2                   shrn2:5432
    20           3        clover-3-shrn3                   shrn3:5432
    21           1        clover-1-shrn1                   shrn1:5432
    22           2        clover-2-shrn2                   shrn2:5432
    23           3        clover-3-shrn3                   shrn3:5432
    
                                
  3. Переместите секцию в новый сегмент, как показано ниже:

    $ shardmanctl --log-level debug tables sharded partmove -t public.users --partnum 1 --shard clover-1-shrn1
    
    2023-07-26T06:00:36.900Z        DEBUG   cmd/common.go:105       Waiting for metadata lock...
    2023-07-26T06:00:36.936Z        DEBUG   rebalance/service.go:256        take extension lock
    2023-07-26T06:00:36.938Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=3
    2023-07-26T06:00:36.938Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=2
    2023-07-26T06:00:36.938Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=1
    2023-07-26T06:00:36.951Z        DEBUG   broadcaster/worker.go:51        repgroup 3 connect established
    2023-07-26T06:00:36.951Z        DEBUG   broadcaster/worker.go:51        repgroup 2 connect established
    2023-07-26T06:00:36.952Z        DEBUG   broadcaster/worker.go:51        repgroup 1 connect established
    2023-07-26T06:00:36.952Z        DEBUG   extension/lock.go:35    Waiting for extension lock...
    2023-07-26T06:00:36.976Z        INFO    rebalance/service.go:276        Performing move partition...
    2023-07-26T06:00:36.977Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=3
    2023-07-26T06:00:36.978Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=2
    2023-07-26T06:00:36.978Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=1
    2023-07-26T06:00:36.987Z        DEBUG   broadcaster/worker.go:51        repgroup 1 connect established
    2023-07-26T06:00:36.989Z        DEBUG   broadcaster/worker.go:51        repgroup 2 connect established
    2023-07-26T06:00:36.992Z        DEBUG   broadcaster/worker.go:51        repgroup 3 connect established
    2023-07-26T06:00:36.992Z        DEBUG   rebalance/service.go:71 Performing cleanup after possible rebalance operation failure
    2023-07-26T06:00:37.077Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=3
    2023-07-26T06:00:37.077Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=1
    2023-07-26T06:00:37.077Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=2
    2023-07-26T06:00:37.082Z        DEBUG   rebalance/service.go:422        Rebalance will run 1 tasks
    2023-07-26T06:00:37.095Z        DEBUG   rebalance/service.go:452        Guessing that rebalance() can use 3 workers
    2023-07-26T06:00:37.096Z        DEBUG   rebalance/job.go:352    state: Idle     {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:37.111Z        DEBUG   rebalance/job.go:352    state: ConnsEstablished {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:37.171Z        DEBUG   rebalance/job.go:352    state: WaitInitCopy     {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.073Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "WaitInitialCatchup"}
    2023-07-26T06:00:38.073Z        DEBUG   rebalance/job.go:352    state: WaitInitialCatchup       {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.084Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "WaitFullSync"}
    2023-07-26T06:00:38.084Z        DEBUG   rebalance/job.go:352    state: WaitFullSync     {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.108Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "Committing"}
    2023-07-26T06:00:38.108Z        DEBUG   rebalance/job.go:352    state: Committing       {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.254Z        DEBUG   rebalance/job.go:352    state: Complete {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.258Z        DEBUG   rebalance/service.go:583        Produce and process tasks on destination replication groups...
    2023-07-26T06:00:38.258Z        DEBUG   rebalance/service.go:594        Produce and process tasks on source replication groups...
    2023-07-26T06:00:38.258Z        DEBUG   rebalance/service.go:606        wait all tasks finish
    2023-07-26T06:00:38.258Z        DEBUG   rebalance/service.go:531        Analyzing table public.users in rg 1    {"table": "public.users", "rgid": 1, "action": "analyze"}
    2023-07-26T06:00:38.573Z        DEBUG   rebalance/service.go:531        Analyzing table public.users in rg 2    {"table": "public.users", "rgid": 2, "action": "analyze"}
    2023-07-26T06:00:38.833Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=1
    2023-07-26T06:00:38.833Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=2
    2023-07-26T06:00:38.833Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=3
    
                                

    В этом примере секция номер 1 таблицы public.users будет перемещена в сегмент clover-1-shrn1.

    После ручного перемещения секции сегментированной таблицы автоматическая перебалансировка данных отключается для этой таблицы и всех совместно размещённых с ней таблиц.

Для получения списка таблиц с отключённой автоматической перебалансировкой, выполните команду shardmanctl table sharded norebalance. Пример:

$ shardmanctl tables sharded norebalance

public.users

                

Для включения автоматической перебалансировки данных для выбранной сегментированной таблицы, выполните команду shardmanctl tables sharded rebalance, как показано в примере ниже:

$ shardmanctl tables sharded rebalance -t public.users

2023-07-26T07:07:00.657Z        DEBUG   cmd/common.go:105       Waiting for metadata lock...
2023-07-26T07:07:00.687Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=1
2023-07-26T07:07:00.687Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=2
2023-07-26T07:07:00.687Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=3
2023-07-26T07:07:00.697Z        DEBUG   broadcaster/worker.go:51        repgroup 1 connect established
2023-07-26T07:07:00.698Z        DEBUG   broadcaster/worker.go:51        repgroup 2 connect established
2023-07-26T07:07:00.698Z        DEBUG   broadcaster/worker.go:51        repgroup 3 connect established
2023-07-26T07:07:00.698Z        DEBUG   extension/lock.go:35    Waiting for extension lock...
2023-07-26T07:07:00.719Z        DEBUG   rebalance/service.go:381        Planned moving pnum 21 for table users from rg 1 to rg 2
2023-07-26T07:07:00.719Z        INFO    rebalance/service.go:244        Performing rebalance...
2023-07-26T07:07:00.720Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=1
2023-07-26T07:07:00.720Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=2
2023-07-26T07:07:00.720Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=3
2023-07-26T07:07:00.732Z        DEBUG   broadcaster/worker.go:51        repgroup 3 connect established
2023-07-26T07:07:00.732Z        DEBUG   broadcaster/worker.go:51        repgroup 1 connect established
2023-07-26T07:07:00.734Z        DEBUG   broadcaster/worker.go:51        repgroup 2 connect established
2023-07-26T07:07:00.734Z        DEBUG   rebalance/service.go:71 Performing cleanup after possible rebalance operation failure
2023-07-26T07:07:00.791Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=1
2023-07-26T07:07:00.791Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=2
2023-07-26T07:07:00.791Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=3
2023-07-26T07:07:00.795Z        DEBUG   rebalance/service.go:422        Rebalance will run 1 tasks
2023-07-26T07:07:00.809Z        DEBUG   rebalance/service.go:452        Guessing that rebalance() can use 3 workers
2023-07-26T07:07:00.809Z        DEBUG   rebalance/job.go:352    state: Idle     {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:00.823Z        DEBUG   rebalance/job.go:352    state: ConnsEstablished {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:00.880Z        DEBUG   rebalance/job.go:352    state: WaitInitCopy     {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:01.886Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "WaitInitialCatchup"}
2023-07-26T07:07:01.886Z        DEBUG   rebalance/job.go:352    state: WaitInitialCatchup       {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:01.904Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "WaitFullSync"}
2023-07-26T07:07:01.905Z        DEBUG   rebalance/job.go:352    state: WaitFullSync     {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:01.932Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "Committing"}
2023-07-26T07:07:01.932Z        DEBUG   rebalance/job.go:352    state: Committing       {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:02.057Z        DEBUG   rebalance/job.go:352    state: Complete {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:02.060Z        DEBUG   rebalance/service.go:583        Produce and process tasks on destination replication groups...
2023-07-26T07:07:02.060Z        DEBUG   rebalance/service.go:594        Produce and process tasks on source replication groups...
2023-07-26T07:07:02.060Z        DEBUG   rebalance/service.go:531        Analyzing table public.users in rg 2    {"table": "public.users", "rgid": 2, "action": "analyze"}
2023-07-26T07:07:02.060Z        DEBUG   rebalance/service.go:606        wait all tasks finish
2023-07-26T07:07:02.321Z        DEBUG   rebalance/service.go:531        Analyzing table public.users in rg 1    {"table": "public.users", "rgid": 1, "action": "analyze"}
2023-07-26T07:07:02.587Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=3
2023-07-26T07:07:02.587Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=2
2023-07-26T07:07:02.587Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=1

                

Чтобы включить автоматическую перебалансировку данных для всех сегментированных таблиц, выполните команду shardmanctl rebalance с параметром --force.

$ shardmanctl rebalance --force