Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

Поиск
Список
Период
Сортировка
От Goel, Dhruv
Тема Avoiding deadlock errors in CREATE INDEX CONCURRENTLY
Дата
Msg-id 6BF27921-C662-4C4A-A9DC-4E124C180CAF@amazon.com
обсуждение исходный текст
Ответы Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY  (Kuntal Ghosh <kuntalghosh.2007@gmail.com>)
Список pgsql-hackers

Hi,

Currently any DDL operations (Create Indexes, Drop Indexes etc.) when run during an existing concurrent index build on the same table causes the index build to fail with “deadlock detected”. This is a pain-point specially when we want to kick-off multiple concurrent index builds on the same table; the index build will reach phase 3 (consuming resources) and then fail with deadlock errors.

 

I have a patch that might improve the build times and reduce deadlock occurrences. Is this something the community would be interested in? I might be missing some documentation changes in the patch but wanted to get some feedback on the functional aspect of the patch first.

 

Problem:

In the Concurrent Index creation implementation there are three waits that are relevant:

  1. Wait 1 at start of Phase 2: Postgres waits for all transactions that started before this transaction and conflict with “Share Lock” on this relation. This is to make sure from this point forward all HOT updates to the table will be compatible with the new index. 
  2. Wait 2 at the start of Phase 3: Postgres waits for all transactions that started before this transaction and conflict with “Share Lock” on this relation. 
  3. Wait 3 at the end of Phase 3: PG waits for all transactions that started before this transaction primarily because they should not start using the index as they might be using an older snapshot and the index does not have all the entries (missing deleted tuples) for snapshot.

 

Typically, all the three wait states can cause deadlocks. Deadlocks due to the third wait state is reproduced by transactions that are waiting for a lock to be freed from “CREATE INDEX CONCURRENTLY” will cause deadlocks (primarily DDLs). The former 2 waits are much harder to reproduce with the test case being a Insert/Update/Delete as first statement of the transaction and then another DDL which causes lock escalation.

 

Proposed Solution:

We remove the third wait state completely from the concurrent index build. When we mark the index as ready, we also mark “indcheckxmin” to true which essentially enforces Postgres to not use this index for older snapshots.

 

Tests:

Added an isolation test which breaks without the patch. Manual test with a Repeatable Read Transaction that has an older snapshot with a tuple that has been deleted since and not part of the index.

 

 

May the force be with you,

Dhruv

 

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: vacuumdb and new VACUUM options
Следующее
От: Andrey Borodin
Дата:
Сообщение: Re: pglz performance