Обсуждение: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

Поиск
Список
Период
Сортировка

Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
"Goel, Dhruv"
Дата:

Hi,

Currently any DDL operations (Create Indexes, Drop Indexes etc.) when run during an existing concurrent index build on the same table causes the index build to fail with “deadlock detected”. This is a pain-point specially when we want to kick-off multiple concurrent index builds on the same table; the index build will reach phase 3 (consuming resources) and then fail with deadlock errors.

 

I have a patch that might improve the build times and reduce deadlock occurrences. Is this something the community would be interested in? I might be missing some documentation changes in the patch but wanted to get some feedback on the functional aspect of the patch first.

 

Problem:

In the Concurrent Index creation implementation there are three waits that are relevant:

  1. Wait 1 at start of Phase 2: Postgres waits for all transactions that started before this transaction and conflict with “Share Lock” on this relation. This is to make sure from this point forward all HOT updates to the table will be compatible with the new index. 
  2. Wait 2 at the start of Phase 3: Postgres waits for all transactions that started before this transaction and conflict with “Share Lock” on this relation. 
  3. Wait 3 at the end of Phase 3: PG waits for all transactions that started before this transaction primarily because they should not start using the index as they might be using an older snapshot and the index does not have all the entries (missing deleted tuples) for snapshot.

 

Typically, all the three wait states can cause deadlocks. Deadlocks due to the third wait state is reproduced by transactions that are waiting for a lock to be freed from “CREATE INDEX CONCURRENTLY” will cause deadlocks (primarily DDLs). The former 2 waits are much harder to reproduce with the test case being a Insert/Update/Delete as first statement of the transaction and then another DDL which causes lock escalation.

 

Proposed Solution:

We remove the third wait state completely from the concurrent index build. When we mark the index as ready, we also mark “indcheckxmin” to true which essentially enforces Postgres to not use this index for older snapshots.

 

Tests:

Added an isolation test which breaks without the patch. Manual test with a Repeatable Read Transaction that has an older snapshot with a tuple that has been deleted since and not part of the index.

 

 

May the force be with you,

Dhruv

 

Вложения

Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Kuntal Ghosh
Дата:
Hello,

On Wed, May 15, 2019 at 1:45 PM Goel, Dhruv <goeldhru@amazon.com> wrote:

 

Proposed Solution:

We remove the third wait state completely from the concurrent index build. When we mark the index as ready, we also mark “indcheckxmin” to true which essentially enforces Postgres to not use this index for older snapshots.

 

I think there is a problem in the proposed solution. When phase 3 is reached, the index is valid. But it might not contain tuples deleted just before the reference snapshot was taken. Hence, we wait for those transactions that might have older snapshot. The TransactionXmin of these transactions can be greater than the xmin of the pg_index entry for this index.
Instead of waiting in the third phase, if we just set indcheckxmin as true, the above transactions will be able to use the index which is wrong. (because they won't find the recently deleted tuples from the index that are still live according to their snapshots)

The respective code from get_relation_info:
if (index->indcheckxmin &&                                                                                                                                                                 
                 !TransactionIdPrecedes(HeapTupleHeaderGetXmin(indexRelation->rd_indextuple->t_data), TransactionXmin))                                         
 { /* don't use this index */ }

Please let me know if I'm missing something.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
"Goel, Dhruv"
Дата:

Yes, you are correct. The test case here was that if a tuple is inserted after the reference snapshot is taken in Phase 2 and before the index is marked ready. If this tuple is deleted before the reference snapshot of Phase 3, it will never make it to the index. I have fixed this problem by making pg_index tuple updates transactional (I believe there is no reason why it has to be in place now) so that the xmin of the pg_index tuple is same the xmin of the snapshot in Phase 3.

Attached the amended patch.

 

From: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Date: Wednesday, May 15, 2019 at 3:45 AM
To: "Goel, Dhruv" <goeldhru@amazon.com>
Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Subject: Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

 

Hello,

 

On Wed, May 15, 2019 at 1:45 PM Goel, Dhruv <goeldhru@amazon.com> wrote:

 

Proposed Solution:

We remove the third wait state completely from the concurrent index build. When we mark the index as ready, we also mark “indcheckxmin” to true which essentially enforces Postgres to not use this index for older snapshots.

 

I think there is a problem in the proposed solution. When phase 3 is reached, the index is valid. But it might not contain tuples deleted just before the reference snapshot was taken. Hence, we wait for those transactions that might have older snapshot. The TransactionXmin of these transactions can be greater than the xmin of the pg_index entry for this index.

Instead of waiting in the third phase, if we just set indcheckxmin as true, the above transactions will be able to use the index which is wrong. (because they won't find the recently deleted tuples from the index that are still live according to their snapshots)

 

The respective code from get_relation_info:

if (index->indcheckxmin &&                                                                                                                                                                 

                 !TransactionIdPrecedes(HeapTupleHeaderGetXmin(indexRelation->rd_indextuple->t_data), TransactionXmin))                                         

 { /* don't use this index */ }

 

Please let me know if I'm missing something.

 

--

Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Вложения

Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Tom Lane
Дата:
"Goel, Dhruv" <goeldhru@amazon.com> writes:
> Yes, you are correct. The test case here was that if a tuple is inserted after the reference snapshot is taken in
Phase2 and before the index is marked ready. If this tuple is deleted before the reference snapshot of Phase 3, it will
nevermake it to the index. I have fixed this problem by making pg_index tuple updates transactional (I believe there is
noreason why it has to be in place now) so that the xmin of the pg_index tuple is same the xmin of the snapshot in
Phase3. 

I think you are mistaken that doing transactional updates in pg_index
is OK.  If memory serves, we rely on xmin of the pg_index row for purposes
such as detecting whether a concurrently-created index is safe to use yet.
So a transactional update would restart that clock and result in temporary
denial of service.

            regards, tom lane



Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Andres Freund
Дата:
Hi,

On June 9, 2019 8:36:37 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>"Goel, Dhruv" <goeldhru@amazon.com> writes:
>I think you are mistaken that doing transactional updates in pg_index
>is OK.  If memory serves, we rely on xmin of the pg_index row for
>purposes
>such as detecting whether a concurrently-created index is safe to use
>yet.

We could replace that with storing a 64 xid in a normal column nowadays.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On June 9, 2019 8:36:37 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think you are mistaken that doing transactional updates in pg_index
>> is OK.  If memory serves, we rely on xmin of the pg_index row for
>> purposes such as detecting whether a concurrently-created index is safe
>> to use yet.

> We could replace that with storing a 64 xid in a normal column nowadays.

Perhaps, but that's a nontrivial change that'd be prerequisite to
doing what's suggested in this thread.

            regards, tom lane



Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
"Goel, Dhruv"
Дата:
> On Jun 9, 2019, at 5:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andres Freund <andres@anarazel.de> writes:
>> On June 9, 2019 8:36:37 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I think you are mistaken that doing transactional updates in pg_index
>>> is OK.  If memory serves, we rely on xmin of the pg_index row for
>>> purposes such as detecting whether a concurrently-created index is safe
>>> to use yet.

I took a deeper look regarding this use case but was unable to find more evidence. As part of this patch, we
essentiallymake concurrently-created index safe to use only if transaction started after the xmin of Phase 3. Even
todayconcurrent indexes can not be used for transactions before this xmin because of the wait (which I am trying to get
ridof in this patch), is there any other denial of service you are talking about? Both the other states indislive,
indisreadycan be transactional updates as far as I understand. Is there anything more I am missing here? 


Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
"Goel, Dhruv"
Дата:
> On Jun 10, 2019, at 1:20 PM, Goel, Dhruv <goeldhru@amazon.com> wrote:
>
>
>> On Jun 9, 2019, at 5:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Andres Freund <andres@anarazel.de> writes:
>>> On June 9, 2019 8:36:37 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I think you are mistaken that doing transactional updates in pg_index
>>>> is OK.  If memory serves, we rely on xmin of the pg_index row for
>>>> purposes such as detecting whether a concurrently-created index is safe
>>>> to use yet.
>
> I took a deeper look regarding this use case but was unable to find more evidence. As part of this patch, we
essentiallymake concurrently-created index safe to use only if transaction started after the xmin of Phase 3. Even
todayconcurrent indexes can not be used for transactions before this xmin because of the wait (which I am trying to get
ridof in this patch), is there any other denial of service you are talking about? Both the other states indislive,
indisreadycan be transactional updates as far as I understand. Is there anything more I am missing here? 


Hi,

I did some more concurrency testing here through some python scripts which compare the end state of the concurrently
createdindexes. I also back-ported this patch to PG 9.6 and ran some custom concurrency tests (Inserts, Deletes, and
CreateIndex Concurrently) which seem to succeed. The intermediate states unfortunately are not easy to test in an
automatedmanner, but to be fair concurrent indexes could never be used for older transactions. Do you have more
inputs/ideason this patch? 

Thanks,
Dhruv


Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Thomas Munro
Дата:
On Sun, Jun 30, 2019 at 7:30 PM Goel, Dhruv <goeldhru@amazon.com> wrote:
> > On Jun 10, 2019, at 1:20 PM, Goel, Dhruv <goeldhru@amazon.com> wrote:
> >> On Jun 9, 2019, at 5:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >>> On June 9, 2019 8:36:37 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>>> I think you are mistaken that doing transactional updates in pg_index
> >>>> is OK.  If memory serves, we rely on xmin of the pg_index row for
> >>>> purposes such as detecting whether a concurrently-created index is safe
> >>>> to use yet.
> >
> > I took a deeper look regarding this use case but was unable to find more evidence. As part of this patch, we
essentiallymake concurrently-created index safe to use only if transaction started after the xmin of Phase 3. Even
todayconcurrent indexes can not be used for transactions before this xmin because of the wait (which I am trying to get
ridof in this patch), is there any other denial of service you are talking about? Both the other states indislive,
indisreadycan be transactional updates as far as I understand. Is there anything more I am missing here? 
>
> I did some more concurrency testing here through some python scripts which compare the end state of the concurrently
createdindexes. I also back-ported this patch to PG 9.6 and ran some custom concurrency tests (Inserts, Deletes, and
CreateIndex Concurrently) which seem to succeed. The intermediate states unfortunately are not easy to test in an
automatedmanner, but to be fair concurrent indexes could never be used for older transactions. Do you have more
inputs/ideason this patch? 

I noticed that check-world passed several times with this patch
applied, but the most recent CI run failed in multiple-cic:

+error in steps s2i s1i: ERROR:  cache lookup failed for index 26303

https://travis-ci.org/postgresql-cfbot/postgresql/builds/555472214

--
Thomas Munro
https://enterprisedb.com



Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Thomas Munro
Дата:
On Mon, Jul 8, 2019 at 9:51 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> I noticed that check-world passed several times with this patch
> applied, but the most recent CI run failed in multiple-cic:
>
> +error in steps s2i s1i: ERROR:  cache lookup failed for index 26303
>
> https://travis-ci.org/postgresql-cfbot/postgresql/builds/555472214

And in another run, this time on Windows, create_index failed:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46455

-- 
Thomas Munro
https://enterprisedb.com



Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
"Goel, Dhruv"
Дата:
Hi,

Thank you Thomas.

On Jul 7, 2019, at 3:15 PM, Thomas Munro <thomas.munro@gmail.com> wrote:

On Mon, Jul 8, 2019 at 9:51 AM Thomas Munro <thomas.munro@gmail.com> wrote:
I noticed that check-world passed several times with this patch
applied, but the most recent CI run failed in multiple-cic:

+error in steps s2i s1i: ERROR:  cache lookup failed for index 26303

https://travis-ci.org/postgresql-cfbot/postgresql/builds/555472214

And in another run, this time on Windows, create_index failed:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46455

--
Thomas Munro
https://enterprisedb.com
I have attached the revised patch. I ran check-world multiple times on my machine and it seems to succeed now. Do you mind kicking-off the CI build with the latest patch?
Вложения

Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Thomas Munro
Дата:
On Tue, Jul 9, 2019 at 10:33 AM Goel, Dhruv <goeldhru@amazon.com> wrote:
> I have attached the revised patch. I ran check-world multiple times on my machine and it seems to succeed now. Do you
mindkicking-off the CI build with the latest patch?
 

Thanks.

It's triggered automatically when you post patches to the thread and
also once a day, though it took ~35 minutes to get around to noticing
your new version due to other activity in other threads, and general
lack of horsepower.  I'm planning to fix that with more horses.

It passed on both OSes.  See here:

http://cfbot.cputube.org/dhruv-goel.html

-- 
Thomas Munro
https://enterprisedb.com



RE: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
"imai.yoshikazu@fujitsu.com"
Дата:
Hi Dhruv,

On Sun, June 30, 2019 at 7:30 AM, Goel, Dhruv wrote:
> > On Jun 10, 2019, at 1:20 PM, Goel, Dhruv <goeldhru@amazon.com> wrote:
> >> On Jun 9, 2019, at 5:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >>> On June 9, 2019 8:36:37 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>>> I think you are mistaken that doing transactional updates in
> >>>> pg_index is OK.  If memory serves, we rely on xmin of the pg_index
> >>>> row for purposes such as detecting whether a concurrently-created
> >>>> index is safe to use yet.
> >
> > I took a deeper look regarding this use case but was unable to find more evidence. As part of this patch, we
essentially
> make concurrently-created index safe to use only if transaction started after the xmin of Phase 3. Even today
concurrent
> indexes can not be used for transactions before this xmin because of the wait (which I am trying to get rid of in
this
> patch), is there any other denial of service you are talking about? Both the other states indislive, indisready can
> be transactional updates as far as I understand. Is there anything more I am missing here?
>
>
> Hi,
>
> I did some more concurrency testing here through some python scripts which compare the end state of the concurrently
> created indexes. I also back-ported this patch to PG 9.6 and ran some custom concurrency tests (Inserts, Deletes, and
> Create Index Concurrently) which seem to succeed. The intermediate states unfortunately are not easy to test in an
> automated manner, but to be fair concurrent indexes could never be used for older transactions. Do you have more
> inputs/ideas on this patch?

According to the commit 3c8404649 [1], transactional update in pg_index is not safe in non-MVCC catalog scans before
PG9.4.
But it seems to me that we can use transactional update in pg_index after the commit 813fb03155 [2] which got rid of
SnapshotNow. 

If we apply this patch back to 9.3 or earlier, we might need to consider another way or take the Andres suggestion
(whichI don't understand the way fully though), but which version do you want/do we need to apply this patch? 

Also, if we apply this patch in this way, there are several comments to be fixed which state the method of CREATE INDEX
CONCURRENTLY.

ex.
[index.c]
/*
* validate_index - support code for concurrent index builds
...
* After completing validate_index(), we wait until all transactions that
* were alive at the time of the reference snapshot are gone; this is
* necessary to be sure there are none left with a transaction snapshot
* older than the reference (and hence possibly able to see tuples we did
* not index).  Then we mark the index "indisvalid" and commit.  Subsequent
* transactions will be able to use it for queries.
...
valiate_index()
{
}


[1]
https://github.com/postgres/postgres/commit/3c84046490bed3c22e0873dc6ba492e02b8b9051#diff-b279fc6d56760ed80ce4178de1401d2c
[2]
https://github.com/postgres/postgres/commit/813fb0315587d32e3b77af1051a0ef517d187763#diff-b279fc6d56760ed80ce4178de1401d2c

--
Yoshikazu Imai



Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Michael Paquier
Дата:
On Mon, Oct 28, 2019 at 05:17:52AM +0000, imai.yoshikazu@fujitsu.com wrote:
> According to the commit 3c8404649 [1], transactional update in
> pg_index is not safe in non-MVCC catalog scans before PG9.4.
> But it seems to me that we can use transactional update in pg_index
> after the commit 813fb03155 [2] which got rid of SnapshotNow.

That's actually this part of the patch:
-   /* Assert that current xact hasn't done any transactional updates */
-   Assert(GetTopTransactionIdIfAny() == InvalidTransactionId);
And this thread (for commit 3c84046):
https://www.postgresql.org/message-id/19082.1349481400@sss.pgh.pa.us

And while looking at this patch, I have doubts that what you are doing
is actually safe either.

> If we apply this patch back to 9.3 or earlier, we might need to
> consider another way or take the Andres suggestion (which I don't
> understand the way fully though), but which version do you want/do
> we need to apply this patch?

Per the arguments of upthread, storing a 64-bit XID would require a
catalog change and you cannot backpatch that.  I would suggest to keep
this patch focused on HEAD, and keep it as an improvement of the
existing features.  Concurrent deadlock risks caused by CCI exist
since the feature came to life.

> Also, if we apply this patch in this way, there are several comments
> to be fixed which state the method of CREATE INDEX CONCURRENTLY.

Are we sure as well that all the cache lookup failures are addressed?
The CF robot does not complain per its latest status, but are we sure
to be out of the ground here?

The indentation of your patch is wrong in some places by the way.
--
Michael

Вложения

Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

От
Michael Paquier
Дата:
On Fri, Nov 08, 2019 at 10:30:39AM +0900, Michael Paquier wrote:
> Per the arguments of upthread, storing a 64-bit XID would require a
> catalog change and you cannot backpatch that.  I would suggest to keep
> this patch focused on HEAD, and keep it as an improvement of the
> existing features.  Concurrent deadlock risks caused by CCI exist
> since the feature came to life.

Marked as returned with feedback per lack of activity and the patch
was waiting on author for a bit more than two weeks.
--
Michael

Вложения