Обсуждение: WaitForOlderSnapshots in DETACH PARTITION causes deadlocks

Поиск
Список
Период
Сортировка

WaitForOlderSnapshots in DETACH PARTITION causes deadlocks

От
"Imseih (AWS), Sami"
Дата:

Hi,

 

While recently looking into partition maintenance, I found a case in which

DETACH PARTITION FINALIZE could case deadlocks. This occurs when a

ALTER TABLE DETACH CONCURRENTLY, followed by a cancel, the followed by

an ALTER TABLE DETACH FINALIZE, and this sequence of steps occur in the

middle of other REPEATABLE READ/SERIALIZABLE transactions. These RR/SERIALIZABLE

should not accessed the partition until the FINALIZE step starts. See the attached

repro.

 

This seems to occur as the FINALIZE is calling WaitForOlderSnapshot [1]

to make sure that all older snapshots are completed before the finalize

is completed and the detached partition is removed from the parent table

association.

 

WaitForOlderSnapshots is used here to ensure that snapshots older than

the start of the ALTER TABLE DETACH CONCURRENTLY are completely removed

to guarantee consistency, however it does seem to cause deadlocks for at

least RR/SERIALIZABLE transactions.

 

There are cases [2] in which certain operations are accepted as not being

MVCC-safe, and now I am wondering if this is another case? Deadlocks are

not a good scenario, and WaitForOlderSnapshot does not appear to do

anything for READ COMMITTED transactions.

 

So do we actually need WaitForOlderSnapshot in the FINALIZE code? and

Could be acceptable that this operation is marked as not MVCC-safe like

the other aforementioned operations?

 

Perhaps I am missing some important point here, so any feedback will be

appreciated.

 

Regards,

 

Sami Imseih

Amazon Web Services (AWS)

 

 

[1] https://github.com/postgres/postgres/blob/master/src/backend/commands/tablecmds.c#L18757-L18764

[2] https://www.postgresql.org/docs/current/mvcc-caveats.html

Вложения

Re: WaitForOlderSnapshots in DETACH PARTITION causes deadlocks

От
Michael Paquier
Дата:
On Mon, Jul 24, 2023 at 07:40:04PM +0000, Imseih (AWS), Sami wrote:
> WaitForOlderSnapshots is used here to ensure that snapshots older than
> the start of the ALTER TABLE DETACH CONCURRENTLY are completely removed
> to guarantee consistency, however it does seem to cause deadlocks for at
> least RR/SERIALIZABLE transactions.
>
> There are cases [2] in which certain operations are accepted as not being
> MVCC-safe, and now I am wondering if this is another case? Deadlocks are
> not a good scenario, and WaitForOlderSnapshot does not appear to do
> anything for READ COMMITTED transactions.
>
> So do we actually need WaitForOlderSnapshot in the FINALIZE code? and
> Could be acceptable that this operation is marked as not MVCC-safe like
> the other aforementioned operations?

I guess that there is an argument with lifting that a bit.  Based on
the test case your are providing, a deadlock occuring between the
FINALIZE and a scan of the top-level partitioned table in a
transaction that began before the DETACH CONCURRENTLY does not seem
like the best answer to have from the user perspective.  I got to
wonder whether there is room to make the wait for older snapshots in
the finalize phase more robust, though, and actually make it wait
until the first transaction commits rather than fail because of a
deadlock like that.

> Perhaps I am missing some important point here, so any feedback will be
> appreciated.

Adding Alvaro in CC as the author of 71f4c8c6 for input, FYI.
--
Michael

Вложения

Re: WaitForOlderSnapshots in DETACH PARTITION causes deadlocks

От
Alvaro Herrera
Дата:
On 2023-Jul-25, Michael Paquier wrote:

> On Mon, Jul 24, 2023 at 07:40:04PM +0000, Imseih (AWS), Sami wrote:
> > WaitForOlderSnapshots is used here to ensure that snapshots older than
> > the start of the ALTER TABLE DETACH CONCURRENTLY are completely removed
> > to guarantee consistency, however it does seem to cause deadlocks for at
> > least RR/SERIALIZABLE transactions.
> > 
> > There are cases [2] in which certain operations are accepted as not being
> > MVCC-safe, and now I am wondering if this is another case? Deadlocks are
> > not a good scenario, and WaitForOlderSnapshot does not appear to do
> > anything for READ COMMITTED transactions.
> > 
> > So do we actually need WaitForOlderSnapshot in the FINALIZE code? and
> > Could be acceptable that this operation is marked as not MVCC-safe like
> > the other aforementioned operations?
> 
> I guess that there is an argument with lifting that a bit.  Based on
> the test case your are providing, a deadlock occuring between the
> FINALIZE and a scan of the top-level partitioned table in a
> transaction that began before the DETACH CONCURRENTLY does not seem
> like the best answer to have from the user perspective.  I got to
> wonder whether there is room to make the wait for older snapshots in
> the finalize phase more robust, though, and actually make it wait
> until the first transaction commits rather than fail because of a
> deadlock like that.

It took a lot of work to get SERIALIZABLE/RR transactions to work with
DETACHED CONCURRENTLY, so I'm certainly not in a hurry to give up and
just "document that it doesn't work".  I don't think that would be an
acceptable feature regression.

I spent some time yesterday trying to turn the reproducer into an
isolationtester spec file.  I have not succeeded yet, but I'll continue
with that this afternoon.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Every machine is a smoke machine if you operate it wrong enough."
https://twitter.com/libseybieda/status/1541673325781196801