Race condition in dependency searches

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Race condition in dependency searches
Дата
Msg-id 26527.1549572789@sss.pgh.pa.us
обсуждение исходный текст
Список pgsql-hackers
While I've been staring at dependency.c, I've realized that this bit
in findDependentObjects is unsafe:

                /*
                 * First, release caller's lock on this object and get
                 * deletion lock on the owning object.  (We must release
                 * caller's lock to avoid deadlock against a concurrent
                 * deletion of the owning object.)
                 */
                ReleaseDeletionLock(object);
                AcquireDeletionLock(&otherObject, 0);

                /*
                 * The owning object might have been deleted while we waited
                 * to lock it; if so, neither it nor the current object are
                 * interesting anymore.  We test this by checking the
                 * pg_depend entry (see notes below).
                 */
                if (!systable_recheck_tuple(scan, tup))
                {
                    systable_endscan(scan);
                    ReleaseDeletionLock(&otherObject);
                    return;
                }

                /*
                 * Okay, recurse to the owning object instead of proceeding.

The unstated assumption here is that if the pg_depend entry we are looking
at has gone away, then both the current object and its owning object must
be gone too.  That was a safe assumption when this code was written,
fifteen years ago, because nothing except object deletion could cause
a pg_depend entry to disappear.  Since then, however, we have merrily
handed people a bunch of foot-guns with which they can munge pg_depend
like mad, for example ALTER EXTENSION ... DROP.

Hence, if the pg_depend entry is gone, that might only mean that somebody
randomly decided to remove the dependency.  Now, I think it's legit to
decide that we needn't remove the previously-owning object in that case.
But it's not okay to just pack up shop and return, because if the current
object is still there, proceeding with deletion of whatever we were
deleting would be bad.  That could leave us with scenarios like triggers
whose function is gone, views referring to a deceased table, indexes
dependent on a vanished datatype or opclass, yadda yadda.

It seems like what ought to happen here, if systable_recheck_tuple fails,
is to reacquire the deletion lock that we gave up on "object", then
check to see if "object" is still there, and if so continue with deleting
it.  Only if it in fact isn't there is it OK to conclude that 
findDependentObjects needn't do any more work at this recursion level.

I do not think we have any off-the-shelf way of asking whether an
ObjectAddress's referent still exists.  We could probably teach
objectaddress.c to provide one --- it seems to have enough infrastructure
already to know where to find the object's (main) catalog tuple.

This issue seems independent of the partition problems I'm working
on right now, so I plan to leave it for later.

            regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dmitry Dolgov
Дата:
Сообщение: Re: propagating replica identity to partitions
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: Location of pg_rewind/RewindTest.pm and ssl/ServerSetup.pm