Обсуждение: BUG #17122: panic on prepare with subsequent pg_advisory_lock() and pg_advisory_xact_lock_shared()

Поиск
Список
Период
Сортировка

BUG #17122: panic on prepare with subsequent pg_advisory_lock() and pg_advisory_xact_lock_shared()

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17122
Logged by:          Alexander Pyhalov
Email address:      a.pyhalov@postgrespro.ru
PostgreSQL version: 13.3
Operating system:   Ubuntu 20.04
Description:

The issue is reproducible at least on  on PG 13.3 and 14beta2

mkdir d;
initdb -D d;

Increase max_prepared_transactions in postgresql.conf  (for example, set it
to 10).

The following statements lead to panic:
begin;
select pg_advisory_lock(1);
select pg_advisory_xact_lock_shared(1);
prepare transaction 'test';

2021-07-23 17:15:59.868 MSK [61851] PANIC:  we seem to have dropped a bit
somewhere
2021-07-23 17:15:59.868 MSK [61851] STATEMENT:  prepare transaction
'test';

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fc72923c859 in __GI_abort () at abort.c:79
#2  0x000055f80d4facdd in errfinish (filename=0x55f80d6b53c3 "lock.c",
lineno=3466, funcname=0x55f80d6b6200 <__func__.13837> "PostPrepare_Locks")
at elog.c:592
#3  0x000055f80d340143 in PostPrepare_Locks (xid=485) at lock.c:3466
#4  0x000055f80cf50942 in PrepareTransaction () at xact.c:2519
#5  0x000055f80cf512b9 in CommitTransactionCommand () at xact.c:3034
#6  0x000055f80d35c487 in finish_xact_command () at postgres.c:2662
#7  0x000055f80d359edb in exec_simple_query (query_string=0x55f80f0890a0
"prepare transaction 'test';") at postgres.c:1264
#8  0x000055f80d35e587 in PostgresMain (argc=1, argv=0x55f80f0b38b0,
dbname=0x55f80f0856d8 "postgres", username=0x55f80f0b37d0 "leoric") at
postgres.c:4339
#9  0x000055f80d29ddf4 in BackendRun (port=0x55f80f0abc20) at
postmaster.c:4526
#10 0x000055f80d29d4f6 in BackendStartup (port=0x55f80f0abc20) at
postmaster.c:4210
#11 0x000055f80d2997ec in ServerLoop () at postmaster.c:1739
#12 0x000055f80d298f85 in PostmasterMain (argc=3, argv=0x55f80f083650) at
postmaster.c:1412
#13 0x000055f80d193dcd in main (argc=3, argv=0x55f80f083650) at main.c:210


PG Bug reporting form <noreply@postgresql.org> writes:
> The following statements lead to panic:
> begin;
> select pg_advisory_lock(1);
> select pg_advisory_xact_lock_shared(1);
> prepare transaction 'test';

Thanks for the report!  Looks like the shared lock is sufficient
to cause the problem:

regression=# begin;
BEGIN
regression=*# select pg_advisory_xact_lock_shared(1);
 pg_advisory_xact_lock_shared
------------------------------

(1 row)

regression=*# prepare transaction 'test';
PANIC:  we seem to have dropped a bit somewhere
server closed the connection unexpectedly

Taking only the other lock works fine.  Even more interesting,
"select pg_advisory_xact_lock(1);" also works, as does
"select pg_advisory_lock_shared(1);".  So it's specific to
xact-level shared locks, which seems downright weird.

I also tried this:

regression=# begin;
BEGIN
regression=*# select pg_advisory_lock_shared(1);
 pg_advisory_lock_shared
-------------------------

(1 row)

regression=*# select pg_advisory_xact_lock(1);
 pg_advisory_xact_lock
-----------------------

(1 row)

regression=*# select pg_advisory_lock(1);
 pg_advisory_lock
------------------

(1 row)

regression=*# prepare transaction 'test';
ERROR:  cannot PREPARE while holding both session-level and transaction-level locks on the same object

which makes me wonder why we didn't issue that error in your
example case.  However, that's not sufficient to explain
the crash with only the xact-shared lock.

BTW, note that we do manage to prepare the transaction before
crashing, so you must do
commit prepared 'test';
to clean up before trying again.

            regards, tom lane



I wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
>> The following statements lead to panic:
>> begin;
>> select pg_advisory_lock(1);
>> select pg_advisory_xact_lock_shared(1);
>> prepare transaction 'test';

> Thanks for the report!  Looks like the shared lock is sufficient
> to cause the problem:

Ah, scratch that.  I can't reproduce it now, and I think I messed
up yesterday by not remembering that pg_advisory_lock() takes a
session-level lock that would continue to be held after PREPARE.

> I also tried this:
> ...
> regression=*# prepare transaction 'test';
> ERROR:  cannot PREPARE while holding both session-level and transaction-level locks on the same object
> which makes me wonder why we didn't issue that error in your
> example case.

The reason why not is that the code that's meant to detect that is
just fundamentally inadequate.  It's examining individual LOCALLOCK
entries to detect conflicts, but we keep separate LOCALLOCK entries
for different lockmodes on the same object.  So pg_advisory_lock()
creates an entry with lockmode ExclusiveLock, while
pg_advisory_xact_lock_shared() creates a different entry with lockmode
AccessShareLock.  But they are pointing at the same PROCLOCK, and the
restriction we're dealing with here applies at the PROCLOCK level.

So I think we need something like the attached to fix this.

In the longer term, maybe it'd be better to rethink how we represent
LOCALLOCK entries so that they are one-to-one with PROCLOCKs.  But
rearranging that data structure for the convenience of PREPARE
TRANSACTION is probably going to be a net loss performance-wise.
In any case, we certainly wouldn't risk back-patching such a change.

            regards, tom lane

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 108b4d9023..8c2138f107 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3204,6 +3204,102 @@ LockRefindAndRelease(LockMethod lockMethodTable, PGPROC *proc,
     }
 }

+/*
+ * CheckForSessionAndXactLocks
+ *        Check to see if transaction holds both session-level and xact-level
+ *        locks on the same object; if so, throw an error.
+ *
+ * If we have both session- and transaction-level locks on the same object,
+ * PREPARE TRANSACTION must fail.  This should never happen with regular
+ * locks, since we only take those at session level in some special operations
+ * like VACUUM.  It's possible to hit this with advisory locks, though.
+ *
+ * It would be nice if we could keep the session hold and give away the
+ * transactional hold to the prepared xact.  However, that would require two
+ * PROCLOCK objects, and we cannot be sure that another PROCLOCK will be
+ * available when it comes time for PostPrepare_Locks to do the deed.
+ * So for now, we error out while we can still do so safely.
+ *
+ * Since the LOCALLOCK table stores a separate entry for each lockmode,
+ * we can't implement this check by examining LOCALLOCK entries in isolation.
+ * We must build a transient hashtable that is indexed by locktag only.
+ */
+static void
+CheckForSessionAndXactLocks(void)
+{
+    typedef struct
+    {
+        LOCKTAG        lock;        /* identifies the lockable object */
+        bool        sessLock;    /* is any lockmode held at session level? */
+        bool        xactLock;    /* is any lockmode held at xact level? */
+    } PerLockTagEntry;
+
+    HASHCTL        hash_ctl;
+    HTAB       *lockhtab;
+    HASH_SEQ_STATUS status;
+    LOCALLOCK  *locallock;
+
+    /* Create a local hash table keyed by LOCKTAG only */
+    hash_ctl.keysize = sizeof(LOCKTAG);
+    hash_ctl.entrysize = sizeof(PerLockTagEntry);
+    hash_ctl.hcxt = CurrentMemoryContext;
+
+    lockhtab = hash_create("CheckForSessionAndXactLocks table",
+                           256, /* arbitrary initial size */
+                           &hash_ctl,
+                           HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+    /* Scan local lock table to find entries for each LOCKTAG */
+    hash_seq_init(&status, LockMethodLocalHash);
+
+    while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
+    {
+        LOCALLOCKOWNER *lockOwners = locallock->lockOwners;
+        PerLockTagEntry *hentry;
+        bool        found;
+        int            i;
+
+        /*
+         * Ignore VXID locks.  We don't want those to be held by prepared
+         * transactions, since they aren't meaningful after a restart.
+         */
+        if (locallock->tag.lock.locktag_type == LOCKTAG_VIRTUALTRANSACTION)
+            continue;
+
+        /* Ignore it if we don't actually hold the lock */
+        if (locallock->nLocks <= 0)
+            continue;
+
+        /* Otherwise, find or make an entry in lockhtab */
+        hentry = (PerLockTagEntry *) hash_search(lockhtab,
+                                                 (void *) &locallock->tag.lock,
+                                                 HASH_ENTER, &found);
+        if (!found)                /* initialize, if newly created */
+            hentry->sessLock = hentry->xactLock = false;
+
+        /* Scan to see if we hold lock at session or xact level or both */
+        for (i = locallock->numLockOwners - 1; i >= 0; i--)
+        {
+            if (lockOwners[i].owner == NULL)
+                hentry->sessLock = true;
+            else
+                hentry->xactLock = true;
+        }
+
+        /*
+         * We can throw error immediately when we see both types of locks; no
+         * need to wait around to see if there are more violations.
+         */
+        if (hentry->sessLock && hentry->xactLock)
+            ereport(ERROR,
+                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                     errmsg("cannot PREPARE while holding both session-level and transaction-level locks on the same
object")));
+    }
+
+    /* Success, so clean up */
+    hash_destroy(lockhtab);
+}
+
 /*
  * AtPrepare_Locks
  *        Do the preparatory work for a PREPARE: make 2PC state file records
@@ -3211,11 +3307,10 @@ LockRefindAndRelease(LockMethod lockMethodTable, PGPROC *proc,
  *
  * Session-level locks are ignored, as are VXID locks.
  *
- * There are some special cases that we error out on: we can't be holding any
- * locks at both session and transaction level (since we must either keep or
- * give away the PROCLOCK object), and we can't be holding any locks on
- * temporary objects (since that would mess up the current backend if it tries
- * to exit before the prepared xact is committed).
+ * For the most part, we don't need to touch shared memory for this ---
+ * all the necessary state information is in the locallock table.
+ * Fast-path locks are an exception, however: we move any such locks to
+ * the main table before allowing PREPARE TRANSACTION to succeed.
  */
 void
 AtPrepare_Locks(void)
@@ -3223,12 +3318,10 @@ AtPrepare_Locks(void)
     HASH_SEQ_STATUS status;
     LOCALLOCK  *locallock;

-    /*
-     * For the most part, we don't need to touch shared memory for this ---
-     * all the necessary state information is in the locallock table.
-     * Fast-path locks are an exception, however: we move any such locks to
-     * the main table before allowing PREPARE TRANSACTION to succeed.
-     */
+    /* First, verify there aren't locks of both xact and session level */
+    CheckForSessionAndXactLocks();
+
+    /* Now do the per-locallock cleanup work */
     hash_seq_init(&status, LockMethodLocalHash);

     while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
@@ -3264,19 +3357,7 @@ AtPrepare_Locks(void)
         if (!haveXactLock)
             continue;

-        /*
-         * If we have both session- and transaction-level locks, fail.  This
-         * should never happen with regular locks, since we only take those at
-         * session level in some special operations like VACUUM.  It's
-         * possible to hit this with advisory locks, though.
-         *
-         * It would be nice if we could keep the session hold and give away
-         * the transactional hold to the prepared xact.  However, that would
-         * require two PROCLOCK objects, and we cannot be sure that another
-         * PROCLOCK will be available when it comes time for PostPrepare_Locks
-         * to do the deed.  So for now, we error out while we can still do so
-         * safely.
-         */
+        /* This can't happen, because we already checked it */
         if (haveSessionLock)
             ereport(ERROR,
                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
diff --git a/src/test/regress/expected/prepared_xacts.out b/src/test/regress/expected/prepared_xacts.out
index eb77c18788..ba8e3ccc6c 100644
--- a/src/test/regress/expected/prepared_xacts.out
+++ b/src/test/regress/expected/prepared_xacts.out
@@ -151,6 +151,22 @@ SELECT gid FROM pg_prepared_xacts;

 -- Clean up
 DROP TABLE pxtest1;
+-- Test detection of session-level and xact-level locks on same object
+BEGIN;
+SELECT pg_advisory_lock(1);
+ pg_advisory_lock
+------------------
+
+(1 row)
+
+SELECT pg_advisory_xact_lock_shared(1);
+ pg_advisory_xact_lock_shared
+------------------------------
+
+(1 row)
+
+PREPARE TRANSACTION 'foo6';  -- fails
+ERROR:  cannot PREPARE while holding both session-level and transaction-level locks on the same object
 -- Test subtransactions
 BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
   CREATE TABLE pxtest2 (a int);
diff --git a/src/test/regress/expected/prepared_xacts_1.out b/src/test/regress/expected/prepared_xacts_1.out
index 0857d259e0..2cd50ad947 100644
--- a/src/test/regress/expected/prepared_xacts_1.out
+++ b/src/test/regress/expected/prepared_xacts_1.out
@@ -153,6 +153,23 @@ SELECT gid FROM pg_prepared_xacts;

 -- Clean up
 DROP TABLE pxtest1;
+-- Test detection of session-level and xact-level locks on same object
+BEGIN;
+SELECT pg_advisory_lock(1);
+ pg_advisory_lock
+------------------
+
+(1 row)
+
+SELECT pg_advisory_xact_lock_shared(1);
+ pg_advisory_xact_lock_shared
+------------------------------
+
+(1 row)
+
+PREPARE TRANSACTION 'foo6';  -- fails
+ERROR:  prepared transactions are disabled
+HINT:  Set max_prepared_transactions to a nonzero value.
 -- Test subtransactions
 BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
   CREATE TABLE pxtest2 (a int);
diff --git a/src/test/regress/sql/prepared_xacts.sql b/src/test/regress/sql/prepared_xacts.sql
index d8249a27dc..2f0bb55bb4 100644
--- a/src/test/regress/sql/prepared_xacts.sql
+++ b/src/test/regress/sql/prepared_xacts.sql
@@ -88,6 +88,12 @@ SELECT gid FROM pg_prepared_xacts;
 -- Clean up
 DROP TABLE pxtest1;

+-- Test detection of session-level and xact-level locks on same object
+BEGIN;
+SELECT pg_advisory_lock(1);
+SELECT pg_advisory_xact_lock_shared(1);
+PREPARE TRANSACTION 'foo6';  -- fails
+
 -- Test subtransactions
 BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
   CREATE TABLE pxtest2 (a int);


> 25 июля 2021 г., в 02:44, Tom Lane <tgl@sss.pgh.pa.us> написал(а):
>
> +PREPARE TRANSACTION 'foo6';  -- fails
> +ERROR:  prepared transactions are disabled
> +HINT:  Set max_prepared_transactions to a nonzero value.

Do we actually expect this particular error?

Best regards, Andrey Borodin.



Andrey Borodin <x4mmm@yandex-team.ru> writes:
>> 25 июля 2021 г., в 02:44, Tom Lane <tgl@sss.pgh.pa.us> написал(а):
>> +PREPARE TRANSACTION 'foo6';  -- fails
>> +ERROR:  prepared transactions are disabled
>> +HINT:  Set max_prepared_transactions to a nonzero value.

> Do we actually expect this particular error?

Of course.  "make installcheck" will produce that result file,
unless you've changed max_prepared_transactions on the server
being tested.

            regards, tom lane




> 25 июля 2021 г., в 19:32, Tom Lane <tgl@sss.pgh.pa.us> написал(а):
>
> Andrey Borodin <x4mmm@yandex-team.ru> writes:
>>> 25 июля 2021 г., в 02:44, Tom Lane <tgl@sss.pgh.pa.us> написал(а):
>>> +PREPARE TRANSACTION 'foo6';  -- fails
>>> +ERROR:  prepared transactions are disabled
>>> +HINT:  Set max_prepared_transactions to a nonzero value.
>
>> Do we actually expect this particular error?
>
> Of course.  "make installcheck" will produce that result file,
> unless you've changed max_prepared_transactions on the server
> being tested.

Sorry for the noise. I did not know that suffixes for .out files work this way.

Best regards, Andrey Borodin.


On Sat, Jul 24, 2021 at 05:44:57PM -0400, Tom Lane wrote:
> The reason why not is that the code that's meant to detect that is
> just fundamentally inadequate.  It's examining individual LOCALLOCK
> entries to detect conflicts, but we keep separate LOCALLOCK entries
> for different lockmodes on the same object.  So pg_advisory_lock()
> creates an entry with lockmode ExclusiveLock, while
> pg_advisory_xact_lock_shared() creates a different entry with lockmode
> AccessShareLock.  But they are pointing at the same PROCLOCK, and the
> restriction we're dealing with here applies at the PROCLOCK level.

For the archives: this has been applied as of 6310809.
--
Michael

Вложения