BUG #17949: Adding an index introduces serialisation anomalies.

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #17949: Adding an index introduces serialisation anomalies.
Дата
Msg-id 17949-a0f17035294a55e2@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #17949: Adding an index introduces serialisation anomalies.  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17949
Logged by:          Artem Anisimov
Email address:      artem.anisimov.255@gmail.com
PostgreSQL version: 15.3
Operating system:   fedora 38
Description:

Hello dear pg authors,

I have come across a behaviour in pg that contradicts
https://www.postgresql.org/docs/15/transaction-iso.html#XACT-SERIALIZABLE.

I've minimised my testcase to a scenario that essentially duplicates a
scenario that you describe in the following paragraph:
> In particular, it is possible to see unique constraint violations
> caused by conflicts with overlapping Serializable transactions
> even after explicitly checking that the key isn't present
> before attempting to insert it.
> This can be avoided by making sure that all Serializable transactions
that
> insert potentially conflicting keys explicitly check if they can do so
first.

At the end of the report there is a C reproducer that highlights the
problem. Let me give a high-level overview of the scenario first.

I have N threads each trying to "acquire an exclusive lock" this way:
  BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
  SELECT * FROM locks WHERE path = $1;
  quit if there already is a lock
  INSERT INTO locks(path, ...) VALUES($1, ...);
  COMMIT;

Once all threads have attempted to acquire a lock, I count how many of them
succeeded to INSERT INTO. Normally, there is only one thread that succeeds,
which is what I expect. However, if I add a HASH or a BTREE index on "path",
it becomes possible for multiple threads to do a successful INSERT INTO,
which violates the serialisability.

The problem can be reproduced with the default postgresql.conf, but it takes
some time. If I increase "shared_buffers" to 1024MB, the issue appears
almost immediately (fewer chances to promote predicate locks to locks on the
whole table?).

I've seen this behaviour with pg 13, 15 and 16 (503b055).

Now let us see the reproducer. It has two primary components:
1. test_once() spawns 32 threads that try to acquire a lock, waits for them,
and counts the number of "acquired exclusive locks",
2. try_acquire_lock() executes a transaction described above.

To build the reproducer:
$  gcc -std=c99 -o test test.c -pthread `pkg-config libpq --cflags --libs`

To run the reproducer:
$ ./test
or
$ DB_CONNSTR="dbname=abc host=def user=ghi" ./test

The reproducer:

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <assert.h>

#include <libpq-fe.h>


// Comment this to make serialisation anomalies go away.
#define WITH_INDEX


// I have seen the problem with as few as 3 threads. 32 threads make
// the issue appear much sooner.
#define NR_THREADS (32)
#define NR_RUNS (1024 * 1024)

static PGconn *conns[NR_THREADS];

static void* try_acquire_lock(void *arg)
{
        PGconn *c = arg;
        PGresult *res;
        int ntuples;

        res = PQexec(c, "BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE");
        assert(PQresultStatus(res) == PGRES_COMMAND_OK);
        PQclear(res);

        res = PQexec(c, "SELECT * FROM locks WHERE path = 'xyz'");
        assert(PQresultStatus(res) == PGRES_TUPLES_OK);
        ntuples = PQntuples(res);
        PQclear(res);

        if (ntuples > 0) {
                // someone else already has a lock
                res = PQexec(c, "COMMIT");
                assert(PQresultStatus(res) == PGRES_COMMAND_OK);
                PQclear(res);
                return NULL;
        }

        res = PQexec(c, "INSERT INTO locks(path) VALUES('xyz')");
        PQclear(res);

        res = PQexec(c, "COMMIT");
        PQclear(res);

        return NULL;
}

static void test_once(void)
{
        PGconn *c = conns[0];
        PGresult *res;
        int ntuples;

        pthread_t thrs[NR_THREADS];
        for (int i = 0; i < NR_THREADS; ++i)
                pthread_create(&thrs[i], NULL, &try_acquire_lock,
conns[i]);
        for (int i = 0; i < NR_THREADS; ++i)
                pthread_join(thrs[i], NULL);

        res = PQexec(c, "SELECT * FROM locks WHERE path = 'xyz'");
        assert(PQresultStatus(res) == PGRES_TUPLES_OK);
        ntuples = PQntuples(res);
        PQclear(res);

        if (ntuples != 1)
                printf("ntuples = %d\n", ntuples);
        assert(ntuples == 1);

        res = PQexec(c, "TRUNCATE TABLE locks");
        assert(PQresultStatus(res) == PGRES_COMMAND_OK);
        PQclear(res);
}

static void prepare_db(void)
{
        PGconn *c = conns[0];
        PGresult *res;

        res = PQexec(c, "DROP TABLE locks");
        PQclear(res);

        res = PQexec(c, "CREATE TABLE locks (path TEXT NOT NULL)");
        assert(PQresultStatus(res) == PGRES_COMMAND_OK);
        PQclear(res);

#ifdef WITH_INDEX
        res = PQexec(c, "CREATE INDEX ON locks USING HASH(path)");
        assert(PQresultStatus(res) == PGRES_COMMAND_OK);
        PQclear(res);
#endif
}

int main(void)
{
        const char *connstr = getenv("DB_CONNSTR");
        if (connstr == NULL)
                connstr = "dbname=postgres";

        for (int i = 0; i < NR_THREADS; ++i) {
                conns[i] = PQconnectdb(connstr);
                assert(PQstatus(conns[i]) == CONNECTION_OK);
        }
        prepare_db();

        for (int i = 0; i < NR_RUNS; ++i)
                test_once();

        for (int i = 0; i < NR_THREADS; ++i)
                PQfinish(conns[i]);

        return 0;
}


В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17948: libpq seems to misbehave in a pipelining corner case
Следующее
От: jian he
Дата:
Сообщение: Re: Comparing date strings with jsonpath expression