Re: Online enabling of checksums

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Online enabling of checksums
Дата
Msg-id 20180406222323.4idsn37rye6zzvfj@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Online enabling of checksums  (Daniel Gustafsson <daniel@yesql.se>)
Ответы Re: Online enabling of checksums  (Daniel Gustafsson <daniel@yesql.se>)
Список pgsql-hackers
Hi,

On 2018-04-06 02:28:17 +0200, Daniel Gustafsson wrote:
> Applying this makes the _cancel test pass, moving the failure instead to the
> following _enable test (which matches what coypu and mylodon are seeing).

FWIW, I'm somewhat annoyed that I'm now spending time debugging this to
get the buildfarm green again.

I'm fairly certain that the bug here is a simple race condition in the
test (not the main code!):

The flag informing whether the worker has started is cleared via an
on_shmem_exit() hook:

static void
launcher_exit(int code, Datum arg)
{
    ChecksumHelperShmem->abort = false;
    pg_atomic_clear_flag(&ChecksumHelperShmem->launcher_started);
}

but the the wait in the test is done via functions like:

    CREATE OR REPLACE FUNCTION test_checksums_on() RETURNS boolean AS $$
    DECLARE
        enabled boolean;
    BEGIN
        LOOP
            SELECT setting = 'on' INTO enabled FROM pg_catalog.pg_settings WHERE name = 'data_checksums';
            IF enabled THEN
                EXIT;
            END IF;
            PERFORM pg_sleep(1);
        END LOOP;
        RETURN enabled;
    END;
    $$ LANGUAGE plpgsql;

    INSERT INTO t1 (b, c) VALUES (generate_series(1,10000), 'starting values');

    CREATE OR REPLACE FUNCTION test_checksums_off() RETURNS boolean AS $$
    DECLARE
        enabled boolean;
    BEGIN
        PERFORM pg_sleep(1);
        SELECT setting = 'off' INTO enabled FROM pg_catalog.pg_settings WHERE name = 'data_checksums';
        RETURN enabled;
    END;
    $$ LANGUAGE plpgsql;

which just waits for setting checksums to have finished.  It's
exceedingly unsurprising that a 'pg_sleep(1)' is not a reliable way to
make sure that a process has finished exiting.  Then followup tests fail
because the process is still running

Also:
    CREATE OR REPLACE FUNCTION reader_loop() RETURNS boolean AS $$
    DECLARE
        counter integer;
    BEGIN
        FOR counter IN 1..30 LOOP
            PERFORM count(a) FROM t1;
            PERFORM pg_sleep(0.2);
        END LOOP;
        RETURN True;
    END;
    $$ LANGUAGE plpgsql;
}

really?  Let's just force the test take at least 6s purely from
sleeping?

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Steele
Дата:
Сообщение: Re: PATCH: Configurable file mode mask
Следующее
От: David Steele
Дата:
Сообщение: Re: PATCH: Configurable file mode mask