Re: [PATCH] Logical decoding timeline following take II

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [PATCH] Logical decoding timeline following take II
Дата
Msg-id 20161112150757.qgy5cw4e7buxm2my@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [PATCH] Logical decoding timeline following take II  (Petr Jelinek <petr@2ndquadrant.com>)
Ответы Re: [PATCH] Logical decoding timeline following take II
Список pgsql-hackers
On 2016-10-24 17:49:13 +0200, Petr Jelinek wrote:
> + * Determine which timeline to read an xlog page from and set the
> + * XLogReaderState's state->currTLI to that timeline ID.

"XLogReaderState's state->currTLI" - the state's a bit redundant.

> + * The caller must also make sure it doesn't read past the current redo pointer
> + * so it doesn't fail to notice that the current timeline became historical.
> + */

Not sure what that means? The redo pointer usually menas the "logical
start" of the last checkpoint, but I don't see how thta could be meant
here?

> +static void
> +XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, uint32 wantLength)
> +{
> +    const XLogRecPtr lastReadPage = state->readSegNo * XLogSegSize + state->readOff;
> +
> +    elog(DEBUG4, "Determining timeline for read at %X/%X+%X",
> +            (uint32)(wantPage>>32), (uint32)wantPage, wantLength);

Doing this on every single read from a page seems quite verbose.   It's
also (like most or all the following debug elogs) violating the casing
prescribed by the message style guidelines.


> +    /*
> +     * If the desired page is currently read in and valid, we have nothing to do.
> +     *
> +     * The caller should've ensured that it didn't previously advance readOff
> +     * past the valid limit of this timeline, so it doesn't matter if the current
> +     * TLI has since become historical.
> +     */
> +    if (lastReadPage == wantPage &&
> +        state->readLen != 0 &&
> +        lastReadPage + state->readLen >= wantPage + Min(wantLength,XLOG_BLCKSZ-1))
> +    {
> +        elog(DEBUG4, "Wanted data already valid"); //XXX
> +        return;
> +    }

With that kind of comment/XXX present, this surely can't be ready for
committer?


> +    /*
> +     * If we're reading from the current timeline, it hasn't become historical
> +     * and the page we're reading is after the last page read, we can again
> +     * just carry on. (Seeking backwards requires a check to make sure the older
> +     * page isn't on a prior timeline).
> +     */

How can ThisTimeLineID become historical?

> +    if (state->currTLI == ThisTimeLineID && wantPage >= lastReadPage)
> +    {
> +        Assert(state->currTLIValidUntil == InvalidXLogRecPtr);
> +        elog(DEBUG4, "On current timeline");
> +        return;
> +    }

Also, is it actually ok to rely on ThisTimeLineID here? That's IIRC not
maintained outside of the startup process, until recovery ended (cf it
being set in InitXLOGAccess() called via RecoveryInProgress()).


>          /*
> -         * TODO: we're going to have to do something more intelligent about
> -         * timelines on standbys. Use readTimeLineHistory() and
> -         * tliOfPointInHistory() to get the proper LSN? For now we'll catch
> -         * that case earlier, but the code and TODO is left in here for when
> -         * that changes.
> +         * Check which timeline to get the record from.
> +         *
> +         * We have to do it each time through the loop because if we're in
> +         * recovery as a cascading standby, the current timeline might've
> +         * become historical.
>           */

I guess you mean cascading as "replicating logically from a physical
standby"? I don't think it's good to use cascading here, because that's
usually thought to mean doing physical SR from a standby...



> diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
> index 318726e..4315fb3 100644
> --- a/src/backend/replication/logical/logicalfuncs.c
> +++ b/src/backend/replication/logical/logicalfuncs.c
> @@ -234,12 +234,6 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
>      rsinfo->setResult = p->tupstore;
>      rsinfo->setDesc = p->tupdesc;
>  
> -    /* compute the current end-of-wal */
> -    if (!RecoveryInProgress())
> -        end_of_wal = GetFlushRecPtr();
> -    else
> -        end_of_wal = GetXLogReplayRecPtr(NULL);
> -
>      ReplicationSlotAcquire(NameStr(*name));
>  
>      PG_TRY();
> @@ -279,6 +273,12 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
>          /* invalidate non-timetravel entries */
>          InvalidateSystemCaches();
>  
> +        if (!RecoveryInProgress())
> +            end_of_wal = GetFlushRecPtr();
> +        else
> +            end_of_wal = GetXLogReplayRecPtr(NULL);
> +
> +        /* Decode until we run out of records */
>          while ((startptr != InvalidXLogRecPtr && startptr < end_of_wal) ||
>                 (ctx->reader->EndRecPtr != InvalidXLogRecPtr && ctx->reader->EndRecPtr < end_of_wal))
>          {

That seems like a pretty random change?


> diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
> index deaa7f5..caff9a6 100644
> --- a/src/include/access/xlogreader.h
> +++ b/src/include/access/xlogreader.h
> @@ -27,6 +27,10 @@
>  
>  #include "access/xlogrecord.h"
>  
> +#ifndef FRONTEND
> +#include "nodes/pg_list.h"
> +#endif

Why is that needed?
> diff --git a/src/test/modules/test_slot_timelines/README b/src/test/modules/test_slot_timelines/README
> new file mode 100644
> index 0000000..585f02f
> --- /dev/null
> +++ b/src/test/modules/test_slot_timelines/README
> @@ -0,0 +1,19 @@
> +A test module for logical decoding failover and timeline following.
> +
> +This module provides a minimal way to maintain logical slots on replicas
> +that mirror the state on the master. It doesn't make decoding possible,
> +just tracking slot state so that a decoding client that's using the master
> +can follow a physical failover to the standby. The master doesn't know
> +about the slots on the standby, they're synced by a client that connects
> +to both.
> +
> +This is intentionally not part of the test_decoding module because that's meant
> +to serve as example code, where this module exercises internal server features
> +by unsafely exposing internal state to SQL. It's not the right way to do
> +failover, it's just a simple way to test it from the perl TAP framework to
> +prove the feature works.
> +
> +In a practical implementation of this approach a bgworker on the master would
> +monitor slot positions and relay them to a bgworker on the standby that applies
> +the position updates without exposing slot internals to SQL. That's too complex
> +for this test framework though.

I still don't want to have this code in postgres, it's just an example
for doing something bad.  Lets get proper feedback control in, and go
from there.

> diff --git a/src/test/modules/test_slot_timelines/expected/load_extension_1.out
b/src/test/modules/test_slot_timelines/expected/load_extension_1.out
> new file mode 100644
> index 0000000..0db21e4
> --- /dev/null
> +++ b/src/test/modules/test_slot_timelines/expected/load_extension_1.out
> @@ -0,0 +1,7 @@
> +CREATE EXTENSION test_slot_timelines;
> +SELECT test_slot_timelines_create_logical_slot('test_slot', 'test_decoding');
> +ERROR:  replication slots can only be used if max_replication_slots > 0
> +SELECT test_slot_timelines_advance_logical_slot('test_slot', txid_current()::text::xid, txid_current()::text::xid,
pg_current_xlog_location(),pg_current_xlog_location());
 
> +ERROR:  replication slots can only be used if max_replication_slots > 0
> +SELECT pg_drop_replication_slot('test_slot');
> +ERROR:  replication slots can only be used if max_replication_slots > 0
> diff --git a/src/test/modules/test_slot_timelines/sql/load_extension.sql
b/src/test/modules/test_slot_timelines/sql/load_extension.sql
> new file mode 100644
> index 0000000..2440355

It doesn't seem worthwhlie to maintain a test for this - why do we run
the tests with those settings in the first place?

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gilles Darold
Дата:
Сообщение: Re: Patch to implement pg_current_logfile() function
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: pg_dump, pg_dumpall and data durability