Re: Logical Decoding follows timelines

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Logical Decoding follows timelines
Дата
Msg-id 54A7BF61.9080708@vmware.com
обсуждение исходный текст
Ответ на Re: Logical Decoding follows timelines  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: Logical Decoding follows timelines  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On 12/17/2014 10:35 AM, Simon Riggs wrote:
> On 16 December 2014 at 21:17, Simon Riggs <simon@2ndquadrant.com> wrote:
>
>>>> This patch is a WIP version of doing that, but only currently attempts
>
>>> With the patch, XLogSendLogical uses the same logic to calculate SendRqstPtr
>>> that XLogSendPhysical does. It would be good to refactor that into a common
>>> function, rather than copy-paste.
>>
>> Some of the logic is similar, but not all.
>>
>>> SendRqstPtr isn't actually used for anything in XLogSendLogical.
>>
>> It exists to allow the call which resets TLI.
>>
>> I'll see if I can make it exactly identical; I didn't think so when I
>> first looked, will look again.
>
> Yes, that works. New version attached

Some comments, mostly on readability (not all of these were this patch's 
fault):

>             /*
>              * Check that the timeline the client requested for exists, and
>              * the requested start location is on that timeline.
>              */
>             (void) ReadSendTimeLine(cmd->timeline);
>
>             /*
>              * Found the requested timeline in the history. Check that
>              * requested startpoint is on that timeline in our history.
>              *
>              * This is quite loose on purpose. We only check that we didn't
>              * fork off the requested timeline before the switchpoint. We
>              * don't check that we switched *to* it before the requested
>              * starting point. This is because the client can legitimately
>              * request to start replication from the beginning of the WAL
>              * segment that contains switchpoint, but on the new timeline, so
>              * that it doesn't end up with a partial segment. If you ask for a
>              * too old starting point, you'll get an error later when we fail
>              * to find the requested WAL segment in pg_xlog.
>              *
>              * XXX: we could be more strict here and only allow a startpoint
>              * that's older than the switchpoint, if it's still in the same
>              * WAL segment.
>              */

The first comment implies that the ReadSendTimeLine call checks that the 
requested start location is on the timeline, but that's actually done by 
the code that follows the second comment. I would merge these two 
comments, and move the ReadSendTimeLine call below the merged comment.

> @@ -577,8 +571,8 @@ StartReplication(StartReplicationCmd *cmd)
>               * that's older than the switchpoint, if it's still in the same
>               * WAL segment.
>               */
> -            if (!XLogRecPtrIsInvalid(switchpoint) &&
> -                switchpoint < cmd->startpoint)
> +            if (!XLogRecPtrIsInvalid(sendTimeLineValidUpto) &&
> +                sendTimeLineValidUpto < cmd->startpoint)
>              {
>                  ereport(ERROR,
>                          (errmsg("requested starting point %X/%X on timeline %u is not in this server's history",

IMHO using the local 'switchpoint' variable was more clear.

> @@ -941,6 +936,8 @@ StartLogicalReplication(StartReplicationCmd *cmd)
>       * Force a disconnect, so that the decoding code doesn't need to care
>       * about an eventual switch from running in recovery, to running in a
>       * normal environment. Client code is expected to handle reconnects.
> +     * This covers the race condition where we are promoted half way
> +     * through starting up.
>       */
>      if (am_cascading_walsender && !RecoveryInProgress())
>      {

We could exit recovery immediately after this check. Why is this check 
needed?

>      /*
> +     * Find the timeline for the start location, or throw an error.
> +     *
> +     * Logical replication relies upon replication slots. Each slot has a
> +     * single timeline history baked into it, so this should be easy.
> +     */

I don't understand what "baked in" means here.

> +    /*
> +     * Get the SendRqstPtr and follow any timeline changes.
> +     */
> +    SendRqstPtr = GetLatestRequestPtr();

The old comment used to say "Figure out how far we can safely send the 
WAL". I think that was much more clear. It's not clear what following 
timeline changes means here, and the fact that it "gets the SendRqstPtr" 
is obvious from the code.

> +
> +static XLogRecPtr
> +GetLatestRequestPtr(void)

This function desperately needs comment to explain what it does. I don't 
much like its name either.

> +static TimeLineID
> +ReadSendTimeLine(TimeLineID tli)

Ditto. This function is also missing a "return".

I think it would slightly more intuitive if this function didn't set the 
global variables directly, but simply returned the returned values to 
the caller.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Problems with approach #2 to value locking (INSERT ... ON CONFLICT UPDATE/IGNORE patch)
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Problems with approach #2 to value locking (INSERT ... ON CONFLICT UPDATE/IGNORE patch)