Обсуждение: Helper functions for wait_for_catchup() in Cluster.pm

Поиск
Список
Период
Сортировка

Helper functions for wait_for_catchup() in Cluster.pm

От
"Drouvot, Bertrand"
Дата:
Hi hackers,

please find attached a patch proposal to define $SUBJECT.

The idea has been raised in [1], where we are adding more calls to wait_for_catchup() in 'replay' mode.

The current code already has 25 of those, so the proposed patch is defining a new wait_for_replay_catchup() function.

While at it, adding also:

- wait_for_write_catchup(): called 5 times
- wait_for_sent_catchup() and wait_for_flush_catchup() for consistency purpose (while there is
currently no occurrences of wait_for_catchup() in 'sent' or 'flush' mode.).

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

[1]: https://www.postgresql.org/message-id/20230106034036.2m4qnn7ep7b5ipet%40awork3.anarazel.de
Вложения

Re: Helper functions for wait_for_catchup() in Cluster.pm

От
Alvaro Herrera
Дата:
On 2023-Jan-17, Drouvot, Bertrand wrote:

> The idea has been raised in [1], where we are adding more calls to
> wait_for_catchup() in 'replay' mode.

This seems mostly useless as presented.  Maybe if you're able to reduce
the noise on the second argument it would be worth something -- namely,
if the wrapper function receives a node instead of an LSN: perhaps
wait_for_replay_catchup() would use the flush LSN from the given node,
wait_for_write_catchup() would use the write LSN, and
wait_for_sent_catchup() would use the insert LSN.  (I didn't check in
your patch if there are callsites that do something else).  This would
in several cases let you also remove the line with the assignment of
appropriate LSN to a separate variable.  If you did it that way, maybe
the code would become a tiny bit smaller overall.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Helper functions for wait_for_catchup() in Cluster.pm

От
"Drouvot, Bertrand"
Дата:
Hi,

On 1/17/23 12:23 PM, Alvaro Herrera wrote:
> On 2023-Jan-17, Drouvot, Bertrand wrote:
> 
>> The idea has been raised in [1], where we are adding more calls to
>> wait_for_catchup() in 'replay' mode.
> 
> This seems mostly useless as presented.  Maybe if you're able to reduce
> the noise on the second argument it would be worth something -- namely,
> if the wrapper function receives a node instead of an LSN: perhaps
> wait_for_replay_catchup() would use the flush LSN from the given node,
> wait_for_write_catchup() would use the write LSN, and
> wait_for_sent_catchup() would use the insert LSN.  (I didn't check in
> your patch if there are callsites that do something else).  This would
> in several cases let you also remove the line with the assignment of
> appropriate LSN to a separate variable.  If you did it that way, maybe
> the code would become a tiny bit smaller overall.
> 

Thanks for looking at it!

The current calls are done that way:

wait_for_replay_catchup called:
- 8 times with write LSN as an argument
- 1 time with insert LSN as an argument
- 16 times with flush LSN as an argument

wait_for_write_catchup called:
- 5 times with write LSN as an argument

So it looks like that providing a node as a second argument
would not help for the wait_for_replay_catchup() case.

Worth to use the node as an argument for wait_for_write_catchup()? (though it would be
weird to have different types of arguments between wait_for_replay_catchup() and wait_for_write_catchup()).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Helper functions for wait_for_catchup() in Cluster.pm

От
Alvaro Herrera
Дата:
On 2023-Jan-18, Drouvot, Bertrand wrote:

> The current calls are done that way:
> 
> wait_for_replay_catchup called:
> - 8 times with write LSN as an argument
> - 1 time with insert LSN as an argument
> - 16 times with flush LSN as an argument

> So it looks like that providing a node as a second argument
> would not help for the wait_for_replay_catchup() case.

... unless we changed the calls that wait for reply that use write or
insert so that they use flush instead.  Surely everything should still
work, right?  Flushing would still occur, either right after the write
(as the transaction commits) or ~200ms afterwards when WAL writer
catches up to that point.

I suppose this may fail to be true if there is some test that is
specifically testing whether writing WAL without flushing works, which
should rare enough, but if it does exist, in that one place we can use
the underlying wait_for_catchup().

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: Helper functions for wait_for_catchup() in Cluster.pm

От
"Drouvot, Bertrand"
Дата:
Hi,

On 1/18/23 10:59 AM, Alvaro Herrera wrote:
> On 2023-Jan-18, Drouvot, Bertrand wrote:
> 
>> The current calls are done that way:
>>
>> wait_for_replay_catchup called:
>> - 8 times with write LSN as an argument
>> - 1 time with insert LSN as an argument
>> - 16 times with flush LSN as an argument
> 
>> So it looks like that providing a node as a second argument
>> would not help for the wait_for_replay_catchup() case.
> 
> ... unless we changed the calls that wait for reply that use write or
> insert so that they use flush instead.  

That's a good idea, thanks! Please find attached V2 doing so.

> Surely everything should still
> work, right?  

Right.

> Flushing would still occur, either right after the write
> (as the transaction commits) or ~200ms afterwards when WAL writer
> catches up to that point.
> 
> I suppose this may fail to be true if there is some test that is
> specifically testing whether writing WAL without flushing works, which
> should rare enough, but if it does exist, 

I don't see this kind of test.

Please note that V2 does not contain wait_for_flush_catchup() and
wait_for_sent_catchup() anymore as: 1) they are not used yet
and 2) it lets to their author (if any) decide the node->lsn() mode to be used.

This is also mentioned as a comment in V2.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Helper functions for wait_for_catchup() in Cluster.pm

От
Alvaro Herrera
Дата:
Looking again, I have two thoughts for making things easier:

1. I don't think wait_for_write_catchup is necessary, because
calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments
would already do the same thing.  So what we should do is patch places
that currently give those two arguments, so that they don't.

2. Because wait_for_replay_catchup is an instance method, passing the
second node as argument is needlessly noisy, because that's already
known as $self.  So we can just say

  $primary_node->wait_for_replay_catchup($standby_node);

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Helper functions for wait_for_catchup() in Cluster.pm

От
"Drouvot, Bertrand"
Дата:
Hi,

On 1/24/23 7:27 PM, Alvaro Herrera wrote:
> Looking again, I have two thoughts for making things easier:
> 
> 1. I don't think wait_for_write_catchup is necessary, because
> calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments
> would already do the same thing.  So what we should do is patch places
> that currently give those two arguments, so that they don't.
> 

Agree but there is one place where the node passed as the second argument is not the "$self":

src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_write_catchup('node_c', $node_a);

So it looks like there is still a need for wait_for_write_catchup().

> 2. Because wait_for_replay_catchup is an instance method, passing the
> second node as argument is needlessly noisy, because that's already
> known as $self.  So we can just say
> 
>    $primary_node->wait_for_replay_catchup($standby_node);
> 

Yeah, but same here, there is places where the node passed as the second argument is not the "$self":

src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_replay_catchup('node_c', $node_a);
src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
src/test/recovery/t/001_stream_rep.pl:  $node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);

So it looks like there is still a need for wait_for_replay_catchup() with 2 parameters.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Helper functions for wait_for_catchup() in Cluster.pm

От
Alvaro Herrera
Дата:
On 2023-Jan-26, Drouvot, Bertrand wrote:

> On 1/24/23 7:27 PM, Alvaro Herrera wrote:

> > 1. I don't think wait_for_write_catchup is necessary, because
> > calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments
> > would already do the same thing.  So what we should do is patch places
> > that currently give those two arguments, so that they don't.
> 
> Agree but there is one place where the node passed as the second argument is not the "$self":
> 
> src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_write_catchup('node_c', $node_a);
> 
> So it looks like there is still a need for wait_for_write_catchup().

Hmm, I think that one can use the more general wait_for_catchup.


> > 2. Because wait_for_replay_catchup is an instance method, passing the
> > second node as argument is needlessly noisy, because that's already
> > known as $self.  So we can just say
> > 
> >    $primary_node->wait_for_replay_catchup($standby_node);
> 
> Yeah, but same here, there is places where the node passed as the second argument is not the "$self":
> 
> src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_replay_catchup('node_c', $node_a);
> src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
> src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
> src/test/recovery/t/001_stream_rep.pl:  $node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
> 
> So it looks like there is still a need for wait_for_replay_catchup() with 2 parameters.

Ah, cascading replication.  In that case, let's make the second
parameter optional.  If it's not given, $self is used.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"En las profundidades de nuestro inconsciente hay una obsesiva necesidad
de un universo lógico y coherente. Pero el universo real se halla siempre
un paso más allá de la lógica" (Irulan)



Re: Helper functions for wait_for_catchup() in Cluster.pm

От
"Drouvot, Bertrand"
Дата:
Hi,

On 1/26/23 10:42 AM, Alvaro Herrera wrote:
> On 2023-Jan-26, Drouvot, Bertrand wrote:
> 
>> On 1/24/23 7:27 PM, Alvaro Herrera wrote:
> 
>>> 1. I don't think wait_for_write_catchup is necessary, because
>>> calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments
>>> would already do the same thing. 

Having a closer look, it does not seem to be the case. The default mode
in wait_for_catchup() is 'replay' and the default mode for the lsn is 'write'.

But in wait_for_write_catchup() we are making use of 'write' for both.

> 
>>> 2. Because wait_for_replay_catchup is an instance method, passing the
>>> second node as argument is needlessly noisy, because that's already
>>> known as $self.  So we can just say
>>>
>>>     $primary_node->wait_for_replay_catchup($standby_node);
>>
>> Yeah, but same here, there is places where the node passed as the second argument is not the "$self":
>>
>> src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_replay_catchup('node_c', $node_a);
>> src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
>> src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
>> src/test/recovery/t/001_stream_rep.pl:  $node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
>>
>> So it looks like there is still a need for wait_for_replay_catchup() with 2 parameters.
> 
> Ah, cascading replication.  In that case, let's make the second
> parameter optional.  If it's not given, $self is used.
> 

Good point, done in V3 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Helper functions for wait_for_catchup() in Cluster.pm

От
Alvaro Herrera
Дата:
On 2023-Jan-26, Drouvot, Bertrand wrote:

> Hi,
> 
> On 1/26/23 10:42 AM, Alvaro Herrera wrote:
> > On 2023-Jan-26, Drouvot, Bertrand wrote:
> > 
> > > On 1/24/23 7:27 PM, Alvaro Herrera wrote:
> > 
> > > > 1. I don't think wait_for_write_catchup is necessary, because
> > > > calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments
> > > > would already do the same thing.
> 
> Having a closer look, it does not seem to be the case. The default mode
> in wait_for_catchup() is 'replay' and the default mode for the lsn is 'write'.
> 
> But in wait_for_write_catchup() we are making use of 'write' for both.

But that turns
 $node->wait_for_catchup('foobar', 'write')
into
 $node->wait_for_write_catchup('foobar');
so I don't see much value in it.  Also, the patch series from which this
patch spawned in the first place doesn't wait for write AFAICS.

After adding some more POD docs for it, I pushed the one for replay.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
¡Ay, ay, ay!  Con lo mucho que yo lo quería (bis)
se fue de mi vera ... se fue para siempre, pa toíta ... pa toíta la vida
¡Ay Camarón! ¡Ay Camarón!                                (Paco de Lucía)



Re: Helper functions for wait_for_catchup() in Cluster.pm

От
"Drouvot, Bertrand"
Дата:
Hi,

On 2/13/23 11:58 AM, Alvaro Herrera wrote:
> On 2023-Jan-26, Drouvot, Bertrand wrote:
> 
>> Hi,
>>
>> On 1/26/23 10:42 AM, Alvaro Herrera wrote:
>>> On 2023-Jan-26, Drouvot, Bertrand wrote:
>>>
>>>> On 1/24/23 7:27 PM, Alvaro Herrera wrote:
>>>
>>>>> 1. I don't think wait_for_write_catchup is necessary, because
>>>>> calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments
>>>>> would already do the same thing.
>>
>> Having a closer look, it does not seem to be the case. The default mode
>> in wait_for_catchup() is 'replay' and the default mode for the lsn is 'write'.
>>
>> But in wait_for_write_catchup() we are making use of 'write' for both.
> 
> But that turns
>   $node->wait_for_catchup('foobar', 'write')
> into
>   $node->wait_for_write_catchup('foobar');
> so I don't see much value in it.

Agree.

>  Also, the patch series from which this
> patch spawned in the first place doesn't wait for write AFAICS.
> 

Right, it does wait for replay only.

> After adding some more POD docs for it, I pushed the one for replay.
> 

Thanks!

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com