Обсуждение: Helper functions for wait_for_catchup() in Cluster.pm
Hi hackers, please find attached a patch proposal to define $SUBJECT. The idea has been raised in [1], where we are adding more calls to wait_for_catchup() in 'replay' mode. The current code already has 25 of those, so the proposed patch is defining a new wait_for_replay_catchup() function. While at it, adding also: - wait_for_write_catchup(): called 5 times - wait_for_sent_catchup() and wait_for_flush_catchup() for consistency purpose (while there is currently no occurrences of wait_for_catchup() in 'sent' or 'flush' mode.). Looking forward to your feedback, Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com [1]: https://www.postgresql.org/message-id/20230106034036.2m4qnn7ep7b5ipet%40awork3.anarazel.de
Вложения
On 2023-Jan-17, Drouvot, Bertrand wrote: > The idea has been raised in [1], where we are adding more calls to > wait_for_catchup() in 'replay' mode. This seems mostly useless as presented. Maybe if you're able to reduce the noise on the second argument it would be worth something -- namely, if the wrapper function receives a node instead of an LSN: perhaps wait_for_replay_catchup() would use the flush LSN from the given node, wait_for_write_catchup() would use the write LSN, and wait_for_sent_catchup() would use the insert LSN. (I didn't check in your patch if there are callsites that do something else). This would in several cases let you also remove the line with the assignment of appropriate LSN to a separate variable. If you did it that way, maybe the code would become a tiny bit smaller overall. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Hi, On 1/17/23 12:23 PM, Alvaro Herrera wrote: > On 2023-Jan-17, Drouvot, Bertrand wrote: > >> The idea has been raised in [1], where we are adding more calls to >> wait_for_catchup() in 'replay' mode. > > This seems mostly useless as presented. Maybe if you're able to reduce > the noise on the second argument it would be worth something -- namely, > if the wrapper function receives a node instead of an LSN: perhaps > wait_for_replay_catchup() would use the flush LSN from the given node, > wait_for_write_catchup() would use the write LSN, and > wait_for_sent_catchup() would use the insert LSN. (I didn't check in > your patch if there are callsites that do something else). This would > in several cases let you also remove the line with the assignment of > appropriate LSN to a separate variable. If you did it that way, maybe > the code would become a tiny bit smaller overall. > Thanks for looking at it! The current calls are done that way: wait_for_replay_catchup called: - 8 times with write LSN as an argument - 1 time with insert LSN as an argument - 16 times with flush LSN as an argument wait_for_write_catchup called: - 5 times with write LSN as an argument So it looks like that providing a node as a second argument would not help for the wait_for_replay_catchup() case. Worth to use the node as an argument for wait_for_write_catchup()? (though it would be weird to have different types of arguments between wait_for_replay_catchup() and wait_for_write_catchup()). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 2023-Jan-18, Drouvot, Bertrand wrote: > The current calls are done that way: > > wait_for_replay_catchup called: > - 8 times with write LSN as an argument > - 1 time with insert LSN as an argument > - 16 times with flush LSN as an argument > So it looks like that providing a node as a second argument > would not help for the wait_for_replay_catchup() case. ... unless we changed the calls that wait for reply that use write or insert so that they use flush instead. Surely everything should still work, right? Flushing would still occur, either right after the write (as the transaction commits) or ~200ms afterwards when WAL writer catches up to that point. I suppose this may fail to be true if there is some test that is specifically testing whether writing WAL without flushing works, which should rare enough, but if it does exist, in that one place we can use the underlying wait_for_catchup(). -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Hi, On 1/18/23 10:59 AM, Alvaro Herrera wrote: > On 2023-Jan-18, Drouvot, Bertrand wrote: > >> The current calls are done that way: >> >> wait_for_replay_catchup called: >> - 8 times with write LSN as an argument >> - 1 time with insert LSN as an argument >> - 16 times with flush LSN as an argument > >> So it looks like that providing a node as a second argument >> would not help for the wait_for_replay_catchup() case. > > ... unless we changed the calls that wait for reply that use write or > insert so that they use flush instead. That's a good idea, thanks! Please find attached V2 doing so. > Surely everything should still > work, right? Right. > Flushing would still occur, either right after the write > (as the transaction commits) or ~200ms afterwards when WAL writer > catches up to that point. > > I suppose this may fail to be true if there is some test that is > specifically testing whether writing WAL without flushing works, which > should rare enough, but if it does exist, I don't see this kind of test. Please note that V2 does not contain wait_for_flush_catchup() and wait_for_sent_catchup() anymore as: 1) they are not used yet and 2) it lets to their author (if any) decide the node->lsn() mode to be used. This is also mentioned as a comment in V2. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Looking again, I have two thoughts for making things easier: 1. I don't think wait_for_write_catchup is necessary, because calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments would already do the same thing. So what we should do is patch places that currently give those two arguments, so that they don't. 2. Because wait_for_replay_catchup is an instance method, passing the second node as argument is needlessly noisy, because that's already known as $self. So we can just say $primary_node->wait_for_replay_catchup($standby_node); -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Hi, On 1/24/23 7:27 PM, Alvaro Herrera wrote: > Looking again, I have two thoughts for making things easier: > > 1. I don't think wait_for_write_catchup is necessary, because > calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments > would already do the same thing. So what we should do is patch places > that currently give those two arguments, so that they don't. > Agree but there is one place where the node passed as the second argument is not the "$self": src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_write_catchup('node_c', $node_a); So it looks like there is still a need for wait_for_write_catchup(). > 2. Because wait_for_replay_catchup is an instance method, passing the > second node as argument is needlessly noisy, because that's already > known as $self. So we can just say > > $primary_node->wait_for_replay_catchup($standby_node); > Yeah, but same here, there is places where the node passed as the second argument is not the "$self": src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_replay_catchup('node_c', $node_a); src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); src/test/recovery/t/001_stream_rep.pl: $node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); So it looks like there is still a need for wait_for_replay_catchup() with 2 parameters. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 2023-Jan-26, Drouvot, Bertrand wrote: > On 1/24/23 7:27 PM, Alvaro Herrera wrote: > > 1. I don't think wait_for_write_catchup is necessary, because > > calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments > > would already do the same thing. So what we should do is patch places > > that currently give those two arguments, so that they don't. > > Agree but there is one place where the node passed as the second argument is not the "$self": > > src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_write_catchup('node_c', $node_a); > > So it looks like there is still a need for wait_for_write_catchup(). Hmm, I think that one can use the more general wait_for_catchup. > > 2. Because wait_for_replay_catchup is an instance method, passing the > > second node as argument is needlessly noisy, because that's already > > known as $self. So we can just say > > > > $primary_node->wait_for_replay_catchup($standby_node); > > Yeah, but same here, there is places where the node passed as the second argument is not the "$self": > > src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_replay_catchup('node_c', $node_a); > src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); > src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); > src/test/recovery/t/001_stream_rep.pl: $node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); > > So it looks like there is still a need for wait_for_replay_catchup() with 2 parameters. Ah, cascading replication. In that case, let's make the second parameter optional. If it's not given, $self is used. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "En las profundidades de nuestro inconsciente hay una obsesiva necesidad de un universo lógico y coherente. Pero el universo real se halla siempre un paso más allá de la lógica" (Irulan)
Hi, On 1/26/23 10:42 AM, Alvaro Herrera wrote: > On 2023-Jan-26, Drouvot, Bertrand wrote: > >> On 1/24/23 7:27 PM, Alvaro Herrera wrote: > >>> 1. I don't think wait_for_write_catchup is necessary, because >>> calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments >>> would already do the same thing. Having a closer look, it does not seem to be the case. The default mode in wait_for_catchup() is 'replay' and the default mode for the lsn is 'write'. But in wait_for_write_catchup() we are making use of 'write' for both. > >>> 2. Because wait_for_replay_catchup is an instance method, passing the >>> second node as argument is needlessly noisy, because that's already >>> known as $self. So we can just say >>> >>> $primary_node->wait_for_replay_catchup($standby_node); >> >> Yeah, but same here, there is places where the node passed as the second argument is not the "$self": >> >> src/bin/pg_rewind/t/007_standby_source.pl:$node_b->wait_for_replay_catchup('node_c', $node_a); >> src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); >> src/test/recovery/t/001_stream_rep.pl:$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); >> src/test/recovery/t/001_stream_rep.pl: $node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); >> >> So it looks like there is still a need for wait_for_replay_catchup() with 2 parameters. > > Ah, cascading replication. In that case, let's make the second > parameter optional. If it's not given, $self is used. > Good point, done in V3 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
On 2023-Jan-26, Drouvot, Bertrand wrote: > Hi, > > On 1/26/23 10:42 AM, Alvaro Herrera wrote: > > On 2023-Jan-26, Drouvot, Bertrand wrote: > > > > > On 1/24/23 7:27 PM, Alvaro Herrera wrote: > > > > > > 1. I don't think wait_for_write_catchup is necessary, because > > > > calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments > > > > would already do the same thing. > > Having a closer look, it does not seem to be the case. The default mode > in wait_for_catchup() is 'replay' and the default mode for the lsn is 'write'. > > But in wait_for_write_catchup() we are making use of 'write' for both. But that turns $node->wait_for_catchup('foobar', 'write') into $node->wait_for_write_catchup('foobar'); so I don't see much value in it. Also, the patch series from which this patch spawned in the first place doesn't wait for write AFAICS. After adding some more POD docs for it, I pushed the one for replay. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ ¡Ay, ay, ay! Con lo mucho que yo lo quería (bis) se fue de mi vera ... se fue para siempre, pa toíta ... pa toíta la vida ¡Ay Camarón! ¡Ay Camarón! (Paco de Lucía)
Hi, On 2/13/23 11:58 AM, Alvaro Herrera wrote: > On 2023-Jan-26, Drouvot, Bertrand wrote: > >> Hi, >> >> On 1/26/23 10:42 AM, Alvaro Herrera wrote: >>> On 2023-Jan-26, Drouvot, Bertrand wrote: >>> >>>> On 1/24/23 7:27 PM, Alvaro Herrera wrote: >>> >>>>> 1. I don't think wait_for_write_catchup is necessary, because >>>>> calling wait_for_catchup() and omitting the 'mode' and 'lsn' arguments >>>>> would already do the same thing. >> >> Having a closer look, it does not seem to be the case. The default mode >> in wait_for_catchup() is 'replay' and the default mode for the lsn is 'write'. >> >> But in wait_for_write_catchup() we are making use of 'write' for both. > > But that turns > $node->wait_for_catchup('foobar', 'write') > into > $node->wait_for_write_catchup('foobar'); > so I don't see much value in it. Agree. > Also, the patch series from which this > patch spawned in the first place doesn't wait for write AFAICS. > Right, it does wait for replay only. > After adding some more POD docs for it, I pushed the one for replay. > Thanks! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com