On Tue, Apr 11, 2023 at 01:10:57PM -0700, Andres Freund wrote:
> On 2023-04-11 11:04:50 +0200, Drouvot, Bertrand wrote:
> > On 4/11/23 10:55 AM, Drouvot, Bertrand wrote:
> > > I think we might want to add:
> > >
> > > $node_primary->wait_for_replay_catchup($node_standby);
> > >
> > > before calling the slot creation.
> Pushed. Seems like a clear race in the test, so I didn't think it was worth
> waiting for testing it on hoverfly.
We'll see what happens in the next run.
> I think we should lower the log level, but perhaps wait for a few more cycles
> in case there are random failures?
Fine with me.
> I wonder if we should make the connections in poll_query_until to reduce
> verbosity - it's pretty annoying how much that can bloat the log. Perhaps also
> introduce some backoff? It's really annoying to have to trawl through all
> those logs when there's a problem.
Agreed. My ranked wish list for poll_query_until is:
1. Exponential backoff
2. Closed-loop time control via Time::HiRes or similar, instead of assuming
that ten loops complete in ~1s. I've seen the loop take 3x as long as the
intended timeout.
3. Connect less often than today's once per probe