>> Speeding up recovery or failover activity via a faster promote is a
>> desirable thing. So, maybe, we should look at teaching the relevant
>> code about using "KnownPreparedList"? I know that would increase the
>> size of this patch and would mean more testing, but this seems to be
>> last remaining optimization in this code path.
>
> That's a good idea, worth having in this patch. Actually we may not
> want to call KnownPreparedRecreateFiles() here as promotion is not
> synonym of end-of-recovery checkpoint for a couple of releases now.
>
Once implemented, a good way to performance test this could be to set
checkpoint_timeout to a a large value like an hour. Then, generate
enough 2PC WAL while ensuring that a checkpoint does not happen
automatically or otherwise.
We could then measure the time taken to recover on startup to see the efficacy.
> Most of the 2PC syncs just won't happen, such transactions normally
> don't last long, and the number you would get during a checkpoint is
> largely lower than what would happen between two checkpoints. When
> working on Postgres-XC, the number of 2PC was really more funky.
>
Yes, postgres-xl is full of 2PC, so hopefully this optimization should
help a lot in that case as well.
Regards,
Nikhils
-- Nikhil Sontakke http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services