Re: Making pg_rewind faster
| От | Robert Haas |
|---|---|
| Тема | Re: Making pg_rewind faster |
| Дата | |
| Msg-id | CA+TgmoaZX4x3qHwxED+Y+2O+FTbyHz=HGW9mwPKGLv5YJbe08A@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Making pg_rewind faster (Srinath Reddy Sadipiralla <srinath2133@gmail.com>) |
| Ответы |
Re: Making pg_rewind faster
|
| Список | pgsql-hackers |
On Thu, Oct 9, 2025 at 3:09 PM Srinath Reddy Sadipiralla <srinath2133@gmail.com> wrote: > just a second late :( i was about to post a patch addressing the refactors which Robert mentioned ,anyway will have alook at your latest patch John thanks :), curious about the tap test. > > while i was writing the patch something suddenly struck me , that is why we are even depending on last_common_segno ,becauseonce we reached decide_wal_file_action it means that the file exists in both target and source ,AFAIK this can onlyhappen with wal segments older than or equal to last_common_segno because once the promotion competes the filename ofthe WAL files gets changed with the new timelineID(2), for ex: if the last_common_segno is 000000010000000000000003 thenbased on the rules in XLogInitNewTimeline > 1) if the timeline switch happens in middle of segment ,copy data from the last WAL segment and create WAL file with samesegno but different timelineID,in this case the starting WAL file for the new timeline will be 000000020000000000000003 > 2) if the timeline switch happens at segment boundary , just create next segment for this case the starting WAL file forthe new timeline will be 000000020000000000000004 > > so basically the files which exists in source and not in target like the new timeline WAL segments will be copied to targetin total before we reach decide_wal_file_action , so i think we don't need to think about copying WAL files after divergencepoint by calculating and checking against last_common_segno which we are doing in our current approach , i thinkwe can just do What makes me nervous about this is that it isn't necessarily the case that the servers were perfectly in sync at the time of the failure. Suppose that the primary was in the middle of writing 000000010000000000000003. The standby might also have this file, but it might contain less valid data than the one on the primary; therefore, if we don't copy the file, the two servers might not have an identical file. Maybe that wouldn't really matter, in the sense that the extra valid data that exists on the original primary shouldn't prevent it from replaying WAL on the new primary's timeline, which is probably all we really care about. But it feels dangerous to me. -- Robert Haas EDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: