Обсуждение: Fwd: Re: New 8.4 hot standby feature
On Tuesday 27 January 2009 13:13:32 you wrote: > On Tue, 2009-01-27 at 12:53 -0500, Gabi Julien wrote: > > I have merged the last hot standby patch (v9g) to 8.4 devel and I am > > pleased with the experience. This is promising stuff. Perhaps it is a bit > > too soon to ask questions here but here it is: > > > > 1. Speed of recovery > > > > With a archive_timeout of 60 seconds, it can take about 4 minutes before > > I see the reflected changes in the replica. This is normal since, in > > addition to the WAL log shipping, it takes more time to do the recovery > > itself. Still, is there any way besides the archive_timeout config option > > to speed up the recovery of WAL logs on the hot standby? > > Is the recovery itself the bottleneck? Yes, the logs are shipped every minute but the recevory is 3 or 4 times longer. > > There's a performance improvement submitted here: > http://archives.postgresql.org/message-id/a778a7260810280033n43f70d36x8c437 >eacf9a5461e@mail.gmail.com > > But I haven't been following the development of it closely, so you'll > have to read the thread to see whether it will meet your needs or not. Thanks I will take a look at it. Also, I came across the record log shipping feature too in my research: http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD Could this help? If the logs are smaller then I could potentially afford shipping then at a higher frequency. > > Regards, > Jeff Davis -------------------------------------------------------
On Tue, 2009-01-27 at 14:28 -0500, Gabi Julien wrote: > Could this help? If the logs are smaller then I could potentially afford > shipping then at a higher frequency. > See if there are times during which the recovery process isn't doing anything (i.e. just waiting for WAL data). If so, something like this might help. If it's constantly working as hard as it can, then probably not. An important question you should ask yourself is whether it can keep up in the steady state at all. If the primary is producing segments faster than the standby is recovering them, I don't think there's any way around that. Regards, Jeff Davis
Hi, On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com> wrote: > Yes, the logs are shipped every minute but the recevory is 3 or 4 times > longer. Are you disabling full_page_writes? It may slow down recovery several times. > Thanks I will take a look at it. Also, I came across the record log shipping > feature too in my research: > > http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD > > Could this help? If the logs are smaller then I could potentially afford > shipping then at a higher frequency. No. Even if the logs are shipped frequently, they cannot be applied until the log file fills. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao wrote:
Is pg_clearxlogtail going to be in contrib or integrated in some other way?Hi, On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com> wrote:Yes, the logs are shipped every minute but the recevory is 3 or 4 times longer.Are you disabling full_page_writes? It may slow down recovery several times.Thanks I will take a look at it. Also, I came across the record log shipping feature too in my research: http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD Could this help? If the logs are smaller then I could potentially afford shipping then at a higher frequency.No. Even if the logs are shipped frequently, they cannot be applied until the log file fills. Regards,
Hi, On Thu, Jan 29, 2009 at 12:23 AM, Jason Long <mailing.list@supernovasoftware.com> wrote: > Is pg_clearxlogtail going to be in contrib or integrated in some other way? I also hope so. The related topic was discussed before. http://archives.postgresql.org/pgsql-hackers/2009-01/msg00639.php Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Tuesday 27 January 2009 16:25:44 you wrote: > On Tue, 2009-01-27 at 14:28 -0500, Gabi Julien wrote: > > Could this help? If the logs are smaller then I could potentially afford > > shipping then at a higher frequency. > > See if there are times during which the recovery process isn't doing > anything (i.e. just waiting for WAL data). If so, something like this > might help. If it's constantly working as hard as it can, then probably > not. > > An important question you should ask yourself is whether it can keep up > in the steady state at all. If the primary is producing segments faster > than the standby is recovering them, I don't think there's any way > around that. The load on the slave is close to 0 so it does not explain the speed of recovery. Also the shipping of the 16MB WAL log takes only 1 second on the LAN. I guess the problem is probably what Fujii Masao explained. The WAL log shipped are not yet usable or something like that. I won't try to increase the frequency of log shipping because of that. Also, my setting of 60 seconds is the lowest frequency suggested by the documentation anyways. However, I have found the v4 patch about the PITR performance improvement. I will give it a try and report here. I might try pg_clearxlogtail too if I have time. > > Regards, > Jeff Davis
On Tuesday 27 January 2009 21:47:36 you wrote: > Hi, > > On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com> wrote: > > Yes, the logs are shipped every minute but the recevory is 3 or 4 times > > longer. > > Are you disabling full_page_writes? It may slow down recovery several > times. It looks like you found my problem. Everything I needed to know is described here: http://www.postgresql.org/docs/8.3/interactive/wal-configuration.html Setting checkpoint_timeout to 55 seconds speeds up the recovery to the level I want. Ironically, it makes the pg_last_recovered_xact_timestamp() function more reliable too on how up to date the replica is. I am not sure that I can take this for granted however. I will disable full_page_writes to make sure this agressive checkpoint_timeout setting won't slow down my master database too much. Can I be reassured on the fact that, if the master database crashes and some data is lost, at least the replica would keep its integrity (even though it is not in sync)? My settings: full_page_writes = off checkpoint_timeout = 55s checkpoint_completion_target = 0.7 archive_mode = on archive_command = './archive_command.sh "%p" "%f"' archive_timeout = 60 Also, would it be possible to recompile postgresql by using a different size (smaller) then 16M for WAL logs and would that be a smart thing to try? Thanks a lot to all of you. > > > Thanks I will take a look at it. Also, I came across the record log > > shipping feature too in my research: > > > > http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STAN > >DBY-RECORD > > > > Could this help? If the logs are smaller then I could potentially afford > > shipping then at a higher frequency. > > No. Even if the logs are shipped frequently, they cannot be applied until > the log file fills. > > Regards,
On Wednesday 28 January 2009 18:35:18 Gabi Julien wrote: > On Tuesday 27 January 2009 21:47:36 you wrote: > > Hi, > > > > On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com> > > wrote: > > > Yes, the logs are shipped every minute but the recevory is 3 or 4 times > > > longer. > > > > Are you disabling full_page_writes? It may slow down recovery several > > times. > > It looks like you found my problem. Everything I needed to know is > described here: > > http://www.postgresql.org/docs/8.3/interactive/wal-configuration.html > > Setting checkpoint_timeout to 55 seconds speeds up the recovery to the > level I want. Ironically, it makes the pg_last_recovered_xact_timestamp() > function more reliable too on how up to date the replica is. I am not sure > that I can take this for granted however. This is a good question actually. If I set the checkpoint_timeout to something less then the archive_timeout, can I take this for granted the fact that pg_last_recovered_xact_timestamp() will always accurately tell me how up to date the replica is?