Обсуждение: wal_segment size vs max_wal_size
There is apparently some misbehavior if max_wal_size is less than 5 * wal_segment_size. For example, if you build with --with-wal-segsize=64, then the recovery test fails unless you set max_wal_size to at least 320MB in PostgresNode.pm. The issue is that pg_basebackup fails with: pg_basebackup: could not get transaction log end position from server: ERROR: could not find any WAL files This should probably be made friendlier in some way. But it also shows that bigger WAL segment sizes are apparently not well-chartered territory lately. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Sep 21, 2016 at 8:33 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > There is apparently some misbehavior if max_wal_size is less than 5 * > wal_segment_size. > > For example, if you build with --with-wal-segsize=64, then the recovery > test fails unless you set max_wal_size to at least 320MB in > PostgresNode.pm. The issue is that pg_basebackup fails with: > In recovery tests, max_wal_size is set to 128MB. Now, when you build with --with-wal-segsize=64, max_wal_size is calculated as follows: max_wal_size = 128 / (64 * 1024 * 1024) / (1024 * 1024) = 2. and CheckPointSegments is calculated as follows: CheckPointSegments = 2 / (2 + 0.5) = 0.8 rounded to 1. (Default is 3) Hence, checkpoints occurs very frequently at master. > pg_basebackup: could not get transaction log end position from server: > ERROR: could not find any WAL files This error occurs when the recovery test tries to take backup from the standby using the above settings. pg_basebackup scans pg_xlog and include all WAL files in the range between 'startptr' and 'endptr', regardless of the timeline the file is stamped with. 'startptr' is initialized to ControlFile->checkPointCopy.redo and 'endptr' is initialized to ControlFile->minRecoveryPoint. Now, whenever we redo a CHECKPOINT_ONLINE log, we update checkPointCopy.redo and whenever we flush logs, we update minRecoveryPoint. In this case, we are having frequent checkpoints at master which in turn updates checkPointCopy.redo in standy frequently. Sometimes, it even goes ahead of minRecoveryPoint. At this point, if you call pg_basebackup, it will throw the aforesaid error. > This should probably be made friendlier in some way. But it also shows > that bigger WAL segment sizes are apparently not well-chartered > territory lately. > Well, there can be multiple solutions to this problem. 1. If somebody intends to increase wal segment size, he should increase max_wal_size accordingly. 2. In recovery test, we can add some delay before taking backup so that the pending logs in the buffer gets flushed. (Not a good solution) 3. In CreateRestartPoint() method, we can force a XLogFlush to update minRecoveryPoint. Thoughts? -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
On Mon, Sep 26, 2016 at 4:00 PM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote: > On Wed, Sep 21, 2016 at 8:33 PM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> There is apparently some misbehavior if max_wal_size is less than 5 * >> wal_segment_size. >> > >> This should probably be made friendlier in some way. But it also shows >> that bigger WAL segment sizes are apparently not well-chartered >> territory lately. >> > Well, there can be multiple solutions to this problem. > 1. If somebody intends to increase wal segment size, he should > increase max_wal_size accordingly. > 2. In recovery test, we can add some delay before taking backup so > that the pending logs in the buffer > gets flushed. (Not a good solution) > 3. In CreateRestartPoint() method, we can force a XLogFlush to update > minRecoveryPoint. > IIRC, there is already a patch to update the minRecoveryPoint correctly, can you check if that solves the problem for you? [1] - https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Mon, Sep 26, 2016 at 5:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > > IIRC, there is already a patch to update the minRecoveryPoint > correctly, can you check if that solves the problem for you? > > [1] - https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp > +1. I've tested after applying the patch. This clearly solves the problem. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
On Mon, Sep 26, 2016 at 9:30 PM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote: > On Mon, Sep 26, 2016 at 5:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> IIRC, there is already a patch to update the minRecoveryPoint >> correctly, can you check if that solves the problem for you? >> >> [1] - https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp >> > +1. I've tested after applying the patch. This clearly solves the problem. Even if many things have been discussed on this thread, Horiguchi-san's first patch is still the best approach found after several lookups and attempts when messing with the recovery code. -- Michael
On 9/26/16 8:38 PM, Michael Paquier wrote: > On Mon, Sep 26, 2016 at 9:30 PM, Kuntal Ghosh > <kuntalghosh.2007@gmail.com> wrote: >> On Mon, Sep 26, 2016 at 5:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >>> >>> IIRC, there is already a patch to update the minRecoveryPoint >>> correctly, can you check if that solves the problem for you? >>> >>> [1] - https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp >>> >> +1. I've tested after applying the patch. This clearly solves the problem. > > Even if many things have been discussed on this thread, > Horiguchi-san's first patch is still the best approach found after > several lookups and attempts when messing with the recovery code. What is the status of that patch then? The above thread seems to have stopped. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Sep 30, 2016 at 11:05 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 9/26/16 8:38 PM, Michael Paquier wrote: >> On Mon, Sep 26, 2016 at 9:30 PM, Kuntal Ghosh >> <kuntalghosh.2007@gmail.com> wrote: >>> On Mon, Sep 26, 2016 at 5:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >>>> >>>> IIRC, there is already a patch to update the minRecoveryPoint >>>> correctly, can you check if that solves the problem for you? >>>> >>>> [1] - https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp >>>> >>> +1. I've tested after applying the patch. This clearly solves the problem. >> >> Even if many things have been discussed on this thread, >> Horiguchi-san's first patch is still the best approach found after >> several lookups and attempts when messing with the recovery code. > > What is the status of that patch then? The above thread seems to have > stopped. The conclusion is to use the original patch proposed by Horiguchi-san, and with a test case I have added you get that: https://www.postgresql.org/message-id/CAB7nPqTv5gmKQcNDoFGTGqoqXz2xLz4RRw247oqOJzZTVy6-7Q%40mail.gmail.com -- Michael