Обсуждение: Summary of plans to avoid the annoyance of Freezing
Freezing is painful for VLDBs and high transaction rate systems. We have a number of proposals to improve things...
1. Allow parallel cores to be used with vacuumdb (Dilip)
Idea is if we have to get the job done, lets do it as fast as we can using brute force. Reasonable thinking, but there are more efficient approaches.
2. Freeze Map (Sawada)
This works and we have a mostly-ready patch. I'm down to do final review on this, which is why I'm producing this summary and working out what's the best next action for Postgres.
3. Speed up autovacuums when they are triggered to avoid wraparounds (Simon)
Idea is to do a VACUUM scan which only freezes tuples. If we dirty a page from freezing we then also prune it, but don't attempt to scan indexes to remove the now-truncated dead tuples.
This looks very straightforward, no technical issues. Might even be able to backpatch it.
[patch investigated but not finished yet]
4. 64-bit Xids (Alexander)
Proposal rejected
5. 64-bit Xid Visibility, with Xid Epoch in page header (Heikki)
* Epoch is stored on new pages, using a new page format. Epoch of existing pages is stored in control file - code changes to bufpage.c look isolated
* All transaction age tests are made using both xid and epoch - the code changes to tqual.c look isolated
* We won't need to issue anti-wraparound vacuums again, ever
* We will still need to issue clog-trimming vacuums, so we'll need a new parameter to control when to start scanning to avoid clog growing too large
* relfrozenxid will be set to InvalidTransactionId once all existing pages have been made visible
* relhintxid and relhintepoch will be required so we have a backstop to indicate where to truncate clog (and same for multixacts)
* There is no need to FREEZE except when UPDATEing/DELETEing pages from previous epochs
* VACUUM FREEZE will only freeze old pages; on a new cluster it will work same as VACUUM
* VACUUMs that touch every non-all-visible page will be able to set relhintxid to keep clog trimmed, so never need to scan all blocks in table
* Code changes seem fairly isolated
* No action required at pg_upgrade
* Additional benefit: we can move to 32-bit CRC checksums on data pages at same time as doing this (seamlessly).
* 8 bytes additional space required per page (~0.1% growth in database size)
* (Any other changes to page headers can be made or reserved at this time)
To me 3) would have been useful if we'd done it earlier. Now that we have 2) and 5), I don't see much point pursuing 3).
The main options are 2) or 5)
Freeze Map (2) makes Freezing more efficient for larger tables, but doesn't avoid the need altogether. 5) is a deeper treatment of the underlying problem and is a better architecture for the future of Postgres, IMHO.
I was previously a proponent of (2) as a practical way forwards, but my proposal here today is that we don't do anything further on 2) yet, and seek to make progress on 5) instead.
If 5) fails to bring a workable solution by the Jan 2016 CF then we commit 2) instead.
If Heikki wishes to work on (5), that's good. Otherwise, I think its something I can understand and deliver by 1 Jan, though likely for 1 Nov CF.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Aug 9, 2015 at 11:03 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > If 5) fails to bring a workable solution by the Jan 2016 CF then we commit > 2) instead. Is there actually a conflict there? I didn't think so. -- Peter Geoghegan
On 10 August 2015 at 07:14, Peter Geoghegan <pg@heroku.com> wrote:
--
On Sun, Aug 9, 2015 at 11:03 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> If 5) fails to bring a workable solution by the Jan 2016 CF then we commit
> 2) instead.
Is there actually a conflict there? I didn't think so.
I didn't explain myself fully, thank you for asking.
Having a freeze map would be wholly unnecessary if we don't ever need to freeze whole tables again. Freezing would still be needed on individual blocks where an old row has been updated or deleted; a freeze map would not help there either.
So there is no conflict, but options 2) and 3) are completely redundant if we go for 5). After investigation, I now think 5) is achievable in 9.6, but if I am wrong for whatever reason, we have 2) as a backstop.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2015-08-10 07:26:29 +0100, Simon Riggs wrote: > On 10 August 2015 at 07:14, Peter Geoghegan <pg@heroku.com> wrote: > > > On Sun, Aug 9, 2015 at 11:03 PM, Simon Riggs <simon@2ndquadrant.com> > > wrote: > > > If 5) fails to bring a workable solution by the Jan 2016 CF then we > > commit > > > 2) instead. > > > > Is there actually a conflict there? I didn't think so. > > > > I didn't explain myself fully, thank you for asking. > > Having a freeze map would be wholly unnecessary if we don't ever need to > freeze whole tables again. Freezing would still be needed on individual > blocks where an old row has been updated or deleted; a freeze map would not > help there either. > > So there is no conflict, but options 2) and 3) are completely redundant if > we go for 5). After investigation, I now think 5) is achievable in 9.6, but > if I am wrong for whatever reason, we have 2) as a backstop. I don't think that's true. You can't ever delete the clog without freezing. There's no need for anti-wraparound scans anymore, but you still need to freeze once. Andres
On 08/10/2015 11:17 AM, Andres Freund wrote: > On 2015-08-10 07:26:29 +0100, Simon Riggs wrote: >> On 10 August 2015 at 07:14, Peter Geoghegan <pg@heroku.com> wrote: >> >>> On Sun, Aug 9, 2015 at 11:03 PM, Simon Riggs <simon@2ndquadrant.com> >>> wrote: >>>> If 5) fails to bring a workable solution by the Jan 2016 CF then we >>> commit >>>> 2) instead. >>> >>> Is there actually a conflict there? I didn't think so. >>> >> >> I didn't explain myself fully, thank you for asking. >> >> Having a freeze map would be wholly unnecessary if we don't ever need to >> freeze whole tables again. Freezing would still be needed on individual >> blocks where an old row has been updated or deleted; a freeze map would not >> help there either. >> >> So there is no conflict, but options 2) and 3) are completely redundant if >> we go for 5). After investigation, I now think 5) is achievable in 9.6, but >> if I am wrong for whatever reason, we have 2) as a backstop. > > I don't think that's true. You can't ever delete the clog without > freezing. There's no need for anti-wraparound scans anymore, but you > still need to freeze once. What's your definition of freezing? As long as you remove all dead tuples, you can just leave the rest in place with their original XID+epoch in place, and assume that everything old enough is committed. - Heikki
On 2015-08-10 11:25:37 +0300, Heikki Linnakangas wrote: > On 08/10/2015 11:17 AM, Andres Freund wrote: > >On 2015-08-10 07:26:29 +0100, Simon Riggs wrote: > >>So there is no conflict, but options 2) and 3) are completely redundant if > >>we go for 5). After investigation, I now think 5) is achievable in 9.6, but > >>if I am wrong for whatever reason, we have 2) as a backstop. > > > >I don't think that's true. You can't ever delete the clog without > >freezing. There's no need for anti-wraparound scans anymore, but you > >still need to freeze once. > > What's your definition of freezing? As long as you remove all dead tuples, > you can just leave the rest in place with their original XID+epoch in place, > and assume that everything old enough is committed. Hm. Right. -ENCOFFEE (I really ran out of beans). Sorry for that.
Simon, Thank you for this summary! I was losing track, myself. On 08/09/2015 11:03 PM, Simon Riggs wrote: > Freezing is painful for VLDBs and high transaction rate systems. We have > a number of proposals to improve things... > 3. Speed up autovacuums when they are triggered to avoid wraparounds (Simon) > Idea is to do a VACUUM scan which only freezes tuples. If we dirty a > page from freezing we then also prune it, but don't attempt to scan > indexes to remove the now-truncated dead tuples. > This looks very straightforward, no technical issues. Might even be able > to backpatch it. > [patch investigated but not finished yet] There's a lesser version of this item which remains relevant unless we implement (5). That is, currently the same autovacuum_vaccuum_delay (AVVD) applies to regular autovacuums as does to anti-wraparound autovacuums. If the user has set AV with a high delay, this means that anti-wraparound AV may never complete. For that reason, we ought to have a separate parameter for AVVD, which defaults to a lower number (like 5ms), or even to zero. Of course, if we implement (5), that's not necessary, since AV will never trigger an anti-wraparound freeze. > Having a freeze map would be wholly unnecessary if we don't ever need to > freeze whole tables again. Freezing would still be needed on individual > blocks where an old row has been updated or deleted; a freeze map would > not help there either. > > So there is no conflict, but options 2) and 3) are completely redundant > if we go for 5). After investigation, I now think 5) is achievable in > 9.6, but if I am wrong for whatever reason, we have 2) as a backstop. It's not redundant. Users may still want to freeze for two reasons: 1. to shrink the clog and multixact logs 2. to support INDEX-ONLY SCAN In both of those cases, having a freeze map would speed up the manual vacuum freeze considerably. Otherwise, we're just punting on the problem, and making it worse for users who wait too long. Now, it might still be the case that the *overhead* of a freeze map is a bad tradeoff if we don't have to worry about forced wraparound. But that's a different argument. BTW, has it occured to anyone that implementing XID epoch headers is going to mean messing with multixact logs again? Just thought I'd open a papercut and pour some lemon juice on it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 10 August 2015 at 18:02, Josh Berkus <josh@agliodbs.com> wrote:
--
There's a lesser version of this item which remains relevant unless we
implement (5). That is, currently the same autovacuum_vaccuum_delay
(AVVD) applies to regular autovacuums as does to anti-wraparound
autovacuums. If the user has set AV with a high delay, this means that
anti-wraparound AV may never complete. For that reason, we ought to
have a separate parameter for AVVD, which defaults to a lower number
(like 5ms), or even to zero.
Of course, if we implement (5), that's not necessary, since AV will
never trigger an anti-wraparound freeze.
Good idea.
> Having a freeze map would be wholly unnecessary if we don't ever need to
> freeze whole tables again. Freezing would still be needed on individual
> blocks where an old row has been updated or deleted; a freeze map would
> not help there either.
>
> So there is no conflict, but options 2) and 3) are completely redundant
> if we go for 5). After investigation, I now think 5) is achievable in
> 9.6, but if I am wrong for whatever reason, we have 2) as a backstop.
It's not redundant. Users may still want to freeze for two reasons:
1. to shrink the clog and multixact logs
2. to support INDEX-ONLY SCAN
Freezing is not a necessary pre-condition for either of those things, I am happy to say. There is confusion here because for ( 1 ) the shrink was performed after freezing, but when you have access to the epoch there is no need for exhaustive freezing - only in special cases, as noted. If we are lucky those special cases will mean a massive reduction in I/O. For ( 2 ) a normal VACUUM is sufficient and as Robert observes, maybe just HOT is enough.
In the new world, the clog can be shrunk when everything has been hinted. Given that is possible with just a normal VACUUM, I think the new anti-freeze design (hey, we need a name, right?) will mean the clog actually stays smaller in most cases than it does now.
In both of those cases, having a freeze map would speed up the manual
vacuum freeze considerably. Otherwise, we're just punting on the
problem, and making it worse for users who wait too long.
There would be no further need for the VACUUM FREEZE command. It would do nothing desirable.
Now, it might still be the case that the *overhead* of a freeze map is a
bad tradeoff if we don't have to worry about forced wraparound. But
that's a different argument.
I myself was in favour of the freeze map solution for some time, but I'm not anymore. Thank discussions at Pgcon slowly working their way into my brain.
BTW, has it occured to anyone that implementing XID epoch headers is
going to mean messing with multixact logs again? Just thought I'd open
a papercut and pour some lemon juice on it.
I doubt we have seen the last of that pain, but its not my fingers on the chopping board, so squeeze all you want. ;-)
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 08/10/2015 10:31 AM, Simon Riggs wrote: > Freezing is not a necessary pre-condition for either of those things, I > am happy to say. There is confusion here because for ( 1 ) the shrink > was performed after freezing, but when you have access to the epoch > there is no need for exhaustive freezing - only in special cases, as > noted. If we are lucky those special cases will mean a massive reduction > in I/O. For ( 2 ) a normal VACUUM is sufficient and as Robert observes, > maybe just HOT is enough. Yeah, saw your explanation on this on the other thread. Good point. Question: does regular vacuum update the visibility map in the same way vacuum freeze does? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Having a freeze map would be wholly unnecessary if we don't ever need to freeze whole tables again. Freezing would still be needed on individual blocks where an old row has been updated or deleted; a freeze map would not help there either. So there is no conflict, but options 2) and 3) are completely redundant if we go for 5). After investigation, I now think 5) is achievable in 9.6, but if I am wrong for whatever reason, we have 2) as a backstop for more go to h ttp://www.pillenpalast.com/ <http://www.pillenpalast.com/> ----- Kamagra http://www.pillenpalast.com/ -- View this message in context: http://postgresql.nabble.com/Summary-of-plans-to-avoid-the-annoyance-of-Freezing-tp5861530p5861534.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On 10 August 2015 at 19:21, Josh Berkus <josh@agliodbs.com> wrote:
--
Question: does regular vacuum update the visibility map in the same way
vacuum freeze does?
Yes
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2015-08-10 07:03:02 +0100, Simon Riggs wrote: > I was previously a proponent of (2) as a practical way forwards, but my > proposal here today is that we don't do anything further on 2) yet, and > seek to make progress on 5) instead. > > If 5) fails to bring a workable solution by the Jan 2016 CF then we commit > 2) instead. > > If Heikki wishes to work on (5), that's good. Otherwise, I think its > something I can understand and deliver by 1 Jan, though likely for 1 Nov CF. I highly doubt that we can get either variant into 9.6 if we only start to seriously review them by then. Heikki's lsn ranges patch essentially was a variant of 5) and it ended up being a rather complicated patch. I don't think using an explicit epoch is going to be that much simpler. So I think we need to decide now. My vote is that we should try to get freeze maps into 9.6 - that seems more realistic given that we have a patch right now. Yes, it might end up being superflous churn, but it's rather localized. I think around we've put off significant incremental improvements off with the promise of more radical stuff too often. Greetings, Andres Freund
On 9/6/15 7:25 AM, Andres Freund wrote: > On 2015-08-10 07:03:02 +0100, Simon Riggs wrote: >> I was previously a proponent of (2) as a practical way forwards, but my >> proposal here today is that we don't do anything further on 2) yet, and >> seek to make progress on 5) instead. >> >> If 5) fails to bring a workable solution by the Jan 2016 CF then we commit >> 2) instead. >> >> If Heikki wishes to work on (5), that's good. Otherwise, I think its >> something I can understand and deliver by 1 Jan, though likely for 1 Nov CF. > > I highly doubt that we can get either variant into 9.6 if we only start > to seriously review them by then. Heikki's lsn ranges patch essentially > was a variant of 5) and it ended up being a rather complicated patch. I > don't think using an explicit epoch is going to be that much simpler. > > So I think we need to decide now. > > My vote is that we should try to get freeze maps into 9.6 - that seems > more realistic given that we have a patch right now. Yes, it might end > up being superflous churn, but it's rather localized. I think around > we've put off significant incremental improvements off with the promise > of more radical stuff too often. I'm concerned with how to test this. Right now it's rather difficult to test things like epoch rollover, especially in a way that would expose race conditions and other corner cases. We obviously got burned by that on the MultiXact changes, and a lot of our best developers had to spend a huge amount of time fixing that. ISTM that a way to unit test things like CLOG/MXID truncation and visibility logic should be created before attempting a change like this. Would having this kind of test infrastructure have helped with the LSN patch development? More importantly, would it have reduced the odds of the MXID bugs, or made it easier to diagnose them? In any case, thanks Simon for the summary. I really like the idea and will help with it if I can. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On Sun, Sep 6, 2015 at 8:25 AM, Andres Freund <andres@anarazel.de> wrote: > On 2015-08-10 07:03:02 +0100, Simon Riggs wrote: >> I was previously a proponent of (2) as a practical way forwards, but my >> proposal here today is that we don't do anything further on 2) yet, and >> seek to make progress on 5) instead. >> >> If 5) fails to bring a workable solution by the Jan 2016 CF then we commit >> 2) instead. >> >> If Heikki wishes to work on (5), that's good. Otherwise, I think its >> something I can understand and deliver by 1 Jan, though likely for 1 Nov CF. > > I highly doubt that we can get either variant into 9.6 if we only start > to seriously review them by then. Heikki's lsn ranges patch essentially > was a variant of 5) and it ended up being a rather complicated patch. I > don't think using an explicit epoch is going to be that much simpler. > > So I think we need to decide now. > > My vote is that we should try to get freeze maps into 9.6 - that seems > more realistic given that we have a patch right now. Yes, it might end > up being superflous churn, but it's rather localized. I think around > we've put off significant incremental improvements off with the promise > of more radical stuff too often. I strongly support that plan. I think it's unlikely we're going to have something better ready in time for 9.6, and freeze maps by themselves would bring enormous relief to many people. And the worst thing that happens if we rip this back out again because it isn't needed, which isn't really going to be a lot of work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Sep 6, 2015 at 1:25 PM, Andres Freund <andres@anarazel.de> wrote: > My vote is that we should try to get freeze maps into 9.6 - that seems > more realistic given that we have a patch right now. Yes, it might end > up being superflous churn, but it's rather localized. I think around > we've put off significant incremental improvements off with the promise > of more radical stuff too often. Superfluous churn in the code isn't too bad. But superfluous churn in data formats might be a bit more scary. Would we be able to handle pg_upgrade from a database with or without a freezemap? Would you have to upgrade once to add the freezemap then again to remove it? -- greg
On Wed, Sep 9, 2015 at 10:03 PM, Greg Stark <stark@mit.edu> wrote: > On Sun, Sep 6, 2015 at 1:25 PM, Andres Freund <andres@anarazel.de> wrote: >> My vote is that we should try to get freeze maps into 9.6 - that seems >> more realistic given that we have a patch right now. Yes, it might end >> up being superflous churn, but it's rather localized. I think around >> we've put off significant incremental improvements off with the promise >> of more radical stuff too often. > > Superfluous churn in the code isn't too bad. But superfluous churn in > data formats might be a bit more scary. Would we be able to handle > pg_upgrade from a database with or without a freezemap? Would you have > to upgrade once to add the freezemap then again to remove it? > Currently freeze map patch adds frozen bit to visibility map when upgrading to 9.6. The visibility map is not critical information, and is generated by VACUUM. So we can drop it and create new visibility map by doing VACUUM, if table size is not large. Regards, -- Masahiko Sawada
On Wed, Sep 9, 2015 at 6:03 AM, Greg Stark <stark@mit.edu> wrote:
On Sun, Sep 6, 2015 at 1:25 PM, Andres Freund <andres@anarazel.de> wrote:
> My vote is that we should try to get freeze maps into 9.6 - that seems
> more realistic given that we have a patch right now. Yes, it might end
> up being superflous churn, but it's rather localized. I think around
> we've put off significant incremental improvements off with the promise
> of more radical stuff too often.
Superfluous churn in the code isn't too bad. But superfluous churn in
data formats might be a bit more scary. Would we be able to handle
pg_upgrade from a database with or without a freezemap? Would you have
to upgrade once to add the freezemap then again to remove it?
Surely we wouldn't introduce and remove freeze-maps between minor versions. So either it is a new major version, in which case you would be doing the upgrade anyway, or they would be added and then removed again all within one development cycle; and running unreleased code always has on-disk incompatibility churn. Or am I missing your point here?
Cheers,
Jeff
* My vote is that we should try to get freeze maps into 9.6 - that seems * More realistic given that we have a patch right now. Yes, it might end * Up being superfluous churn, but it's rather localized. I think around *We’ve put off significant incremental improvements off with the promise *Of more radical stuff too often. Superfluous churn in the code isn't too bad. But superfluous churn in Data formats might be a bit scarier. Would we be able to handle pg_upgrade from a database with or without a freeze map? Would you have? To upgrade once to add the freeze map then again to remove it? Surely we wouldn't introduce and remove freeze-maps between minor versions. So either it is a new major version, in which case you would be doing the upgrade anyway, or they would be added and then removed again all within one development cycle; and running unreleased code always has on-disk incompatibility churn. Or am I missing your point here? Cheers, Martin ----- Kamagra -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html