Обсуждение: New vacuum option to do only freezing
Hi, Attached patch adds a new option FREEZE_ONLY to VACUUM command. This option is same as FREEZE option except for it disables reclaiming dead tuples. That is, with this option vacuum does pruning HOT chain, freezing live tuples and maintaining both visibility map and freespace map but does not collect dead tuples and invoke neither heap vacuum nor index vacuum. This option will be useful if user wants to prevent XID wraparound a table as quick as possible, especially when table is quite large and is about to XID wraparound. I think this usecase was mentioned in some threads but I couldn't find them. Currently this patch just adds the new option to VACUUM command but it might be good to make autovacuum use it when emergency vacuum is required. This is a performance-test result for FREEZE option and FREEZE_ONLY option. I've tested them on the table which is about 3.8GB table without indexes and randomly modified. * FREEZE INFO: aggressively vacuuming "public.pgbench_accounts" INFO: "pgbench_accounts": removed 5 row versions in 8 pages INFO: "pgbench_accounts": found 5 removable, 30000000 nonremovable row versions in 491804 out of 491804 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 722 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 4.20 s, system: 16.47 s, elapsed: 50.28 s. VACUUM Time: 50301.262 ms (00:50.301) * FREEZE_ONLY INFO: aggressively vacuuming "public.pgbench_accounts" INFO: "pgbench_accounts": found 4 removable, 30000000 nonremovable row versions in 491804 out of 491804 pages DETAIL: freeze 30000000 rows There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 3.10 s, system: 14.85 s, elapsed: 44.56 s. VACUUM Time: 44589.794 ms (00:44.590) Feedback is very welcome. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Mon, Oct 1, 2018 at 7:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Hi, > > Attached patch adds a new option FREEZE_ONLY to VACUUM command. This > option is same as FREEZE option except for it disables reclaiming dead > tuples. That is, with this option vacuum does pruning HOT chain, > freezing live tuples and maintaining both visibility map and freespace > map but does not collect dead tuples and invoke neither heap vacuum > nor index vacuum. This option will be useful if user wants to prevent > XID wraparound a table as quick as possible, especially when table is > quite large and is about to XID wraparound. I think this usecase was > mentioned in some threads but I couldn't find them. > > Currently this patch just adds the new option to VACUUM command but it > might be good to make autovacuum use it when emergency vacuum is > required. > > This is a performance-test result for FREEZE option and FREEZE_ONLY > option. I've tested them on the table which is about 3.8GB table > without indexes and randomly modified. > > * FREEZE > INFO: aggressively vacuuming "public.pgbench_accounts" > INFO: "pgbench_accounts": removed 5 row versions in 8 pages > INFO: "pgbench_accounts": found 5 removable, 30000000 nonremovable > row versions in 491804 out of 491804 pages > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 722 > There were 0 unused item pointers. > Skipped 0 pages due to buffer pins, 0 frozen pages. > 0 pages are entirely empty. > CPU: user: 4.20 s, system: 16.47 s, elapsed: 50.28 s. > VACUUM > Time: 50301.262 ms (00:50.301) > > * FREEZE_ONLY > INFO: aggressively vacuuming "public.pgbench_accounts" > INFO: "pgbench_accounts": found 4 removable, 30000000 nonremovable > row versions in 491804 out of 491804 pages > DETAIL: freeze 30000000 rows > There were 0 unused item pointers. > Skipped 0 pages due to buffer pins, 0 frozen pages. > 0 pages are entirely empty. > CPU: user: 3.10 s, system: 14.85 s, elapsed: 44.56 s. > VACUUM > Time: 44589.794 ms (00:44.590) > > Feedback is very welcome. > Added to the next commit fest. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Oct 4, 2018 at 6:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Oct 1, 2018 at 7:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Hi, > > > > Attached patch adds a new option FREEZE_ONLY to VACUUM command. This > > option is same as FREEZE option except for it disables reclaiming dead > > tuples. That is, with this option vacuum does pruning HOT chain, > > freezing live tuples and maintaining both visibility map and freespace > > map but does not collect dead tuples and invoke neither heap vacuum > > nor index vacuum. This option will be useful if user wants to prevent > > XID wraparound a table as quick as possible, especially when table is > > quite large and is about to XID wraparound. I think this usecase was > > mentioned in some threads but I couldn't find them. > > > > Currently this patch just adds the new option to VACUUM command but it > > might be good to make autovacuum use it when emergency vacuum is > > required. > > > > This is a performance-test result for FREEZE option and FREEZE_ONLY > > option. I've tested them on the table which is about 3.8GB table > > without indexes and randomly modified. > > > > * FREEZE > > INFO: aggressively vacuuming "public.pgbench_accounts" > > INFO: "pgbench_accounts": removed 5 row versions in 8 pages > > INFO: "pgbench_accounts": found 5 removable, 30000000 nonremovable > > row versions in 491804 out of 491804 pages > > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 722 > > There were 0 unused item pointers. > > Skipped 0 pages due to buffer pins, 0 frozen pages. > > 0 pages are entirely empty. > > CPU: user: 4.20 s, system: 16.47 s, elapsed: 50.28 s. > > VACUUM > > Time: 50301.262 ms (00:50.301) > > > > * FREEZE_ONLY > > INFO: aggressively vacuuming "public.pgbench_accounts" > > INFO: "pgbench_accounts": found 4 removable, 30000000 nonremovable > > row versions in 491804 out of 491804 pages > > DETAIL: freeze 30000000 rows > > There were 0 unused item pointers. > > Skipped 0 pages due to buffer pins, 0 frozen pages. > > 0 pages are entirely empty. > > CPU: user: 3.10 s, system: 14.85 s, elapsed: 44.56 s. > > VACUUM > > Time: 44589.794 ms (00:44.590) > > > > Feedback is very welcome. > > > > Added to the next commit fest. Rebaed to the current HEAD. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
Hi, On 10/1/18, 5:23 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > Attached patch adds a new option FREEZE_ONLY to VACUUM command. This > option is same as FREEZE option except for it disables reclaiming dead > tuples. That is, with this option vacuum does pruning HOT chain, > freezing live tuples and maintaining both visibility map and freespace > map but does not collect dead tuples and invoke neither heap vacuum > nor index vacuum. This option will be useful if user wants to prevent > XID wraparound a table as quick as possible, especially when table is > quite large and is about to XID wraparound. I think this usecase was > mentioned in some threads but I couldn't find them. I've thought about this a bit myself. One of the reasons VACUUM can take so long is because of all the index scans needed. If you're in a potential XID wraparound situation and just need a quick way out, it would be nice to have a way to do the minimum amount of work necessary to reclaim transaction IDs. At a high level, I think there are some improvements to this design we should consider. 1. Create a separate FREEZE command instead of adding a new VACUUM option The first line of the VACUUM documentation reads, "VACUUM reclaims storage occupied by dead tuples," which is something that we would explicitly not be doing with FREEZE_ONLY. I think it makes sense to reuse many of the VACUUM code paths to implement this feature, but from a user perspective, it should be separate. 2. We should reclaim transaction IDs from dead tuples as well Unless we also have a way to freeze XMAX like we do XMIN, I doubt this feature will be useful for the imminent-XID-wraparound use-case. In short, we won't be able to advance relfrozenxid and relminmxid beyond the oldest XMAX value for the relation. IIUC the idea of freezing XMAX doesn't really exist yet. Either the XMAX is aborted/invalid and can be reset to InvalidTransactionId, or it is committed and the tuple can be removed if it beyond the freezing threshold. So, we probably also want to look into adding a way to freeze XMAX, either by setting it to FrozenTransactionId or by setting the hint bits to (HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN. Looking closer, I see that the phrase "freezing XMAX" is currently used to refer to setting it to InvalidTransactionId if it is aborted or invalid (e.g. lock-only). > Currently this patch just adds the new option to VACUUM command but it > might be good to make autovacuum use it when emergency vacuum is > required. This also seems like a valid use-case, but it should definitely be done as a separate effort after this feature has been committed. > This is a performance-test result for FREEZE option and FREEZE_ONLY > option. I've tested them on the table which is about 3.8GB table > without indexes and randomly modified. > > * FREEZE > ... > Time: 50301.262 ms (00:50.301) > > * FREEZE_ONLY > ... > Time: 44589.794 ms (00:44.590) I'd be curious to see the improvements you get when there are several indexes on the relation. The ability to skip the index scans is likely how this feature will really help speed things up. Nathan
On Mon, Oct 1, 2018 at 6:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Attached patch adds a new option FREEZE_ONLY to VACUUM command. This > option is same as FREEZE option except for it disables reclaiming dead > tuples. That is, with this option vacuum does pruning HOT chain, > freezing live tuples and maintaining both visibility map and freespace > map but does not collect dead tuples and invoke neither heap vacuum > nor index vacuum. This option will be useful if user wants to prevent > XID wraparound a table as quick as possible, especially when table is > quite large and is about to XID wraparound. I think this usecase was > mentioned in some threads but I couldn't find them. The feature seems like a reasonable one, but the name doesn't seem like a good choice. It doesn't only freeze - e.g. it HOT-prunes, as it must. Maybe consider something like (without_index_cleanup true) or (index_cleanup false). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Nov 2, 2018 at 1:32 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > Hi, > > On 10/1/18, 5:23 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > > Attached patch adds a new option FREEZE_ONLY to VACUUM command. This > > option is same as FREEZE option except for it disables reclaiming dead > > tuples. That is, with this option vacuum does pruning HOT chain, > > freezing live tuples and maintaining both visibility map and freespace > > map but does not collect dead tuples and invoke neither heap vacuum > > nor index vacuum. This option will be useful if user wants to prevent > > XID wraparound a table as quick as possible, especially when table is > > quite large and is about to XID wraparound. I think this usecase was > > mentioned in some threads but I couldn't find them. > Thank you for the comment! > I've thought about this a bit myself. One of the reasons VACUUM can > take so long is because of all the index scans needed. If you're in a > potential XID wraparound situation and just need a quick way out, it > would be nice to have a way to do the minimum amount of work necessary > to reclaim transaction IDs. At a high level, I think there are some > improvements to this design we should consider. > > 1. Create a separate FREEZE command instead of adding a new VACUUM > option > > The first line of the VACUUM documentation reads, "VACUUM reclaims > storage occupied by dead tuples," which is something that we would > explicitly not be doing with FREEZE_ONLY. No. Actually FREEZE_ONLY option (maybe will be changed its name) could reclaim dead tuples by HOT-purning. If a page have HOT-updated chains the FREEZE_ONLY prunes them and reclaim disk space occupied. > I think it makes sense to > reuse many of the VACUUM code paths to implement this feature, but > from a user perspective, it should be separate. I'm concernced that since the existing users already have recognized that vacuuming and freezing are closely related they would get confused more if we have a similar purpose feature with different name. > > 2. We should reclaim transaction IDs from dead tuples as well > > Unless we also have a way to freeze XMAX like we do XMIN, I doubt this > feature will be useful for the imminent-XID-wraparound use-case. In > short, we won't be able to advance relfrozenxid and relminmxid beyond > the oldest XMAX value for the relation. > IIUC the idea of freezing> XMAX doesn't really exist yet. Either the XMAX is aborted/invalid and > can be reset to InvalidTransactionId, or it is committed and the tuple > can be removed if it beyond the freezing threshold. So, we probably > also want to look into adding a way to freeze XMAX, either by setting > it to FrozenTransactionId or by setting the hint bits to > (HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN. That's a good point. If the oldest xmax is close to the old relfrozenxid we will not be able to advance relfrozenxid enough. However, since dead tuples are vacuumed by autovacuum periodically I think that we can advance relfrozenxid enough in common case. There is possible that we eventually need to do vacuum with removing dead tuples after done FREEZE_ONLY but it would be a rare case. Thought? > > Looking closer, I see that the phrase "freezing XMAX" is currently > used to refer to setting it to InvalidTransactionId if it is aborted > or invalid (e.g. lock-only). > > > Currently this patch just adds the new option to VACUUM command but it > > might be good to make autovacuum use it when emergency vacuum is > > required. > > This also seems like a valid use-case, but it should definitely be > done as a separate effort after this feature has been committed. Agreed. > > > This is a performance-test result for FREEZE option and FREEZE_ONLY > > option. I've tested them on the table which is about 3.8GB table > > without indexes and randomly modified. > > > > * FREEZE > > ... > > Time: 50301.262 ms (00:50.301) > > > > * FREEZE_ONLY > > ... > > Time: 44589.794 ms (00:44.590) > > I'd be curious to see the improvements you get when there are several > indexes on the relation. The ability to skip the index scans is > likely how this feature will really help speed things up. > I've tested performance of FREEZE option and FREEZE_ONLY option using a 3GB table having 3 indexes. Before do vacuum I modified 1 % of data on the table. * FREEZE Time: 78677.211 ms (01:18.677) Time: 86958.452 ms (01:26.958) Time: 78351.190 ms (01:18.351) * FREEZE_ONLY Time: 19913.863 ms (00:19.914) Time: 18917.379 ms (00:18.917) Time: 20048.541 ms (00:20.049) Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Fri, Nov 2, 2018 at 2:02 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Oct 1, 2018 at 6:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Attached patch adds a new option FREEZE_ONLY to VACUUM command. This > > option is same as FREEZE option except for it disables reclaiming dead > > tuples. That is, with this option vacuum does pruning HOT chain, > > freezing live tuples and maintaining both visibility map and freespace > > map but does not collect dead tuples and invoke neither heap vacuum > > nor index vacuum. This option will be useful if user wants to prevent > > XID wraparound a table as quick as possible, especially when table is > > quite large and is about to XID wraparound. I think this usecase was > > mentioned in some threads but I couldn't find them. > Thank you for the comment! > The feature seems like a reasonable one, but the name doesn't seem > like a good choice. It doesn't only freeze - e.g. it HOT-prunes, as > it must. Maybe consider something like (without_index_cleanup true) > or (index_cleanup false). Adding a field-and-value style option might be worth. Or maybe we can add one option for example freeze_without_index_cleanup? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 11/5/18, 2:07 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > On Fri, Nov 2, 2018 at 1:32 AM Bossart, Nathan <bossartn@amazon.com> wrote: >> 1. Create a separate FREEZE command instead of adding a new VACUUM >> option >> >> The first line of the VACUUM documentation reads, "VACUUM reclaims >> storage occupied by dead tuples," which is something that we would >> explicitly not be doing with FREEZE_ONLY. > > No. Actually FREEZE_ONLY option (maybe will be changed its name) could > reclaim dead tuples by HOT-purning. If a page have HOT-updated chains > the FREEZE_ONLY prunes them and reclaim disk space occupied. I see. >> I think it makes sense to >> reuse many of the VACUUM code paths to implement this feature, but >> from a user perspective, it should be separate. > > I'm concernced that since the existing users already have recognized > that vacuuming and freezing are closely related they would get > confused more if we have a similar purpose feature with different > name. That seems reasonable to me. Perhaps decoupling this option from FREEZE would make it clearer to users and easier to name. This would allow users to do both VACUUM (WITHOUT_INDEX_CLEANUP) and VACUUM (FREEZE, WITHOUT_INDEX_CLEANUP). >> 2. We should reclaim transaction IDs from dead tuples as well >> >> Unless we also have a way to freeze XMAX like we do XMIN, I doubt this >> feature will be useful for the imminent-XID-wraparound use-case. In >> short, we won't be able to advance relfrozenxid and relminmxid beyond >> the oldest XMAX value for the relation. >> IIUC the idea of freezing> XMAX doesn't really exist yet. Either the XMAX is aborted/invalid and >> can be reset to InvalidTransactionId, or it is committed and the tuple >> can be removed if it beyond the freezing threshold. So, we probably >> also want to look into adding a way to freeze XMAX, either by setting >> it to FrozenTransactionId or by setting the hint bits to >> (HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN. > > That's a good point. If the oldest xmax is close to the old > relfrozenxid we will not be able to advance relfrozenxid enough. > However, since dead tuples are vacuumed by autovacuum periodically I > think that we can advance relfrozenxid enough in common case. There is > possible that we eventually need to do vacuum with removing dead > tuples after done FREEZE_ONLY but it would be a rare case. Thought? Given that a stated goal of this patch is to help recover from near wraparound, I think this is very important optimization. It's true that you might be able to advance relfrozenxid/relminmxid a bit in some cases, but there are many others where you won't. For example, if I create a row, delete it, and insert many more rows, my table's XID age would be stuck at the row deletion. If I was in a situation where this table was near wraparound and autovacuum wasn't keeping up, I would run VACUUM (WITHOUT_INDEX_CLEANUP, FREEZE) with the intent of reclaiming transaction IDs as fast as possible. >> I'd be curious to see the improvements you get when there are several >> indexes on the relation. The ability to skip the index scans is >> likely how this feature will really help speed things up. >> > > I've tested performance of FREEZE option and FREEZE_ONLY option using > a 3GB table having 3 indexes. Before do vacuum I modified 1 % of data > on the table. > > * FREEZE > Time: 78677.211 ms (01:18.677) > Time: 86958.452 ms (01:26.958) > Time: 78351.190 ms (01:18.351) > > * FREEZE_ONLY > Time: 19913.863 ms (00:19.914) > Time: 18917.379 ms (00:18.917) > Time: 20048.541 ms (00:20.049) Nice. Nathan
On Mon, Nov 5, 2018 at 3:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Adding a field-and-value style option might be worth. Or maybe we can > add one option for example freeze_without_index_cleanup? That seems non-orthogonal. We have an existing flag to force freezing (FREEZE); we don't need a second option that does the same thing. Skipping the index scans (and thus necessarily the second heap pass) is a separate behavior which we don't currently have a way to control. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon 5 Nov 2018, 12:49 Robert Haas <robertmhaas@gmail.com wrote:
That seems non-orthogonal. We have an existing flag to force freezing
(FREEZE); we don't need a second option that does the same thing.
Skipping the index scans (and thus necessarily the second heap pass)
is a separate behavior which we don't currently have a way to control.
It sounds like it might be better to name this "VACUUM (FAST)” and document that it skips some of the normal (and necessary) work that vacuum does and is only suitable for avoiding wraparounds and not sufficient for avoiding bloat
On Mon, Nov 5, 2018 at 9:12 PM Greg Stark <stark@mit.edu> wrote: > It sounds like it might be better to name this "VACUUM (FAST)” and document that it skips some of the normal (and necessary)work that vacuum does and is only suitable for avoiding wraparounds and not sufficient for avoiding bloat We could do that, but I don't see why that's better than VACUUM (SKIP_INDEX_SCANS) or similar. There are, perhaps, multiple kinds of shortcuts that could make vacuum run faster, but skipping index scans is what it is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Nov 6, 2018 at 2:48 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Nov 5, 2018 at 3:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Adding a field-and-value style option might be worth. Or maybe we can > > add one option for example freeze_without_index_cleanup? > > That seems non-orthogonal. We have an existing flag to force freezing > (FREEZE); we don't need a second option that does the same thing. > Skipping the index scans (and thus necessarily the second heap pass) > is a separate behavior which we don't currently have a way to control. > We already have disable_page_skipping option, not (page_skipping false). So ISMT disable_index_cleanup would be more natural. Also, since what to do with this option is not only skipping vacuum indexes but also skipping removeing dead tuples on heap, I think that the option should have a more understandable name for users indicating that both it removes dead tuples less than the normal vacuum and it's aimed to freeze tuple more faster. Of course we document them, though. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 2018-Nov-08, Masahiko Sawada wrote: > On Tue, Nov 6, 2018 at 2:48 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Mon, Nov 5, 2018 at 3:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > Adding a field-and-value style option might be worth. Or maybe we can > > > add one option for example freeze_without_index_cleanup? > > > > That seems non-orthogonal. We have an existing flag to force freezing > > (FREEZE); we don't need a second option that does the same thing. > > Skipping the index scans (and thus necessarily the second heap pass) > > is a separate behavior which we don't currently have a way to control. > > We already have disable_page_skipping option, not (page_skipping > false). So ISMT disable_index_cleanup would be more natural. Also, > since what to do with this option is not only skipping vacuum indexes > but also skipping removeing dead tuples on heap, I think that the > option should have a more understandable name for users indicating > that both it removes dead tuples less than the normal vacuum and it's > aimed to freeze tuple more faster. Of course we document them, though. I would name this based on the fact that it freezes quickly -- something like FAST_FREEZE perhaps. The other things seem implementation details. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 8, 2018 at 4:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > We already have disable_page_skipping option, not (page_skipping > false). So ISMT disable_index_cleanup would be more natural. Sure. > Also, > since what to do with this option is not only skipping vacuum indexes > but also skipping removeing dead tuples on heap, I think that the > option should have a more understandable name for users indicating > that both it removes dead tuples less than the normal vacuum and it's > aimed to freeze tuple more faster. Of course we document them, though. Well, I actually don't think that you should control two behaviors with the same option. If you want to vacuum and skip index cleanup, you should say VACUUM (disable_index_cleanup). If you want to vacuum, disable index cleanup, and skip aggressively, you should say VACUUM (freeze, disable_index_cleanup). Both behaviors seem useful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Nov 9, 2018 at 2:56 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Nov 8, 2018 at 4:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > We already have disable_page_skipping option, not (page_skipping > > false). So ISMT disable_index_cleanup would be more natural. > > Sure. > > > Also, > > since what to do with this option is not only skipping vacuum indexes > > but also skipping removeing dead tuples on heap, I think that the > > option should have a more understandable name for users indicating > > that both it removes dead tuples less than the normal vacuum and it's > > aimed to freeze tuple more faster. Of course we document them, though. > > Well, I actually don't think that you should control two behaviors > with the same option. If you want to vacuum and skip index cleanup, > you should say VACUUM (disable_index_cleanup). If you want to vacuum, > disable index cleanup, and skip aggressively, you should say VACUUM > (freeze, disable_index_cleanup). Both behaviors seem useful. > Attached the updated version patch incorporated all comments I got. The patch includes the following changes. * Changed the option name to DISABLE_INDEX_CLENAUP. I agreed with Robert and Nathan, having an option separated from FREEZE. * Instead of freezing xmax I've changed the behaviour of the new option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead instead of as unused and skips both index vacuum and index cleanup. That is, we remove the storage of dead tuple but leave dead itemids for the index lookups. These are removed by the next vacuum execution enabling index cleanup. Currently HOT-pruning doesn't set the root of the chain as unused even if the whole chain is dead. Since setting tuples as dead removes storage the freezing xmax is no longer required. The second change might conflict with the retail index deletion patch[1], which makes HOT-pruning *mark* tuples as dead (i.g., using ItemIdMarkDead() instead). I think that this patch would not block that patch but I've not considered deeply yet the combination with these two pathes. [1] https://www.postgresql.org/message-id/flat/425db134-8bba-005c-b59d-56e50de3b41e%40postgrespro.ru Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
Hi, On 1/8/19, 7:03 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > Attached the updated version patch incorporated all comments I got. Thanks for the new patch! > * Instead of freezing xmax I've changed the behaviour of the new > option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead > instead of as unused and skips both index vacuum and index cleanup. > That is, we remove the storage of dead tuple but leave dead itemids > for the index lookups. These are removed by the next vacuum execution > enabling index cleanup. Currently HOT-pruning doesn't set the root of > the chain as unused even if the whole chain is dead. Since setting > tuples as dead removes storage the freezing xmax is no longer > required. I tested this option with a variety of cases (HOT, indexes, etc.), and it seems to work well. I haven't looked too deeply into the implications of using LP_DEAD this way, but it seems like a reasonable approach at first glance. + <varlistentry> + <term><literal>DISABLE_INDEX_CLEANUP</literal></term> + <listitem> + <para> + <command>VACUUM</command> removes dead tuples and prunes HOT-updated + tuples chain for live tuples on table. If the table has any dead tuple + it removes them from both table and indexes for re-use. With this + option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't + remove tuple storage) and disables removing dead tuples from indexes. + This is suitable for avoiding transaction ID wraparound but not + sufficient for avoiding index bloat. This option cannot be used in + conjunction with <literal>FULL</literal> option. + </para> + </listitem> + </varlistentry> I think we should clarify the expected behavior when this option is used on a table with no indexes. We probably do not want to fail, as this could disrupt VACUUM commands that touch several tables. Also, we don't need to set tuples as dead instead of unused, which appears to be what this patch is actually doing: + if (nindexes > 0 && disable_index_cleanup) + lazy_set_tuples_dead(onerel, blkno, buf, vacrelstats); + else + lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer); In this case, perhaps we should generate a log statement that notes that DISABLE_INDEX_CLEANUP is being ignored and set disable_index_cleanup to false during processing. + if (disable_index_cleanup) + ereport(elevel, + (errmsg("\"%s\": marked %.0f row versions in %u pages as dead", + RelationGetRelationName(onerel), + tups_vacuumed, vacuumed_pages))); + else + ereport(elevel, + (errmsg("\"%s\": removed %.0f row versions in %u pages", + RelationGetRelationName(onerel), + tups_vacuumed, vacuumed_pages))); Should the first log statement only be generated when there are also indexes? +static void +lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer, + LVRelStats *vacrelstats) This function looks very similar to lazy_vacuum_page(). Perhaps the two could be combined? Nathan
On Thu, Jan 10, 2019 at 5:23 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > Hi, > > On 1/8/19, 7:03 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > > Attached the updated version patch incorporated all comments I got. > > Thanks for the new patch! > > > * Instead of freezing xmax I've changed the behaviour of the new > > option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead > > instead of as unused and skips both index vacuum and index cleanup. > > That is, we remove the storage of dead tuple but leave dead itemids > > for the index lookups. These are removed by the next vacuum execution > > enabling index cleanup. Currently HOT-pruning doesn't set the root of > > the chain as unused even if the whole chain is dead. Since setting > > tuples as dead removes storage the freezing xmax is no longer > > required. > > I tested this option with a variety of cases (HOT, indexes, etc.), and > it seems to work well. I haven't looked too deeply into the > implications of using LP_DEAD this way, but it seems like a reasonable > approach at first glance. Thank you for reviewing the patch! > > + <varlistentry> > + <term><literal>DISABLE_INDEX_CLEANUP</literal></term> > + <listitem> > + <para> > + <command>VACUUM</command> removes dead tuples and prunes HOT-updated > + tuples chain for live tuples on table. If the table has any dead tuple > + it removes them from both table and indexes for re-use. With this > + option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't > + remove tuple storage) and disables removing dead tuples from indexes. > + This is suitable for avoiding transaction ID wraparound but not > + sufficient for avoiding index bloat. This option cannot be used in > + conjunction with <literal>FULL</literal> option. > + </para> > + </listitem> > + </varlistentry> > > I think we should clarify the expected behavior when this option is > used on a table with no indexes. We probably do not want to fail, as > this could disrupt VACUUM commands that touch several tables. Also, > we don't need to set tuples as dead instead of unused, which appears > to be what this patch is actually doing: > > + if (nindexes > 0 && disable_index_cleanup) > + lazy_set_tuples_dead(onerel, blkno, buf, vacrelstats); > + else > + lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer); > > In this case, perhaps we should generate a log statement that notes > that DISABLE_INDEX_CLEANUP is being ignored and set > disable_index_cleanup to false during processing. Agreed. How about output a NOTICE message before calling lazy_scan_heap() in lazy_vacuum_rel()? > > + if (disable_index_cleanup) > + ereport(elevel, > + (errmsg("\"%s\": marked %.0f row versions in %u pages as dead", > + RelationGetRelationName(onerel), > + tups_vacuumed, vacuumed_pages))); > + else > + ereport(elevel, > + (errmsg("\"%s\": removed %.0f row versions in %u pages", > + RelationGetRelationName(onerel), > + tups_vacuumed, vacuumed_pages))); > > Should the first log statement only be generated when there are also > indexes? You're right. Will fix. > > +static void > +lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer, > + LVRelStats *vacrelstats) > > This function looks very similar to lazy_vacuum_page(). Perhaps the > two could be combined? > Yes, I intentionally separed them as I was concerned the these functions have different assumptions and usage. But the combining them also could work. Will change it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Jan 10, 2019 at 2:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Jan 10, 2019 at 5:23 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > > > Hi, > > > > On 1/8/19, 7:03 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > > > Attached the updated version patch incorporated all comments I got. > > > > Thanks for the new patch! > > > > > * Instead of freezing xmax I've changed the behaviour of the new > > > option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead > > > instead of as unused and skips both index vacuum and index cleanup. > > > That is, we remove the storage of dead tuple but leave dead itemids > > > for the index lookups. These are removed by the next vacuum execution > > > enabling index cleanup. Currently HOT-pruning doesn't set the root of > > > the chain as unused even if the whole chain is dead. Since setting > > > tuples as dead removes storage the freezing xmax is no longer > > > required. > > > > I tested this option with a variety of cases (HOT, indexes, etc.), and > > it seems to work well. I haven't looked too deeply into the > > implications of using LP_DEAD this way, but it seems like a reasonable > > approach at first glance. > > Thank you for reviewing the patch! > > > > > + <varlistentry> > > + <term><literal>DISABLE_INDEX_CLEANUP</literal></term> > > + <listitem> > > + <para> > > + <command>VACUUM</command> removes dead tuples and prunes HOT-updated > > + tuples chain for live tuples on table. If the table has any dead tuple > > + it removes them from both table and indexes for re-use. With this > > + option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't > > + remove tuple storage) and disables removing dead tuples from indexes. > > + This is suitable for avoiding transaction ID wraparound but not > > + sufficient for avoiding index bloat. This option cannot be used in > > + conjunction with <literal>FULL</literal> option. > > + </para> > > + </listitem> > > + </varlistentry> > > > > I think we should clarify the expected behavior when this option is > > used on a table with no indexes. We probably do not want to fail, as > > this could disrupt VACUUM commands that touch several tables. Also, > > we don't need to set tuples as dead instead of unused, which appears > > to be what this patch is actually doing: > > > > + if (nindexes > 0 && disable_index_cleanup) > > + lazy_set_tuples_dead(onerel, blkno, buf, vacrelstats); > > + else > > + lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer); > > > > In this case, perhaps we should generate a log statement that notes > > that DISABLE_INDEX_CLEANUP is being ignored and set > > disable_index_cleanup to false during processing. > > Agreed. How about output a NOTICE message before calling > lazy_scan_heap() in lazy_vacuum_rel()? > > > > > + if (disable_index_cleanup) > > + ereport(elevel, > > + (errmsg("\"%s\": marked %.0f row versions in %u pages as dead", > > + RelationGetRelationName(onerel), > > + tups_vacuumed, vacuumed_pages))); > > + else > > + ereport(elevel, > > + (errmsg("\"%s\": removed %.0f row versions in %u pages", > > + RelationGetRelationName(onerel), > > + tups_vacuumed, vacuumed_pages))); > > > > Should the first log statement only be generated when there are also > > indexes? > > You're right. Will fix. > > > > > +static void > > +lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer, > > + LVRelStats *vacrelstats) > > > > This function looks very similar to lazy_vacuum_page(). Perhaps the > > two could be combined? > > > > Yes, I intentionally separed them as I was concerned the these > functions have different assumptions and usage. But the combining them > also could work. Will change it. > Attached the updated patch. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Fri, Jan 11, 2019 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Attached the updated patch. Please review it. I'm quite confused by this patch. It seems to me that the easiest way to implement this patch would be to (1) make lazy_space_alloc take the maxtuples = MaxHeapTuplesPerPage branch when the new option is specified, and then (2) forget about them after each page i.e. if (nindexes == 0 && vacrelstats->num_dead_tuples > 0) { ... } else if (skipping index cleanup) vacrelstats->num_dead_tuples = 0; I don't see why it should touch the logic inside lazy_vacuum_page() or the decision about whether to truncate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Jan 12, 2019 at 3:27 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Jan 11, 2019 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Attached the updated patch. Please review it. > > I'm quite confused by this patch. It seems to me that the easiest way > to implement this patch would be to (1) make lazy_space_alloc take the > maxtuples = MaxHeapTuplesPerPage branch when the new option is > specified, and then (2) forget about them after each page i.e. > > if (nindexes == 0 && > vacrelstats->num_dead_tuples > 0) > { > ... > } > else if (skipping index cleanup) > vacrelstats->num_dead_tuples = 0; > > I don't see why it should touch the logic inside lazy_vacuum_page() or > the decision about whether to truncate. > I think that because the tuples that got dead after heap_page_prune() looked are recorded but not removed without lazy_vacuum_page() we need to process them in lazy_vacuum_page(). For decision about whether to truncate we should not change it, so I will fix it. It should be an another option to control whether to truncate if we want. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, Jan 14, 2019 at 9:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > I think that because the tuples that got dead after heap_page_prune() > looked are recorded but not removed without lazy_vacuum_page() we need > to process them in lazy_vacuum_page(). For decision about whether to > truncate we should not change it, so I will fix it. It should be an > another option to control whether to truncate if we want. I think that heap_page_prune() is going to truncate dead tuples to dead line pointers. At that point the tuple is gone. The only remaining step is to mark the dead line pointer as unused, which can't be done without scanning the indexes. So I don't understand what lazy_vacuum_page() would need to do in this case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jan 16, 2019 at 5:24 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Jan 14, 2019 at 9:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I think that because the tuples that got dead after heap_page_prune() > > looked are recorded but not removed without lazy_vacuum_page() we need > > to process them in lazy_vacuum_page(). For decision about whether to > > truncate we should not change it, so I will fix it. It should be an > > another option to control whether to truncate if we want. > > I think that heap_page_prune() is going to truncate dead tuples to > dead line pointers. At that point the tuple is gone. The only > remaining step is to mark the dead line pointer as unused, which can't > be done without scanning the indexes. So I don't understand what > lazy_vacuum_page() would need to do in this case. > vacuumlazy.c:L1022 switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf)) { case HEAPTUPLE_DEAD: /* * Ordinarily, DEAD tuples would have been removed by * heap_page_prune(), but it's possible that the tuple * state changed since heap_page_prune() looked. In * particular an INSERT_IN_PROGRESS tuple could have * changed to DEAD if the inserter aborted. So this * cannot be considered an error condition. * (snip) */ if (HeapTupleIsHotUpdated(&tuple) || HeapTupleIsHeapOnly(&tuple)) nkeep += 1; else tupgone = true; /* we can delete the tuple */ all_visible = false; break; As the above comment says, it's possible that the state of an INSERT_IN_PROGRESS tuple could be changed to 'dead' after heap_page_prune(). Since such tuple is not truncated at this point we record it and set it as UNUSED in lazy_vacuum_page(). I think that the DISABLE_INDEX_CLEANUP case is the same; we need to process them after recorded. Am I missing something? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Wed, Jan 16, 2019 at 3:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > As the above comment says, it's possible that the state of an > INSERT_IN_PROGRESS tuple could be changed to 'dead' after > heap_page_prune(). Since such tuple is not truncated at this point we > record it and set it as UNUSED in lazy_vacuum_page(). I think that the > DISABLE_INDEX_CLEANUP case is the same; we need to process them after > recorded. Am I missing something? I believe you are. Think about it this way. After the first pass over the heap has been completed but before we've done anything to the indexes, let alone started the second pass over the heap, somebody could kill the vacuum process. Somebody could in fact yank the plug out of the wall, stopping the entire server in its tracks. If they do that, then lazy_vacuum_page() will never get executed. Yet, the heap can't be in any kind of corrupted state at this point, right? We know that the system is resilient against crashes, and killing a vacuum or even the whole server midway through does not leave the system in any kind of bad state. If it's fine for lazy_vacuum_page() to never be reached in that case, it must also be fine for it never to be reached if we ask for vacuum to stop cleanly before lazy_vacuum_page(). In the case of the particular comment to which you are referring, that comment is part of lazy_scan_heap(), not lazy_vacuum_page(), so I don't see how it bears on the question of whether we need to call lazy_vacuum_page(). It's true that, at any point in time, an in-progress transaction could abort. And if it does then some insert-in-progress tuples could become dead. But if that happens, then the next vacuum will remove them, just as it will remove any tuples that become dead for that reason when vacuum isn't running in the first place. You can't use that as a justification for needing a second heap pass, because if it were, then you'd also need a THIRD heap pass in case a transaction aborts after the second heap pass has visited the pages, and a fourth heap pass in case a transaction aborts after the third heap pass has visited the pages, etc. etc. forever. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Jan 17, 2019 at 1:33 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Jan 16, 2019 at 3:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > As the above comment says, it's possible that the state of an > > INSERT_IN_PROGRESS tuple could be changed to 'dead' after > > heap_page_prune(). Since such tuple is not truncated at this point we > > record it and set it as UNUSED in lazy_vacuum_page(). I think that the > > DISABLE_INDEX_CLEANUP case is the same; we need to process them after > > recorded. Am I missing something? > > I believe you are. Think about it this way. After the first pass > over the heap has been completed but before we've done anything to the > indexes, let alone started the second pass over the heap, somebody > could kill the vacuum process. Somebody could in fact yank the plug > out of the wall, stopping the entire server in its tracks. If they do > that, then lazy_vacuum_page() will never get executed. Yet, the heap > can't be in any kind of corrupted state at this point, right? We know > that the system is resilient against crashes, and killing a vacuum or > even the whole server midway through does not leave the system in any > kind of bad state. If it's fine for lazy_vacuum_page() to never be > reached in that case, it must also be fine for it never to be reached > if we ask for vacuum to stop cleanly before lazy_vacuum_page(). > Yes, that's right. It doesn't necessarily need to be reached lazy_vacuum_page(). > In the case of the particular comment to which you are referring, that > comment is part of lazy_scan_heap(), not lazy_vacuum_page(), so I > don't see how it bears on the question of whether we need to call > lazy_vacuum_page(). It's true that, at any point in time, an > in-progress transaction could abort. And if it does then some > insert-in-progress tuples could become dead. But if that happens, > then the next vacuum will remove them, just as it will remove any > tuples that become dead for that reason when vacuum isn't running in > the first place. You can't use that as a justification for needing a > second heap pass, because if it were, then you'd also need a THIRD > heap pass in case a transaction aborts after the second heap pass has > visited the pages, and a fourth heap pass in case a transaction aborts > after the third heap pass has visited the pages, etc. etc. forever. > The reason why I processed the tuples that became dead after the first heap pass is that I was not sure the reason why we ignore such tuples in the second heap pass despite of there already have been the code doing so which has been used for a long time. I thought we can do that in the same manner even in DISABLE_INDEX_CLEANUP case. Also, since I thought that lazy_vacuum_page() is the best place to set them as DEAD I modified it (In the previous patch I introduced another function setting them as DEAD aside from lazy_vacuum_page(). But since these were almost same I merged them). However, as you mentioned it's true that these tuples will be removed by the next vacuum even if we ignore. Also these tuples would not be many and without this change the patch would get more simple. So if the risk of this change is greater than the benefit, we should not do that. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Jan 17, 2019 at 1:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > The reason why I processed the tuples that became dead after the first > heap pass is that I was not sure the reason why we ignore such tuples > in the second heap pass despite of there already have been the code > doing so which has been used for a long time. I thought we can do that > in the same manner even in DISABLE_INDEX_CLEANUP case. Also, since I > thought that lazy_vacuum_page() is the best place to set them as DEAD > I modified it (In the previous patch I introduced another function > setting them as DEAD aside from lazy_vacuum_page(). But since these > were almost same I merged them). The race you're concerned about is extremely narrow. We HOT-prune the page, and then immediately afterward -- probably a few milliseconds later -- we loop over the tuples still on the page and check the status of each one. The only time we get a different answer is when a transaction aborts in those few milliseconds. We don't worry about handling those because it's a very rare condition. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Jan 19, 2019 at 5:08 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Jan 17, 2019 at 1:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > The reason why I processed the tuples that became dead after the first > > heap pass is that I was not sure the reason why we ignore such tuples > > in the second heap pass despite of there already have been the code > > doing so which has been used for a long time. I thought we can do that > > in the same manner even in DISABLE_INDEX_CLEANUP case. Also, since I > > thought that lazy_vacuum_page() is the best place to set them as DEAD > > I modified it (In the previous patch I introduced another function > > setting them as DEAD aside from lazy_vacuum_page(). But since these > > were almost same I merged them). > > The race you're concerned about is extremely narrow. We HOT-prune the > page, and then immediately afterward -- probably a few milliseconds > later -- we loop over the tuples still on the page and check the > status of each one. The only time we get a different answer is when a > transaction aborts in those few milliseconds. We don't worry about > handling those because it's a very rare condition. > Understood and Agreed. I've attached the new version patch incorporated the review comments. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On 1/21/19, 2:23 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > Understood and Agreed. I've attached the new version patch > incorporated the review comments. I took a look at the latest version of the patch. + <varlistentry> + <term><literal>DISABLE_INDEX_CLEANUP</literal></term> + <listitem> + <para> + <command>VACUUM</command> removes dead tuples and prunes HOT-updated + tuples chain for live tuples on table. If the table has any dead tuple + it removes them from both table and indexes for re-use. With this + option <command>VACUUM</command> doesn't completely remove dead tuples + and disables removing dead tuples from indexes. This is suitable for + avoiding transaction ID wraparound but not sufficient for avoiding + index bloat. This option is ignored if the table doesn't have index. + Also, this cannot be used in conjunction with <literal>FULL</literal> + option. + </para> + </listitem> + </varlistentry> IMHO we could document this feature at a slightly higher level without leaving out any really important user-facing behavior. Here's a quick attempt to show what I am thinking: With this option, VACUUM skips all index cleanup behavior and only marks tuples as "dead" without reclaiming the storage. While this can help reclaim transaction IDs faster to avoid transaction ID wraparound (see Section 24.1.5), it will not reduce bloat. Note that this option is ignored for tables that have no indexes. Also, this option cannot be used in conjunction with the FULL option, since FULL requires rewriting the table. + /* Notify user that DISABLE_INDEX_CLEANUP option is ignored */ + if (!vacrelstats->hasindex && (options & VACOPT_DISABLE_INDEX_CLEANUP)) + ereport(NOTICE, + (errmsg("DISABLE_INDEX_CLEANUP is ignored because table \"%s\" does not have index", + RelationGetRelationName(onerel)))); It might make more sense to emit this in lazy_scan_heap() where we determine the value of skip_index_vacuum. + if (skip_index_vacuum) + ereport(elevel, + (errmsg("\"%s\": marked %.0f row versions as dead in %u pages", + RelationGetRelationName(onerel), + tups_vacuumed, vacuumed_pages))); IIUC tups_vacuumed will include tuples removed during HOT pruning, so it could be inaccurate to say all of these tuples have only been marked "dead." Perhaps we could keep a separate count of tuples removed via HOT pruning in case we are using DISABLE_INDEX_CLEANUP. There might be similar problems with the values stored in vacrelstats that are used at the end of heap_vacuum_rel() (e.g. tuples_deleted). I would suggest adding this option to vacuumdb in this patch set as well. Nathan
On Fri, Feb 01, 2019 at 10:43:37PM +0000, Bossart, Nathan wrote: > I would suggest adding this option to vacuumdb in this patch set as > well. Doing it would be simple enough, for now I have moved it to next CF. -- Michael
Вложения
On 2019-Feb-01, Bossart, Nathan wrote: > IMHO we could document this feature at a slightly higher level without > leaving out any really important user-facing behavior. Here's a quick > attempt to show what I am thinking: > > With this option, VACUUM skips all index cleanup behavior and > only marks tuples as "dead" without reclaiming the storage. > While this can help reclaim transaction IDs faster to avoid > transaction ID wraparound (see Section 24.1.5), it will not > reduce bloat. Hmm ... don't we compact out the storage for dead tuples? If we do (and I think we should) then this wording is not entirely correct. > Note that this option is ignored for tables > that have no indexes. Also, this option cannot be used in > conjunction with the FULL option, since FULL requires > rewriting the table. I would remove the "Also," in there, since it seems to me to give the wrong impression about those two things being similar, but they're not: one is convenient behavior, the other is a limitation. > + /* Notify user that DISABLE_INDEX_CLEANUP option is ignored */ > + if (!vacrelstats->hasindex && (options & VACOPT_DISABLE_INDEX_CLEANUP)) > + ereport(NOTICE, > + (errmsg("DISABLE_INDEX_CLEANUP is ignored because table \"%s\" does not have index", > + RelationGetRelationName(onerel)))); > > It might make more sense to emit this in lazy_scan_heap() where we > determine the value of skip_index_vacuum. Why do we need a NOTICE here? I think this case should be silent. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Feb 2, 2019 at 7:00 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > On 2019-Feb-01, Bossart, Nathan wrote: > > > IMHO we could document this feature at a slightly higher level without > > leaving out any really important user-facing behavior. Here's a quick > > attempt to show what I am thinking: > > > > With this option, VACUUM skips all index cleanup behavior and > > only marks tuples as "dead" without reclaiming the storage. > > While this can help reclaim transaction IDs faster to avoid > > transaction ID wraparound (see Section 24.1.5), it will not > > reduce bloat. > > Hmm ... don't we compact out the storage for dead tuples? If we do (and > I think we should) then this wording is not entirely correct. Yeah, we remove tuple and leave the dead ItemId. So we actually reclaim the almost tuple storage. > > > Note that this option is ignored for tables > > that have no indexes. Also, this option cannot be used in > > conjunction with the FULL option, since FULL requires > > rewriting the table. > > I would remove the "Also," in there, since it seems to me to give the > wrong impression about those two things being similar, but they're not: > one is convenient behavior, the other is a limitation. Agreed. > > > + /* Notify user that DISABLE_INDEX_CLEANUP option is ignored */ > > + if (!vacrelstats->hasindex && (options & VACOPT_DISABLE_INDEX_CLEANUP)) > > + ereport(NOTICE, > > + (errmsg("DISABLE_INDEX_CLEANUP is ignored because table \"%s\" does not have index", > > + RelationGetRelationName(onerel)))); > > > > It might make more sense to emit this in lazy_scan_heap() where we > > determine the value of skip_index_vacuum. > > Why do we need a NOTICE here? I think this case should be silent. > Okay, removed it. On Fri, Feb 1, 2019 at 11:43 PM Bossart, Nathan <bossartn@amazon.com> wrote: > > + if (skip_index_vacuum) > + ereport(elevel, > + (errmsg("\"%s\": marked %.0f row versions as dead in %u pages", > + RelationGetRelationName(onerel), > + tups_vacuumed, vacuumed_pages))); > > IIUC tups_vacuumed will include tuples removed during HOT pruning, so > it could be inaccurate to say all of these tuples have only been > marked "dead." Perhaps we could keep a separate count of tuples > removed via HOT pruning in case we are using DISABLE_INDEX_CLEANUP. > There might be similar problems with the values stored in vacrelstats > that are used at the end of heap_vacuum_rel() (e.g. tuples_deleted). Yeah, tups_vacuumed include such tuples so the message is inaccurate. So I think that we should not change the message but we can report the dead item pointers we left in errdetail. That's more accurate and would help user. postgres(1:1130)=# vacuum (verbose, disable_index_cleanup) test; INFO: vacuuming "public.test" INFO: "test": removed 49 row versions in 1 pages INFO: "test": found 49 removable, 51 nonremovable row versions in 1 out of 1 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 509 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. 49 tuples are left as dead. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM Attached the updated patch and the patch for vacuumdb. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On 2/3/19, 1:48 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > On Sat, Feb 2, 2019 at 7:00 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> >> On 2019-Feb-01, Bossart, Nathan wrote: >> >> > IMHO we could document this feature at a slightly higher level without >> > leaving out any really important user-facing behavior. Here's a quick >> > attempt to show what I am thinking: >> > >> > With this option, VACUUM skips all index cleanup behavior and >> > only marks tuples as "dead" without reclaiming the storage. >> > While this can help reclaim transaction IDs faster to avoid >> > transaction ID wraparound (see Section 24.1.5), it will not >> > reduce bloat. >> >> Hmm ... don't we compact out the storage for dead tuples? If we do (and >> I think we should) then this wording is not entirely correct. > > Yeah, we remove tuple and leave the dead ItemId. So we actually > reclaim the almost tuple storage. Ah, yes. I was wrong here. Thank you for clarifying. > Attached the updated patch and the patch for vacuumdb. Thanks! I am hoping to take a deeper look at this patch in the next couple of days. Nathan
Sorry for the delay. I finally got a chance to look through the latest patches. On 2/3/19, 1:48 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > On Fri, Feb 1, 2019 at 11:43 PM Bossart, Nathan <bossartn@amazon.com> wrote: >> >> + if (skip_index_vacuum) >> + ereport(elevel, >> + (errmsg("\"%s\": marked %.0f row versions as dead in %u pages", >> + RelationGetRelationName(onerel), >> + tups_vacuumed, vacuumed_pages))); >> >> IIUC tups_vacuumed will include tuples removed during HOT pruning, so >> it could be inaccurate to say all of these tuples have only been >> marked "dead." Perhaps we could keep a separate count of tuples >> removed via HOT pruning in case we are using DISABLE_INDEX_CLEANUP. >> There might be similar problems with the values stored in vacrelstats >> that are used at the end of heap_vacuum_rel() (e.g. tuples_deleted). > > Yeah, tups_vacuumed include such tuples so the message is inaccurate. > So I think that we should not change the message but we can report the > dead item pointers we left in errdetail. That's more accurate and > would help user. > > postgres(1:1130)=# vacuum (verbose, disable_index_cleanup) test; > INFO: vacuuming "public.test" > INFO: "test": removed 49 row versions in 1 pages > INFO: "test": found 49 removable, 51 nonremovable row versions in 1 > out of 1 pages > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 509 > There were 0 unused item pointers. > Skipped 0 pages due to buffer pins, 0 frozen pages. > 0 pages are entirely empty. > 49 tuples are left as dead. > CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > VACUUM This seems reasonable to me. The current version of the patches builds cleanly, passes 'make check-world', and seems to work well in my own manual tests. I have a number of small suggestions, but I think this will be ready-for- committer soon. + Assert(!skip_index_vacuum); There are two places in lazy_scan_heap() that I see this without any comment. Can we add a short comment explaining why this should be true at these points? + if (skip_index_vacuum) + appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n", + "%.0f tuples are left as dead.\n", + nleft), + nleft); I think we could emit this metric for all cases, not only when DISABLE_INDEX_CLEANUP is used. + /* + * Remove tuples from heap if the table has no index. If the table + * has index but index vacuum is disabled, we don't vacuum and forget + * them. The vacrelstats->dead_tuples could have tuples which became + * dead after checked at HOT-pruning time which are handled by + * lazy_vacuum_page() but we don't worry about handling those because + * it's a very rare condition and these would not be a large number. + */ Based on this, it sounds like nleft could be inaccurate. Do you think it is worth adjusting the log message to reflect that, or is this discrepancy something we should just live with? I think adding something like "at least N tuples left marked dead" is arguably a bit misleading, since the only time the number is actually higher is when this "very rare condition" occurs. + /* + * Skip index vacuum if it's requested for table with indexes. In this + * case we use the one-pass strategy and don't remove tuple storage. + */ + skip_index_vacuum = + (options & VACOPT_DISABLE_INDEX_CLEANUP) != 0 && vacrelstats->hasindex; AFAICT we don't actually need to adjust this based on vacrelstats->hasindex because we are already checking for indexes everywhere we check for this option. What do you think about leaving that part out? + if (vacopts->disable_index_cleanup) + { + /* DISABLE_PAGE_SKIPPING is supported since 12 */ + Assert(serverVersion >= 120000); + appendPQExpBuffer(sql, "%sDISABLE_INDEX_CLEANUP", sep); + sep = comma; + } s/DISABLE_PAGE_SKIPPING/DISABLE_INDEX_CLEANUP/ + printf(_(" --disable-index-cleanup disable index vacuuming and index clenaup\n")); s/clenaup/cleanup/ Nathan
On Wed, Feb 27, 2019 at 10:02 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > Sorry for the delay. I finally got a chance to look through the > latest patches. > > On 2/3/19, 1:48 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > > On Fri, Feb 1, 2019 at 11:43 PM Bossart, Nathan <bossartn@amazon.com> wrote: > >> > >> + if (skip_index_vacuum) > >> + ereport(elevel, > >> + (errmsg("\"%s\": marked %.0f row versions as dead in %u pages", > >> + RelationGetRelationName(onerel), > >> + tups_vacuumed, vacuumed_pages))); > >> > >> IIUC tups_vacuumed will include tuples removed during HOT pruning, so > >> it could be inaccurate to say all of these tuples have only been > >> marked "dead." Perhaps we could keep a separate count of tuples > >> removed via HOT pruning in case we are using DISABLE_INDEX_CLEANUP. > >> There might be similar problems with the values stored in vacrelstats > >> that are used at the end of heap_vacuum_rel() (e.g. tuples_deleted). > > > > Yeah, tups_vacuumed include such tuples so the message is inaccurate. > > So I think that we should not change the message but we can report the > > dead item pointers we left in errdetail. That's more accurate and > > would help user. > > > > postgres(1:1130)=# vacuum (verbose, disable_index_cleanup) test; > > INFO: vacuuming "public.test" > > INFO: "test": removed 49 row versions in 1 pages > > INFO: "test": found 49 removable, 51 nonremovable row versions in 1 > > out of 1 pages > > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 509 > > There were 0 unused item pointers. > > Skipped 0 pages due to buffer pins, 0 frozen pages. > > 0 pages are entirely empty. > > 49 tuples are left as dead. > > CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > > VACUUM > > This seems reasonable to me. > > The current version of the patches builds cleanly, passes 'make > check-world', and seems to work well in my own manual tests. I have a > number of small suggestions, but I think this will be ready-for- > committer soon. Thank you for reviewing the patch! > > + Assert(!skip_index_vacuum); > > There are two places in lazy_scan_heap() that I see this without any > comment. Can we add a short comment explaining why this should be > true at these points? Agreed. Will add a comment. > > + if (skip_index_vacuum) > + appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n", > + "%.0f tuples are left as dead.\n", > + nleft), > + nleft); > > I think we could emit this metric for all cases, not only when > DISABLE_INDEX_CLEANUP is used. I think that tups_vacuumed shows total number of vacuumed tuples and is already shown in the log message. The 'nleft' counts the total number of recorded dead tuple but not counts tuples are removed during HOT-pruning. Is this a valuable for users in non-DISABLE_INDEX_CLEANUP case? > > + /* > + * Remove tuples from heap if the table has no index. If the table > + * has index but index vacuum is disabled, we don't vacuum and forget > + * them. The vacrelstats->dead_tuples could have tuples which became > + * dead after checked at HOT-pruning time which are handled by > + * lazy_vacuum_page() but we don't worry about handling those because > + * it's a very rare condition and these would not be a large number. > + */ > > Based on this, it sounds like nleft could be inaccurate. Do you think > it is worth adjusting the log message to reflect that, or is this > discrepancy something we should just live with? I think adding > something like "at least N tuples left marked dead" is arguably a bit > misleading, since the only time the number is actually higher is when > this "very rare condition" occurs. Hmm, I think it's true that we leave 'nleft' dead tuples because it includes both tuples marked as dead and tuples not marked yet. So I think the log message works. Maybe the comment leads misreading. Will fix it. > > + /* > + * Skip index vacuum if it's requested for table with indexes. In this > + * case we use the one-pass strategy and don't remove tuple storage. > + */ > + skip_index_vacuum = > + (options & VACOPT_DISABLE_INDEX_CLEANUP) != 0 && vacrelstats->hasindex; > > AFAICT we don't actually need to adjust this based on > vacrelstats->hasindex because we are already checking for indexes > everywhere we check for this option. What do you think about leaving > that part out? > Yeah, I think you're right. Will fix. > + if (vacopts->disable_index_cleanup) > + { > + /* DISABLE_PAGE_SKIPPING is supported since 12 */ > + Assert(serverVersion >= 120000); > + appendPQExpBuffer(sql, "%sDISABLE_INDEX_CLEANUP", sep); > + sep = comma; > + } > > s/DISABLE_PAGE_SKIPPING/DISABLE_INDEX_CLEANUP/ > Will fix. > + printf(_(" --disable-index-cleanup disable index vacuuming and index clenaup\n")); > > s/clenaup/cleanup/ Will fix. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 2/27/19, 2:08 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: >> + if (skip_index_vacuum) >> + appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n", >> + "%.0f tuples are left as dead.\n", >> + nleft), >> + nleft); >> >> I think we could emit this metric for all cases, not only when >> DISABLE_INDEX_CLEANUP is used. > > I think that tups_vacuumed shows total number of vacuumed tuples and > is already shown in the log message. The 'nleft' counts the total > number of recorded dead tuple but not counts tuples are removed during > HOT-pruning. Is this a valuable for users in non-DISABLE_INDEX_CLEANUP > case? I think it is valuable. When DISABLE_INDEX_CLEANUP is not used or it is used for a relation with no indexes, it makes it clear that no tuples were left marked as dead. Also, it looks like all of the other information here is provided regardless of the options used. IMO it is good to list all of the stats so that users have the full picture of what VACUUM did. Nathan
On Thu, Feb 28, 2019 at 2:46 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > On 2/27/19, 2:08 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > >> + if (skip_index_vacuum) > >> + appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n", > >> + "%.0f tuples are left as dead.\n", > >> + nleft), > >> + nleft); > >> > >> I think we could emit this metric for all cases, not only when > >> DISABLE_INDEX_CLEANUP is used. > > > > I think that tups_vacuumed shows total number of vacuumed tuples and > > is already shown in the log message. The 'nleft' counts the total > > number of recorded dead tuple but not counts tuples are removed during > > HOT-pruning. Is this a valuable for users in non-DISABLE_INDEX_CLEANUP > > case? > > I think it is valuable. When DISABLE_INDEX_CLEANUP is not used or it > is used for a relation with no indexes, it makes it clear that no > tuples were left marked as dead. Also, it looks like all of the other > information here is provided regardless of the options used. IMO it > is good to list all of the stats so that users have the full picture > of what VACUUM did. > I see your point. That seems good to me. Attached the updated version patch. I've incorporated all review comments I got and have changed the number of tuples being reported as 'removed tuples'. With this option, tuples completely being removed is only tuples marked as unused during HOT-pruning, other dead tuples are left. So we count those tuples during HOT-pruning and reports it as removed tuples. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Thu, Feb 28, 2019 at 3:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Attached the updated version patch. Regarding the user interface for this patch, please have a look at the concerns I mention in https://www.postgresql.org/message-id/CA+TgmoZORX_UUv67rjaSX_aswkdZWV8kWfKfrWxyLdCqFqj+Yw@mail.gmail.com I provided a suggestion there as to how to resolve the issue but naturally others may feel differently. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2/28/19, 12:13 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > Attached the updated version patch. I've incorporated all review > comments I got and have changed the number of tuples being reported as > 'removed tuples'. With this option, tuples completely being removed is > only tuples marked as unused during HOT-pruning, other dead tuples are > left. So we count those tuples during HOT-pruning and reports it as > removed tuples. Thanks for the new patches. Beyond the reloptions discussion, most of my feedback is wording suggestions. + <command>VACUUM</command> removes dead tuples and prunes HOT-updated + tuples chain for live tuples on table. If the table has any dead tuple + it removes them from both table and indexes for re-use. With this + option <command>VACUUM</command> doesn't completely remove dead tuples + and disables removing dead tuples from indexes. This is suitable for + avoiding transaction ID wraparound (see + <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding + index bloat. This option is ignored if the table doesn't have index. + This cannot be used in conjunction with <literal>FULL</literal> + option. There are a couple of small changes I would make. How does something like this sound? VACUUM removes dead tuples and prunes HOT-updated tuple chains for live tuples on the table. If the table has any dead tuples, it removes them from both the table and its indexes and marks the corresponding line pointers as available for re-use. With this option, VACUUM still removes dead tuples from the table, but it does not process any indexes, and the line pointers are marked as dead instead of available for re-use. This is suitable for avoiding transaction ID wraparound (see Section 24.1.5) but not sufficient for avoiding index bloat. This option is ignored if the table does not have any indexes. This cannot be used in conjunction with the FULL option. - * Returns the number of tuples deleted from the page and sets - * latestRemovedXid. + * Returns the number of tuples deleted from the page and set latestRemoveXid + * and increment nunused. I would say something like: "Returns the number of tuples deleted from the page, sets latestRemovedXid, and updates nunused." + /* + * hasindex = true means two-pass strategy; false means one-pass. But we + * always use the one-pass strategy when index vacuum is disabled. + */ I think the added sentence should make it more clear that hasindex will still be true when DISABLE_INDEX_CLEANUP is used. Maybe something like: /* * hasindex = true means two-pass strategy; false means one-pass * * If DISABLE_INDEX_CLEANUP is used, hasindex may still be true, * but we'll always use the one-pass strategy. */ tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false, - &vacrelstats->latestRemovedXid); + &vacrelstats->latestRemovedXid, + &tups_pruned); Why do we need a separate tups_pruned argument in heap_page_prune()? Could we add the result of heap_page_prune() to tups_pruned instead, then report the total number of removed tuples as tups_vacuumed + tups_pruned elsewhere? + * If there are no indexes or we skip index vacuum then we can vacuum + * the page right now instead of doing a second scan. How about: If there are no indexes or index cleanup is disabled, we can vacuum the page right now instead of doing a second scan. + /* + * Here, we have indexes but index vacuum is disabled. We don't + * vacuum dead tuples on heap but forget them as we skip index + * vacuum. The vacrelstats->dead_tuples could have tuples which + * became dead after checked at HOT-pruning time but aren't marked + * as dead yet. We don't process them because it's a very rare + * condition and the next vacuum will process them. + */ I would suggest a few small changes: /* * Here, we have indexes but index vacuum is disabled. Instead of * vacuuming the dead tuples on the heap, we just forget them. * * Note that vacrelstats->dead_tuples could include tuples which * became dead after HOT-pruning but are not marked dead yet. We * do not process them because this is a very rare condition, and * the next vacuum will process them anyway. */ - /* If no indexes, make log report that lazy_vacuum_heap would've made */ + /* + * If no index or disables index vacuum, make log report that lazy_vacuum_heap + * would've made. If index vacuum is disabled, we didn't remove all dead + * tuples but did for tuples removed by HOT-pruning. + */ if (vacuumed_pages) ereport(elevel, (errmsg("\"%s\": removed %.0f row versions in %u pages", RelationGetRelationName(onerel), - tups_vacuumed, vacuumed_pages))); + skip_index_vacuum ? tups_pruned : tups_vacuumed, + vacuumed_pages))); How about: /* * If no index or index vacuum is disabled, make log report that * lazy_vacuum_heap would've made. If index vacuum is disabled, * we don't include the tuples that we marked dead, but we do * include tuples removed by HOT-pruning. */ Another interesting thing I noticed is that this "removed X row versions" message is only emitted if vacuumed_pages is greater than 0. However, if we only did HOT pruning, tups_vacuumed will be greater than 0 while vacuumed_pages will still be 0, so some information will be left out. I think this is already the case, though, so this could probably be handled in a separate thread. Nathan
On Tue, Mar 5, 2019 at 8:27 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > On 2/28/19, 12:13 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > > Attached the updated version patch. I've incorporated all review > > comments I got and have changed the number of tuples being reported as > > 'removed tuples'. With this option, tuples completely being removed is > > only tuples marked as unused during HOT-pruning, other dead tuples are > > left. So we count those tuples during HOT-pruning and reports it as > > removed tuples. > > Thanks for the new patches. Beyond the reloptions discussion, most of > my feedback is wording suggestions. Thank you for the comment! > > + <command>VACUUM</command> removes dead tuples and prunes HOT-updated > + tuples chain for live tuples on table. If the table has any dead tuple > + it removes them from both table and indexes for re-use. With this > + option <command>VACUUM</command> doesn't completely remove dead tuples > + and disables removing dead tuples from indexes. This is suitable for > + avoiding transaction ID wraparound (see > + <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding > + index bloat. This option is ignored if the table doesn't have index. > + This cannot be used in conjunction with <literal>FULL</literal> > + option. > > There are a couple of small changes I would make. How does something > like this sound? > > VACUUM removes dead tuples and prunes HOT-updated tuple chains for > live tuples on the table. If the table has any dead tuples, it > removes them from both the table and its indexes and marks the > corresponding line pointers as available for re-use. With this > option, VACUUM still removes dead tuples from the table, but it > does not process any indexes, and the line pointers are marked as > dead instead of available for re-use. This is suitable for > avoiding transaction ID wraparound (see Section 24.1.5) but not > sufficient for avoiding index bloat. This option is ignored if > the table does not have any indexes. This cannot be used in > conjunction with the FULL option. Hmm, that's good idea but I'm not sure that user knows the word 'line pointer' because it is used only at pageinspect document. I wonder if the word 'item identifier' would rather be appropriate here because this is used at at describing page layout(in storage.sgml). Thought? > > - * Returns the number of tuples deleted from the page and sets > - * latestRemovedXid. > + * Returns the number of tuples deleted from the page and set latestRemoveXid > + * and increment nunused. > > I would say something like: "Returns the number of tuples deleted from > the page, sets latestRemovedXid, and updates nunused." Fixed. > > + /* > + * hasindex = true means two-pass strategy; false means one-pass. But we > + * always use the one-pass strategy when index vacuum is disabled. > + */ > > I think the added sentence should make it more clear that hasindex > will still be true when DISABLE_INDEX_CLEANUP is used. Maybe > something like: > > /* > * hasindex = true means two-pass strategy; false means one-pass > * > * If DISABLE_INDEX_CLEANUP is used, hasindex may still be true, > * but we'll always use the one-pass strategy. > */ > Fixed. > tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false, > - &vacrelstats->latestRemovedXid); > + &vacrelstats->latestRemovedXid, > + &tups_pruned); > > Why do we need a separate tups_pruned argument in heap_page_prune()? > Could we add the result of heap_page_prune() to tups_pruned instead, > then report the total number of removed tuples as tups_vacuumed + > tups_pruned elsewhere? Hmm, I thought that we should report only the number of tuples completely removed but we already count the tulples marked as redirected as tups_vacuumed. Let me summarize the fate of dead tuples. I think we can roughly classify dead tuples as follows. 1. root tuple of HOT chain that became dead 2. root tuple of HOT chain that became redirected 3. other tupels of HOT chain that became unused 4. tuples that became dead after HOT pruning The tuples of #1 through #3 either have only ItemIDs or have been completely removed but tuples of #4 has its tuple storage because they are not processed when HOT-pruning. Currently tups_vacuumed counts all of them, nleft (= vacrelstats->num_dead_tuples) counts #1 + #4. I think that the number of removed tuples being reported would be #1 + #2 + #3. Or should we use #2 + #3 instead? > > + * If there are no indexes or we skip index vacuum then we can vacuum > + * the page right now instead of doing a second scan. > > How about: > > If there are no indexes or index cleanup is disabled, we can > vacuum the page right now instead of doing a second scan. Fixed. > > + /* > + * Here, we have indexes but index vacuum is disabled. We don't > + * vacuum dead tuples on heap but forget them as we skip index > + * vacuum. The vacrelstats->dead_tuples could have tuples which > + * became dead after checked at HOT-pruning time but aren't marked > + * as dead yet. We don't process them because it's a very rare > + * condition and the next vacuum will process them. > + */ > > I would suggest a few small changes: > > /* > * Here, we have indexes but index vacuum is disabled. Instead of > * vacuuming the dead tuples on the heap, we just forget them. > * > * Note that vacrelstats->dead_tuples could include tuples which > * became dead after HOT-pruning but are not marked dead yet. We > * do not process them because this is a very rare condition, and > * the next vacuum will process them anyway. > */ Fixed. > > - /* If no indexes, make log report that lazy_vacuum_heap would've made */ > + /* > + * If no index or disables index vacuum, make log report that lazy_vacuum_heap > + * would've made. If index vacuum is disabled, we didn't remove all dead > + * tuples but did for tuples removed by HOT-pruning. > + */ > if (vacuumed_pages) > ereport(elevel, > (errmsg("\"%s\": removed %.0f row versions in %u pages", > RelationGetRelationName(onerel), > - tups_vacuumed, vacuumed_pages))); > + skip_index_vacuum ? tups_pruned : tups_vacuumed, > + vacuumed_pages))); > > How about: > > /* > * If no index or index vacuum is disabled, make log report that > * lazy_vacuum_heap would've made. If index vacuum is disabled, > * we don't include the tuples that we marked dead, but we do > * include tuples removed by HOT-pruning. > */ Fixed. > > Another interesting thing I noticed is that this "removed X row > versions" message is only emitted if vacuumed_pages is greater than 0. > However, if we only did HOT pruning, tups_vacuumed will be greater > than 0 while vacuumed_pages will still be 0, so some information will > be left out. I think this is already the case, though, so this could > probably be handled in a separate thread. > Hmm, since this log message is corresponding to the one that lazy_vacuum_heap makes and total number of removed tuples are always reported, it seems consistent to me. Do you have another point? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hello, I have some other comments. At Mon, 4 Mar 2019 23:27:10 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in <48410154-E6C5-4C07-8122-8D04E3BCD1F8@amazon.com> > On 2/28/19, 12:13 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > > Attached the updated version patch. I've incorporated all review > > comments I got and have changed the number of tuples being reported as > > 'removed tuples'. With this option, tuples completely being removed is > > only tuples marked as unused during HOT-pruning, other dead tuples are > > left. So we count those tuples during HOT-pruning and reports it as > > removed tuples. > > Thanks for the new patches. Beyond the reloptions discussion, most of > my feedback is wording suggestions. > > + <command>VACUUM</command> removes dead tuples and prunes HOT-updated > + tuples chain for live tuples on table. If the table has any dead tuple > + it removes them from both table and indexes for re-use. With this > + option <command>VACUUM</command> doesn't completely remove dead tuples > + and disables removing dead tuples from indexes. This is suitable for > + avoiding transaction ID wraparound (see > + <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding > + index bloat. This option is ignored if the table doesn't have index. > + This cannot be used in conjunction with <literal>FULL</literal> > + option. > > There are a couple of small changes I would make. How does something > like this sound? > > VACUUM removes dead tuples and prunes HOT-updated tuple chains for > live tuples on the table. If the table has any dead tuples, it > removes them from both the table and its indexes and marks the > corresponding line pointers as available for re-use. With this > option, VACUUM still removes dead tuples from the table, but it > does not process any indexes, and the line pointers are marked as > dead instead of available for re-use. This is suitable for > avoiding transaction ID wraparound (see Section 24.1.5) but not > sufficient for avoiding index bloat. This option is ignored if > the table does not have any indexes. This cannot be used in > conjunction with the FULL option. > > - * Returns the number of tuples deleted from the page and sets > - * latestRemovedXid. > + * Returns the number of tuples deleted from the page and set latestRemoveXid > + * and increment nunused. > > I would say something like: "Returns the number of tuples deleted from > the page, sets latestRemovedXid, and updates nunused." > > + /* > + * hasindex = true means two-pass strategy; false means one-pass. But we > + * always use the one-pass strategy when index vacuum is disabled. > + */ > > I think the added sentence should make it more clear that hasindex > will still be true when DISABLE_INDEX_CLEANUP is used. Maybe > something like: > > /* > * hasindex = true means two-pass strategy; false means one-pass > * > * If DISABLE_INDEX_CLEANUP is used, hasindex may still be true, > * but we'll always use the one-pass strategy. > */ > > tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false, > - &vacrelstats->latestRemovedXid); > + &vacrelstats->latestRemovedXid, > + &tups_pruned); > > Why do we need a separate tups_pruned argument in heap_page_prune()? > Could we add the result of heap_page_prune() to tups_pruned instead, > then report the total number of removed tuples as tups_vacuumed + > tups_pruned elsewhere? > > + * If there are no indexes or we skip index vacuum then we can vacuum > + * the page right now instead of doing a second scan. > > How about: > > If there are no indexes or index cleanup is disabled, we can > vacuum the page right now instead of doing a second scan. > > + /* > + * Here, we have indexes but index vacuum is disabled. We don't > + * vacuum dead tuples on heap but forget them as we skip index > + * vacuum. The vacrelstats->dead_tuples could have tuples which > + * became dead after checked at HOT-pruning time but aren't marked > + * as dead yet. We don't process them because it's a very rare > + * condition and the next vacuum will process them. > + */ > > I would suggest a few small changes: > > /* > * Here, we have indexes but index vacuum is disabled. Instead of > * vacuuming the dead tuples on the heap, we just forget them. > * > * Note that vacrelstats->dead_tuples could include tuples which > * became dead after HOT-pruning but are not marked dead yet. We > * do not process them because this is a very rare condition, and > * the next vacuum will process them anyway. > */ > > - /* If no indexes, make log report that lazy_vacuum_heap would've made */ > + /* > + * If no index or disables index vacuum, make log report that lazy_vacuum_heap > + * would've made. If index vacuum is disabled, we didn't remove all dead > + * tuples but did for tuples removed by HOT-pruning. > + */ > if (vacuumed_pages) > ereport(elevel, > (errmsg("\"%s\": removed %.0f row versions in %u pages", > RelationGetRelationName(onerel), > - tups_vacuumed, vacuumed_pages))); > + skip_index_vacuum ? tups_pruned : tups_vacuumed, > + vacuumed_pages))); > > How about: > > /* > * If no index or index vacuum is disabled, make log report that > * lazy_vacuum_heap would've made. If index vacuum is disabled, > * we don't include the tuples that we marked dead, but we do > * include tuples removed by HOT-pruning. > */ > > Another interesting thing I noticed is that this "removed X row > versions" message is only emitted if vacuumed_pages is greater than 0. > However, if we only did HOT pruning, tups_vacuumed will be greater > than 0 while vacuumed_pages will still be 0, so some information will > be left out. I think this is already the case, though, so this could > probably be handled in a separate thread. + nleft; /* item pointers we left */ The name seems to be something other, and the comment doesn't makes sense at least.. for me.. Looking below, + "%.0f tuples are left as dead.\n", + nleft), + nleft); How about "nleft_dead; /* iterm pointers left as dead */"? In this block: - if (nindexes == 0 && + if ((nindexes == 0 || skip_index_vacuum) && vacrelstats->num_dead_tuples > 0) { Is it right that vacuumed_pages is incremented and FSM is updated while the page is not actually vacuumed? tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false, - &vacrelstats->latestRemovedXid); + &vacrelstats->latestRemovedXid, + &tups_pruned); tups_pruned looks as "HOT pruned tuples". It is named "unused" in the function's parameters. (But I think it is useless. Please see the details below.) I tested it with a simple data set. (autovacuum = off) drop table if exists t; create table t with (fillfactor=50) as select a, a % 3 as b from generate_series(1, 9) a; create index on t(a); update t set a = -a where b = 0; update t set b = b + 1 where b = 1; We now have 9 tuples, 15 versions and 3 out of 6 "old" tuples are to be "left dead" by DISABLE_INDEX_CLEANUP vacuum. It means, three tuples ends with "left dead", three tuples are removed and 12 tuples will survive the vacuum below. vacuum (verbose, freeze ,disable_index_cleanup, disable_page_skipping) t; > INFO: "t": removed 0 row versions in 1 pages > INFO: "t": found 0 removable, 9 nonremovable row versions in 1 out of 1 pages > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 925 > There were 0 unused item pointers. > Skipped 0 pages due to buffer pins, 0 frozen pages. > 0 pages are entirely empty. > 3 tuples are left as dead. Three tuple versions have vanished. Actually they were removed but not shown in the message. heap_prune_chain() doesn't count a live root entry of a chain as "unused (line pointer)" since it is marked as "redierected". As the result the vanished tuples are counted in tups_vacuumed, not in tups_pruned. Maybe the name tups_vacuumed is confusing. After removing tups_pruned code it works correctly. > INFO: "t": removed 6 row versions in 1 pages > INFO: "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 932 > There were 0 unused item pointers. > Skipped 0 pages due to buffer pins, 0 frozen pages. > 0 pages are entirely empty. > 3 tuples are left as dead. I see two choices of the second line above. 1> "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages "removable" includes "left dead" tuples. 2> "t": found 3 removable, 12 nonremovable row versions in 1 out of 1 pages "removable" excludes "left dead" tuples. If you prefer the latter, removable and nonremoveable need to be corrected using nleft. > INFO: "t": found 3 removable, 12 nonremovable row versions in 1 out of 1 pages > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 942 > There were 0 unused item pointers. > Skipped 0 pages due to buffer pins, 0 frozen pages. > 0 pages are entirely empty. > 3 tuples are left as dead. > CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 3/5/19, 1:22 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote: > On Tue, Mar 5, 2019 at 8:27 AM Bossart, Nathan <bossartn@amazon.com> wrote: >> + <command>VACUUM</command> removes dead tuples and prunes HOT-updated >> + tuples chain for live tuples on table. If the table has any dead tuple >> + it removes them from both table and indexes for re-use. With this >> + option <command>VACUUM</command> doesn't completely remove dead tuples >> + and disables removing dead tuples from indexes. This is suitable for >> + avoiding transaction ID wraparound (see >> + <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding >> + index bloat. This option is ignored if the table doesn't have index. >> + This cannot be used in conjunction with <literal>FULL</literal> >> + option. >> >> There are a couple of small changes I would make. How does something >> like this sound? >> >> VACUUM removes dead tuples and prunes HOT-updated tuple chains for >> live tuples on the table. If the table has any dead tuples, it >> removes them from both the table and its indexes and marks the >> corresponding line pointers as available for re-use. With this >> option, VACUUM still removes dead tuples from the table, but it >> does not process any indexes, and the line pointers are marked as >> dead instead of available for re-use. This is suitable for >> avoiding transaction ID wraparound (see Section 24.1.5) but not >> sufficient for avoiding index bloat. This option is ignored if >> the table does not have any indexes. This cannot be used in >> conjunction with the FULL option. > > Hmm, that's good idea but I'm not sure that user knows the word 'line > pointer' because it is used only at pageinspect document. I wonder if > the word 'item identifier' would rather be appropriate here because > this is used at at describing page layout(in storage.sgml). Thought? That seems reasonable to me. It seems like ItemIdData is referred to as "item pointer," "line pointer," or "item identifier" depending on where you're looking, but ItemPointerData is also referred to as "item pointer." I think using "item identifier" is appropriate here for clarity and consistency with storage.sgml. >> tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false, >> - &vacrelstats->latestRemovedXid); >> + &vacrelstats->latestRemovedXid, >> + &tups_pruned); >> >> Why do we need a separate tups_pruned argument in heap_page_prune()? >> Could we add the result of heap_page_prune() to tups_pruned instead, >> then report the total number of removed tuples as tups_vacuumed + >> tups_pruned elsewhere? > > Hmm, I thought that we should report only the number of tuples > completely removed but we already count the tulples marked as > redirected as tups_vacuumed. Let me summarize the fate of dead tuples. > I think we can roughly classify dead tuples as follows. > > 1. root tuple of HOT chain that became dead > 2. root tuple of HOT chain that became redirected > 3. other tupels of HOT chain that became unused > 4. tuples that became dead after HOT pruning > > The tuples of #1 through #3 either have only ItemIDs or have been > completely removed but tuples of #4 has its tuple storage because they > are not processed when HOT-pruning. > > Currently tups_vacuumed counts all of them, nleft (= > vacrelstats->num_dead_tuples) counts #1 + #4. I think that the number > of removed tuples being reported would be #1 + #2 + #3. Or should we > use #2 + #3 instead? I think I'm actually more in favor of what was in v6. IIRC that version of the patch didn't modify how we tracked the "removed" tuples at all, but it just added the "X item identifiers left marked dead" metric. Since even the tuples we are leaving marked dead lose storage, that seems accurate enough to me. >> Another interesting thing I noticed is that this "removed X row >> versions" message is only emitted if vacuumed_pages is greater than 0. >> However, if we only did HOT pruning, tups_vacuumed will be greater >> than 0 while vacuumed_pages will still be 0, so some information will >> be left out. I think this is already the case, though, so this could >> probably be handled in a separate thread. >> > > Hmm, since this log message is corresponding to the one that > lazy_vacuum_heap makes and total number of removed tuples are always > reported, it seems consistent to me. Do you have another point? Here's an example: postgres=# CREATE TABLE test (a INT, b INT); CREATE TABLE postgres=# CREATE INDEX ON test (a); CREATE INDEX postgres=# INSERT INTO test VALUES (1, 2); INSERT 0 1 After only HOT updates, the "removed X row versions in Y pages" message is not emitted: postgres=# UPDATE test SET b = 3; UPDATE 1 postgres=# UPDATE test SET b = 4; UPDATE 1 postgres=# VACUUM (FREEZE, VERBOSE) test; INFO: aggressively vacuuming "public.test" INFO: index "test_a_idx" now contains 1 row versions in 2 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. INFO: "test": found 2 removable, 1 nonremovable row versions in 1 out of 1 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 494 There were 1 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM After non-HOT updates, the "removed" message is emitted: postgres=# UPDATE test SET a = 5; UPDATE 1 postgres=# VACUUM (FREEZE, VERBOSE) test; INFO: aggressively vacuuming "public.test" INFO: scanned index "test_a_idx" to remove 1 row versions DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: "test": removed 1 row versions in 1 pages DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: index "test_a_idx" now contains 1 row versions in 2 pages DETAIL: 1 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. INFO: "test": found 1 removable, 1 nonremovable row versions in 1 out of 1 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 495 There were 1 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM Nathan
On Tue, Mar 5, 2019 at 8:01 PM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > Hello, I have some other comments. > Thank you for the comment! On Tue, Mar 5, 2019 at 8:01 PM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > > + nleft; /* item pointers we left */ > > The name seems to be something other, and the comment doesn't > makes sense at least.. for me.. Looking below, > > + "%.0f tuples are left as dead.\n", > + nleft), > + nleft); > > How about "nleft_dead; /* iterm pointers left as dead */"? Fixed. > > > > In this block: > > - if (nindexes == 0 && > + if ((nindexes == 0 || skip_index_vacuum) && > vacrelstats->num_dead_tuples > 0) > { > > Is it right that vacuumed_pages is incremented and FSM is updated > while the page is not actually vacuumed? Good catch. I think the FSM stuff is right because we actually did HOT pruning but the increment of vacuumed_page seems wrong to me. I think we should not increment it and not report "removed XX row version in YY pages" message. > > > tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false, > - &vacrelstats->latestRemovedXid); > + &vacrelstats->latestRemovedXid, > + &tups_pruned); > > tups_pruned looks as "HOT pruned tuples". It is named "unused" in > the function's parameters. (But I think it is useless. Please see > the details below.) > > > I tested it with a simple data set. > > (autovacuum = off) > drop table if exists t; > create table t with (fillfactor=50) as select a, a % 3 as b from generate_series(1, 9) a; > create index on t(a); > update t set a = -a where b = 0; > update t set b = b + 1 where b = 1; > > We now have 9 tuples, 15 versions and 3 out of 6 "old" tuples are > to be "left dead" by DISABLE_INDEX_CLEANUP vacuum. It means, > three tuples ends with "left dead", three tuples are removed and > 12 tuples will survive the vacuum below. > > vacuum (verbose, freeze ,disable_index_cleanup, disable_page_skipping) t; > > > INFO: "t": removed 0 row versions in 1 pages > > INFO: "t": found 0 removable, 9 nonremovable row versions in 1 out of 1 pages > > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 925 > > There were 0 unused item pointers. > > Skipped 0 pages due to buffer pins, 0 frozen pages. > > 0 pages are entirely empty. > > 3 tuples are left as dead. > > Three tuple versions have vanished. Actually they were removed > but not shown in the message. > > heap_prune_chain() doesn't count a live root entry of a chain as > "unused (line pointer)" since it is marked as "redierected". As > the result the vanished tuples are counted in tups_vacuumed, not > in tups_pruned. Maybe the name tups_vacuumed is confusing. After > removing tups_pruned code it works correctly. > > > INFO: "t": removed 6 row versions in 1 pages > > INFO: "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages > > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 932 > > There were 0 unused item pointers. > > Skipped 0 pages due to buffer pins, 0 frozen pages. > > 0 pages are entirely empty. > > 3 tuples are left as dead. > > I see two choices of the second line above. > > 1> "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages > > "removable" includes "left dead" tuples. > > 2> "t": found 3 removable, 12 nonremovable row versions in 1 out of 1 pages > > "removable" excludes "left dead" tuples. > > If you prefer the latter, removable and nonremoveable need to be > corrected using nleft. I think that the first vacuum should report the former message because it's true that the table has 6 removable tuples. We remove 6 tuples but leave 3 item pointers. So in the second vacuum, it should be "found 0 removable, 9 nonremovable row versions ..." and "3 tuples are left as dead". But to report more precisely it'd be better to report "0 tuples and 3 item identifiers are left as dead" here. Attached updated patch incorporated all of comments. Also I've added new reloption vacuum_index_cleanup as per discussion on the "reloption to prevent VACUUM from truncating empty pages at the end of relation" thread. Autovacuums also can skip index cleanup when the reloption is set to false. Since the setting this to false might lead some problems I've made autovacuums report the number of dead tuples and dead itemids we left. Regards, Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Tue, Mar 5, 2019 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Attached updated patch incorporated all of comments. Also I've added > new reloption vacuum_index_cleanup as per discussion on the "reloption > to prevent VACUUM from truncating empty pages at the end of relation" > thread. Autovacuums also can skip index cleanup when the reloption is > set to false. Since the setting this to false might lead some problems > I've made autovacuums report the number of dead tuples and dead > itemids we left. It seems to me that the disable_index_cleanup should be renamed index_cleanup and the default should be changed to true, for consistency with the reloption (and, perhaps, other patches). - num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0; + num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = + nleft_dead_itemids = nleft_dead_tuples = 0; I would suggest leaving the existing line alone (and not adding an extra space to it as the patch does now) and just adding a second initialization on the next line as a separate statement. a = b = c = d = e = 0 isn't such great coding style that we should stick to it rigorously even when it ends up having to be broken across lines. + /* Index vacuum must be enabled in two-pass vacuum */ + Assert(!skip_index_vacuum); I am a big believer in naming consistency. Please, let's use the same name everywhere! If it's going to be index_cleanup, then call the reloption vacuum_index_cleanup, and call the option index_cleanup, and call the variable index_cleanup. Picking a different subset of cleanup, index, vacuum, skip, and disable for each new name makes it harder to understand. - * If there are no indexes then we can vacuum the page right now - * instead of doing a second scan. + * If there are no indexes or index vacuum is disabled we can + * vacuum the page right now instead of doing a second scan. This comment is wrong. That wouldn't be safe. And that's probably why it's not what the code does. - /* If no indexes, make log report that lazy_vacuum_heap would've made */ + /* + * If no index or index vacuum is disabled, make log report that + * lazy_vacuum_heap would've make. + */ if (vacuumed_pages) Hmm, does this really do what the comment claims? It looks to me like we only increment vacuumed_pages when we call lazy_vacuum_page(), and we (correctly) don't do that when index cleanup is disabled, but then here this claims that if (vacuumed_pages) will be true in that case. I wonder if it would be cleaner to rename vacrelstate->hasindex to 'useindex' and set it to false if there are no indexes or index cleanup is disabled. But that might actually be worse, not sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Mar 7, 2019 at 3:55 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Mar 5, 2019 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Attached updated patch incorporated all of comments. Also I've added > > new reloption vacuum_index_cleanup as per discussion on the "reloption > > to prevent VACUUM from truncating empty pages at the end of relation" > > thread. Autovacuums also can skip index cleanup when the reloption is > > set to false. Since the setting this to false might lead some problems > > I've made autovacuums report the number of dead tuples and dead > > itemids we left. > > It seems to me that the disable_index_cleanup should be renamed > index_cleanup and the default should be changed to true, for > consistency with the reloption (and, perhaps, other patches). Hmm, the patch already has new reloption vacuum_index_cleanup and default value is true but you meant I should change its name to index_cleanup? > > - num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0; > + num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = > + nleft_dead_itemids = nleft_dead_tuples = 0; > > I would suggest leaving the existing line alone (and not adding an > extra space to it as the patch does now) and just adding a second > initialization on the next line as a separate statement. a = b = c = d > = e = 0 isn't such great coding style that we should stick to it > rigorously even when it ends up having to be broken across lines. Fixed. > > + /* Index vacuum must be enabled in two-pass vacuum */ > + Assert(!skip_index_vacuum); > > I am a big believer in naming consistency. Please, let's use the same > name everywhere! If it's going to be index_cleanup, then call the > reloption vacuum_index_cleanup, and call the option index_cleanup, and > call the variable index_cleanup. Picking a different subset of > cleanup, index, vacuum, skip, and disable for each new name makes it > harder to understand. Fixed. > > - * If there are no indexes then we can vacuum the page right now > - * instead of doing a second scan. > + * If there are no indexes or index vacuum is disabled we can > + * vacuum the page right now instead of doing a second scan. > > This comment is wrong. That wouldn't be safe. And that's probably > why it's not what the code does. Fixed. > > - /* If no indexes, make log report that lazy_vacuum_heap would've made */ > + /* > + * If no index or index vacuum is disabled, make log report that > + * lazy_vacuum_heap would've make. > + */ > if (vacuumed_pages) > > Hmm, does this really do what the comment claims? It looks to me like > we only increment vacuumed_pages when we call lazy_vacuum_page(), and > we (correctly) don't do that when index cleanup is disabled, but then > here this claims that if (vacuumed_pages) will be true in that case. You're right, vacuumed_pages never be > 0 in disable_index_cleanup case. Fixed. > > I wonder if it would be cleaner to rename vacrelstate->hasindex to > 'useindex' and set it to false if there are no indexes or index > cleanup is disabled. But that might actually be worse, not sure. > I tried the changes and it seems good idea to me. Fixed. Attached the updated version patches. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Thu, Mar 7, 2019 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Hmm, the patch already has new reloption vacuum_index_cleanup and > default value is true but you meant I should change its name to > index_cleanup? No, I mean that you should make it so that someone writes VACUUM (INDEX_CLEANUP false) instead of VACUUM (DISABLE_INDEX_CLEANUP). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 8, 2019 at 12:04 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Mar 7, 2019 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Hmm, the patch already has new reloption vacuum_index_cleanup and > > default value is true but you meant I should change its name to > > index_cleanup? > > No, I mean that you should make it so that someone writes VACUUM > (INDEX_CLEANUP false) instead of VACUUM (DISABLE_INDEX_CLEANUP). > IIUC we've discussed the field-and-value style vacuum option. I suggested that since we have already the disable_page_skipping option the disable_page_skipping option would be more natural style and consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent with its reloption but not with other vacuum options. So why does only this option (and probably up-coming new options) need to support new style? Do we need the same change to the existing options? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Fri, Mar 8, 2019 at 12:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > IIUC we've discussed the field-and-value style vacuum option. I > suggested that since we have already the disable_page_skipping option > the disable_page_skipping option would be more natural style and > consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent > with its reloption but not with other vacuum options. So why does only > this option (and probably up-coming new options) need to support new > style? Do we need the same change to the existing options? Well, it's too late to change to change DISABLE_PAGE_SKIPPING to work some other way; it's been released, and we're stuck with it at this point. However, I generally believe that it is preferable to phrase options positively then negatively, so that for example one writes EXPLAIN (ANALYZE, TIMING OFF) not EXPLAIN (ANALYZE, NO_TIMING). So I'd like to do it that way for the new options that we're proposing to add. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Mar 27, 2019 at 12:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sat, Mar 23, 2019 at 3:25 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Fri, Mar 8, 2019 at 12:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > IIUC we've discussed the field-and-value style vacuum option. I > > > suggested that since we have already the disable_page_skipping option > > > the disable_page_skipping option would be more natural style and > > > consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent > > > with its reloption but not with other vacuum options. So why does only > > > this option (and probably up-coming new options) need to support new > > > style? Do we need the same change to the existing options? > > > > Well, it's too late to change to change DISABLE_PAGE_SKIPPING to work > > some other way; it's been released, and we're stuck with it at this > > point. > > Agreed. > > > However, I generally believe that it is preferable to phrase > > options positively then negatively, so that for example one writes > > EXPLAIN (ANALYZE, TIMING OFF) not EXPLAIN (ANALYZE, NO_TIMING). So > > I'd like to do it that way for the new options that we're proposing to > > add. > > Agreed with using phrase options positively than negatively. Since > DISABLE_PAGE_SKIPPING is an option for emergency we might be able to > rename for consistency in a future release. > > Attached updated version patches. The patch adds the basic functionality to disable index cleanup but one possible argument could be whether we should always disable it when anti-wraparound vacuum. As discussed on another thread[1] anti-wraparound vacuum still could lead the I/O burst problem and take a long time, especially for append-only large table. Originally the purpose of this feature is to resolve the problem that vacuum takes a long time even if the table has just a few dead tuples, which is a quite common situation of anti-wraparound vacuum. It might be too late to discuss but if we always disable it when anti-wraparound vacuum then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on PostgreSQL 12. Dose anyone have opinions? [1] https://www.postgresql.org/message-id/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Mar 28, 2019 at 2:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > The patch adds the basic functionality to disable index cleanup but > one possible argument could be whether we should always disable it > when anti-wraparound vacuum. As discussed on another thread[1] > anti-wraparound vacuum still could lead the I/O burst problem and take > a long time, especially for append-only large table. Originally the > purpose of this feature is to resolve the problem that vacuum takes a > long time even if the table has just a few dead tuples, which is a > quite common situation of anti-wraparound vacuum. It might be too late > to discuss but if we always disable it when anti-wraparound vacuum > then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on > PostgreSQL 12. Dose anyone have opinions? I think we can respect the configured value of the option even for aggressive vacuums, but I don't think we should change aggressive vacuums to work that way by default. You are correct that the table might have only a few dead tuples, but it might also have a lot of dead tuples; I have heard rumors of a PostgreSQL installation that had autovacuum = off and non-stop wraparound autovacuums desperately trying to forestall shutdown. That's probably a lot less likely now that we have the freeze map and such a system would almost surely have a nasty bloat problem, but disabling index cleanup by default would make it worse. I think the solution in the long run here is to (1) allow the index_cleanup option (or the corresponding reloption) to override the default behavior and (2) eventually change the default behavior from 'always yes' to 'depends on how many dead tuples we found'. But I think that the second of those things is not appropriate to consider changing in PG 12 at this point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 29, 2019 at 4:39 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Mar 28, 2019 at 2:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > The patch adds the basic functionality to disable index cleanup but > > one possible argument could be whether we should always disable it > > when anti-wraparound vacuum. As discussed on another thread[1] > > anti-wraparound vacuum still could lead the I/O burst problem and take > > a long time, especially for append-only large table. Originally the > > purpose of this feature is to resolve the problem that vacuum takes a > > long time even if the table has just a few dead tuples, which is a > > quite common situation of anti-wraparound vacuum. It might be too late > > to discuss but if we always disable it when anti-wraparound vacuum > > then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on > > PostgreSQL 12. Dose anyone have opinions? > > I think we can respect the configured value of the option even for > aggressive vacuums, but I don't think we should change aggressive > vacuums to work that way by default. You are correct that the table > might have only a few dead tuples, but it might also have a lot of > dead tuples; I have heard rumors of a PostgreSQL installation that had > autovacuum = off and non-stop wraparound autovacuums desperately > trying to forestall shutdown. That's probably a lot less likely now > that we have the freeze map and such a system would almost surely have > a nasty bloat problem, but disabling index cleanup by default would > make it worse. Understood and agreed. Always setting it to false would affect much and there are users who expect anti-wraparound vacuums to reclaim garbage. > > I think the solution in the long run here is to (1) allow the > index_cleanup option (or the corresponding reloption) to override the > default behavior and (2) eventually change the default behavior from > 'always yes' to 'depends on how many dead tuples we found'. But I > think that the second of those things is not appropriate to consider > changing in PG 12 at this point. The current patch already takes (1) and I agreed with you that (2) would be for PG 13 or later. So the patch would be helpful for such users as well. Attached updated patches. These patches are applied on top of 0001 patch on parallel vacuum thread[1]. [1] https://www.postgresql.org/message-id/CAD21AoBaFcKBAeL5_%2B%2Bj%2BVzir2vBBcF4juW7qH8b3HsQY%3DQ6%2Bw%40mail.gmail.com Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Fri, Mar 29, 2019 at 2:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Attached updated patches. These patches are applied on top of 0001 > patch on parallel vacuum thread[1]. + bool index_cleanup = true; /* by default */ I think we should instead initialize index_cleanup to the reloption value, if there is one, or true if none, and then let it be overridden by the loop that follows, where whatever the user specifies in the SQL command is processed. That way, any explicitly-specified option within the command itself wins, and the reloption sets the default. As you have it, index cleanup is disabled when the reloption is set to false even if the user wrote VACUUM (INDEX_CLEANUP TRUE). + appendStringInfo(&buf, + _("%.0f tuples and %.0f item identifiers are left as dead.\n"), + vacrelstats->nleft_dead_tuples, + vacrelstats->nleft_dead_itemids); I tend to think we should omit this line entirely if both values are 0, as will very often be the case. + if ((params->options & VACOPT_FULL) != 0 && + (params->options & VACOPT_INDEX_CLEANUP) == 0) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL"))); I think it would be better to just ignore the INDEX_CLEANUP option when FULL is specified. I wasn't all that happy with the documentation changes you proposed. Please find attached a proposed set of doc changes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
On Fri, Mar 29, 2019 at 10:46 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Mar 29, 2019 at 2:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Attached updated patches. These patches are applied on top of 0001 > > patch on parallel vacuum thread[1]. > > + bool index_cleanup = true; /* by default */ > > I think we should instead initialize index_cleanup to the reloption > value, if there is one, or true if none, and then let it be overridden > by the loop that follows, where whatever the user specifies in the SQL > command is processed. That way, any explicitly-specified option > within the command itself wins, and the reloption sets the default. > As you have it, index cleanup is disabled when the reloption is set to > false even if the user wrote VACUUM (INDEX_CLEANUP TRUE). > Yeah, but since multiple relations might be specified in VACUUM command we need to process index_cleanup option after opened each relations. Maybe we need to process all options except for INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel() and process it in manner of you suggested after opened the relation. Is that right? > + appendStringInfo(&buf, > + _("%.0f tuples and %.0f item identifiers are left > as dead.\n"), > + vacrelstats->nleft_dead_tuples, > + vacrelstats->nleft_dead_itemids); > > I tend to think we should omit this line entirely if both values are > 0, as will very often be the case. Fixed. > > + if ((params->options & VACOPT_FULL) != 0 && > + (params->options & VACOPT_INDEX_CLEANUP) == 0) > + ereport(ERROR, > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("VACUUM option INDEX_CLEANUP cannot be set to > false with FULL"))); > > I think it would be better to just ignore the INDEX_CLEANUP option > when FULL is specified. Okay, but why do we ignore that in this case while we complain in the case of FULL and DISABLE_PAGE_SKIPPING? > > I wasn't all that happy with the documentation changes you proposed. > Please find attached a proposed set of doc changes. Thank you! I've incorporated these changes. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Yeah, but since multiple relations might be specified in VACUUM > command we need to process index_cleanup option after opened each > relations. Maybe we need to process all options except for > INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel() > and process it in manner of you suggested after opened the relation. > Is that right? Blech, no, let's not do that. We'd better use some method that can indicate yes/no/default. Something like psql's trivalue thingy, but probably not exactly that. We can define an enum for this purpose, for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}. Or maybe there's some other way. But let's not pass bits of the parse tree around any more than really required. > > I think it would be better to just ignore the INDEX_CLEANUP option > > when FULL is specified. > > Okay, but why do we ignore that in this case while we complain in the > case of FULL and DISABLE_PAGE_SKIPPING? Well, that's a fair point, I guess. If we go that that route, we'll need to make sure that setting the reloption doesn't prevent VACUUM FULL from working -- the complaint must only affect an explicit option on the VACUUM command line. I think we should have a regression test for that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Yeah, but since multiple relations might be specified in VACUUM > > command we need to process index_cleanup option after opened each > > relations. Maybe we need to process all options except for > > INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel() > > and process it in manner of you suggested after opened the relation. > > Is that right? > > Blech, no, let's not do that. We'd better use some method that can > indicate yes/no/default. Something like psql's trivalue thingy, but > probably not exactly that. We can define an enum for this purpose, > for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}. Or > maybe there's some other way. But let's not pass bits of the parse > tree around any more than really required. I've defined an enum VacOptTernaryValue representing enabled/disabled/default and added index_cleanup variable as the new enum type to VacuumParams. The vacuum options that uses the reloption value as the default value such as index cleanup and new truncation option can use this enum and set either enabled or disabled after opened the relation when it’s set to default. Please find the attached patches. > > > > I think it would be better to just ignore the INDEX_CLEANUP option > > > when FULL is specified. > > > > Okay, but why do we ignore that in this case while we complain in the > > case of FULL and DISABLE_PAGE_SKIPPING? > > Well, that's a fair point, I guess. If we go that that route, we'll > need to make sure that setting the reloption doesn't prevent VACUUM > FULL from working -- the complaint must only affect an explicit option > on the VACUUM command line. I think we should have a regression test > for that. I've added regression tests. Since we check it before setting index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from working even when the vacuum_index_cleanup is false. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation: not tested I have read this patch. I like the concept and would like it to get committed. Question I have after reading the patch is around this construct: /* - * If there are no indexes then we can vacuum the page right now - * instead of doing a second scan. + * If there are no indexes we can vacuum the page right now instead of + * doing a second scan. Also we don't do that but forget dead tuples + * when index cleanup is disabled. */ This seems to change behavior on heap tuples, even though the option itself is documented to be about "Indexes" only. Thisneeds either better explanation what "forget dead tuples" means and that it does not lead to some kind of internal inconsistency,or documentation on what is the effect on heap tuples. This same block raises a question on "after I enable this option, do a vacuum, decide I don't like it, what do I need todo to disable it back?" - just set it back, or set and perform a vacuum, or set and perform a VACUUM FULL as somethingwas "forgotten"? It may happen this concept of "forgetting" is documented somewhere in the near comments but I'd prefer it to be stated explicitly. The new status of this patch is: Waiting on Author
Hello. At Mon, 1 Apr 2019 14:26:15 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCKKwvgQgWxKwPPmVFJjQVj6c=oV9dtdiRthdV+WjnD4w@mail.gmail.com> > On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > Yeah, but since multiple relations might be specified in VACUUM > > > command we need to process index_cleanup option after opened each > > > relations. Maybe we need to process all options except for > > > INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel() > > > and process it in manner of you suggested after opened the relation. > > > Is that right? > > > > Blech, no, let's not do that. We'd better use some method that can > > indicate yes/no/default. Something like psql's trivalue thingy, but > > probably not exactly that. We can define an enum for this purpose, > > for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}. Or > > maybe there's some other way. But let's not pass bits of the parse > > tree around any more than really required. > > I've defined an enum VacOptTernaryValue representing > enabled/disabled/default and added index_cleanup variable as the new It is defined as ENABLED=0, DISABLED=1, DEFAULT=2. At leat ENABLED=0 and DISABLED=1 are misleading. > enum type to VacuumParams. The vacuum options that uses the reloption > value as the default value such as index cleanup and new truncation > option can use this enum and set either enabled or disabled after > opened the relation when it’s set to default. Please find the attached > patches. +static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def); This is actually a type converter of boolean. It is used to read VACUUM option but not used to read reloption. It seems useless. Finally the ternary value is determined to true or false before use. It is simple that index_cleanup finaly be read as bool. We could add another boolean to indicate that the value is set or not, but I think it would be better that the ternary type is a straightfoward expansion of bool.{DEFAULT = -1, DISABLED = 0, ENABLED = 1} and make sure that index_cleanup is not DEFAULT at a certain point. So, how about this? #define VACOPT_TERNARY_DEFAULT -1 typedef int VacOptTernaryValue; /* -1 is undecided, 0 is false, 1 is true */ /* No longer the value mustn't be left DEFAULT */ Assert (params->index_cleanup != VACOPT_TERNARY_DEFAULT); > > > > I think it would be better to just ignore the INDEX_CLEANUP option > > > > when FULL is specified. > > > > > > Okay, but why do we ignore that in this case while we complain in the > > > case of FULL and DISABLE_PAGE_SKIPPING? > > > > Well, that's a fair point, I guess. If we go that that route, we'll > > need to make sure that setting the reloption doesn't prevent VACUUM > > FULL from working -- the complaint must only affect an explicit option > > on the VACUUM command line. I think we should have a regression test > > for that. > > I've added regression tests. Since we check it before setting > index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from > working even when the vacuum_index_cleanup is false. + errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL"))); I'm not one to talk on this, but this seems somewhat confused. "VACUUM option INDEX_CLEANUP cannot be set to false with FULL being specified" or "INDEX_CLEANUP cannot be disabled for VACUUM FULL"? And in the following part: + /* Set index cleanup option based on reloptions */ + if (params->index_cleanup == VACUUM_OPTION_DEFAULT) + { + if (onerel->rd_options == NULL || + ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup) + params->index_cleanup = VACUUM_OPTION_ENABLED; + else + params->index_cleanup = VACUUM_OPTION_DISABLED; + } + The option should not be false while VACUUM FULL, and maybe we should complain in WARNING or NOTICE that the relopt is ignored. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Apr 3, 2019 at 10:56 AM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > Hello. > > At Mon, 1 Apr 2019 14:26:15 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCKKwvgQgWxKwPPmVFJjQVj6c=oV9dtdiRthdV+WjnD4w@mail.gmail.com> > > On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Yeah, but since multiple relations might be specified in VACUUM > > > > command we need to process index_cleanup option after opened each > > > > relations. Maybe we need to process all options except for > > > > INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel() > > > > and process it in manner of you suggested after opened the relation. > > > > Is that right? > > > > > > Blech, no, let's not do that. We'd better use some method that can > > > indicate yes/no/default. Something like psql's trivalue thingy, but > > > probably not exactly that. We can define an enum for this purpose, > > > for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}. Or > > > maybe there's some other way. But let's not pass bits of the parse > > > tree around any more than really required. > > > > I've defined an enum VacOptTernaryValue representing > > enabled/disabled/default and added index_cleanup variable as the new > Thank you for reviewing the patch! > It is defined as ENABLED=0, DISABLED=1, DEFAULT=2. At leat > ENABLED=0 and DISABLED=1 are misleading. Indeed, will fix. > > > enum type to VacuumParams. The vacuum options that uses the reloption > > value as the default value such as index cleanup and new truncation > > option can use this enum and set either enabled or disabled after > > opened the relation when it’s set to default. Please find the attached > > patches. > > +static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def); > > This is actually a type converter of boolean. It is used to read > VACUUM option but not used to read reloption. It seems useless. > > > Finally the ternary value is determined to true or false before > use. It is simple that index_cleanup finaly be read as bool. We > could add another boolean to indicate that the value is set or > not, but I think it would be better that the ternary type is a > straightfoward expansion of bool.{DEFAULT = -1, DISABLED = 0, > ENABLED = 1} and make sure that index_cleanup is not DEFAULT at a > certain point. > > So, how about this? > > #define VACOPT_TERNARY_DEFAULT -1 > typedef int VacOptTernaryValue; /* -1 is undecided, 0 is false, 1 is true */ Hmm, if we do that we set either VAOPT_TERNARY_DEFAULT, true or false to index_cleanup, but I'm not sure this is a good approach. I think we would want VACOPT_TERNARY_TRUE and VACOPT_TERNARY_FALSE as we defined new type as a ternary value and already have VACOPT_TERNARY_DEFAULT. > > /* No longer the value mustn't be left DEFAULT */ > Assert (params->index_cleanup != VACOPT_TERNARY_DEFAULT); Agreed, will add it. > > > > > > > I think it would be better to just ignore the INDEX_CLEANUP option > > > > > when FULL is specified. > > > > > > > > Okay, but why do we ignore that in this case while we complain in the > > > > case of FULL and DISABLE_PAGE_SKIPPING? > > > > > > Well, that's a fair point, I guess. If we go that that route, we'll > > > need to make sure that setting the reloption doesn't prevent VACUUM > > > FULL from working -- the complaint must only affect an explicit option > > > on the VACUUM command line. I think we should have a regression test > > > for that. > > > > I've added regression tests. Since we check it before setting > > index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from > > working even when the vacuum_index_cleanup is false. > > + errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL"))); > > I'm not one to talk on this, but this seems somewhat confused. > > "VACUUM option INDEX_CLEANUP cannot be set to false with FULL being specified" > > or > > "INDEX_CLEANUP cannot be disabled for VACUUM FULL"? I prefer the former, will fix. > > > And in the following part: > > + /* Set index cleanup option based on reloptions */ > + if (params->index_cleanup == VACUUM_OPTION_DEFAULT) > + { > + if (onerel->rd_options == NULL || > + ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup) > + params->index_cleanup = VACUUM_OPTION_ENABLED; > + else > + params->index_cleanup = VACUUM_OPTION_DISABLED; > + } > + > > The option should not be false while VACUUM FULL, I think that we need to complain only when INDEX_CLEANUP option is disabled by an explicit option on the VACUUM command and FULL option is specified. It's no problem when vacuum_index_cleanup is false and FULL option is true. Since internally we don't use index cleanup when vacuum full I guess that we don't need to require index_cleanup being always true even when full option is specified. > and maybe we > should complain in WARNING or NOTICE that the relopt is ignored. I think when users want to control index cleanup behavior manually they specify INDEX_CLEANUP option on the VACUUM command. So it seems to me that overwriting a reloption by an explicit option would be a natural behavior. I'm concerned that these message would rather confuse users. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
At Wed, 3 Apr 2019 12:10:03 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoBkaHka5sav5N6vvoKS9qpmrWRBdyNGP8S7M0SsPd0iyQ@mail.gmail.com> > > And in the following part: > > > > + /* Set index cleanup option based on reloptions */ > > + if (params->index_cleanup == VACUUM_OPTION_DEFAULT) > > + { > > + if (onerel->rd_options == NULL || > > + ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup) > > + params->index_cleanup = VACUUM_OPTION_ENABLED; > > + else > > + params->index_cleanup = VACUUM_OPTION_DISABLED; > > + } > > + > > > > The option should not be false while VACUUM FULL, > > I think that we need to complain only when INDEX_CLEANUP option is > disabled by an explicit option on the VACUUM command and FULL option > is specified. It's no problem when vacuum_index_cleanup is false and > FULL option is true. Since internally we don't use index cleanup when > vacuum full I guess that we don't need to require index_cleanup being > always true even when full option is specified. I know it's safe. It's just about integrity of option values. So I don't insist on that. > > and maybe we > > should complain in WARNING or NOTICE that the relopt is ignored. > > I think when users want to control index cleanup behavior manually > they specify INDEX_CLEANUP option on the VACUUM command. So it seems > to me that overwriting a reloption by an explicit option would be a > natural behavior. I'm concerned that these message would rather > confuse users. If it "cannot be specified with FULL", it seems strange that it's safe being specified by reloptions. I'm rather thinking that INDEX_CLEANUP = false is ignorable even being specified with FULL option, aand DISABLE_PAGE_SKIPPING for VACUUM FULL shuld be ignored since VACUUM FULL doesn't skip pages in the first place. Couldn't we silence the DISABLE_PAGE_SKIPPING & FULL case instead of complaining about INDEX_CLEANUP & FULL? If so, I feel just ignoring the relopt cases is reasonable. Yeah, perhaps I'm warrying too much. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Apr 3, 2019 at 1:17 PM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > At Wed, 3 Apr 2019 12:10:03 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoBkaHka5sav5N6vvoKS9qpmrWRBdyNGP8S7M0SsPd0iyQ@mail.gmail.com> > > > And in the following part: > > > > > > + /* Set index cleanup option based on reloptions */ > > > + if (params->index_cleanup == VACUUM_OPTION_DEFAULT) > > > + { > > > + if (onerel->rd_options == NULL || > > > + ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup) > > > + params->index_cleanup = VACUUM_OPTION_ENABLED; > > > + else > > > + params->index_cleanup = VACUUM_OPTION_DISABLED; > > > + } > > > + > > > > > > The option should not be false while VACUUM FULL, > > > > I think that we need to complain only when INDEX_CLEANUP option is > > disabled by an explicit option on the VACUUM command and FULL option > > is specified. It's no problem when vacuum_index_cleanup is false and > > FULL option is true. Since internally we don't use index cleanup when > > vacuum full I guess that we don't need to require index_cleanup being > > always true even when full option is specified. > > I know it's safe. It's just about integrity of option values. So > I don't insist on that. > > > > and maybe we > > > should complain in WARNING or NOTICE that the relopt is ignored. > > > > I think when users want to control index cleanup behavior manually > > they specify INDEX_CLEANUP option on the VACUUM command. So it seems > > to me that overwriting a reloption by an explicit option would be a > > natural behavior. I'm concerned that these message would rather > > confuse users. > > If it "cannot be specified with FULL", it seems strange that it's > safe being specified by reloptions. > > I'm rather thinking that INDEX_CLEANUP = false is ignorable even > being specified with FULL option, aand DISABLE_PAGE_SKIPPING for > VACUUM FULL shuld be ignored since VACUUM FULL doesn't skip pages > in the first place. > > Couldn't we silence the DISABLE_PAGE_SKIPPING & FULL case instead > of complaining about INDEX_CLEANUP & FULL? If so, I feel just > ignoring the relopt cases is reasonable. Agreed with being silent even when INDEX_CLEANUP/vacuum_index_cleanup = false and FULL = true. For DISABLE_PAGE_SKIPPING, it should be a separate patch and changes the existing behavior. Maybe need other discussion. For VacOptTernaryValue part, I've incorporated the review comments but left the new enum type since it seems to be more straightforward for now. I might change that if there are other way. Attached the updated version patches including the DISABLE_PAGE_SKIPPING part (0003). Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Wed, Apr 3, 2019 at 1:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Attached the updated version patches including the > DISABLE_PAGE_SKIPPING part (0003). I am confused about nleft_dead_tuples. It looks like it gets incremented whenever we set tupgone = true, regardless of whether we are doing index cleanup. But if we ARE doing index cleanup then the dead tuple will not be left. And if we are not doing index vacuum then we still don't need this for anything, because tups_vacuumed is counting the same thing. I may be confused. But if I'm not, then I think this should just be ripped out, and we should only keep nleft_dead_itemids. As far as VacOptTernaryValue, I think it would be safer to change this so that VACOPT_TERNARY_DEFAULT = 0. That way palloc0 will fill in the value that people are likely to want by default, which makes it less likely that people will accidentally write future code that doesn't clean up indexes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hello. At Wed, 3 Apr 2019 11:55:00 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+Tgmoas581jpJ0TPaA38OhjXHgbLy8z1fuuHH7CaNkrboZJeA@mail.gmail.com> > On Wed, Apr 3, 2019 at 1:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Attached the updated version patches including the > > DISABLE_PAGE_SKIPPING part (0003). > > I am confused about nleft_dead_tuples. It looks like it gets > incremented whenever we set tupgone = true, regardless of whether we > are doing index cleanup. But if we ARE doing index cleanup then the > dead tuple will not be left. And if we are not doing index vacuum > then we still don't need this for anything, because tups_vacuumed is > counting the same thing. I may be confused. But if I'm not, then I > think this should just be ripped out, and we should only keep > nleft_dead_itemids. tups_vacuumed is including heap_page_prune()ed tuples, which aren't counted as "tupgone". > As far as VacOptTernaryValue, I think it would be safer to change this > so that VACOPT_TERNARY_DEFAULT = 0. That way palloc0 will fill in the > value that people are likely to want by default, which makes it less > likely that people will accidentally write future code that doesn't > clean up indexes. It's convincing. My compalint was enabled=0 and disabled=1 is confusing so I'm fine with default=0, disabled=1, enabled=2. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Apr 4, 2019 at 9:18 AM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > Hello. > > At Wed, 3 Apr 2019 11:55:00 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+Tgmoas581jpJ0TPaA38OhjXHgbLy8z1fuuHH7CaNkrboZJeA@mail.gmail.com> > > On Wed, Apr 3, 2019 at 1:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > Attached the updated version patches including the > > > DISABLE_PAGE_SKIPPING part (0003). > > > > I am confused about nleft_dead_tuples. It looks like it gets > > incremented whenever we set tupgone = true, regardless of whether we > > are doing index cleanup. But if we ARE doing index cleanup then the > > dead tuple will not be left. And if we are not doing index vacuum > > then we still don't need this for anything, because tups_vacuumed is > > counting the same thing. I may be confused. But if I'm not, then I > > think this should just be ripped out, and we should only keep > > nleft_dead_itemids. > > tups_vacuumed is including heap_page_prune()ed tuples, which > aren't counted as "tupgone". Yes. tup_vacuumed counts not only HOT pruned tuples but also tuples that became dead after heap_page_prune(). When index clenaup is disabled, the former leaves only itemid whereas the latter leaves itemid and heap tuple as we don't remove. nleft_dead_tuples counts only the latter to report precisely. I think nleft_dead_tuples should be incremented only when index cleanup is disabled, and the that part comment should be polished. > > > As far as VacOptTernaryValue, I think it would be safer to change this > > so that VACOPT_TERNARY_DEFAULT = 0. That way palloc0 will fill in the > > value that people are likely to want by default, which makes it less > > likely that people will accidentally write future code that doesn't > > clean up indexes. > > It's convincing. My compalint was enabled=0 and disabled=1 is > confusing so I'm fine with default=0, disabled=1, enabled=2. Okay, fixed. Attached the updated version patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Attached the updated version patch. Committed with a little bit of documentation tweaking. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 4/4/19, 12:06 PM, "Robert Haas" <robertmhaas@gmail.com> wrote: > Committed with a little bit of documentation tweaking. Thanks! I noticed a very small typo in the new documentation. diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml index fdd8151220..c652f8b0bc 100644 --- a/doc/src/sgml/ref/vacuum.sgml +++ b/doc/src/sgml/ref/vacuum.sgml @@ -199,7 +199,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet and the table itself will accumulate dead line pointers that cannot be removed until index cleanup is completed. This option has no effect for tables that do not have an index and is ignored if the - <literal>FULL</literal> is used. + <literal>FULL</literal> option is used. </para> </listitem> </varlistentry> Nathan
On Fri, Apr 5, 2019 at 4:06 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Attached the updated version patch. > > Committed with a little bit of documentation tweaking. > Thank you for committing them! Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Apr 4, 2019 at 5:37 PM Bossart, Nathan <bossartn@amazon.com> wrote: > Thanks! I noticed a very small typo in the new documentation. Committed, thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Apr 5, 2019 at 10:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Apr 5, 2019 at 4:06 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > Attached the updated version patch. > > > > Committed with a little bit of documentation tweaking. > > > > Thank you for committing them! > BTW should we support this option for toast tables as well? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hi, On 2019-04-04 15:05:55 -0400, Robert Haas wrote: > On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Attached the updated version patch. > > Committed with a little bit of documentation tweaking. I've closed the commitfest entry. I hope that's accurate? Greetings, Andres Freund
On Sat, Apr 6, 2019 at 10:23 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2019-04-04 15:05:55 -0400, Robert Haas wrote: > > On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > Attached the updated version patch. > > > > Committed with a little bit of documentation tweaking. > > I've closed the commitfest entry. I hope that's accurate? > Yes, but Fujii-san pointed out that this option doesn't support toast tables and I think there is not specific reason why not supporting them. So it might be good to add toast.vacuum_index_cleanup. Attached patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Sat, Apr 06, 2019 at 11:31:31AM +0900, Masahiko Sawada wrote: > Yes, but Fujii-san pointed out that this option doesn't support toast > tables and I think there is not specific reason why not supporting > them. So it might be good to add toast.vacuum_index_cleanup. Attached > patch. Being able to control that option at toast level sounds sensible. I have added an open item about that. -- Michael
Вложения
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> Attached the updated version patch. > Committed with a little bit of documentation tweaking. topminnow just failed an assertion from this patch: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=topminnow&dt=2019-04-14%2011%3A01%3A48 The symptoms are: TRAP: FailedAssertion("!((params->index_cleanup == VACOPT_TERNARY_ENABLED && nleft_dead_tuples == 0 && nleft_dead_itemids== 0) || params->index_cleanup == VACOPT_TERNARY_DISABLED)", File: "/home/nm/farm/mipsel_deb8_gcc_32/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c",Line: 1404) ... 2019-04-14 14:49:16.328 CEST [15282:5] LOG: server process (PID 18985) was terminated by signal 6: Aborted 2019-04-14 14:49:16.328 CEST [15282:6] DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE pg_catalog.pg_depend Just looking at the logic around index_cleanup, I rather think that that assertion is flat out wrong: + /* No dead tuples should be left if index cleanup is enabled */ + Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED && + nleft_dead_tuples == 0 && nleft_dead_itemids == 0) || + params->index_cleanup == VACOPT_TERNARY_DISABLED); Either it's wrong, or this is: + /* + * Since this dead tuple will not be vacuumed and + * ignored when index cleanup is disabled we count + * count it for reporting. + */ + if (params->index_cleanup == VACOPT_TERNARY_ENABLED) + nleft_dead_tuples++; The poor quality of that comment suggests that maybe the code is just as confused. (I also think that that "ternary option" stuff is unreadably overdesigned notation, which possibly contributed to getting this wrong. If even the author can't keep it straight, it's unreadable.) regards, tom lane
On Mon, Apr 15, 2019 at 12:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Robert Haas <robertmhaas@gmail.com> writes: > > On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > >> Attached the updated version patch. > > > Committed with a little bit of documentation tweaking. > > topminnow just failed an assertion from this patch: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=topminnow&dt=2019-04-14%2011%3A01%3A48 > > The symptoms are: > > TRAP: FailedAssertion("!((params->index_cleanup == VACOPT_TERNARY_ENABLED && nleft_dead_tuples == 0 && nleft_dead_itemids== 0) || params->index_cleanup == VACOPT_TERNARY_DISABLED)", File: "/home/nm/farm/mipsel_deb8_gcc_32/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c",Line: 1404) > ... > 2019-04-14 14:49:16.328 CEST [15282:5] LOG: server process (PID 18985) was terminated by signal 6: Aborted > 2019-04-14 14:49:16.328 CEST [15282:6] DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE pg_catalog.pg_depend > > Just looking at the logic around index_cleanup, I rather think that > that assertion is flat out wrong: > > + /* No dead tuples should be left if index cleanup is enabled */ > + Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED && > + nleft_dead_tuples == 0 && nleft_dead_itemids == 0) || > + params->index_cleanup == VACOPT_TERNARY_DISABLED); > > Either it's wrong, or this is: > > + /* > + * Since this dead tuple will not be vacuumed and > + * ignored when index cleanup is disabled we count > + * count it for reporting. > + */ > + if (params->index_cleanup == VACOPT_TERNARY_ENABLED) > + nleft_dead_tuples++; > Ugh, I think the assertion is right but the above condition is completely wrong. We should increment nleft_dead_tuples when index cleanup is *not* enabled. For nleft_dead_itemids we require that index cleanup is disabled as follows. { /* * Here, we have indexes but index cleanup is disabled. Instead of * vacuuming the dead tuples on the heap, we just forget them. * * Note that vacrelstats->dead_tuples could have tuples which * became dead after HOT-pruning but are not marked dead yet. * We do not process them because it's a very rare condition, and * the next vacuum will process them anyway. */ Assert(params->index_cleanup == VACOPT_TERNARY_DISABLED); nleft_dead_itemids += vacrelstats->num_dead_tuples; } Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, Apr 15, 2019 at 9:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Apr 15, 2019 at 12:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Robert Haas <robertmhaas@gmail.com> writes: > > > On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > >> Attached the updated version patch. > > > > > Committed with a little bit of documentation tweaking. > > > > topminnow just failed an assertion from this patch: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=topminnow&dt=2019-04-14%2011%3A01%3A48 > > > > The symptoms are: > > > > TRAP: FailedAssertion("!((params->index_cleanup == VACOPT_TERNARY_ENABLED && nleft_dead_tuples == 0 && nleft_dead_itemids== 0) || params->index_cleanup == VACOPT_TERNARY_DISABLED)", File: "/home/nm/farm/mipsel_deb8_gcc_32/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c",Line: 1404) > > ... > > 2019-04-14 14:49:16.328 CEST [15282:5] LOG: server process (PID 18985) was terminated by signal 6: Aborted > > 2019-04-14 14:49:16.328 CEST [15282:6] DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE pg_catalog.pg_depend > > > > Just looking at the logic around index_cleanup, I rather think that > > that assertion is flat out wrong: > > > > + /* No dead tuples should be left if index cleanup is enabled */ > > + Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED && > > + nleft_dead_tuples == 0 && nleft_dead_itemids == 0) || > > + params->index_cleanup == VACOPT_TERNARY_DISABLED); > > > > Either it's wrong, or this is: > > > > + /* > > + * Since this dead tuple will not be vacuumed and > > + * ignored when index cleanup is disabled we count > > + * count it for reporting. > > + */ > > + if (params->index_cleanup == VACOPT_TERNARY_ENABLED) > > + nleft_dead_tuples++; > > > > Ugh, I think the assertion is right but the above condition is > completely wrong. We should increment nleft_dead_tuples when index > cleanup is *not* enabled. Here is a draft patch to fix this issue. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
Masahiko Sawada <sawada.mshk@gmail.com> writes: >> Ugh, I think the assertion is right but the above condition is >> completely wrong. We should increment nleft_dead_tuples when index >> cleanup is *not* enabled. > Here is a draft patch to fix this issue. So the real issue here, I fear, is that we've got no consistent testing of the whole case block for HEAPTUPLE_DEAD, as you can easily confirm by checking the code coverage report at https://coverage.postgresql.org/src/backend/access/heap/vacuumlazy.c.gcov.html This is perhaps unsurprising, given the code comment that points out that we can only reach that block if the tuple's state changed since heap_page_prune() a few lines above. Still, it means that this patch hasn't been tested in that scenario, until we were lucky enough for a slow buildfarm machine like topminnow to hit it. What's more, because that block is the only way for "tupgone" to be set, we also don't reach the "if (tupgone)" block at lines 1183ff. And I think this patch has probably broken that, too. Surely, if we are not going to remove the tuple, we should not increment tups_vacuumed? And maybe we have to freeze it instead? How is it that we can, or should, treat this situation as different from a dead-but-not-removable tuple? I have a very strong feeling that this patch was not fully baked. BTW, I can reproduce the crash fairly easily (within a few cycles of the regression tests) by (a) inserting pg_usleep(10000) after the heap_page_prune call, and (b) making autovacuum much more aggressive. regards, tom lane
On Mon, Apr 15, 2019 at 1:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Masahiko Sawada <sawada.mshk@gmail.com> writes: > >> Ugh, I think the assertion is right but the above condition is > >> completely wrong. We should increment nleft_dead_tuples when index > >> cleanup is *not* enabled. > > > Here is a draft patch to fix this issue. > > So the real issue here, I fear, is that we've got no consistent testing > of the whole case block for HEAPTUPLE_DEAD, as you can easily confirm > by checking the code coverage report at > https://coverage.postgresql.org/src/backend/access/heap/vacuumlazy.c.gcov.html > > This is perhaps unsurprising, given the code comment that points out that > we can only reach that block if the tuple's state changed since > heap_page_prune() a few lines above. Still, it means that this patch > hasn't been tested in that scenario, until we were lucky enough for > a slow buildfarm machine like topminnow to hit it. > > What's more, because that block is the only way for "tupgone" to be > set, we also don't reach the "if (tupgone)" block at lines 1183ff. > And I think this patch has probably broken that, too. Surely, if we > are not going to remove the tuple, we should not increment tups_vacuumed? > And maybe we have to freeze it instead? How is it that we can, or should, > treat this situation as different from a dead-but-not-removable tuple? > > I have a very strong feeling that this patch was not fully baked. I think you're right, but I don't understand the comment in the preceding paragraph. How does this problem prevent tupgone from getting set? It looks to me like nleft_dead_tuples should be ripped out. It's basically trying to count the same thing as tups_vacuumed, and there doesn't seem to be any need to count that thing twice. And then just below this block: /* save stats for use later */ vacrelstats->tuples_deleted = tups_vacuumed; vacrelstats->new_dead_tuples = nkeep; vacrelstats->nleft_dead_itemids = nleft_dead_itemids; We should do something like: if (params->index_cleanup == VACOPT_TERNARY_DISABLED) { nkeep += tups_vacuumed; tups_vacuumed = 0; } -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Apr 15, 2019 at 1:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I have a very strong feeling that this patch was not fully baked. > I think you're right, but I don't understand the comment in the > preceding paragraph. How does this problem prevent tupgone from > getting set? My point is that I suspect that tupgone *shouldn't* get set. It's not (going to be) gone. > It looks to me like nleft_dead_tuples should be ripped out. That was pretty much what I was thinking too. It makes more sense just to treat this case identically to dead-but-not-yet-removable. I have substantial doubts about nleft_dead_itemids being worth anything, as well. > We should do something like: > if (params->index_cleanup == VACOPT_TERNARY_DISABLED) > { > nkeep += tups_vacuumed; > tups_vacuumed = 0; > } No. I'm thinking there should be exactly one test of index_cleanup in this logic, and what it would be is along the lines of if (HeapTupleIsHotUpdated(&tuple) || HeapTupleIsHeapOnly(&tuple) || + params->index_cleanup == VACOPT_TERNARY_DISABLED) nkeep += 1; else In general, this thing has a strong whiff of "large patch with a small patch struggling to get out". regards, tom lane
On Mon, Apr 15, 2019 at 3:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > No. I'm thinking there should be exactly one test of index_cleanup > in this logic, and what it would be is along the lines of > > if (HeapTupleIsHotUpdated(&tuple) || > HeapTupleIsHeapOnly(&tuple) || > + params->index_cleanup == VACOPT_TERNARY_DISABLED) > nkeep += 1; > else I'm not sure that's correct. If you do that, it'll end up in the non-tupgone case, which might try to freeze a tuple that should've been removed. Or am I confused? My idea of the mechanism of action of this patch is that we accumulate the tuples just as if we were going to vacuum them, but then at the end of each page we forget them all, sorta like there are no indexes. In that mental model, I don't really see why this part of this logic needs any adjustment at all vs. pre-patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Apr 15, 2019 at 3:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> No. I'm thinking there should be exactly one test of index_cleanup >> in this logic, and what it would be is along the lines of ... > I'm not sure that's correct. If you do that, it'll end up in the > non-tupgone case, which might try to freeze a tuple that should've > been removed. Or am I confused? If we're failing to remove it, and it's below the desired freeze horizon, then we'd darn well better freeze it instead, no? Since we know that the tuple only just became dead, I suspect that the case would be unreachable in practice. But the approach you propose risks violating the invariant that all old tuples will either be removed or frozen. regards, tom lane
On Mon, Apr 15, 2019 at 09:07:16PM -0400, Tom Lane wrote: > If we're failing to remove it, and it's below the desired freeze > horizon, then we'd darn well better freeze it instead, no? > > Since we know that the tuple only just became dead, I suspect > that the case would be unreachable in practice. But the approach > you propose risks violating the invariant that all old tuples > will either be removed or frozen. Please note that I have added an open item for this investigation (see "topminnow triggered assertion failure with vacuum_index_cleanup"). -- Michael
Вложения
On Tue, Apr 16, 2019 at 4:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Robert Haas <robertmhaas@gmail.com> writes: > > On Mon, Apr 15, 2019 at 1:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I have a very strong feeling that this patch was not fully baked. > > > I think you're right, but I don't understand the comment in the > > preceding paragraph. How does this problem prevent tupgone from > > getting set? > > My point is that I suspect that tupgone *shouldn't* get set. > It's not (going to be) gone. > > > It looks to me like nleft_dead_tuples should be ripped out. > > That was pretty much what I was thinking too. tups_vacuumed counts not only (1)dead-but-not-yet-removable tuple but also HOT-pruned tuples. These HOT-pruned tuples include both (2)the tuples we removed both its itemid and tuple storage and the tuples (3)we removed only its tuple storage and marked itemid as dead. So we cannot add tups_vacuumed to nkeeps as it includes completely removed tuple like tuples-(2). I added nleft_dead_itemids to count only tuples-(3) and nleft_dead_tuples to count only tuples-(1) for reporting. Tuples-(2) are removed even when index cleanup is disabled. > It makes more sense > just to treat this case identically to dead-but-not-yet-removable. > I have substantial doubts about nleft_dead_itemids being worth > anything, as well. I think that the adding tuples-(3) to nkeeps would be a good idea. If we do that, nleft_dead_tuples is no longer necessary. On the other hand, I think we need nleft_dead_itemids to report how many itemids we left when index cleanup is disabled. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I'm not sure that's correct. If you do that, it'll end up in the > > non-tupgone case, which might try to freeze a tuple that should've > > been removed. Or am I confused? > > If we're failing to remove it, and it's below the desired freeze > horizon, then we'd darn well better freeze it instead, no? I don't know that that's safe. IIRC, the freeze code doesn't cope nicely with being given a tuple that actually ought to have been deleted. It'll just freeze it anyway, which is obviously bad. Unless this has been changed since I last looked at it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-Apr-16, Robert Haas wrote: > On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > I'm not sure that's correct. If you do that, it'll end up in the > > > non-tupgone case, which might try to freeze a tuple that should've > > > been removed. Or am I confused? > > > > If we're failing to remove it, and it's below the desired freeze > > horizon, then we'd darn well better freeze it instead, no? > > I don't know that that's safe. IIRC, the freeze code doesn't cope > nicely with being given a tuple that actually ought to have been > deleted. It'll just freeze it anyway, which is obviously bad. Umm, but if we fail to freeze it, we'll leave a tuple around that's below the relfrozenxid for the table, causing later pg_commit to be truncated and error messages saying that the tuple cannot be read, no? I remember that for a similar case in multixact-land, what we do is generate a fresh multixact that carries the members that are still alive (ie. those that cause the multixact to be kept rather than remove it), and relabel the tuple with that one. So the old multixact can be removed safely. Obviously we cannot do that for XIDs, but I do wonder what can possibly cause a tuple to be unfreezable yet the XID to be below the freeze horizon. Surely if the transaction is that old, we should have complained about it, and generated a freeze horizon that was even older? > Unless this has been changed since I last looked at it. I don't think so. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Apr 16, 2019 at 11:26 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > I'm not sure that's correct. If you do that, it'll end up in the > > > non-tupgone case, which might try to freeze a tuple that should've > > > been removed. Or am I confused? > > > > If we're failing to remove it, and it's below the desired freeze > > horizon, then we'd darn well better freeze it instead, no? > > I don't know that that's safe. IIRC, the freeze code doesn't cope > nicely with being given a tuple that actually ought to have been > deleted. It'll just freeze it anyway, which is obviously bad. Hmm, I think that we already choose to leave HEAPTUPLE_DEAD tuples and might freeze them if HeapTupleIsHotUpdated() || HeapTupleIsHeapOnly(L1083 at vacuumlazy.c) is true, which actually have to be deleted. What difference between these tuples and the tuples that we intentionally leave when index cleanup is disabled? Maybe I'm missing something and confused. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Apr-16, Robert Haas wrote: >> On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> If we're failing to remove it, and it's below the desired freeze >>> horizon, then we'd darn well better freeze it instead, no? >> I don't know that that's safe. IIRC, the freeze code doesn't cope >> nicely with being given a tuple that actually ought to have been >> deleted. It'll just freeze it anyway, which is obviously bad. > Umm, but if we fail to freeze it, we'll leave a tuple around that's > below the relfrozenxid for the table, causing later pg_commit to be > truncated and error messages saying that the tuple cannot be read, no? Yeah. If you think that it's unsafe to freeze the tuple, then this entire patch is ill-conceived and needs to be reverted. I don't know how much more plainly I can put it: index_cleanup cannot be a license to ignore the freeze horizon. (Indeed, I do not quite see what the point of the feature is otherwise. Why would you run a vacuum with this option at all, if not to increase the table's relfrozenxid? But you can *not* advance relfrozenxid if you left old XIDs behind.) regards, tom lane
Hi, On 2019-04-16 10:54:34 -0400, Alvaro Herrera wrote: > On 2019-Apr-16, Robert Haas wrote: > > On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > I'm not sure that's correct. If you do that, it'll end up in the > > > > non-tupgone case, which might try to freeze a tuple that should've > > > > been removed. Or am I confused? > > > > > > If we're failing to remove it, and it's below the desired freeze > > > horizon, then we'd darn well better freeze it instead, no? > > > > I don't know that that's safe. IIRC, the freeze code doesn't cope > > nicely with being given a tuple that actually ought to have been > > deleted. It'll just freeze it anyway, which is obviously bad. > > Umm, but if we fail to freeze it, we'll leave a tuple around that's > below the relfrozenxid for the table, causing later pg_commit to be > truncated and error messages saying that the tuple cannot be read, no? Is the below-relfrozenxid case actually reachable? Isn't the theory of that whole codeblock that we ought to only get there if a transaction concurrently commits? * Ordinarily, DEAD tuples would have been removed by * heap_page_prune(), but it's possible that the tuple * state changed since heap_page_prune() looked. In * particular an INSERT_IN_PROGRESS tuple could have * changed to DEAD if the inserter aborted. So this * cannot be considered an error condition. And in case there was a concurrent transaction at the time of the heap_page_prune(), it got to be above the OldestXmin passed to HeapTupleSatisfiesVacuum() to - as it's the same OldestXmin value. And as FreezeLimit should always be older than than OldestXmin, we should never get into a situation where heap_page_prune() couldn't prune something that we would have been forced to remove? > > I don't know that that's safe. IIRC, the freeze code doesn't cope > > nicely with being given a tuple that actually ought to have been > > deleted. It'll just freeze it anyway, which is obviously bad. > > > > Unless this has been changed since I last looked at it. > > I don't think so. I think it has changed a bit - these days heap_prepare_freeze_tuple() will detect that case, and error out: /* * If we freeze xmax, make absolutely sure that it's not an XID * that is important. (Note, a lock-only xmax can be removed * independent of committedness, since a committed lock holder has * released the lock). */ if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) && TransactionIdDidCommit(xid)) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), errmsg_internal("cannot freeze committed xmax %u", xid))); and the equivalent multixact case: if (TransactionIdDidCommit(xid)) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), errmsg_internal("cannot freeze committed update xid %u", xid))); We even complain if xmin is uncommitted and would need to be frozen: if (TransactionIdPrecedes(xid, cutoff_xid)) { if (!TransactionIdDidCommit(xid)) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), errmsg_internal("uncommitted xmin %u from before xid cutoff %u needs to be frozen", xid, cutoff_xid))); I IIRC added that after one of the multixact issues lead to precisely that, heap_prepare_freeze_tuple() leading to a valid xmax just being emptied out, resurfacing dead tuples (and HOT corruption and such). These messages are obviously intended to be a backstop against continuing to corrupt further, than actually something a user should ever see in a working system. Greetings, Andres Freund
Hi, On 2019-04-16 11:38:01 -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > On 2019-Apr-16, Robert Haas wrote: > >> On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>> If we're failing to remove it, and it's below the desired freeze > >>> horizon, then we'd darn well better freeze it instead, no? > > >> I don't know that that's safe. IIRC, the freeze code doesn't cope > >> nicely with being given a tuple that actually ought to have been > >> deleted. It'll just freeze it anyway, which is obviously bad. > > > Umm, but if we fail to freeze it, we'll leave a tuple around that's > > below the relfrozenxid for the table, causing later pg_commit to be > > truncated and error messages saying that the tuple cannot be read, no? > > Yeah. If you think that it's unsafe to freeze the tuple, then this > entire patch is ill-conceived and needs to be reverted. I don't > know how much more plainly I can put it: index_cleanup cannot be a > license to ignore the freeze horizon. (Indeed, I do not quite see > what the point of the feature is otherwise. Why would you run a > vacuum with this option at all, if not to increase the table's > relfrozenxid? But you can *not* advance relfrozenxid if you left > old XIDs behind.) As I just wrote - I don't think this codepath can ever deal with tuples that old. Greetings, Andres Freund
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If we're failing to remove it, and it's below the desired freeze >> horizon, then we'd darn well better freeze it instead, no? > I don't know that that's safe. IIRC, the freeze code doesn't cope > nicely with being given a tuple that actually ought to have been > deleted. It'll just freeze it anyway, which is obviously bad. Looking at heap_prepare_freeze_tuple, it looks to me like it'd notice the problem and throw an error. The two possible reasons for a tuple to be dead are xmin aborted and xmax committed, right? There are tests in there that will complain if either of those is true and the xid is below the freeze horizon. Given that we don't get here except when the tuple has just become dead, it probably is all right to assume that it can't possibly get selected for freezing, and let those tests backstop the assumption. (BTW, I don't understand why that code will throw "found xmin %u from before relfrozenxid %u" if HeapTupleHeaderXminFrozen is true? Shouldn't the whole if-branch at lines 6113ff be skipped if xmin_frozen?) regards, tom lane PS: I see that mandrill just replicated the topminnow failure that started this discussion.
Hi, On 2019-04-16 12:01:36 -0400, Tom Lane wrote: > (BTW, I don't understand why that code will throw "found xmin %u from > before relfrozenxid %u" if HeapTupleHeaderXminFrozen is true? Shouldn't > the whole if-branch at lines 6113ff be skipped if xmin_frozen?) I *think* that just looks odd, but isn't actively wrong. That's because TransactionIdIsNormal() won't trigger, as: #define HeapTupleHeaderGetXmin(tup) \ ( \ HeapTupleHeaderXminFrozen(tup) ? \ FrozenTransactionId : HeapTupleHeaderGetRawXmin(tup) \ ) which afaict makes xmin_frozen = ((xid == FrozenTransactionId) || HeapTupleHeaderXminFrozen(tuple)); redundant. Looks like that was introduced relatively recently, in commit d2599ecfcc74fea9fad1720a70210a740c716730 Author: Alvaro Herrera <alvherre@alvh.no-ip.org> Date: 2018-05-04 15:24:44 -0300 Don't mark pages all-visible spuriously @@ -6814,6 +6815,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, /* Process xmin */ xid = HeapTupleHeaderGetXmin(tuple); + xmin_frozen = ((xid == FrozenTransactionId) || + HeapTupleHeaderXminFrozen(tuple)); if (TransactionIdIsNormal(xid)) { if (TransactionIdPrecedes(xid, relfrozenxid)) @@ -6832,9 +6835,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, frz->t_infomask |= HEAP_XMIN_FROZEN; changed = true; + xmin_frozen = true; } - else - totally_frozen = false; } Greetings, Andres Freund
So after thinking about this a bit more ... ISTM that what we have here is a race condition (ie, tuple changed state since heap_page_prune), and that ideally we want the code to resolve it as if no race had happened. That is, either of these behaviors would be acceptable: 1. Delete the tuple, just as heap_page_prune would've done if it had seen it DEAD. (Possibly we could implement that by jumping back and doing heap_page_prune again, but that seems pretty messy and bug-prone. In any case, if we're not doing index cleanup then this must reduce to "replace tuple by a dead line pointer", not remove it altogether.) 2. Act as if the tuple were still live, just as would've happened if the state didn't change till just after we looked instead of just before. Option #2 is a lot simpler and safer, and can be implemented as I suggested earlier, assuming we're all good with the assumption that heap_prepare_freeze_tuple isn't going to do anything bad. However ... it strikes me that there's yet another assumption in here that this patch has broken. Namely, notice that the reason we normally don't get here is that what heap_page_prune does with an already-DEAD tuple is reduce it to a dead line pointer and then count it in its return value, which gets added to tups_vacuumed. But then what lazy_scan_heap's per-tuple loop does is /* * DEAD item pointers are to be vacuumed normally; but we don't * count them in tups_vacuumed, else we'd be double-counting (at * least in the common case where heap_page_prune() just freed up * a non-HOT tuple). */ if (ItemIdIsDead(itemid)) { lazy_record_dead_tuple(vacrelstats, &(tuple.t_self)); all_visible = false; continue; } When this patch is active, it will *greatly* increase the odds that we report a misleading tups_vacuumed total, for two different reasons: * DEAD tuples reduced to dead line pointers during heap_page_prune will be counted as tups_vacuumed, even though we don't take the further step of removing the dead line pointer, as always happened before. * When, after some vacuum cycles with index_cleanup disabled, we finally do one with index_cleanup enabled, there are going to be a heck of a lot of old dead line pointers to clean out, which the existing logic won't count at all. That was only barely tolerable before, and it seems like this has pushed it over the bounds of silliness. People are going to be wondering why vacuum reports that it removed zillions of index entries and no tuples. I'm thinking that we really need to upgrade vacuum's reporting totals so that it accounts in some more-honest way for pre-existing dead line pointers. The patch as it stands has made the reporting even more confusing, rather than less so. BTW, the fact that dead line pointers will accumulate without limit makes me even more dubious of the proposition that this "feature" will be safe to enable as a reloption in production. I really think that we ought to restrict it to be a manual VACUUM option, to be used only when you're desperate to freeze old tuples. regards, tom lane
I wrote: > I'm thinking that we really need to upgrade vacuum's reporting totals > so that it accounts in some more-honest way for pre-existing dead > line pointers. The patch as it stands has made the reporting even more > confusing, rather than less so. Here's a couple of ideas about that: 1. Ignore heap_page_prune's activity altogether, on the grounds that it's just random chance that any cleanup done there was done during VACUUM and not some preceding DML operation. Make tups_vacuumed be the count of dead line pointers removed. The advantage of this way is that tups_vacuumed would become independent of previous non-VACUUM pruning activity, as well as preceding index-cleanup-disabled VACUUMs. But maybe it's hiding too much information. 2. Have heap_page_prune count, and add to tups_vacuumed, only HOT tuples that it deleted entirely. The action of replacing a DEAD root tuple with a dead line pointer doesn't count for anything. Then also add the count of dead line pointers removed to tups_vacuumed. Approach #2 is closer to the way we've defined tups_vacuumed up to now, but I think it'd be more realistic in cases where previous pruning or index-cleanup-disabled vacuums have left lots of dead line pointers. I'm not especially wedded to either of these --- any other ideas? regards, tom lane
On Wed, Apr 17, 2019 at 5:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: > > I'm thinking that we really need to upgrade vacuum's reporting totals > > so that it accounts in some more-honest way for pre-existing dead > > line pointers. The patch as it stands has made the reporting even more > > confusing, rather than less so. > > Here's a couple of ideas about that: > > 1. Ignore heap_page_prune's activity altogether, on the grounds that > it's just random chance that any cleanup done there was done during > VACUUM and not some preceding DML operation. Make tups_vacuumed > be the count of dead line pointers removed. The advantage of this > way is that tups_vacuumed would become independent of previous > non-VACUUM pruning activity, as well as preceding index-cleanup-disabled > VACUUMs. But maybe it's hiding too much information. > > 2. Have heap_page_prune count, and add to tups_vacuumed, only HOT > tuples that it deleted entirely. The action of replacing a DEAD > root tuple with a dead line pointer doesn't count for anything. > Then also add the count of dead line pointers removed to tups_vacuumed. > > Approach #2 is closer to the way we've defined tups_vacuumed up to > now, but I think it'd be more realistic in cases where previous > pruning or index-cleanup-disabled vacuums have left lots of dead > line pointers. On top of the approach #2, how about reporting the number of line pointers we left so that user can notice that there are many dead line pointers in the table? > > I'm not especially wedded to either of these --- any other ideas? Or, how about reporting the vacuumed tuples and line pointers we separately? It might be too detailed but since with this patch we delete either only tuples or only line pointers, it's more accurate. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Masahiko Sawada <sawada.mshk@gmail.com> writes: > On Wed, Apr 17, 2019 at 5:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'm thinking that we really need to upgrade vacuum's reporting totals >>> so that it accounts in some more-honest way for pre-existing dead >>> line pointers. The patch as it stands has made the reporting even more >>> confusing, rather than less so. > Or, how about reporting the vacuumed tuples and line pointers we > separately? It might be too detailed but since with this patch we > delete either only tuples or only line pointers, it's more accurate. Yeah, if we wanted to expose these complications more directly, we could think about adding or changing the main counters. I was wondering about whether we should have it report "x bytes and y line pointers freed", rather than counting tuples per se. regards, tom lane
On Wed, Apr 17, 2019 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, if we wanted to expose these complications more directly, we > could think about adding or changing the main counters. I was wondering > about whether we should have it report "x bytes and y line pointers > freed", rather than counting tuples per se. I like that idea, but I'm pretty sure that there are very few users that are aware of these distinctions at all. It would be a good idea to clearly document them. -- Peter Geoghegan
On Thu, Apr 18, 2019 at 4:20 AM Peter Geoghegan <pg@bowt.ie> wrote: > > On Wed, Apr 17, 2019 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Yeah, if we wanted to expose these complications more directly, we > > could think about adding or changing the main counters. I was wondering > > about whether we should have it report "x bytes and y line pointers > > freed", rather than counting tuples per se. > It looks good idea to me. > I like that idea, but I'm pretty sure that there are very few users > that are aware of these distinctions at all. It would be a good idea > to clearly document them. I completely agreed. I'm sure that only a few user can do the action of enabling index cleanup when the report says there are many dead line pointers in the table. It brought me an another idea of reporting something like "x bytes freed, y bytes can be freed after index cleanup". That is, we report how much bytes including tuples and line pointers we freed and how much bytes of line pointers can be freed after index cleanup. While index cleanup is enabled, the latter value should always be 0. If the latter value gets large user can be aware of necessity of doing index cleanup. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Apr 18, 2019 at 3:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Apr 18, 2019 at 4:20 AM Peter Geoghegan <pg@bowt.ie> wrote: > > > > On Wed, Apr 17, 2019 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Yeah, if we wanted to expose these complications more directly, we > > > could think about adding or changing the main counters. I was wondering > > > about whether we should have it report "x bytes and y line pointers > > > freed", rather than counting tuples per se. > > > > It looks good idea to me. > > > I like that idea, but I'm pretty sure that there are very few users > > that are aware of these distinctions at all. It would be a good idea > > to clearly document them. > > I completely agreed. I'm sure that only a few user can do the action > of enabling index cleanup when the report says there are many dead > line pointers in the table. > > It brought me an another idea of reporting something like "x bytes > freed, y bytes can be freed after index cleanup". That is, we report > how much bytes including tuples and line pointers we freed and how > much bytes of line pointers can be freed after index cleanup. While > index cleanup is enabled, the latter value should always be 0. If the > latter value gets large user can be aware of necessity of doing index > cleanup. > Attached the draft version of patch based on the discussion so far. This patch fixes two issues: the assertion error topminnow reported and the contents of the vacuum logs. For the first issue, I've changed lazy_scan_heap so that it counts the tuples that became dead after HOT pruning in nkeep when index cleanup is disabled. As per discussions so far, it would be no problem to try to freeze tuples that ought to have been deleted. For the second issue, I've changed lazy vacuum so that it reports both the number of kilobytes we freed and the number of kilobytes can be freed after index cleanup. The former value includes the size of not only heap tuples but also line pointers. That is, when a normal tuple has been marked as either dead or redirected we count only the size of heap tuple, and when it has been marked as unused we count the size of both heap tuple and line pointer. Similarly when either a redirect line pointer or a dead line pointer become unused we count only the size of line pointer. The latter value we report could be non-zero only when index cleanup is disabled; it counts the number of bytes of dead line pointers we left. The advantage of this change is that user can be aware of both how many bytes we freed and how many bytes we left due to skipping index cleanup. User can be aware of the necessity of doing index cleanup by seeing the latter value. Also with this patch, we count only tuples that has been marked as unused as deleted tuples. The action of replacing a dead root tuple with a dead line pointer doesn't count for anything. It would be close to the meaning of "deleted tuples" and less confusion. We do that when actually marking rather than when recording because we could record and forget dead tuples. Here is the sample of the report by VACUUM VERBOSE. I used the following script and the vacuum report is changed by the patch. * Script DROP TABLE IF EXISTS test; CREATE TABLE test (c int primary key, d int); INSERT INTO test SELECT * FROM generate_series(1,10000); DELETE FROM test WHERE c < 2000; VACUUM (INDEX_CLEANUP FALSE, VERBOSE) test; VACUUM (INDEX_CLEANUP TRUE, VERBOSE) test; * HEAD (when index cleanup is disabled) INFO: vacuuming "public.test" INFO: "test": found 1999 removable, 8001 nonremovable row versions in 45 out of 45 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 504 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. 0 tuples and 1999 item identifiers are left as dead. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM * Patched (when index cleanup is disabled) INFO: vacuuming "public.test" INFO: "test": 55 kB freed, 7996 bytes can be freed after index cleanup, table size: 360 kB DETAIL: found 8001 nonremovable row versions in 45 out of 45 pages. 0 dead row versions cannot be removed yet, oldest xmin: 1660 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM * HEAD (when index cleanup is enabled) INFO: vacuuming "public.test" INFO: scanned index "test_pkey" to remove 1999 row versions DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: "test": removed 1999 row versions in 9 pages DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: index "test_pkey" now contains 8001 row versions in 30 pages DETAIL: 1999 index row versions were removed. 5 index pages have been deleted, 0 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. INFO: "test": found 0 removable, 91 nonremovable row versions in 10 out of 45 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 504 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. 0 tuples and 0 item identifiers are left as dead. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM * Patched (when index cleanup is enabled) INFO: vacuuming "public.test" INFO: scanned index "test_pkey" to remove 1999 row versions DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: "test": removed 1999 row versions in 9 pages DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: index "test_pkey" now contains 8001 row versions in 30 pages DETAIL: 1999 index row versions were removed. 5 index pages have been deleted, 0 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. INFO: "test": 7996 bytes freed, table size: 360 kB DETAIL: found 91 nonremovable row versions in 10 out of 45 pages. 0 dead row versions cannot be removed yet, oldest xmin: 1660 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM The patch doesn't address the concern that Tom had, which is it might not be safe in production to accumulate the dead line pointers without limit when the reloption is set to false. I think this is a separate issue from the above two issues so I'd like to discuss that after these are solved. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Tue, Apr 23, 2019 at 7:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > For the second issue, I've changed lazy vacuum so that it reports both > the number of kilobytes we freed and the number of kilobytes can be > freed after index cleanup. I am not very convinced that this reporting is in any useful to users. Despite N kilobytes of tuples having been freed, the pages themselves are still allocated and the actual ability to reuse that space may be dependent on lots of factors that the user can't control like the sizes of newly-inserted tuples and the degree to which the free space map is accurate. I feel like we're drifting off into inventing new kinds of reporting here instead of focusing on fixing the reported defects of the already-committed patch, but perhaps I am taking too narrow a view of the situation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Apr 26, 2019 at 12:11 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Apr 23, 2019 at 7:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > For the second issue, I've changed lazy vacuum so that it reports both > > the number of kilobytes we freed and the number of kilobytes can be > > freed after index cleanup. > > I am not very convinced that this reporting is in any useful to users. > Despite N kilobytes of tuples having been freed, the pages themselves > are still allocated and the actual ability to reuse that space may be > dependent on lots of factors that the user can't control like the > sizes of newly-inserted tuples and the degree to which the free space > map is accurate. Hmm, it's a term problem? The phrase 'x bytes vacuumed' would solve it? > > I feel like we're drifting off into inventing new kinds of reporting > here instead of focusing on fixing the reported defects of the > already-committed patch, but perhaps I am taking too narrow a view of > the situation. I should have divided the patches into two: fixing assertion error and the reporting. I think we could think the latter issue also is a kind of bug because it can report something like "1000 index tuples vacuumed but 0 heap tuple vacuumed" in case where the vacuumed table had only dead line pointers. Maybe I should add an another open item for the latter. Attached the split version patches. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
Hi, This thread is referenced an open item, and we ought to make some progress on it. On a more cosmetic note: On 2019-04-16 09:10:19 -0700, Andres Freund wrote: > On 2019-04-16 12:01:36 -0400, Tom Lane wrote: > > (BTW, I don't understand why that code will throw "found xmin %u from > > before relfrozenxid %u" if HeapTupleHeaderXminFrozen is true? Shouldn't > > the whole if-branch at lines 6113ff be skipped if xmin_frozen?) > > I *think* that just looks odd, but isn't actively wrong. That's because > TransactionIdIsNormal() won't trigger, as: > > #define HeapTupleHeaderGetXmin(tup) \ > ( \ > HeapTupleHeaderXminFrozen(tup) ? \ > FrozenTransactionId : HeapTupleHeaderGetRawXmin(tup) \ > ) > > which afaict makes > xmin_frozen = ((xid == FrozenTransactionId) || > HeapTupleHeaderXminFrozen(tuple)); > redundant. > > Looks like that was introduced relatively recently, in > > commit d2599ecfcc74fea9fad1720a70210a740c716730 > Author: Alvaro Herrera <alvherre@alvh.no-ip.org> > Date: 2018-05-04 15:24:44 -0300 > > Don't mark pages all-visible spuriously > > > @@ -6814,6 +6815,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, > > /* Process xmin */ > xid = HeapTupleHeaderGetXmin(tuple); > + xmin_frozen = ((xid == FrozenTransactionId) || > + HeapTupleHeaderXminFrozen(tuple)); > if (TransactionIdIsNormal(xid)) > { > if (TransactionIdPrecedes(xid, relfrozenxid)) > @@ -6832,9 +6835,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, > > frz->t_infomask |= HEAP_XMIN_FROZEN; > changed = true; > + xmin_frozen = true; > } > - else > - totally_frozen = false; > } Alvaro, could we perhaps clean this up a bit? This is pretty confusing looking. I think this probably could just be changed to bool xmin_frozen = false; xid = HeapTupleHeaderGetXmin(tuple); if (xid == FrozenTransactionId) xmin_frozen = true; else if (TransactionIdIsNormal(xid)) or somesuch. There's no need to check for HeapTupleHeaderXminFrozen(tuple) etc, because HeapTupleHeaderGetXmin() already does so - and if it didn't, the issue Tom points out would be problematic. Greetings, Andres Freund
Hi, On 2019-04-06 16:13:53 +0900, Michael Paquier wrote: > On Sat, Apr 06, 2019 at 11:31:31AM +0900, Masahiko Sawada wrote: > > Yes, but Fujii-san pointed out that this option doesn't support toast > > tables and I think there is not specific reason why not supporting > > them. So it might be good to add toast.vacuum_index_cleanup. Attached > > patch. > > Being able to control that option at toast level sounds sensible. I > have added an open item about that. Robert, what is your stance on this open item? It's been an open item for about three weeks, without any progress. Greetings, Andres Freund
On Wed, May 1, 2019 at 12:14 PM Andres Freund <andres@anarazel.de> wrote: > On 2019-04-06 16:13:53 +0900, Michael Paquier wrote: > > On Sat, Apr 06, 2019 at 11:31:31AM +0900, Masahiko Sawada wrote: > > > Yes, but Fujii-san pointed out that this option doesn't support toast > > > tables and I think there is not specific reason why not supporting > > > them. So it might be good to add toast.vacuum_index_cleanup. Attached > > > patch. > > > > Being able to control that option at toast level sounds sensible. I > > have added an open item about that. > > Robert, what is your stance on this open item? It's been an open item > for about three weeks, without any progress. The actual bug in this patch needs to be fixed, but I see we have another open item for that. This open item, as I understand it, is all about whether we should add another reloption so that you can control this behavior separately for TOAST tables. In my opinion, that's not a critical change and the open item should be dropped, but others might see it differently. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Apr 16, 2019 at 12:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > So after thinking about this a bit more ... > > ISTM that what we have here is a race condition (ie, tuple changed state > since heap_page_prune), and that ideally we want the code to resolve it > as if no race had happened. That is, either of these behaviors would > be acceptable: > > 1. Delete the tuple, just as heap_page_prune would've done if it had seen > it DEAD. (Possibly we could implement that by jumping back and doing > heap_page_prune again, but that seems pretty messy and bug-prone. > In any case, if we're not doing index cleanup then this must reduce to > "replace tuple by a dead line pointer", not remove it altogether.) > > 2. Act as if the tuple were still live, just as would've happened if the > state didn't change till just after we looked instead of just before. > > Option #2 is a lot simpler and safer, and can be implemented as I > suggested earlier, assuming we're all good with the assumption that > heap_prepare_freeze_tuple isn't going to do anything bad. After studying this more carefully, I agree. You and Andres and Alvaro are all correct, and I'm plain wrong. Thanks for explaining. I have committed a patch that changes the logic as per your suggestion, and also removes nleft_dead_tuples and nleft_dead_itemids. I'll reply separately to your other point about tups_vacuumed reporting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On 2019-05-02 10:20:25 -0400, Robert Haas wrote: > On Tue, Apr 16, 2019 at 12:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > So after thinking about this a bit more ... > > 2. Act as if the tuple were still live, just as would've happened if the > > state didn't change till just after we looked instead of just before. > > > > Option #2 is a lot simpler and safer, and can be implemented as I > > suggested earlier, assuming we're all good with the assumption that > > heap_prepare_freeze_tuple isn't going to do anything bad. > > After studying this more carefully, I agree. You and Andres and > Alvaro are all correct, and I'm plain wrong. Thanks for explaining. > I have committed a patch that changes the logic as per your > suggestion, and also removes nleft_dead_tuples and nleft_dead_itemids. It'd be good if somebody could make a pass over the safety mechanisms in heap_prepare_freeze_tuple(). I added them at some point, after a data corrupting bug related to freezing, but they really were more intended as a secondary layer of defense, rather than the primary one. My understanding is still that we assume that we never should reach heap_prepare_freeze_tuple() for something that is below the horizon, even after this change, right? Greetings, Andres Freund
On Tue, Apr 16, 2019 at 4:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > I'm thinking that we really need to upgrade vacuum's reporting totals > > so that it accounts in some more-honest way for pre-existing dead > > line pointers. The patch as it stands has made the reporting even more > > confusing, rather than less so. > > Here's a couple of ideas about that: > > 1. Ignore heap_page_prune's activity altogether, on the grounds that > it's just random chance that any cleanup done there was done during > VACUUM and not some preceding DML operation. Make tups_vacuumed > be the count of dead line pointers removed. The advantage of this > way is that tups_vacuumed would become independent of previous > non-VACUUM pruning activity, as well as preceding index-cleanup-disabled > VACUUMs. But maybe it's hiding too much information. > > 2. Have heap_page_prune count, and add to tups_vacuumed, only HOT > tuples that it deleted entirely. The action of replacing a DEAD > root tuple with a dead line pointer doesn't count for anything. > Then also add the count of dead line pointers removed to tups_vacuumed. > > Approach #2 is closer to the way we've defined tups_vacuumed up to > now, but I think it'd be more realistic in cases where previous > pruning or index-cleanup-disabled vacuums have left lots of dead > line pointers. > > I'm not especially wedded to either of these --- any other ideas? I think it's almost impossible to have clear reporting here with only a single counter. There are two clearly-distinct cleanup operations going on here: (1) removing tuples from pages, and (2) making dead line pointers unused so that they can be reused for new tuples. They happen in equal quantity when there are no HOT updates: pruning makes dead tuples into dead line pointers, and then index vacuuming allows the dead line pointers to be set unused. But if there are HOT updates, intermediate tuples in each HOT chain are removed from the page but the line pointers are directly set to unused, so VACUUM could remove a lot of tuples but not need to make very many dead line pointers unused. On the other hand, the opposite could also happen: maybe lots of single-page HOT-pruning has happened prior to VACUUM, so VACUUM has lots of dead line pointers to make unused but removes very few tuples because that's already been done. For the most part, tups_vacuumed seems to be intending to count #1 rather than #2. While the comments for heap_page_prune and heap_prune_chain are not as clear as might be desirable, it appears to me that those functions are counting tuples removed from a page, ignoring everything that might happen to line pointers -- so using the return value of this function is consistent with wanting to count #1. However, there's one place that seems slightly unclear about this, namely this comment: /* * DEAD item pointers are to be vacuumed normally; but we don't * count them in tups_vacuumed, else we'd be double-counting (at * least in the common case where heap_page_prune() just freed up * a non-HOT tuple). */ If we're counting tuples removed from pages, then it's not merely that we would be double-counting, but that we would be counting completely the wrong thing. However, as far as I can see, that's just an issue with the comments; the code is in all cases counting tuple removals, not dead line pointers marked unused. If I understand correctly, your first proposal amounts to redefining tups_vacuumed to count #2 rather than #1, and your second proposal amounts to making tups_vacuumed count #1 + #2 rather than #1. I suggest a third option: have two counters. tups_vacuum can continue to count #1, with just a comment adjustment. And then we can add a second counter which is incremented every time lazy_vacuum_page does ItemIdSetUnused, which will count #2. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-05-02 11:09:10 -0400, Robert Haas wrote: > If I understand correctly, your first proposal amounts to redefining > tups_vacuumed to count #2 rather than #1, and your second proposal > amounts to making tups_vacuumed count #1 + #2 rather than #1. I > suggest a third option: have two counters. +1
On Thu, May 2, 2019 at 10:28 AM Andres Freund <andres@anarazel.de> wrote: > It'd be good if somebody could make a pass over the safety mechanisms in > heap_prepare_freeze_tuple(). I added them at some point, after a data > corrupting bug related to freezing, but they really were more intended > as a secondary layer of defense, rather than the primary one. My > understanding is still that we assume that we never should reach > heap_prepare_freeze_tuple() for something that is below the horizon, > even after this change, right? I think so. This code is hit if the tuple wasn't dead yet at the time that heap_page_prune() decided not to prune it, but it is dead by the time we retest the tuple status. But the same value of OldestXmin was used for both checks, so the only way that can really happen is if the XID in the tuple header was running before and is no longer running. However, if the XID was running at the time that heap_page_prune() ran, than OldestXmin certainly can't be newer than that XID. And therefore the value we're intending to set for relfrozenxid has surely got to be older, so we could hardly prune using OldestXmin as the threshold and then relfrozenxid to a newer XID. Actually, I now believe that my original concern here was exactly backwards. Prior to the logic change in this commit, with index vacuum disabled, a tuple that becomes dead between the heap_page_prune() check and the lazy_scan_heap() check would have followed the tupgone = true path. That would cause it to be treated as if the second vacuum pass were going to remove it - i.e. not frozen. But with index cleanup disabled, there will never be a second vacuum pass. So a tuple that was actually being kept was not sent to heap_prepare_freeze_tuple() with, basically, no justification. Imagine for example that the tuple was not old in terms of its XID age, but its MXID age was somehow ancient. Then we'd fail to freeze it on the theory that it was going to be removed, but yet not remove it. Oops. The revised logic - which is as per Tom's suggestion - does the right thing, which is treat the tuple as one we've chosen to keep. While looking at this code, I think I may have spotted another bug, or at least a near-bug. lazy_record_dead_tuple() thinks it's OK to just forget about dead tuples if there's not enough memory, which I think is OK in the normal case where the dead tuple has been truncated to a dead line pointer. But in the tupgone = true case it's NOT ok, because in that case we're leaving behind not just a dead line pointer but an actual tuple which we have declined to freeze on the assumption that it will be removed later. But if lazy_record_dead_tuple() forgets about it, then it won't be removed later. That could lead to tuples remaining that precede the freeze horizons. The reason why this may be only a near-bug is that we seem to try pretty hard to make sure that we'll never call that function in the first place without enough space being available. Still, it seems to me that it would be safer to change the code to look like: if (vacrelstats->num_dead_tuples >= vacrelstats->max_dead_tuples) elog(ERROR, "oh crap"); -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-May-01, Andres Freund wrote: > Alvaro, could we perhaps clean this up a bit? This is pretty confusing > looking. I think this probably could just be changed to > > bool xmin_frozen = false; > > xid = HeapTupleHeaderGetXmin(tuple); > > if (xid == FrozenTransactionId) > xmin_frozen = true; > else if (TransactionIdIsNormal(xid)) > > or somesuch. There's no need to check for > HeapTupleHeaderXminFrozen(tuple) etc, because HeapTupleHeaderGetXmin() > already does so - and if it didn't, the issue Tom points out would be > problematic. Ah, yeah, that's simpler. I would like to introduce a couple of very minor changes to the proposed style, per the attached. * don't initialize xmin_frozen at all; rather, only set its value to the correct one when we have determined what it is. Doing premature initialization is what led to some of those old bugs, so I prefer not to do it. * Handle the BootstrapXid and InvalidXid cases explicitly, by setting xmin_frozen to true when xmin is not normal. After all, those XID values do not need any freezing. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
Pushed. I added one comment to explain xmin_frozen also, which otherwise seemed a bit mysterious. I did not backpatch, though, so 9.6-11 are a bit different, but I'm not sure it's a good idea at this point, though it should be pretty innocuous. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, May 3, 2019 at 12:09 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Apr 16, 2019 at 4:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: > > > I'm thinking that we really need to upgrade vacuum's reporting totals > > > so that it accounts in some more-honest way for pre-existing dead > > > line pointers. The patch as it stands has made the reporting even more > > > confusing, rather than less so. > > > > Here's a couple of ideas about that: > > > > 1. Ignore heap_page_prune's activity altogether, on the grounds that > > it's just random chance that any cleanup done there was done during > > VACUUM and not some preceding DML operation. Make tups_vacuumed > > be the count of dead line pointers removed. The advantage of this > > way is that tups_vacuumed would become independent of previous > > non-VACUUM pruning activity, as well as preceding index-cleanup-disabled > > VACUUMs. But maybe it's hiding too much information. > > > > 2. Have heap_page_prune count, and add to tups_vacuumed, only HOT > > tuples that it deleted entirely. The action of replacing a DEAD > > root tuple with a dead line pointer doesn't count for anything. > > Then also add the count of dead line pointers removed to tups_vacuumed. > > > > Approach #2 is closer to the way we've defined tups_vacuumed up to > > now, but I think it'd be more realistic in cases where previous > > pruning or index-cleanup-disabled vacuums have left lots of dead > > line pointers. > > > > I'm not especially wedded to either of these --- any other ideas? > > I think it's almost impossible to have clear reporting here with only > a single counter. There are two clearly-distinct cleanup operations > going on here: (1) removing tuples from pages, and (2) making dead > line pointers unused so that they can be reused for new tuples. They > happen in equal quantity when there are no HOT updates: pruning makes > dead tuples into dead line pointers, and then index vacuuming allows > the dead line pointers to be set unused. But if there are HOT > updates, intermediate tuples in each HOT chain are removed from the > page but the line pointers are directly set to unused, so VACUUM could > remove a lot of tuples but not need to make very many dead line > pointers unused. On the other hand, the opposite could also happen: > maybe lots of single-page HOT-pruning has happened prior to VACUUM, so > VACUUM has lots of dead line pointers to make unused but removes very > few tuples because that's already been done. > > For the most part, tups_vacuumed seems to be intending to count #1 > rather than #2. While the comments for heap_page_prune and > heap_prune_chain are not as clear as might be desirable, it appears to > me that those functions are counting tuples removed from a page, > ignoring everything that might happen to line pointers -- so using the > return value of this function is consistent with wanting to count #1. > However, there's one place that seems slightly unclear about this, > namely this comment: > > /* > * DEAD item pointers are to be vacuumed normally; but we don't > * count them in tups_vacuumed, else we'd be double-counting (at > * least in the common case where heap_page_prune() just freed up > * a non-HOT tuple). > */ > > If we're counting tuples removed from pages, then it's not merely that > we would be double-counting, but that we would be counting completely > the wrong thing. However, as far as I can see, that's just an issue > with the comments; the code is in all cases counting tuple removals, > not dead line pointers marked unused. > > If I understand correctly, your first proposal amounts to redefining > tups_vacuumed to count #2 rather than #1, and your second proposal > amounts to making tups_vacuumed count #1 + #2 rather than #1. I > suggest a third option: have two counters. tups_vacuum can continue > to count #1, with just a comment adjustment. And then we can add a > second counter which is incremented every time lazy_vacuum_page does > ItemIdSetUnused, which will count #2. > > Thoughts? I agree to have an another counter. Please note that non-HOT-updated tuples that became dead after hot pruning (that is 'tupgone' tuples) are changed to unused directly in lazy_page_vacuum. Therefore we would need not to increment tups_vacuumed in tupgone case if we increment the new counter in lazy_page_vacuum. For the contents of vacuum verbose report, it could be worth to discuss whether reporting the number of deleted line pointers would be helpful for users. The reporting the number of line pointers we left might be more helpful for users. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, May 2, 2019 at 1:39 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, May 1, 2019 at 12:14 PM Andres Freund <andres@anarazel.de> wrote: > > On 2019-04-06 16:13:53 +0900, Michael Paquier wrote: > > > On Sat, Apr 06, 2019 at 11:31:31AM +0900, Masahiko Sawada wrote: > > > > Yes, but Fujii-san pointed out that this option doesn't support toast > > > > tables and I think there is not specific reason why not supporting > > > > them. So it might be good to add toast.vacuum_index_cleanup. Attached > > > > patch. > > > > > > Being able to control that option at toast level sounds sensible. I > > > have added an open item about that. > > > > Robert, what is your stance on this open item? It's been an open item > > for about three weeks, without any progress. > > The actual bug in this patch needs to be fixed, but I see we have > another open item for that. This open item, as I understand it, is > all about whether we should add another reloption so that you can > control this behavior separately for TOAST tables. In my opinion, > that's not a critical change and the open item should be dropped, but > others might see it differently. I agree that this item is neither critical and bug. But this is an (my) oversight and is a small patch and I think there is no specific reason why we don't dare to include this in 12. So if this patch could get reviewed enough I think we can have it in 12. Since the previous patch conflicts with current HEAD I've attached the rebased version patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
Hi, On 2019-05-09 14:14:20 +0900, Masahiko Sawada wrote: > I agree that this item is neither critical and bug. But this is an > (my) oversight and is a small patch and I think there is no specific > reason why we don't dare to include this in 12. So if this patch could > get reviewed enough I think we can have it in 12. Since the previous > patch conflicts with current HEAD I've attached the rebased version > patch. Robert, this indeed looks near trivial. What do you think? > diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml > index 44a61ef..1e1b0e8 100644 > --- a/doc/src/sgml/ref/create_table.sgml > +++ b/doc/src/sgml/ref/create_table.sgml > @@ -1406,7 +1406,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM > </varlistentry> > > <varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup"> > - <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>) > + <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>) > <indexterm> > <primary><varname>vacuum_index_cleanup</varname> storage parameter</primary> > </indexterm> > diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c > index cfbabb5..022b3a0 100644 > --- a/src/backend/access/common/reloptions.c > +++ b/src/backend/access/common/reloptions.c > @@ -147,7 +147,7 @@ static relopt_bool boolRelOpts[] = > { > "vacuum_index_cleanup", > "Enables index vacuuming and index cleanup", > - RELOPT_KIND_HEAP, > + RELOPT_KIND_HEAP | RELOPT_KIND_TOAST, > ShareUpdateExclusiveLock > }, > true > diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c > index e4c03de..2379b3d 100644 > --- a/src/bin/psql/tab-complete.c > +++ b/src/bin/psql/tab-complete.c > @@ -1056,6 +1056,7 @@ static const char *const table_storage_parameters[] = { > "toast.autovacuum_vacuum_scale_factor", > "toast.autovacuum_vacuum_threshold", > "toast.log_autovacuum_min_duration", > + "toast.vacuum_index_clenaup", > "toast.vacuum_truncate", > "toast_tuple_target", > "user_catalog_table", typo. Sawada-san, it'd be good if you could add at least some minimal tests in the style of the no_index_cleanup test in vacuum.sql. - Andres
Andres Freund <andres@anarazel.de> writes: > Robert, this indeed looks near trivial. What do you think? > On 2019-05-09 14:14:20 +0900, Masahiko Sawada wrote: >> + "toast.vacuum_index_clenaup", Not trivial enough to not have typos, apparently. regards, tom lane
On Sat, May 18, 2019 at 5:55 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2019-05-09 14:14:20 +0900, Masahiko Sawada wrote: > > I agree that this item is neither critical and bug. But this is an > > (my) oversight and is a small patch and I think there is no specific > > reason why we don't dare to include this in 12. So if this patch could > > get reviewed enough I think we can have it in 12. Since the previous > > patch conflicts with current HEAD I've attached the rebased version > > patch. > > Robert, this indeed looks near trivial. What do you think? > > > diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml > > index 44a61ef..1e1b0e8 100644 > > --- a/doc/src/sgml/ref/create_table.sgml > > +++ b/doc/src/sgml/ref/create_table.sgml > > @@ -1406,7 +1406,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM > > </varlistentry> > > > > <varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup"> > > - <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>) > > + <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>) > > <indexterm> > > <primary><varname>vacuum_index_cleanup</varname> storage parameter</primary> > > </indexterm> > > diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c > > index cfbabb5..022b3a0 100644 > > --- a/src/backend/access/common/reloptions.c > > +++ b/src/backend/access/common/reloptions.c > > @@ -147,7 +147,7 @@ static relopt_bool boolRelOpts[] = > > { > > "vacuum_index_cleanup", > > "Enables index vacuuming and index cleanup", > > - RELOPT_KIND_HEAP, > > + RELOPT_KIND_HEAP | RELOPT_KIND_TOAST, > > ShareUpdateExclusiveLock > > }, > > true > > diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c > > index e4c03de..2379b3d 100644 > > --- a/src/bin/psql/tab-complete.c > > +++ b/src/bin/psql/tab-complete.c > > @@ -1056,6 +1056,7 @@ static const char *const table_storage_parameters[] = { > > "toast.autovacuum_vacuum_scale_factor", > > "toast.autovacuum_vacuum_threshold", > > "toast.log_autovacuum_min_duration", > > + "toast.vacuum_index_clenaup", > > "toast.vacuum_truncate", > > "toast_tuple_target", > > "user_catalog_table", > > typo. > > Sawada-san, it'd be good if you could add at least some minimal tests in > the style of the no_index_cleanup test in vacuum.sql. > Thank you for comments. Attached updated version patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
On Fri, May 17, 2019 at 4:55 PM Andres Freund <andres@anarazel.de> wrote: > Robert, this indeed looks near trivial. What do you think? Hmm, yeah, I guess that'd be OK. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, May 20, 2019 at 11:43:19AM +0900, Masahiko Sawada wrote: > Thank you for comments. Attached updated version patch. This is an open item present for quite some time, so I have looked at the patch. The base patch is fine. +INSERT INTO no_index_cleanup(i, t) VALUES(1, repeat('1234567890',30000)); Do we really need a string as long as that? Is actually the existing set of tests that helpful? We now have only two VACUUM queries which run on no_index_cleanup, both of them using FULL so the reloption as well as the value of INDEX_CLEANUP are ignored. Wouldn't it be better to redesign a bit the tests with more combinations of options like that? -- Toast inherit the value from the table ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false); -- Value directly defined for both ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false, toast.vacuum_index_cleanup = true); ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true, toast.vacuum_index_cleanup = false); It seems to me that we'd want tests to make sure that indexes are actually cleaned up, where pageinspect could prove to be useful. -- Michael
Вложения
On Tue, Jun 18, 2019 at 10:39 PM Michael Paquier <michael@paquier.xyz> wrote: > +INSERT INTO no_index_cleanup(i, t) VALUES(1, repeat('1234567890',30000)); > Do we really need a string as long as that? Specifying EXTERNAL storage might make things easier. I have used PLAIN storage to test the 1/3 of a page restriction within nbtree, and to test a bug in amcheck that was related to TOAST compression. > It seems to me that we'd want tests to make sure that indexes are > actually cleaned up, where pageinspect could prove to be useful. That definitely seems preferable, but it'll be a bit tricky to do it in a way that doesn't run into buildfarm issues due to alignment. I suggest an index on a text column to avoid problems. -- Peter Geoghegan
On Wed, Jun 19, 2019 at 12:51:41PM -0700, Peter Geoghegan wrote: > On Tue, Jun 18, 2019 at 10:39 PM Michael Paquier <michael@paquier.xyz> wrote: >> +INSERT INTO no_index_cleanup(i, t) VALUES(1, repeat('1234567890',30000)); >> Do we really need a string as long as that? > > Specifying EXTERNAL storage might make things easier. I have used > PLAIN storage to test the 1/3 of a page restriction within nbtree, and > to test a bug in amcheck that was related to TOAST compression. Ah, good point here. That makes sense. >> It seems to me that we'd want tests to make sure that indexes are >> actually cleaned up, where pageinspect could prove to be useful. > > That definitely seems preferable, but it'll be a bit tricky to do it > in a way that doesn't run into buildfarm issues due to alignment. I > suggest an index on a text column to avoid problems. I am not completely sure how tricky that may be, so I'll believe you on this one :) So, to keep things simple and if we want to get something into v12, I would suggest to just stress more combinations even if the changes are not entirely visible yet. If we get a couple of queries to run with the option disabled on the table, its toast or both by truncating and filling in the table in-between, we may be able to catch some issues by stressing those code paths. And I finish with the attached. Thoughts? -- Michael
Вложения
On Thu, Jun 20, 2019 at 03:50:32PM +0900, Michael Paquier wrote: > And I finish with the attached. Thoughts? So, are there any objections with this patch? Or do people think that it's too late for v12 and that it is better to wait until v13 opens for business? -- Michael
Вложения
On Sun, Jun 23, 2019 at 10:29:25PM +0900, Michael Paquier wrote: > So, are there any objections with this patch? Or do people think that > it's too late for v12 and that it is better to wait until v13 opens > for business? Committed, and open item closed. -- Michael