Обсуждение: Setting visibility map in VACUUM's second phase

Поиск

Список

Период

Сортировка

Setting visibility map in VACUUM's second phase

От

Pavan Deolasee

Дата:

06 декабря 2012 г., 09:31:47

Hi All,

I briefly mentioned this idea in one of the other thread, but starting
a new thread to highlight the point. Today, we set the visibility map
*only* in the first phase of vacuum. This works when the page has no
dead tuples. But the vacuum itself is removing one or more dead tuples
from the page, the visibility map update must wait for the next vacuum
to come in. If the page receives no other action between this and the
next vacuum, the VM bit will get set next time but not before the next
vacuum reads and rescans the page. To explain this further, please see
the test below on current master:

I ran pgbench for a few seconds and then vacuumed the pgbench_accounts
table (stripping out non-interesting output)
INFO:  "pgbench_accounts": found 4709 removable, 1000000 nonremovable
row versions in 16475 out of 16475 pages

So VACUUM removed all the dead tuples from 16475 pages. One would
expect the next VACUUM to not do nothing because the table just
vacuumed is not being accessed at all. But if I run vacuum again, we
get this:
INFO:  "pgbench_accounts": found 0 removable, 1000000 nonremovable row
versions in 16475 out of 16475 pages

Its only third vacuum on the table that skips all the heap pages
because they all are now marked as all-visible in the VM.
INFO:  "pgbench_accounts": found 0 removable, 0 nonremovable row
versions in 0 out of 16475 pages

If I repeat the same test with the attached patch, the second vacuum
itself will skip all the heap pages because the VM is up-to-date at
the end of the first vacuum. Please see the output from the first and
second vacuum with the patch applied.
INFO:  "pgbench_accounts": found 4163 removable, 1000000 nonremovable
row versions in 16465 out of 16465 pages
INFO:  "pgbench_accounts": found 0 removable, 0 nonremovable row
versions in 0 out of 16465 pages

So the idea that the patch implements is this. When we scan pages in
the first phase of vacuum, if we find a page that has all-visible
tuples but also has one or more dead tuples that we know the second
phase of vacuum will remove, we mark such page with a special flag
called PD_ALL_VISIBLE_OR_DEAD (I'm not married to the name, so feel
free to suggest a better name). What this flag tells the second phase
of vacuum is to mark the page all-visible and set the VM bit unless
some intermediate action on the page again inserts a not-all-visible
tuple. If such an action happens, the PD_ALL_VISIBLE_OR_DEAD flag will
be cleared by that backend and the second phase of vacuum will not set
all-visible flag and VM bit.

The patch itself is quite small and works as intended. One thing that
demanded special handling is the fact that visibilitymap_set()
requires a cutoff xid to tell the Hot Standby to resolve conflicts
appropriately. We could have scanned the page again during the second
phase to compute the cutoff xid, but I thought we can overload the
pd_prune_xid field in the page header to store this information which
is already computed in the first phase. We don't need this field when
a page is in PD_ALL_VISIBLE_OR_DEAD state because there is nothing to
prune on the page as such in that state. I added a new page header
flag to tell if the XID stored in pd_prune_xid is a prune XID or a
cutoff XID. But may be its not required. If the page has
PD_ALL_VISIBLE_OR_DEAD flag set, then we can assume that the XID is a
cutoff XID, otherwise its a prune XID. So PageIsPrunable() will need
consult the pd_prune_xid only if PD_ALL_VISIBLE_OR_DEAD is clear.

AFAICS pgbench itself may not be the most appropriate benchmark to
test this feature because between successive vacuums of the large
pgbench_accounts table, I think almost every page in that table will
have dead tuples and need vacuum's attention. But I think this will be
a very powerful enhancement for the cases where a user does a lot of
UPDATE/DELETE operations on a large table and then calls VACUUM to
remove all the dead tuples. In the current master, the visibility map
will remain unset (as shown by the vacuum output above). Hence
index-only scans won't work until the table is vacuumed one more time.
Another downside on the current master is that the next vacuum will
once again read all those heap blocks in shared buffers and scan them
over again. This unnecessary IO activity can be avoided with this
idea.I don't have any reasonable hardware to test this though, so any
help is highly appreciated.

Comments ?

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Вложения

vacuum-secondphase-setvm-v1.patch

Re: Setting visibility map in VACUUM's second phase

От

Tom Lane

Дата:

06 декабря 2012 г., 18:01:36

Pavan Deolasee <pavan.deolasee@gmail.com> writes:
> So the idea that the patch implements is this. When we scan pages in
> the first phase of vacuum, if we find a page that has all-visible
> tuples but also has one or more dead tuples that we know the second
> phase of vacuum will remove, we mark such page with a special flag
> called PD_ALL_VISIBLE_OR_DEAD (I'm not married to the name, so feel
> free to suggest a better name). What this flag tells the second phase
> of vacuum is to mark the page all-visible and set the VM bit unless
> some intermediate action on the page again inserts a not-all-visible
> tuple. If such an action happens, the PD_ALL_VISIBLE_OR_DEAD flag will
> be cleared by that backend and the second phase of vacuum will not set
> all-visible flag and VM bit.

This seems overly complicated, as well as a poor use of a precious page
header flag bit.  Why not just recheck all-visibility status after
performing the deletions?  Keep in mind that even if there were some
not-all-visible tuples when we were on the page the first time, they
might have become all-visible by the time we return.  So this is going
out of its way to produce a less-than-optimal result.

> The patch itself is quite small and works as intended. One thing that
> demanded special handling is the fact that visibilitymap_set()
> requires a cutoff xid to tell the Hot Standby to resolve conflicts
> appropriately. We could have scanned the page again during the second
> phase to compute the cutoff xid, but I thought we can overload the
> pd_prune_xid field in the page header to store this information which
> is already computed in the first phase.

And this is *far* too cute.  The use of that field is complicated enough
already without having its meaning vary depending on randomly-designed
flag bits.  And it's by no means obvious that using a by-now-old value
when we return to the page is a good idea anyway (or even correct).

I think taking a second whack at setting the visibility bit is a fine
idea, but let's drop all the rest of this premature optimization.
        regards, tom lane

Re: Setting visibility map in VACUUM's second phase

От

Robert Haas

Дата:

06 декабря 2012 г., 18:36:03

On Thu, Dec 6, 2012 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think taking a second whack at setting the visibility bit is a fine
> idea, but let's drop all the rest of this premature optimization.

+1.

If there's any optimization needed here, we should try to do it by
remembering relevant details from the first vacuum pass in
backend-private memory, rather than by changing the on-disk format.

One other thought: I'm wondering if we shouldn't try to push the work
of setting the all-visible bit into heap_page_prune(). That would
allow HOT pruning to set the bit. For example, consider an
all-visible page. A tuple is HOT-updated and the page becomes
not-all-visible. Now the page is pruned, removing the old tuple and
changing the line pointer to a redirect. Presto, page is all-visible
again.

Also, it seems to me that heap_page_prune() is already figuring out
most of the stuff we need to know in order to decide whether to set
PD_ALL_VISIBLE. The logic looks quite different from what is
happening in the vacuum code, because the vacuum code iterates over
all of the line pointers while the pruning code is walking HOT chains.But it seems to me that a page can't be
all-visibleunless there are

no dead line pointers and no HOT chains of length != 1, and
heap_prune_chain() does manage to call HeapTupleSatisfiesVacuum() for
every tuple, so the raw information seems like it is available without
any additional CLOG lookups.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Setting visibility map in VACUUM's second phase

От

Tom Lane

Дата:

06 декабря 2012 г., 18:57:51

Robert Haas <robertmhaas@gmail.com> writes:
> One other thought: I'm wondering if we shouldn't try to push the work
> of setting the all-visible bit into heap_page_prune().

Hm, maybe ...

>  But it seems to me that a page can't be all-visible unless there are
> no dead line pointers and no HOT chains of length != 1, and
> heap_prune_chain() does manage to call HeapTupleSatisfiesVacuum() for
> every tuple, so the raw information seems like it is available without
> any additional CLOG lookups.

HeapTupleSatisfiesVacuum is interested in whether a dead tuple is dead
to everybody, but I don't think it figures out whether a live tuple is
live to everybody.  On the assumption that most tuples are live, adding
the latter calculation might represent significant expense.
        regards, tom lane

Re: Setting visibility map in VACUUM's second phase

От

Pavan Deolasee

Дата:

07 декабря 2012 г., 03:52:13

On Thu, Dec 6, 2012 at 11:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>
> I think taking a second whack at setting the visibility bit is a fine
> idea, but let's drop all the rest of this premature optimization.
>

Fair enough. I thought about doing it that way but was worried that an
additional page scan will raise eyebrows. While it does not affect the
common case because we would have done that scan anyways in the
subsequent vacuum, but in the worst case where most of the pages not
remain all-visible by the time we come back to the second phase of
vacuum, this additional line pointer scan will add some overhead. With
couple of pluses for the approach, I won't mind doing it this way
though.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: Setting visibility map in VACUUM's second phase

От

Pavan Deolasee

Дата:

07 декабря 2012 г., 04:07:18

On Fri, Dec 7, 2012 at 12:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Dec 6, 2012 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think taking a second whack at setting the visibility bit is a fine
>> idea, but let's drop all the rest of this premature optimization.
>
> +1.
>
> If there's any optimization needed here, we should try to do it by
> remembering relevant details from the first vacuum pass in
> backend-private memory, rather than by changing the on-disk format.
>

Yeah, I talked about that approach on the other thread. I thought we
can store the page LSN in the backend private memory for all such
pages and compare that with the current page LSN to know if the page
got an intermediate action to demand a recheck for all-visibility. But
I agree to keep these aggressive optimizations to side for now and
revisit them if necessary.

> One other thought: I'm wondering if we shouldn't try to push the work
> of setting the all-visible bit into heap_page_prune().  That would
> allow HOT pruning to set the bit.  For example, consider an
> all-visible page.  A tuple is HOT-updated and the page becomes
> not-all-visible.  Now the page is pruned, removing the old tuple and
> changing the line pointer to a redirect.  Presto, page is all-visible
> again.
>

+1

I had submitted a patch for that way back in 2008 or 2009, but blame
me for not pursuing to the end.
http://archives.postgresql.org/message-id/2e78013d0812042257x175e5a45w5edeaff14f7249ac@mail.gmail.com

Alex Hunsaker had reviewed that patch and confirmed a significant
improvement in the vacuum time. I think the patch needed some rework,
but I dropped the ball or got busy with other things while waiting on
others. If you think its useful to have, I will do the necessary work
and submit for the next commitfest.

Excerpts from Alex's comments:

"The only major difference was with this patch vacuum time (after the
first select after some hot updates) was significantly reduced for my
test case (366ms vs 16494ms).

There was no noticeable (within noise) select or update slow down.

I was able to trigger WARNING:  PD_ALL_VISIBLE flag once while running
pgbench but have not be able to re-create it... (should I keep
trying?)"

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: Setting visibility map in VACUUM's second phase

От

Amit Kapila

Дата:

07 декабря 2012 г., 04:56:11

On Friday, December 07, 2012 12:06 AM Robert Haas wrote:
> On Thu, Dec 6, 2012 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I think taking a second whack at setting the visibility bit is a fine
> > idea, but let's drop all the rest of this premature optimization.
> 
> +1.
> 
> If there's any optimization needed here, we should try to do it by
> remembering relevant details from the first vacuum pass in
> backend-private memory, rather than by changing the on-disk format.
> 
> One other thought: I'm wondering if we shouldn't try to push the work
> of setting the all-visible bit into heap_page_prune().  That would
> allow HOT pruning to set the bit.  For example, consider an
> all-visible page.  A tuple is HOT-updated and the page becomes
> not-all-visible.  Now the page is pruned, removing the old tuple and
> changing the line pointer to a redirect.  Presto, page is all-visible
> again.
 I think it can reduce some load of Vacuum as well, but the only thing is
it should not make Page prune as heavy. By the way, isn't this idea similar to patch at below link:
http://archives.postgresql.org/pgsql-hackers/2010-02/msg02344.php

With Regards,
Amit Kapila.

Re: Setting visibility map in VACUUM's second phase

От

Pavan Deolasee

Дата:

07 декабря 2012 г., 12:48:13

On Fri, Dec 7, 2012 at 9:21 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
> On Thu, Dec 6, 2012 at 11:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>>
>> I think taking a second whack at setting the visibility bit is a fine
>> idea, but let's drop all the rest of this premature optimization.
>>
>
> Fair enough. I thought about doing it that way but was worried that an
> additional page scan will raise eyebrows. While it does not affect the
> common case because we would have done that scan anyways in the
> subsequent vacuum, but in the worst case where most of the pages not
> remain all-visible by the time we come back to the second phase of
> vacuum, this additional line pointer scan will add some overhead. With
> couple of pluses for the approach, I won't mind doing it this way
> though.
>

Revised patch attached. This time much less invasive. I added a new
function heap_page_is_all_visible() to check if the given page is
all-visible and also return the visibility cutoff xid. Couple of
things:

- We use the same OldestXmin computed in the first phase for
visibility checks. We can potentially recompute this at the start of
second phase. I wasn't convinced thats needed or even correct

- It will be a good idea to consolidate the page scanning in a single
place to avoid code duplication. But may be will do that some other
day.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Вложения

vacuum-secondphase-setvm-v2.patch

Re: Setting visibility map in VACUUM's second phase

От

Jeff Janes

Дата:

24 января 2013 г., 09:11:37

On Friday, December 7, 2012, Pavan Deolasee wrote:

Revised patch attached. This time much less invasive. I added a new
function heap_page_is_all_visible() to check if the given page is
all-visible and also return the visibility cutoff xid.

Hi Pavan,

lazy_vacuum_page now pins and unpins the vmbuffer for each page it marks all-visible, which seems like a lot of needless traffic since the next vmbuffer is likely to be the same as the previous one.

Could it instead get vmbuffer passed down to it from the two calling sites? (One call site would have to have it introduced, just for this purpose, the other should be able to use its existing one.)

Cheers,

Jeff

Re: Setting visibility map in VACUUM's second phase

От

Pavan Deolasee

Дата:

24 января 2013 г., 09:29:01

Hi Jeff,

On Thu, Jan 24, 2013 at 2:41 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> lazy_vacuum_page now pins and unpins the vmbuffer for each page it marks
> all-visible, which seems like a lot of needless traffic since the next
> vmbuffer is likely to be the same as the previous one.
>

Good idea. Even though the cost of pinning/unpinning may not be high
with respect to the vacuum cost itself, but it seems to be a good idea
because we already do that at other places. Do you have any other
review comments on the patch or I'll fix this and send an updated
patch soon.

Thanks,
Pavan
-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: Setting visibility map in VACUUM's second phase

От

Jeff Janes

Дата:

24 января 2013 г., 16:01:19

On Thu, Jan 24, 2013 at 1:28 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> Hi Jeff,
>
> On Thu, Jan 24, 2013 at 2:41 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> lazy_vacuum_page now pins and unpins the vmbuffer for each page it marks
>> all-visible, which seems like a lot of needless traffic since the next
>> vmbuffer is likely to be the same as the previous one.
>>
>
> Good idea. Even though the cost of pinning/unpinning may not be high
> with respect to the vacuum cost itself, but it seems to be a good idea
> because we already do that at other places. Do you have any other
> review comments on the patch or I'll fix this and send an updated
> patch soon.

That was the only thing that stood out to me.  You had some doubts
about visibility_cutoff_xid, but I don't know enough about that to
address them.  I agree that it would be nice to avoid the code
duplication, but I don't see a reasonable way to do that.

It applies cleanly (some offsets), builds without warnings, passes
make check under cassert.    No documentation or regression tests are
needed.  We want this, and it does it.

I have verified the primary objective (setting vm sooner) but haven't
experimentally verified that this actually increases throughput for
some workload.  For pgbench when all data fits in shared_buffers, at
least it doesn't cause a readily noticeable slow down.

I haven't devised any patch-specific test cases, either for
correctness or performance.  I just used make check, pgbench, and the
"vacuum verbose" verification you told us about in the original
submission.

If we are going to scan a block twice, I wonder if it should be done
the other way around.  If there are any dead tuples that need to be
removed from indexes, there is no point in dirtying the page with a
HOT prune on the first pass when it will just be dirtied again after
the indexes are vacuumed.  I don't see this idea holding up your patch
though, as surely it would be more invasive than what you are doing.

Cheers,

Jeff

Re: Setting visibility map in VACUUM's second phase

От

Pavan Deolasee

Дата:

27 января 2013 г., 07:25:31

On Thu, Jan 24, 2013 at 9:31 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Thu, Jan 24, 2013 at 1:28 AM, Pavan Deolasee
> <pavan.deolasee@gmail.com> wrote:

>>
>> Good idea. Even though the cost of pinning/unpinning may not be high
>> with respect to the vacuum cost itself, but it seems to be a good idea
>> because we already do that at other places. Do you have any other
>> review comments on the patch or I'll fix this and send an updated
>> patch soon.
>
> That was the only thing that stood out to me.

The attached patch gets that improvement. Also rebased on the latest head.

> You had some doubts
> about visibility_cutoff_xid, but I don't know enough about that to
> address them.  I agree that it would be nice to avoid the code
> duplication, but I don't see a reasonable way to do that.
>

I have left a comment there to ensure that someone changing the code
also takes pain to look at the other part.

> It applies cleanly (some offsets), builds without warnings, passes
> make check under cassert.    No documentation or regression tests are
> needed.  We want this, and it does it.
>
> I have verified the primary objective (setting vm sooner) but haven't
> experimentally verified that this actually increases throughput for
> some workload.  For pgbench when all data fits in shared_buffers, at
> least it doesn't cause a readily noticeable slow down.
>
> I haven't devised any patch-specific test cases, either for
> correctness or performance.  I just used make check, pgbench, and the
> "vacuum verbose" verification you told us about in the original
> submission.

Thanks for the tests and all the review.

>
> If we are going to scan a block twice, I wonder if it should be done
> the other way around.  If there are any dead tuples that need to be
> removed from indexes, there is no point in dirtying the page with a
> HOT prune on the first pass when it will just be dirtied again after
> the indexes are vacuumed.  I don't see this idea holding up your patch
> though, as surely it would be more invasive than what you are doing.
>

I think there is a merit in this idea. But as you rightly said, we
should tackle that as a separate patch.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Вложения

vacuum-secondphase-setvm-v3.patch

Re: Setting visibility map in VACUUM's second phase

От

Jeff Janes

Дата:

02 февраля 2013 г., 21:01:44

On Sat, Jan 26, 2013 at 11:25 PM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> On Thu, Jan 24, 2013 at 9:31 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> On Thu, Jan 24, 2013 at 1:28 AM, Pavan Deolasee
>> <pavan.deolasee@gmail.com> wrote:
>
>>>
>>> Good idea. Even though the cost of pinning/unpinning may not be high
>>> with respect to the vacuum cost itself, but it seems to be a good idea
>>> because we already do that at other places. Do you have any other
>>> review comments on the patch or I'll fix this and send an updated
>>> patch soon.
>>
>> That was the only thing that stood out to me.
>
> The attached patch gets that improvement. Also rebased on the latest head.

Hi Pavan,

I get this warning:
vacuumlazy.c:890: warning: passing argument 6 of 'lazy_vacuum_page'
makes pointer from integer without a cast

and make check then fails.

I've added '&' to that line, and it now passes make check with --enable-cassert.

At line 1096, when you release the vmbuffer, you don't set it to
InvalidBuffer like the other places in the code do.  It seems like
this does would lead to a crash or assertion failure, but it does not
seem to do so.

other places:
            if (BufferIsValid(vmbuffer))
              {
                  ReleaseBuffer(vmbuffer);
                  vmbuffer = InvalidBuffer;
              }

Also, the "Note: If you change anything below, also look at" should
probably say "Note: If you change anything in the for loop below, also
look at".  Otherwise I'd be wondering how far below the caveat
applies.

I've attached a patch with these changes made.  Does this look OK?

Thanks,

Jeff

Вложения

vacuum-secondphase-setvm-v4.patch

Re: Setting visibility map in VACUUM's second phase

От

Pavan Deolasee

Дата:

03 февраля 2013 г., 06:24:47

On Sun, Feb 3, 2013 at 2:31 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

> Hi Pavan,
>
> I get this warning:
> vacuumlazy.c:890: warning: passing argument 6 of 'lazy_vacuum_page'
> makes pointer from integer without a cast
>
> and make check then fails.
>
> I've added '&' to that line, and it now passes make check with --enable-cassert.
>

Stupid me. Obviously I did not run make check before submitting the
patch, but I'm surprised my short pgbench test did not catch this.
Thanks a lot for finding and fixing this.

>
> At line 1096, when you release the vmbuffer, you don't set it to
> InvalidBuffer like the other places in the code do.  It seems like
> this does would lead to a crash or assertion failure, but it does not
> seem to do so.
>

That's harmless because vmbuffer is just a local variable in that
function and we are at the end of the function and that variable is
not used again. But it helps to just be consistent. So I'm OK with
your change.

>
> Also, the "Note: If you change anything below, also look at" should
> probably say "Note: If you change anything in the for loop below, also
> look at".  Otherwise I'd be wondering how far below the caveat
> applies.

Ok.

>
> I've attached a patch with these changes made.  Does this look OK?
>

Looks good to me. I also repeated pgbench and make check and they work
as expected. I'll add it to the CF and also mark the patch "ready for
committer"

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: Setting visibility map in VACUUM's second phase

От

Heikki Linnakangas

Дата:

13 февраля 2013 г., 15:53:28

On 03.02.2013 08:24, Pavan Deolasee wrote:
> On Sun, Feb 3, 2013 at 2:31 AM, Jeff Janes<jeff.janes@gmail.com>  wrote:
>> I've attached a patch with these changes made.  Does this look OK?
>
> Looks good to me. I also repeated pgbench and make check and they work
> as expected. I'll add it to the CF and also mark the patch "ready for
> committer"

Looks good to me too. Committed, thanks.

- Heikki

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Setting visibility map in VACUUM's second phase

Вложения

Вложения

Вложения

Вложения