Обсуждение: Write visibility map during CLUSTER/VACUUM FULL

Поиск
Список
Период
Сортировка

Write visibility map during CLUSTER/VACUUM FULL

От
Alexander Korotkov
Дата:
Hi!

I found it weird that CLUSTER/VACUUM FULL don't write visibility map.
Attached patch implements writing visibility map in
heapam_relation_copy_for_cluster().

I've studied previous attempt to implement this [1].  The main problem
of that attempt was usage of existing heap_page_is_all_visible() and
visibilitymap_set() functions.  These functions works through buffer
manager, while heap rewriting is made bypass buffer manager.

In my patch visibility map pages are handled in the same way as heap
pages are.  RewriteStateData holds contents of single VM page.  Once
heap page is finished, corresponding bits are set to VM page etc.  VM
pages are flushed one-by-one to smgr.

This patch have to implement its own check if tuple is allvisible.
But it appears to be possible to simplify this check assuming that all
tuples already past HeapTupleSatisfiesVacuum(), which sets hint bits.

Links
1. https://www.postgresql.org/message-id/flat/20131203162556.GC27105%40momjian.us#90e4a6e77d92076650dcf1d96b32ba38

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

Re: Write visibility map during CLUSTER/VACUUM FULL

От
Alexander Korotkov
Дата:
On Sun, Sep 1, 2019 at 11:07 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> This patch have to implement its own check if tuple is allvisible.
> But it appears to be possible to simplify this check assuming that all
> tuples already past HeapTupleSatisfiesVacuum(), which sets hint bits.

Forgot to check tuple xmin against oldest xmin.  Fixed.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

Re: Write visibility map during CLUSTER/VACUUM FULL

От
Amit Kapila
Дата:
On Sun, Sep 1, 2019 at 1:37 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>
> Hi!
>
> I found it weird that CLUSTER/VACUUM FULL don't write visibility map.
> Attached patch implements writing visibility map in
> heapam_relation_copy_for_cluster().
>
> I've studied previous attempt to implement this [1].  The main problem
> of that attempt was usage of existing heap_page_is_all_visible() and
> visibilitymap_set() functions.  These functions works through buffer
> manager, while heap rewriting is made bypass buffer manager.
>
> In my patch visibility map pages are handled in the same way as heap
> pages are.
>

I haven't studied this patch in detail, but while glancing I observed
that this doesn't try to sync the vm pages as we do for heap pages in
the end (during end_heap_rewrite).  Am I missing something?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Write visibility map during CLUSTER/VACUUM FULL

От
Alexander Korotkov
Дата:
On Wed, Sep 11, 2019 at 3:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sun, Sep 1, 2019 at 1:37 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > I found it weird that CLUSTER/VACUUM FULL don't write visibility map.
> > Attached patch implements writing visibility map in
> > heapam_relation_copy_for_cluster().
> >
> > I've studied previous attempt to implement this [1].  The main problem
> > of that attempt was usage of existing heap_page_is_all_visible() and
> > visibilitymap_set() functions.  These functions works through buffer
> > manager, while heap rewriting is made bypass buffer manager.
> >
> > In my patch visibility map pages are handled in the same way as heap
> > pages are.
> >
>
> I haven't studied this patch in detail, but while glancing I observed
> that this doesn't try to sync the vm pages as we do for heap pages in
> the end (during end_heap_rewrite).  Am I missing something?

You're not missed anything.  Yes, VM need sync.  Will fix this.  And I
just noticed I need a closer look to what is going on with TOAST.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Write visibility map during CLUSTER/VACUUM FULL

От
Alexander Korotkov
Дата:
On Thu, Sep 12, 2019 at 4:55 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Wed, Sep 11, 2019 at 3:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Sun, Sep 1, 2019 at 1:37 PM Alexander Korotkov
> > <a.korotkov@postgrespro.ru> wrote:
> > > I found it weird that CLUSTER/VACUUM FULL don't write visibility map.
> > > Attached patch implements writing visibility map in
> > > heapam_relation_copy_for_cluster().
> > >
> > > I've studied previous attempt to implement this [1].  The main problem
> > > of that attempt was usage of existing heap_page_is_all_visible() and
> > > visibilitymap_set() functions.  These functions works through buffer
> > > manager, while heap rewriting is made bypass buffer manager.
> > >
> > > In my patch visibility map pages are handled in the same way as heap
> > > pages are.
> > >
> >
> > I haven't studied this patch in detail, but while glancing I observed
> > that this doesn't try to sync the vm pages as we do for heap pages in
> > the end (during end_heap_rewrite).  Am I missing something?
>
> You're not missed anything.  Yes, VM need sync.  Will fix this.  And I
> just noticed I need a closer look to what is going on with TOAST.

Attached patch syncs VM during end_heap_rewrite().

However, VM for TOAST still isn't read.  It appear to be much more
difficult to write VM for TOAST, because it's written by insertion
tuples one-by-one.  Despite it seems to fill TOAST heap pages
sequentially (assuming no FSM exists yet), it's quite hard to handle
page-switching event with reasonable level of abstraction.
Nevertheless, I find this patch useful in current shape.  Even if we
don't write VM for TOAST, it's still useful to do for regular heap.
Additionally, one of key advantages of having VM is index-only scan,
which don't work for TOAST anyway.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

Re: Write visibility map during CLUSTER/VACUUM FULL

От
Andres Freund
Дата:
Hi,

On 2019-09-13 22:22:50 +0300, Alexander Korotkov wrote:
> On Thu, Sep 12, 2019 at 4:55 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > On Wed, Sep 11, 2019 at 3:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > On Sun, Sep 1, 2019 at 1:37 PM Alexander Korotkov
> > > <a.korotkov@postgrespro.ru> wrote:
> > > > I found it weird that CLUSTER/VACUUM FULL don't write visibility map.
> > > > Attached patch implements writing visibility map in
> > > > heapam_relation_copy_for_cluster().
> > > >
> > > > I've studied previous attempt to implement this [1].  The main problem
> > > > of that attempt was usage of existing heap_page_is_all_visible() and
> > > > visibilitymap_set() functions.  These functions works through buffer
> > > > manager, while heap rewriting is made bypass buffer manager.
> > > >
> > > > In my patch visibility map pages are handled in the same way as heap
> > > > pages are.

Why do you want to do that? To me that doesn't immediately seems like a
good idea, and I've not seen justification for it in this thread. Did I
miss something?


> > > I haven't studied this patch in detail, but while glancing I observed
> > > that this doesn't try to sync the vm pages as we do for heap pages in
> > > the end (during end_heap_rewrite).  Am I missing something?
> >
> > You're not missed anything.  Yes, VM need sync.  Will fix this.  And I
> > just noticed I need a closer look to what is going on with TOAST.
> 
> Attached patch syncs VM during end_heap_rewrite().
> 
> However, VM for TOAST still isn't read.  It appear to be much more
> difficult to write VM for TOAST, because it's written by insertion
> tuples one-by-one.  Despite it seems to fill TOAST heap pages
> sequentially (assuming no FSM exists yet), it's quite hard to handle
> page-switching event with reasonable level of abstraction.
> Nevertheless, I find this patch useful in current shape.  Even if we
> don't write VM for TOAST, it's still useful to do for regular heap.
> Additionally, one of key advantages of having VM is index-only scan,
> which don't work for TOAST anyway.

Have you looked at the discussion around
https://www.postgresql.org/message-id/20190408010427.4l63qr7h2fjcyp77%40alap3.anarazel.de
?

That's not quite the same thing as you're tackling here, but I wonder if
some of the logic could be the same. Especially for toast.


> +/* Write contents of VM page */
> +static void
> +rewrite_flush_vm_page(RewriteState state)
> +{
> +    Assert(state->rs_vm_buffer_valid);
> +
> +    if (state->rs_use_wal)
> +        log_newpage(&state->rs_new_rel->rd_node,
> +                    VISIBILITYMAP_FORKNUM,
> +                    state->rs_vm_blockno,
> +                    state->rs_vm_buffer,
> +                    true);
> +    RelationOpenSmgr(state->rs_new_rel);
> +
> +    PageSetChecksumInplace(state->rs_vm_buffer, state->rs_vm_blockno);
> +
> +    smgrextend(state->rs_new_rel->rd_smgr, VISIBILITYMAP_FORKNUM,
> +               state->rs_vm_blockno, (char *) state->rs_vm_buffer, true);
> +
> +    state->rs_vm_buffer_valid = false;
> +}
> +
> +/* Set VM flags to the VM page */
> +static void
> +rewrite_set_vm_flags(RewriteState state)
> +{
> +    BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(state->rs_blockno);
> +    uint32        mapByte = HEAPBLK_TO_MAPBYTE(state->rs_blockno);
> +    uint8        mapOffset = HEAPBLK_TO_OFFSET(state->rs_blockno);
> +    char       *map;
> +    uint8        flags;
> +
> +    if (mapBlock != state->rs_vm_blockno && state->rs_vm_buffer_valid)
> +        rewrite_flush_vm_page(state);
> +
> +    if (!state->rs_vm_buffer_valid)
> +    {
> +        PageInit(state->rs_vm_buffer, BLCKSZ, 0);
> +        state->rs_vm_blockno = mapBlock;
> +        state->rs_vm_buffer_valid = true;
> +    }
> +
> +    flags = (state->rs_all_visible ? VISIBILITYMAP_ALL_VISIBLE : 0) |
> +            (state->rs_all_frozen ? VISIBILITYMAP_ALL_FROZEN : 0);
> +
> +    map = PageGetContents(state->rs_vm_buffer);
> +    map[mapByte] |= (flags << mapOffset);
> +}

I think it's a bad idea to copy this many VM implementation details into
rewriteheap.c.


> +/*
> + * Update rs_all_visible and rs_all_frozen flags according to the tuple.  We
> + * use simplified check assuming that HeapTupleSatisfiesVacuum() should already
> + * set tuple hint bits.
> + */
> +static void
> +rewrite_update_vm_flags(RewriteState state, HeapTuple tuple)
> +{
> +    TransactionId    xmin;
> +
> +    if (!state->rs_all_visible)
> +        return;
> +
> +    if (!HeapTupleHeaderXminCommitted(tuple->t_data))
> +    {
> +        state->rs_all_visible = false;
> +        state->rs_all_frozen = false;
> +        return;
> +    }
> +
> +    xmin = HeapTupleHeaderGetXmin(tuple->t_data);
> +    if (!TransactionIdPrecedes(xmin, state->rs_oldest_xmin))
> +    {
> +        state->rs_all_visible = false;
> +        state->rs_all_frozen = false;
> +        return;
> +    }
> +
> +    if (!(tuple->t_data->t_infomask & HEAP_XMAX_INVALID) &&
> +        !HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_data->t_infomask))
> +    {
> +        state->rs_all_visible = false;
> +        state->rs_all_frozen = false;
> +        return;
> +    }
> +
> +    if (!state->rs_all_frozen)
> +        return;
> +
> +    if (heap_tuple_needs_eventual_freeze(tuple->t_data))
> +        state->rs_all_frozen = false;
> +}

I don't think we should have yet another copy of logic determining
visibility. It has repeatedly proven hard to get right in the past, and
it'll not get better by having yet another copy.  Especially not because
we've basically already done a HTSV (via
heapam_relation_copy_for_cluster), and we're now redoing a poor man's
version of it.


Greetings,

Andres Freund



Re: Write visibility map during CLUSTER/VACUUM FULL

От
Michael Paquier
Дата:
On Fri, Sep 20, 2019 at 01:05:06PM -0700, Andres Freund wrote:
> I don't think we should have yet another copy of logic determining
> visibility. It has repeatedly proven hard to get right in the past, and
> it'll not get better by having yet another copy.  Especially not because
> we've basically already done a HTSV (via
> heapam_relation_copy_for_cluster), and we're now redoing a poor man's
> version of it.

These comments are unanswered for more than 2 months, so I am marking
this entry as returned with feedback.  Please feel free to change if
you think that's not adapted.
--
Michael

Вложения

Re: Write visibility map during CLUSTER/VACUUM FULL

От
Anna Akenteva
Дата:
On 2019-11-29 05:32, Michael Paquier wrote:
> These comments are unanswered for more than 2 months, so I am marking
> this entry as returned with feedback.

I'd like to revive this patch. To make further work easier, attaching a 
rebased version of the latest patch by Alexander.

As for having yet another copy of logic determining visibility: the 
visibility check was mainly taken from heap_page_is_all_visible(), so I 
refactored the code to make sure that logic is not duplicated. The 
updated patch is attached too.

-- 
Anna Akenteva
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: Write visibility map during CLUSTER/VACUUM FULL

От
vignesh C
Дата:
On Mon, Jun 28, 2021 at 1:52 PM Anna Akenteva <a.akenteva@postgrespro.ru> wrote:
>
> On 2019-11-29 05:32, Michael Paquier wrote:
> > These comments are unanswered for more than 2 months, so I am marking
> > this entry as returned with feedback.
>
> I'd like to revive this patch. To make further work easier, attaching a
> rebased version of the latest patch by Alexander.
>
> As for having yet another copy of logic determining visibility: the
> visibility check was mainly taken from heap_page_is_all_visible(), so I
> refactored the code to make sure that logic is not duplicated. The
> updated patch is attached too.

The patch does not apply on Head anymore, could you rebase and post a
patch. I'm changing the status to "Waiting for Author".

Regards,
Vignesh



Re: Write visibility map during CLUSTER/VACUUM FULL

От
Justin Pryzby
Дата:
On Mon, Jun 28, 2021 at 11:22:01AM +0300, Anna Akenteva wrote:
> On 2019-11-29 05:32, Michael Paquier wrote:
> > These comments are unanswered for more than 2 months, so I am marking
> > this entry as returned with feedback.
> 
> I'd like to revive this patch. To make further work easier, attaching a
> rebased version of the latest patch by Alexander.
> 
> As for having yet another copy of logic determining visibility: the
> visibility check was mainly taken from heap_page_is_all_visible(), so I
> refactored the code to make sure that logic is not duplicated. The updated
> patch is attached too.

I agree that it's strange that VACUUM(FREEZE) freezes tuples but not VM bits,
nor page-level PD_ALL_VISIBLE (which is implied by all frozen).  After FULL
runs (taking an exclusive lock on the table), it's necessary to then vacuum the
table again to get what's intended.

Rebased on f10f0ae420ee62400876ab34dca2c09c20dcd030.

And rephrased Anna's two independent/alternate patches as a 2nd patch on top of
the 1st, as that helps me to read it and reduces its total size.

I noticed in testing the patch that autovacuum is still hitting the relation
shortly after vacuum full.  This is due to n_ins_since_autovacuum, which is new
in pg13.  I don't know how to address that (or even if it should be addressed
at all).

Also, pg_class.relallvisible is not set unless vacuum/analyze is run again
(which is mitigated by the n_ins behavior above).  It seems like this might be
important: an plan which uses index-only scan might be missed in favour of
something else until autovacuum runs (it will use cost-based delay, and might
not be scheduled immediately, could be interrupted, or even diasbled).

I'm testing like this:
CREATE EXTENSION IF NOT EXISTS pg_visibility ; DROP TABLE t; CREATE TABLE t AS SELECT generate_series(1,99999); VACUUM
FULLt; ANALYZE t; SELECT n_ins_since_vacuum, n_tup_upd, n_tup_del, n_tup_hot_upd FROM pg_stat_all_tables WHERE
relname='t';SELECT relpages, relallvisible FROM pg_class WHERE oid='t'::regclass; SELECT COUNT(1), COUNT(1)FILTER(WHERE
all_visible='t')allvis, COUNT(1)FILTER(WHERE all_frozen='t') allfrz, COUNT(1)FILTER(WHERE pd_all_visible='t') allvispd
FROMpg_visibility('t');
 

-- 
Justin

Вложения

Re: Write visibility map during CLUSTER/VACUUM FULL

От
Daniel Gustafsson
Дата:
> On 30 Aug 2021, at 02:26, Justin Pryzby <pryzby@telsasoft.com> wrote:

> Rebased on f10f0ae420ee62400876ab34dca2c09c20dcd030.

This patch no longer applies, please submit a rebased version.

--
Daniel Gustafsson        https://vmware.com/




Re: Write visibility map during CLUSTER/VACUUM FULL

От
Justin Pryzby
Дата:
On Sun, Aug 29, 2021 at 07:26:42PM -0500, Justin Pryzby wrote:
> On Mon, Jun 28, 2021 at 11:22:01AM +0300, Anna Akenteva wrote:
> > On 2019-11-29 05:32, Michael Paquier wrote:
> > > These comments are unanswered for more than 2 months, so I am marking
> > > this entry as returned with feedback.
> > 
> > I'd like to revive this patch. To make further work easier, attaching a
> > rebased version of the latest patch by Alexander.
> > 
> > As for having yet another copy of logic determining visibility: the
> > visibility check was mainly taken from heap_page_is_all_visible(), so I
> > refactored the code to make sure that logic is not duplicated. The updated
> > patch is attached too.
> 
> I agree that it's strange that VACUUM(FREEZE) freezes tuples but not VM bits,
> nor page-level PD_ALL_VISIBLE (which is implied by all frozen).  After FULL
> runs (taking an exclusive lock on the table), it's necessary to then vacuum the
> table again to get what's intended.
> 
> Rebased on f10f0ae420ee62400876ab34dca2c09c20dcd030.

Rebased this patch again

Alexander, are you planning on working on this patch ?

Otherwise, Anna, would you want to "own" the patch ?

> And rephrased Anna's two independent/alternate patches as a 2nd patch on top of
> the 1st, as that helps me to read it and reduces its total size.
> 
> I noticed in testing the patch that autovacuum is still hitting the relation
> shortly after vacuum full.  This is due to n_ins_since_autovacuum, which is new
> in pg13.  I don't know how to address that (or even if it should be addressed
> at all).
> 
> Also, pg_class.relallvisible is not set unless vacuum/analyze is run again
> (which is mitigated by the n_ins behavior above).  It seems like this might be
> important: an plan which uses index-only scan might be missed in favour of
> something else until autovacuum runs (it will use cost-based delay, and might
> not be scheduled immediately, could be interrupted, or even diasbled).
> 
> I'm testing like this:
> CREATE EXTENSION IF NOT EXISTS pg_visibility ; DROP TABLE t; CREATE TABLE t AS SELECT generate_series(1,99999);
VACUUMFULL t; ANALYZE t; SELECT n_ins_since_vacuum, n_tup_upd, n_tup_del, n_tup_hot_upd FROM pg_stat_all_tables WHERE
relname='t';SELECT relpages, relallvisible FROM pg_class WHERE oid='t'::regclass; SELECT COUNT(1), COUNT(1)FILTER(WHERE
all_visible='t')allvis, COUNT(1)FILTER(WHERE all_frozen='t') allfrz, COUNT(1)FILTER(WHERE pd_all_visible='t') allvispd
FROMpg_visibility('t'); 

Вложения

Re: Write visibility map during CLUSTER/VACUUM FULL

От
Justin Pryzby
Дата:
On Mon, Nov 15, 2021 at 04:38:56PM -0600, Justin Pryzby wrote:
> On Sun, Aug 29, 2021 at 07:26:42PM -0500, Justin Pryzby wrote:
> > On Mon, Jun 28, 2021 at 11:22:01AM +0300, Anna Akenteva wrote:
> > > On 2019-11-29 05:32, Michael Paquier wrote:
> > > > These comments are unanswered for more than 2 months, so I am marking
> > > > this entry as returned with feedback.
> > > 
> > > I'd like to revive this patch. To make further work easier, attaching a
> > > rebased version of the latest patch by Alexander.
> > > 
> > > As for having yet another copy of logic determining visibility: the
> > > visibility check was mainly taken from heap_page_is_all_visible(), so I
> > > refactored the code to make sure that logic is not duplicated. The updated
> > > patch is attached too.
> > 
> > I agree that it's strange that VACUUM(FREEZE) freezes tuples but not VM bits,
> > nor page-level PD_ALL_VISIBLE (which is implied by all frozen).  After FULL
> > runs (taking an exclusive lock on the table), it's necessary to then vacuum the
> > table again to get what's intended.
> > 
> > Rebased on f10f0ae420ee62400876ab34dca2c09c20dcd030.
> 
> Rebased this patch again
> 
> Alexander, are you planning on working on this patch ?
> 
> Otherwise, Anna, would you want to "own" the patch ?

Rebased on 8e1fae193864527c931a704bd7908e4fbc983f5c.

Would someone step up to "own" this patch ?

If not, its CF entry may need to be closed (there's no status for "needs
author").

Вложения

Re: Write visibility map during CLUSTER/VACUUM FULL

От
Justin Pryzby
Дата:
On Sun, Dec 26, 2021 at 08:59:31PM -0600, Justin Pryzby wrote:
> Rebased on 8e1fae193864527c931a704bd7908e4fbc983f5c.
> 
> Would someone step up to "own" this patch ?
> 
> If not, its CF entry may need to be closed (there's no status for "needs
> author").

I'm planning to close this patch until someone can shepherd it.

-- 
Justin



Re: Write visibility map during CLUSTER/VACUUM FULL

От
Jacob Champion
Дата:
Justin Pryzby <pryzby@telsasoft.com> wrote:
> I'm planning to close this patch until someone can shepherd it.

I've marked it RwF for now.

--Jacob