Обсуждение: HOT line pointer bloat and PageRepairFragmentation

Поиск
Список
Период
Сортировка

HOT line pointer bloat and PageRepairFragmentation

От
"Pavan Deolasee"
Дата:
<br />We know that HOT can cause line pointer bloat because of redirect dead<br />line pointers. In the worst case
therecould be MaxHeapTuplesPerPage<br />redirect-dead line pointers in a page. VACUUM can reclaim these line<br />
pointersand mark them ~LP_USED (what is now called LP_UNUSED).<br />But  we don't reclaim the space used by unused line
pointersduring<br />repairing page fragmentation, and hence we would never be able to<br />remove the line pointer
bloatcompletely. Fundamentally we should <br />be able to reclaim the unused line pointers at the end of the lp
array<br/>(i.e. unused line pointers immediate to pd_lower)<br /><br clear="all" />I had earlier tried to repair the
bloatby reclaiming the space used<br />by LP_UNUSED line pointers at the end of the array. But it doesn't work <br
/>wellwith VACUUM FULL which tracks unused line pointers for moving<br />tuples. Its not that we can not fix that
issue,but I am reluctant to spend<br />time on that right now because many of us feel that VACUUM FULL is<br /> near
itsEOL.<br /><br />How about passing a boolean to PageRepairFragmentation to<br />command it to reclaim unused line
pointers? We pass "true" at all<br />places except in the VACUUM FULL code path. IOW we reclaim unused <br />line
pointersin defragmentation and LAZY VACUUM. We would need<br />to WAL log this information in xl_heap_clean so that we
redothe same<br />during recovery. I have a patch ready since I had already implemented<br />this few weeks back. <br
/><br/>Comments ?<br /><br />Thanks,<br />Pavan<br /><br />-- <br />Pavan Deolasee<br />EnterpriseDB     <a
href="http://www.enterprisedb.com">http://www.enterprisedb.com</a>

Re: HOT line pointer bloat and PageRepairFragmentation

От
Tom Lane
Дата:
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> How about passing a boolean to PageRepairFragmentation to
> command it to reclaim unused line pointers ?

The difficulty with this is having to be 100% confident that noplace in
the system tries to dereference a TID without checking that the line
number (offset) is within range.  At one time that was demonstrably
not so.  I think we've cleaned up most if not all such places, but
I wouldn't want to swear to it.

I'm not convinced it's worth taking any risk for.
        regards, tom lane


Re: HOT line pointer bloat and PageRepairFragmentation

От
"Pavan Deolasee"
Дата:


On 9/13/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:


The difficulty with this is having to be 100% confident that noplace in
the system tries to dereference a TID without checking that the line
number (offset) is within range.  At one time that was demonstrably
not so.  I think we've cleaned up most if not all such places, but
I wouldn't want to swear to it.


If there are such places, aren't we already in problem ? An unused
line pointer can be reused for unrelated tuple. Dereferencing the TID
can cause data corruption, isn't it ? If you want, I can do
a quick search for all callers of PageGetItemId and confirm that
the offset is checked and add any missing checks.

In normal circumstances, line pointer bloat should not occur. But in
some typical cases it may cause unrepairable damage. For example:

CREATE TABLE test (a int, b char(200));
CREATE UNIQUE INDEX testindx ON test(a);
INSERT INTO test VALUES (1, 'foo');

Now, if we repeatedly update the tuple so that each update is a
COLD update, we would bloat the page with redirect-dead line pointers.

Any other idea to recover from this  ?

Thanks,
Pavan


--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

Re: HOT line pointer bloat and PageRepairFragmentation

От
"Zeugswetter Andreas ADI SD"
Дата:
> CREATE TABLE test (a int, b char(200));
> CREATE UNIQUE INDEX testindx ON test(a);
> INSERT INTO test VALUES (1, 'foo');
>
> Now, if we repeatedly update the tuple so that each update is a
> COLD update, we would bloat the page with redirect-dead line pointers.

Um, sorry for not understanding, but why would a COLD update produce a
redirect-dead line pointer (and not two LP_NORMAL ones) ?

Andreas


Re: HOT line pointer bloat and PageRepairFragmentation

От
"Pavan Deolasee"
Дата:


On 9/13/07, Zeugswetter Andreas ADI SD <Andreas.Zeugswetter@s-itsolutions.at> wrote:

> CREATE TABLE test (a int, b char(200));
> CREATE UNIQUE INDEX testindx ON test(a);
> INSERT INTO test VALUES (1, 'foo');
>
> Now, if we repeatedly update the tuple so that each update is a
> COLD update, we would bloat the page with redirect-dead line pointers.

Um, sorry for not understanding, but why would a COLD update produce a
redirect-dead line pointer (and not two LP_NORMAL ones) ?


The COLD updated (old) tuple would be pruned to dead line pointer
once the tuple becomes DEAD. Normally that would let us reuse the
tuple storage for other purposes. We do the same for DELETEd tuples.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

Re: HOT line pointer bloat and PageRepairFragmentation

От
"Zeugswetter Andreas ADI SD"
Дата:
> The COLD updated (old) tuple would be pruned to dead line pointer
> once the tuple becomes DEAD. Normally that would let us reuse the
> tuple storage for other purposes. We do the same for DELETEd tuples.

Oh, I thought only pruned tuples from HOT chains can produce a
"redirect dead" line pointer.

This looks like a problem, since we might end up with a page filled with
LP_DEAD slots, that all have no visibility info and can thus not be
cleaned
by vacuum.

Maybe PageRepairFragmentation when called from HOT should prune less
aggressively. e.g. prune until a max of 1/2 the available slots are
LP_DEAD,
and not prune the rest.

Andreas


Re: HOT line pointer bloat and PageRepairFragmentation

От
"Pavan Deolasee"
Дата:


On 9/13/07, Zeugswetter Andreas ADI SD <Andreas.Zeugswetter@s-itsolutions.at> wrote:

> The COLD updated (old) tuple would be pruned to dead line pointer
> once the tuple becomes DEAD. Normally that would let us reuse the
> tuple storage for other purposes. We do the same for DELETEd tuples.

Oh, I thought only pruned tuples from HOT chains can produce a
"redirect dead" line pointer.

This looks like a problem, since we might end up with a page filled with
LP_DEAD slots, that all have no visibility info and can thus not be
cleaned
by vacuum.


It has nothing to do with visibility info. We already know the tuple is DEAD
and thats why its line pointer is LP_DEAD.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

Re: HOT line pointer bloat and PageRepairFragmentation

От
Tom Lane
Дата:
"Zeugswetter Andreas ADI SD" <Andreas.Zeugswetter@s-itsolutions.at> writes:
> ...This looks like a problem, since we might end up with a page filled with
> LP_DEAD slots, that all have no visibility info and can thus not be
> cleaned by vacuum.

No, it's the other way round: an LP_DEAD item pointer can *always* be
cleaned by VACUUM.  It would not have become LP_DEAD unless someone had
confirmed that the pointed-to tuple was no longer visible to anyone.

The only reason we have LP_DEAD at all is that we don't want HOT pruning
to be required to remove the index entries that link to the item pointer.
        regards, tom lane