Обсуждение: Error "vacuum pg_proc"

Поиск
Список
Период
Сортировка

Error "vacuum pg_proc"

От
Mateus Cordeiro Inssa
Дата:
  Hi,
 I got this error vacuuming pg_proc:

ERROR:  _bt_endpoint: leftmost page (20) has not leftmost flag
 It's postgresql 6.5.3 (Linux).
 []'s

Mateus Cordeiro Inssa
---------------------
Linux User: 76186  Kernel: 2.3.34
ICQ (Licq): 15243895
---------------------
mateus@ifnet.com.br
mateus@cwb.fnn.net

Fri Dec 24 09:55:14 EDT 1999


Re: [HACKERS] Error "vacuum pg_proc"

От
Tom Lane
Дата:
Mateus Cordeiro Inssa <mateus@ifnet.com.br> writes:
>   I got this error vacuuming pg_proc:
> ERROR:  _bt_endpoint: leftmost page (20) has not leftmost flag

Hmm, I wonder if this could be yet another manifestation of the problems
that btree indexes have with oversized key values.  Do you have any
procedures with long definitions?  "Long" in this context means over
about 4K.  If you're not sure, tryselect proname from pg_proc where length(prosrc) > 4000;

If you do, try breaking them up into smaller procedures.  You might have
to dump and rebuild the database to get rid of the corruption in
pg_proc's index, though.

The reason this is an issue is that btree wants to be able to store at
least two index tuples per disk page, so it has problems with indexing
values over half-a-page-less-overhead.  I'm not sure exactly what the
critical size is, but it is somewhere around 4000 bytes.  And pg_proc
has an index on the prosrc field.

The prosrc index is actually completely unnecessary, so we've removed
it for 7.0.  Work is in progress to fix the tuple-size problem as well,
but that will probably take longer.
        regards, tom lane


Re: [HACKERS] Error "vacuum pg_proc"

От
Tom Lane
Дата:
I wrote:
> Mateus Cordeiro Inssa <mateus@ifnet.com.br> writes:
>> I got this error vacuuming pg_proc:
>> ERROR:  _bt_endpoint: leftmost page (20) has not leftmost flag

> Hmm, I wonder if this could be yet another manifestation of the problems
> that btree indexes have with oversized key values.  Do you have any
> procedures with long definitions?  "Long" in this context means over
> about 4K.

I spent some more time looking into this, and found out that actually
the safe upper limit for btree index entries is not ~ BLCKSZ/2, but
~ BLCKSZ/3.  btree needs to be able to insert two items on every page,
but it *also* keeps an extra item (the "high key") on non-rightmost
pages.  So if any item exceeds one-third the available space, you run
a risk of failure depending on what page it ends up on and what else
is on that same page.

It turns out that for an index on a single text column, the maximum
safe text length is 2700 bytes.  So the correct check for dangerous
procedure definitions in 6.5.* is
select proname from pg_proc where length(prosrc) > 2700;

I have committed a check for dangerously long index entries into both
current sources and REL6_5 branch.  The patch is attached if you want to
add it to your installation right away.  Note that this only defends
against oversize new entries; if you've already got oversize index
entries, you still have trouble waiting to happen...
        regards, tom lane

*** src/backend/access/nbtree/nbtinsert.c.orig    Wed Sep  1 13:54:00 1999
--- src/backend/access/nbtree/nbtinsert.c    Sun Dec 26 15:44:15 1999
***************
*** 268,273 ****
--- 268,285 ----                                          * consistent */      /*
+      * Check whether the item can fit on a btree page at all.
+      * (Eventually, we ought to try to apply TOAST methods if not.)
+      * We actually need to be able to fit three items on every page,
+      * so restrict any one item to 1/3 the per-page available space.
+      * Note that at this point, itemsz doesn't include the ItemId.
+      */
+     if (itemsz > (PageGetPageSize(page)-sizeof(PageHeaderData)-MAXALIGN(sizeof(BTPageOpaqueData)))/3 -
sizeof(ItemIdData))
+         elog(ERROR, "btree: index item size %d exceeds maximum %d",
+              itemsz,
+              (PageGetPageSize(page)-sizeof(PageHeaderData)-MAXALIGN(sizeof(BTPageOpaqueData)))/3 -
sizeof(ItemIdData));
+ 
+     /*      * If we have to insert item on the leftmost page which is the first      * page in the chain of
duplicatesthen: 1. if scankey == hikey (i.e.      * - new duplicate item) then insert it here; 2. if scankey < hikey
 


Re: [HACKERS] Error "vacuum pg_proc"

От
Bruce Momjian
Дата:
> I spent some more time looking into this, and found out that actually
> the safe upper limit for btree index entries is not ~ BLCKSZ/2, but
> ~ BLCKSZ/3.  btree needs to be able to insert two items on every page,
> but it *also* keeps an extra item (the "high key") on non-rightmost
> pages.  So if any item exceeds one-third the available space, you run
> a risk of failure depending on what page it ends up on and what else
> is on that same page.
> 
> It turns out that for an index on a single text column, the maximum
> safe text length is 2700 bytes.  So the correct check for dangerous
> procedure definitions in 6.5.* is

This is another argument to try and get long tuples into 7.0.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Error "vacuum pg_proc"

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> It turns out that for an index on a single text column, the maximum
>> safe text length is 2700 bytes.

> This is another argument to try and get long tuples into 7.0.

I think Jan might have enough on his plate already without trying to
TOAST the index code along with the plain-table code.  But if he can
get it done, great!

One thing this does bring up is that the maximum safe tuple length is
dependent on the index or table type.  The toaster's API had better
be designed accordingly...
        regards, tom lane


Re: [HACKERS] Error "vacuum pg_proc"

От
Bruce Momjian
Дата:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> It turns out that for an index on a single text column, the maximum
> >> safe text length is 2700 bytes.
> 
> > This is another argument to try and get long tuples into 7.0.
> 
> I think Jan might have enough on his plate already without trying to
> TOAST the index code along with the plain-table code.  But if he can
> get it done, great!
> 
> One thing this does bring up is that the maximum safe tuple length is
> dependent on the index or table type.  The toaster's API had better
> be designed accordingly...

In talking to Jan, the index code will make use of the toast entries
automatically.  He said the heap_insert will do any toasting, and after
that the tuple already has any toast pointers, and that gets inserted
into the index.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Error "vacuum pg_proc"

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> One thing this does bring up is that the maximum safe tuple length is
>> dependent on the index or table type.  The toaster's API had better
>> be designed accordingly...

> In talking to Jan, the index code will make use of the toast entries
> automatically.  He said the heap_insert will do any toasting, and after
> that the tuple already has any toast pointers, and that gets inserted
> into the index.

If that's his plan, then it's broken by design.  Toasting a complete
tuple cannot be the basis for toasting index entries related to the
tuple, because (a) the index entries will typically use only some of the
fields appearing in the tuple; (b) index entries have different length
limits than tuples do; (c) indexes might be created after the original
table is.  Heck, index *types* might be created after the original table
is.  If index toasting is dependent on toasting of the main table, the
only way it can work is to toast every varlena attribute down to a
prechosen maximum length that Jan hopes will satisfy every index type,
now or hereafter --- no matter whether the column in question has or
ever will have an index of any type.

And that'll still crash and burn for multicolumn indexes.

I thought the plan was to toast indexes independently of the main
table, ie, an index would have its own toast-table and its own
storage of oversize attributes --- where the *index* decides what
is oversize, not the main table.

If main tables and indexes point at the same toast-table entries,
I think VACUUM will become rather an interesting problem, too...
although maybe that could be worked around if VACUUM destroys indexes
and rebuilds them from scratch.
        regards, tom lane


Re: [HACKERS] Error "vacuum pg_proc"

От
Tom Lane
Дата:
I wrote:
> And that'll still crash and burn for multicolumn indexes.

Not to mention functional indexes, which typically store values that
don't appear in the referenced tuple at all.

Basically, indexes have to have their own toasters.  There's no other
way.
        regards, tom lane


Re: [HACKERS] Error "vacuum pg_proc"

От
Bruce Momjian
Дата:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> One thing this does bring up is that the maximum safe tuple length is
> >> dependent on the index or table type.  The toaster's API had better
> >> be designed accordingly...
> 
> > In talking to Jan, the index code will make use of the toast entries
> > automatically.  He said the heap_insert will do any toasting, and after
> > that the tuple already has any toast pointers, and that gets inserted
> > into the index.
> 
> If that's his plan, then it's broken by design.  Toasting a complete
> tuple cannot be the basis for toasting index entries related to the
> tuple, because (a) the index entries will typically use only some of the
> fields appearing in the tuple; (b) index entries have different length
> limits than tuples do; (c) indexes might be created after the original

Yes, I see your point.  This could be quite complicated.  He is
targeting BLKSZ and indexes are BLKSZ/3.  He could toast any single
attribute of size >= BLKSZ/3, but as you mentioned multi-column
indexes would be a problem.  I wonder if he could create the TOAST
entries and modify the heap tuple during index creation?

Wow, this clearly is going to be messy.  Maybe he is going to have to
have a separate TOAST table for the index.  If he got fancy, he could
use the heap TOAST table as needed, and have a secondary index TOAST for
cases the tuple needs further TOASTS for indexes.

It shows we haven't thought through all this yet.




--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Error "vacuum pg_proc"

От
Mateus Cordeiro Inssa
Дата:
Tom Lane writes:> Mateus Cordeiro Inssa <mateus@ifnet.com.br> writes:> >   I got this error vacuuming pg_proc:> >
ERROR: _bt_endpoint: leftmost page (20) has not leftmost flag> > Hmm, I wonder if this could be yet another
manifestationof the problems> that btree indexes have with oversized key values.  Do you have any> procedures with long
definitions? "Long" in this context means over> about 4K.  If you're not sure, try>     select proname from pg_proc
wherelength(prosrc) > 4000;
 
 Yes, I have some functions from 3k to 5k.
> If you do, try breaking them up into smaller procedures.  You might have> to dump and rebuild the database to get rid
ofthe corruption in> pg_proc's index, though.
 
 Ok.
> The prosrc index is actually completely unnecessary, so we've removed> it for 7.0.  Work is in progress to fix the
tuple-sizeproblem as well,> but that will probably take longer.
 
 Oh, I would ask why there was this index. I had problems with it
since version 6.4.
 I'd like to suggest the creation of a new command: ALTER FUNCTION. I 
use pltcl to program in the server, so, no need for checking the
function code. The problems with pg_proc always occurred to me when
changing functions: DROP/CREATE. This command would do just an update on
prosrc field (that doesn't have index anymore).
 []'s

Mateus Cordeiro Inssa
---------------------
Linux User: 76186  Kernel: 2.3.34
ICQ (Licq): 15243895
---------------------
mateus@ifnet.com.br
mateus@cwb.fnn.net

Mon Dec 27 11:15:41 EDT 1999
.


Index toasting (was: Re: [HACKERS] Error "vacuum pg_proc")

От
Jan Wieck
Дата:
Tom Lane wrote:

> I wrote:
> > And that'll still crash and burn for multicolumn indexes.
>
> Not to mention functional indexes, which typically store values that
> don't appear in the referenced tuple at all.
>
> Basically, indexes have to have their own toasters.  There's no other
> way.
    You're right.
    I think it's best to delay index toasting until we have some    experience with normal, main tuple attribute
toasting.It'd be    nice if the solution had covered huge values to be indexed    automatically (what it doesn't any
more).But I think most    people can live with a database, that cannot index huge    values, but is capable to store
andretrieve them for now.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #





Re: Index toasting (was: Re: [HACKERS] Error "vacuum pg_proc")

От
Don Baccus
Дата:
At 07:25 PM 1/3/00 +0100, Jan Wieck wrote:

>     I think it's best to delay index toasting until we have some
>     experience with normal, main tuple attribute toasting. It'd be
>     nice if the solution had covered huge values to be indexed
>     automatically (what it doesn't any more). But I think most
>     people can live with a database, that cannot index huge
>     values, but is capable to store and retrieve them for now.

>From my personal POV, I would certainly agree with this.  The stuff
I'm working on is probably pretty typical, using an integer key to
identify rows which contain large chunks of text, photographs, etc.
I'm perfectly happy not being able to index the big chunks, and
suspect a large percentage of users would feel the same.




- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.