Re: right sibling is not next child

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: right sibling is not next child
Дата
Msg-id 3355.1144873721@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: right sibling is not next child  ("Peter Brant" <Peter.Brant@wicourts.gov>)
Список pgsql-bugs
"Peter Brant" <Peter.Brant@wicourts.gov> writes:
> One thing that seems strange to me is that the original crash on
> Thursday failed on Panel_pkey, but my "vacuum analyze verbose" on a copy
> of the crashed database failed on MaintCode /
> pg_statistic_relid_att_index.

I can't find anything particularly wrong with
pg_statistic_relid_att_index.  There are two index entries for many of
the key values, but that's unsurprising if a database-wide vacuum
analyze had been done or in progress --- just-obsoleted pg_statistic
rows wouldn't have been vacuumed away yet.  I'm inclined to think that
this problem is a side-effect of the fact that you had to use
pg_resetxlog; there was probably an update to pg_statistic that got
lost.  If you want to continue experimenting with this database,
I'd suggest TRUNCATE'ing pg_statistic and rebuilding it via another
VACUUM ANALYZE run.

> I'll send over pg_statistic_relid_att_index and Panel_pkey.  Showing
> the keys to Panel_pkey is no problem.

My little index printing tool shows these entries in Panel_pkey at btree
level zero:

...
40    2006    RES032706     CH0327RES         high key on page 606
40    2006    RES032706     RES032706         383    11
40    2006    RES032706     RES032706         high key on page 608
40    2006    RES040306     CC0403RES         507    14
40    2006    RES040306     CCC0403RES        551    1
40    2006    RES040306     CCC0403RES        high key on page 601
40    2006    RES040306     RES040306         500    1
40    2006    RES040306     RES040306         high key on page 640
40    2006    RES040306     RES040306         high key on page 636
40    2006    RES040306     RES040306         high key on page 635
41    0001    2000POOL      0001              159    3
41    0001    2000POOL      0002              159    4
41    0001    2000POOL      0003              159    5
...

(The first four columns are the key values of this index; the last two
are the pointed-to heap tuple's page/line location.  High keys are
printed after any data keys on the page.)  The down-links at level one
look like:

...
40    2006    RES032706     CCC0327RES        606    1
40    2006    RES032706     CH0327RES         608    1
40    2006    RES032706     RES032706         601    1
40    2006    RES040306     CCC0403RES        640    1
40    2006    RES040306     RES040306         635    1
40    2006    RES040306     RES040306         636    1
40    2006    RES040306     RES040306         635    1
40    2006    RES040306     RES040306         629    1
41    0405    0405          0105              166    1
...

This is fairly interesting because we've got three pages with the same
boundary key.  The bogus entry for page 635 has been inserted where
you'd expect it to get inserted if the insertion were being done on the
basis of key comparison.  (We usually insert a new entry in front of any
ones with the same key.)  But we never do insertions on non-leaf pages
by key comparison!  Upper-level entries are only made during page
splits, by _bt_insert_parent, and that always works by locating the
down-link to the page just split and adding the new entry just after it.

One thing I realize after seeing this is that the index corruption might
be of long standing: the way that the btree search algorithms work, no
search would ever have descended directly to 635 or 636, but instead to
640 and then traverse right from there.  So you'd not have noticed any
malfunction, until the time came to try to delete the page.  That means
we shouldn't assume that the problem was created recently.

Now that we know the problem is associated with keys '40 2006 RES040306
RES040306', can you check your logs and see if anything interesting
happened around the time those keys were being inserted?  (Note: if
Panel_pkey really is a primary key, ie unique, it might seem odd for
there to be several pages worth of identical entries --- this would
imply that the same record had been updated several hundred times
between two vacuums.)

I still kinda like the theory that the extra key got in there because
btree_xlog_cleanup fired inappropriately ... mainly because I can't
think of any other theories ... but I'm still baffled about the details.
Anyone have any ideas?

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Peter Brant"
Дата:
Сообщение: Re: right sibling is not next child
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: contrib/intarray/_int_gist.c