Re: [HACKERS] unique index violation after pg_upgrade to PG10

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] unique index violation after pg_upgrade to PG10
Дата
Msg-id CAH2-Wz=0s6zxoj4JEL+2dvXo4QyE4gif3OHYVQ-TBPhyR0BMCQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] unique index violation after pg_upgrade to PG10  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
On Tue, Oct 24, 2017 at 10:20 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> I think you must have compared these:

Yes, I did. My mistake.

> On Tue, Oct 24, 2017 at 03:11:44PM -0500, Justin Pryzby wrote:
>> ts=# SELECT * FROM bt_page_items(get_raw_page('sites_idx', 1));
>>
>> itemoffset | 48
>> ctid       | (1,37)
>> itemlen    | 32
>> nulls      | f
>> vars       | t
>> data       | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 0b 31 31 31 31 00 00 00 00 00 00
> ...
>> itemoffset | 37
>> ctid       | (0,97)
>> itemlen    | 24
>> nulls      | f
>> vars       | t
>> data       | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 03 00 00
>
> ..but note those are both items in sites_idx (48 and 37, which I seem to have
> pasted out of order)..  I included both because I'm not confident I know what
> the "index tid=(1,37)" referred to, but I gather it means item at offset=37
> (and not item with ctid=(1,37).)
>
> | [pryzbyj@database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_check('sites_idx'::regclass::oid,
heapallindexed=>True)"
> | ERROR:  high key invariant violated for index "sites_idx"
> | DETAIL:  Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0.

This means that the item at (1,37) in the index is not <= the high
key, which is located at (1,1). (The high key comes first, despite
being an upper bound rather than a lower bound, per pageinspect docs.)

I find it suspicious that the page lsn is 0 here, btw.

> ts=# SELECT * FROM page_header(get_raw_page('sites_idx', 1));
>  lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
> -----+----------+-------+-------+-------+---------+----------+---------+-----------
>  0/0 |        0 |     0 |   872 |  1696 |    8176 |     8192 |       4 |         0
>
> Here is its heap page:
>
> ts=# SELECT * FROM heap_page_items(get_raw_page('sites', 0)) WHERE lp=97;
>  lp | lp_off | lp_flags | lp_len | t_xmin |  t_xmax  | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff |
t_bits     | t_oid |                   t_data
 
>
----+--------+----------+--------+--------+----------+----------+--------+-------------+------------+--------+------------------+-------+--------------------------------------------
>  97 |    968 |        1 |     52 |  21269 | 33567444 |        0 | (3,27) |        8204 |       2307 |     32 |
1110100000010000|       | \x700000001b4352434c4d542d43454d5330030303
 
>
> Which I see ends with 0303 vs 0000..

Looks like I was accidentally right, then -- the heap and index do differ.

You might try this tool I published recently, to get a better sense of
details like this:

https://github.com/petergeoghegan/pg_hexedit

(Don't do so with a data directory that you cannot afford to corrupt
again, though!)

> Maybe this is relevant ?
> ts=# SELECT * FROM heap_page_items(get_raw_page('sites', 3)) WHERE lp=27;
>  lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits |
t_oid| t_data
 
>
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------
>  27 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |
     |
 

This looks like an LP_DEAD item.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Current int & float overflow checking is slow.
Следующее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] Current int & float overflow checking is slow.