Обсуждение: BUG #16619: Amcheck detects corruption in hstore' btree index (ver 2)

Поиск
Список
Период
Сортировка

BUG #16619: Amcheck detects corruption in hstore' btree index (ver 2)

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16619
Logged by:          Andrew Bille
Email address:      andrewbille@gmail.com
PostgreSQL version: 13beta3
Operating system:   Ubuntu-20.04
Description:

Hello all.
Index corruption detected after upgrade from v 9.6/10:

mkdir v10
cd v10
git clone https://github.com/postgres/postgres.git ./
git checkout REL_10_STABLE
 ./configure >/dev/null && make -j8 >/dev/null && make -j8 contrib
>/dev/null
echo "SELECT 1;" >> contrib/hstore/sql/hstore.sql
make check -C contrib/hstore
cd ..
mkdir v13
cd v13
git clone https://github.com/postgres/postgres.git ./
git checkout REL_13_STABLE
./configure >/dev/null && make -j8 >/dev/null && make -j8 contrib
>/dev/null
echo "EXTRA_INSTALL = contrib/amcheck contrib/pageinspect"
>>contrib/hstore/Makefile
make check -C contrib/hstore
export LD_LIBRARY_PATH=tmp_install/usr/local/pgsql/lib
tmp_install/usr/local/pgsql/bin/initdb -D /tmp/db
tmp_install/usr/local/pgsql/bin/pg_upgrade -d
"../v10/contrib/hstore/tmp_check/data" -D /tmp/db -b
"../v10/tmp_install/usr/local/pgsql/bin" -B
tmp_install/usr/local/pgsql/bin
tmp_install/usr/local/pgsql/bin/pg_ctl -D /tmp/db -l l.log start
tmp_install/usr/local/pgsql/bin/psql -dcontrib_regression -c "create
extension if not exists amcheck;"
tmp_install/usr/local/pgsql/bin/psql -dcontrib_regression -c "SELECT
bt_index_parent_check('hidx', true);"

I get the following error:
ERROR:  mismatch between parent key and child high key in index "hidx"
DETAIL:  Target block=3 child block=1 target page lsn=0/16D5758.

(The same procedure with v11/12 doesn't produce the error.)

Thank you.


Re: BUG #16619: Amcheck detects corruption in hstore' btree index (ver 2)

От
Anastasia Lubennikova
Дата:
On 16.09.2020 13:34, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference:      16619
> Logged by:          Andrew Bille
> Email address:      andrewbille@gmail.com
> PostgreSQL version: 13beta3
> Operating system:   Ubuntu-20.04
> Description:
>
> Hello all.
> Index corruption detected after upgrade from v 9.6/10:
>
> mkdir v10
> cd v10
> git clone https://github.com/postgres/postgres.git ./
> git checkout REL_10_STABLE
>   ./configure >/dev/null && make -j8 >/dev/null && make -j8 contrib
>> /dev/null
> echo "SELECT 1;" >> contrib/hstore/sql/hstore.sql
> make check -C contrib/hstore
> cd ..
> mkdir v13
> cd v13
> git clone https://github.com/postgres/postgres.git ./
> git checkout REL_13_STABLE
> ./configure >/dev/null && make -j8 >/dev/null && make -j8 contrib
>> /dev/null
> echo "EXTRA_INSTALL = contrib/amcheck contrib/pageinspect"
>>> contrib/hstore/Makefile
> make check -C contrib/hstore
> export LD_LIBRARY_PATH=tmp_install/usr/local/pgsql/lib
> tmp_install/usr/local/pgsql/bin/initdb -D /tmp/db
> tmp_install/usr/local/pgsql/bin/pg_upgrade -d
> "../v10/contrib/hstore/tmp_check/data" -D /tmp/db -b
> "../v10/tmp_install/usr/local/pgsql/bin" -B
> tmp_install/usr/local/pgsql/bin
> tmp_install/usr/local/pgsql/bin/pg_ctl -D /tmp/db -l l.log start
> tmp_install/usr/local/pgsql/bin/psql -dcontrib_regression -c "create
> extension if not exists amcheck;"
> tmp_install/usr/local/pgsql/bin/psql -dcontrib_regression -c "SELECT
> bt_index_parent_check('hidx', true);"
>
> I get the following error:
> ERROR:  mismatch between parent key and child high key in index "hidx"
> DETAIL:  Target block=3 child block=1 target page lsn=0/16D5758.
>
> (The same procedure with v11/12 doesn't produce the error.)
>
> Thank you.

Good catch.

This check fails at v13 with any upgraded index that have BTREE_VERSION 2.

I think it is an amcheck problem. This check was introduced by commit 
d114cc5387.

The bt_pivot_tuple_identical() function excludes block number from 
comparison:

/*
  * Check if two tuples are binary identical except the block number.  So,
  * this function is capable to compare pivot keys on different levels.
  */
static bool
bt_pivot_tuple_identical(highkey, itup)

But it compares offset number. Which is fine for versions > 2, because 
since covering were introduced we store natts in this field and it is 
always initialized.

  * We store the number of columns present inside pivot tuples by abusing
  * their t_tid offset field, since pivot tuples never need to store a real
  * offset (pivot tuples generally store a downlink in t_tid, though).

For v2 indexes pivot tuple's offset can contain any random number which 
will lead to bt_pivot_tuple_identical() failure.

The fix is pretty simple - only compare data part of the tuples. I think 
we can skip offnum check for all index versions to keep the code simple.

I also noticed a typo in a comment and added minor correction to the patch.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

Re: BUG #16619: Amcheck detects corruption in hstore' btree index (ver 2)

От
Peter Geoghegan
Дата:
On Wed, Sep 16, 2020 at 7:24 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
> For v2 indexes pivot tuple's offset can contain any random number which
> will lead to bt_pivot_tuple_identical() failure.
>
> The fix is pretty simple - only compare data part of the tuples. I think
> we can skip offnum check for all index versions to keep the code simple.

I pushed a tweaked version of this patch. It seemed worth preserving
the inclusion of offset number in the bt_pivot_tuple_identical()
memcmp() comparison. The field contains important metadata that really
should match. Also, I made sure to compare t_info field on all indexes
(including pg_upgrade'd indexes) for the same reason.

We already account for heapkeyspace difference in other similar
routines, such as invariant_l_offset().  It's not hard to do this here
as well.

Thanks
-- 
Peter Geoghegan



Re: BUG #16619: Amcheck detects corruption in hstore' btree index (ver 2)

От
Alexander Korotkov
Дата:
Hi, Peter!

On Wed, Sep 16, 2020 at 8:47 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Wed, Sep 16, 2020 at 7:24 AM Anastasia Lubennikova
> <a.lubennikova@postgrespro.ru> wrote:
> > For v2 indexes pivot tuple's offset can contain any random number which
> > will lead to bt_pivot_tuple_identical() failure.
> >
> > The fix is pretty simple - only compare data part of the tuples. I think
> > we can skip offnum check for all index versions to keep the code simple.
>
> I pushed a tweaked version of this patch. It seemed worth preserving
> the inclusion of offset number in the bt_pivot_tuple_identical()
> memcmp() comparison. The field contains important metadata that really
> should match. Also, I made sure to compare t_info field on all indexes
> (including pg_upgrade'd indexes) for the same reason.
>
> We already account for heapkeyspace difference in other similar
> routines, such as invariant_l_offset().  It's not hard to do this here
> as well.

I was going to tweak this patch in a similar way, but made it in
advance.  Your commit looks good to me, thanks!

------
Regards,
Alexander Korotkov



Re: BUG #16619: Amcheck detects corruption in hstore' btree index (ver 2)

От
Peter Geoghegan
Дата:
Hi Alexander,

On Wed, Sep 16, 2020 at 2:56 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> I was going to tweak this patch in a similar way, but made it in
> advance.  Your commit looks good to me, thanks!

I would have left it to you, but we're pretty close to releasing 13,
and I didn't want to risk missing the deadline.

FWIW, the reason that Andrew did not see the same problem when
pg_upgrade'ing from Postgres 11 - Postgres 13 is that the INCLUDE
index feature simplified _bt_insert_parent() for all indexes (not just
indexes with non-key columns). That function used to set the offset
number of the pivot tuple it inserts into the parent to P_HIKEY (even
though that was always unnecessary and unhelpful -- maybe you remember
our discussion about this exact thing in early 2018). The pre-11
behavior of _bt_insert_parent() broke the assumption that each pivot
tuple's contents (after the downlink/block number) must always have
the same value in the matching pivot tuple one level up.

FWIW, Anastasia was right to say that the INCLUDE index feature
prevented such an inconsistency from arising -- it was v11 that
removed the silly set-P_HIKEY thing (not 12 or 13). However, the only
practical way for amcheck to know for sure that the index can never
have been affected by the P_HIKEY thing is to check that the index is
a heapkeyspace index (i.e. an index built by Postgres 12 or 13).

(Maybe this is obvious, but I thought I'd say it just in case, for the
archives.)

Thanks
-- 
Peter Geoghegan