Обсуждение: [HACKERS] "inconsistent page found" with checksum and wal_consistency_checking enabled

Поиск
Список
Период
Сортировка

[HACKERS] "inconsistent page found" with checksum and wal_consistency_checking enabled

От
Ashwin Agrawal
Дата:

Currently, page checksum is not masked by Page masking routines used by wal_consistency_checking facility. So, when running `make installcheck` with data checksum enabled and wal_consistency_checking='all', it easily and consistently FATALs with "inconsistent page found".

If anything needs to be masked on Page to perform / pass wal consistency checking, definitely checksum is not going to match and hence must be masked as well. Attaching patch to fix the same, installcheck passes with checksums enabled and wal_consistency_checking='all' with the fix.

Clubbed to perform the masking with lsn as it sounds logical to have them together, as lsn is masked is all the cases so far and such is needed for checksum as well.

Thank You,
Ashwin Agrawal
Вложения

Re: [HACKERS] "inconsistent page found" with checksum andwal_consistency_checking enabled

От
Michael Paquier
Дата:
On Wed, Sep 20, 2017 at 5:23 AM, Ashwin Agrawal <aagrawal@pivotal.io> wrote:
> Currently, page checksum is not masked by Page masking routines used by
> wal_consistency_checking facility. So, when running `make installcheck` with
> data checksum enabled and wal_consistency_checking='all', it easily and
> consistently FATALs with "inconsistent page found".

Indeed. This had better be fixed before PG10 is out. I am adding an open item.

> If anything needs to be masked on Page to perform / pass wal consistency
> checking, definitely checksum is not going to match and hence must be masked
> as well. Attaching patch to fix the same, installcheck passes with checksums
> enabled and wal_consistency_checking='all' with the fix.
>
> Clubbed to perform the masking with lsn as it sounds logical to have them
> together, as lsn is masked is all the cases so far and such is needed for
> checksum as well.

Agreed.
 * In consistency checks, the LSN of the two pages compared will likely be
- * different because of concurrent operations when the WAL is generated
- * and the state of the page when WAL is applied.
+ * different because of concurrent operations when the WAL is generated and
+ * the state of the page when WAL is applied. Also, mask out checksum as
+ * masking anything else on page means checksum is not going to match as well. */
Nit: Using "the LSN and the checksum" instead of the "the LSN".
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "inconsistent page found" with checksum andwal_consistency_checking enabled

От
Kuntal Ghosh
Дата:
On Wed, Sep 20, 2017 at 10:22 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Sep 20, 2017 at 5:23 AM, Ashwin Agrawal <aagrawal@pivotal.io> wrote:
>> Currently, page checksum is not masked by Page masking routines used by
>> wal_consistency_checking facility. So, when running `make installcheck` with
>> data checksum enabled and wal_consistency_checking='all', it easily and
>> consistently FATALs with "inconsistent page found".
>
> Indeed. This had better be fixed before PG10 is out. I am adding an open item.
>
Oops and surprised! How come we missed that previously. If page lsns
are different, checksums will be different as well. Anyhow, nice catch
and thanks for the patch.


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "inconsistent page found" with checksum andwal_consistency_checking enabled

От
Michael Paquier
Дата:
On Wed, Sep 20, 2017 at 2:26 PM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:
> On Wed, Sep 20, 2017 at 10:22 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Wed, Sep 20, 2017 at 5:23 AM, Ashwin Agrawal <aagrawal@pivotal.io> wrote:
>>> Currently, page checksum is not masked by Page masking routines used by
>>> wal_consistency_checking facility. So, when running `make installcheck` with
>>> data checksum enabled and wal_consistency_checking='all', it easily and
>>> consistently FATALs with "inconsistent page found".
>>
>> Indeed. This had better be fixed before PG10 is out. I am adding an open item.
>>
> Oops and surprised! How come we missed that previously. If page lsns
> are different, checksums will be different as well. Anyhow, nice catch
> and thanks for the patch.

That happens. We have really covered maaany points during many rounds
of reviews, still I am not surprised to see one or two things that
fell into the cracks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "inconsistent page found" with checksum andwal_consistency_checking enabled

От
Kuntal Ghosh
Дата:
On Wed, Sep 20, 2017 at 11:05 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Sep 20, 2017 at 2:26 PM, Kuntal Ghosh
> <kuntalghosh.2007@gmail.com> wrote:
>> On Wed, Sep 20, 2017 at 10:22 AM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> On Wed, Sep 20, 2017 at 5:23 AM, Ashwin Agrawal <aagrawal@pivotal.io> wrote:
>>>> Currently, page checksum is not masked by Page masking routines used by
>>>> wal_consistency_checking facility. So, when running `make installcheck` with
>>>> data checksum enabled and wal_consistency_checking='all', it easily and
>>>> consistently FATALs with "inconsistent page found".
>>>
>>> Indeed. This had better be fixed before PG10 is out. I am adding an open item.
>>>
>> Oops and surprised! How come we missed that previously. If page lsns
>> are different, checksums will be different as well. Anyhow, nice catch
>> and thanks for the patch.
>
> That happens. We have really covered maaany points during many rounds
> of reviews, still I am not surprised to see one or two things that
> fell into the cracks.
Yup, that's true. :-)


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: "inconsistent page found" with checksum andwal_consistency_checking enabled

От
Noah Misch
Дата:
On Wed, Sep 20, 2017 at 01:52:15PM +0900, Michael Paquier wrote:
> On Wed, Sep 20, 2017 at 5:23 AM, Ashwin Agrawal <aagrawal@pivotal.io> wrote:
> > Currently, page checksum is not masked by Page masking routines used by
> > wal_consistency_checking facility. So, when running `make installcheck` with
> > data checksum enabled and wal_consistency_checking='all', it easily and
> > consistently FATALs with "inconsistent page found".
> 
> Indeed. This had better be fixed before PG10 is out. I am adding an open item.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "inconsistent page found" with checksum andwal_consistency_checking enabled

От
Robert Haas
Дата:
On Wed, Sep 20, 2017 at 12:52 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Indeed. This had better be fixed before PG10 is out. I am adding an open item.

This seems a little hyperbolic to me.  Sure, it's a new bug in v10,
and sure, it's always better to fix bugs sooner rather than later, but
there's nothing particularly serious or urgent about this bug as
compared to any other one.

I've committed the proposed patch to master and REL_10_STABLE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers