> On Jun 11, 2020, at 11:35 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Jun 12, 2020 at 12:40 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>>
>>
>>
>>> On Jun 11, 2020, at 9:14 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>>>
>>> I have just browsed through the patch and the idea is quite
>>> interesting. I think we can expand it to check that whether the flags
>>> set in the infomask are sane or not w.r.t other flags and xid status.
>>> Some examples are
>>>
>>> - If HEAP_XMAX_LOCK_ONLY is set in infomask then HEAP_KEYS_UPDATED
>>> should not be set in new_infomask2.
>>> - If HEAP_XMIN(XMAX)_COMMITTED is set in the infomask then can we
>>> actually cross verify the transaction status from the CLOG and check
>>> whether is matching the hint bit or not.
>>>
>>> While browsing through the code I could not find that we are doing
>>> this kind of check, ignore if we are already checking this.
>>
>> Thanks for taking a look!
>>
>> Having both of those bits set simultaneously appears to fall into a different category than what I wrote
verify_heapam.cto detect.
>
> Ok
>
>
>> It doesn't violate any assertion in the backend, nor does it cause
>> the code to crash. (At least, I don't immediately see how it does
>> either of those things.) At first glance it appears invalid to have
>> those bits both set simultaneously, but I'm hesitant to enforce that
>> without good reason. If it is a good thing to enforce, should we also
>> change the backend code to Assert?
>
> Yeah, it may not hit assert or crash but it could lead to a wrong
> result. But I agree that it could be an assertion in the backend
> code.
For v7, I've added an assertion for this. Per heap/README.tuplock, "We currently never set the HEAP_XMAX_COMMITTED
whenthe HEAP_XMAX_IS_MULTI bit is set." I added an assertion for that, too. Both new assertions are in
RelationPutHeapTuple(). I'm not sure if that is the best place to put the assertion, but I am confident that the
assertionneeds to only check tuples destined for disk, as in memory tuples can and do violate the assertion.
Also for v7, I've updated contrib/amcheck to report these two conditions as corruption.
> What about the other check, like hint bit is saying the
> transaction is committed but actually as per the clog the status is
> something else. I think in general processing it is hard to check
> such things in backend no? because if the hint bit is set saying that
> the transaction is committed then we will directly check its
> visibility with the snapshot. I think a corruption checker may be a
> good tool for catching such anomalies.
I already made some design changes to this patch to avoid taking the CLogTruncationLock too often. I'm happy to
incorporatethis idea, but perhaps you could provide a design on how to do it without all the extra locking? If not, I
cantry to get this into v8 as an optional check, so users can turn it on at their discretion. Having the check enabled
bydefault is probably a non-starter.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company