Re: Remove HeapTuple and Buffer dependency for predicate lockingfunctions

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Remove HeapTuple and Buffer dependency for predicate lockingfunctions
Дата
Msg-id a361cfc9-c0dc-d9fa-faab-23335a8a7d44@iki.fi
обсуждение исходный текст
Ответ на Re: Remove HeapTuple and Buffer dependency for predicate locking functions  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: Remove HeapTuple and Buffer dependency for predicate locking functions  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On 06/08/2019 07:20, Thomas Munro wrote:
> On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote:
>> On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
>>> 1.  Commit dafaa3efb75 "Implement genuine serializable isolation
>>> level." (2011) locked the root tuple, because it set t_self to *tid.
>>> Possibly due to confusion about the effect of the preceding line
>>> ItemPointerSetOffsetNumber(tid, offnum).
>>>
>>> 2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
>>> fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
>>> offnum).
>>
>> Hm. It's not at all sure that it's ok to report the non-root tuple tid
>> here. I.e. I'm fairly sure there was a reason to not just set it to the
>> actual tid. I think I might have written that up on the list at some
>> point. Let me dig in memory and list. Obviously possible that that was
>> also obsoleted by parallel changes.
> 
> Adding Heikki and Kevin.
> 
> I haven't found your earlier discussion about that yet, but would be
> keen to read it if you can find it.   I wondered if your argument
> might have had something to do with heap pruning, but I can't find a
> problem there.  It's not as though the TID of any visible tuples
> change, it's just that some dead stuff goes away and the root line
> pointer is changed to LP_REDIRECT if it wasn't already.
> 
> As for the argument for locking the tuple we emit rather than the HOT
> root, I think the questions are: (1) How exactly do we get away with
> locking only one version in a regular non-HOT update chain?  (2) Is it
> OK to do that with a HOT root?
> 
> The answer to the first question is given in README-SSI[1].
> Unfortunately it doesn't discuss HOT directly, but I suspect the
> answer is no, HOT is not special here.  By my reading, it relies on
> the version you lock being the version visible to your snapshot, which
> is important because later updates have to touch that tuple to write
> the next version.  That doesn't apply to some arbitrarily older tuple
> that happens to be a HOT root.  Concretely, heap_update() does
> CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
> only going to produce a rw conflict if T1 took an SIREAD on precisely
> the version T2 locks in that path, not some arbitrarily older version
> that happens to be a HOT root.  A HOT root might never be considered
> again by concurrent writers, no?

Your analysis is spot on. Thanks for the clear write-up!

>>> This must be in want of an isolation test, but I haven't yet tried to
>>> get my head around how to write a test that would show the difference.
>>
>> Indeed.
> 
> One practical problem is that the only way to reach
> PredicateLockTuple() is from an index scan, and the index scan locks
> the index page (or the whole index, depending on
> rd_indam->ampredlocks).  So I think if you want to see a serialization
> anomaly you'll need multiple indexes (so that index page locks don't
> hide the problem), a long enough HOT chain and then probably several
> transactions to be able to miss a cycle that should be picked up by
> the logic in [1].  I'm out of steam for this problem today though.

I had some steam, and wrote a spec that reproduces this bug. It wasn't 
actually that hard to reproduce, fortunately. Or unfortunately; people 
might well be hitting it in production. I used the "freezetest.spec" 
from the 2013 thread as the starting point, and added one UPDATE to the 
initialization, so that the test starts with an already HOT-updated 
tuple. It should throw a serialization error, but on current master, it 
does not. After applying your fix.txt, it does.

Your fix.txt seems correct. For clarity, I'd prefer moving things around 
a bit, though, so that the t_self is set earlier in the function, at the 
same place where the other HeapTuple fields are set. And set blkno and 
offnum together, in one ItemPointerSet call. With that, I'm not sure we 
need such a verbose comment explaining why t_self needs to be updated 
but I kept it for now.

Attached is a patch that contains your fix.txt with the changes for 
clarity mentioned above, and an isolationtester test case.

PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self 
to the returned tuple version, updating *tid is redundant. And the call 
in heapam_index_fetch_tuple() wouldn't need to do 
"bslot->base.tupdata.t_self = *tid".

- Heikki

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: Fix a typo in add_partial_path
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Update to DocBook 4.5