Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key

Поиск
Список
Период
Сортировка
От amul sul
Тема Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key
Дата
Msg-id CAAJ_b96mw5xn5oSQgxpgn5dWFRs1j7OebpHRmXkdSNY+70yYEw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi Andres,

Thanks for your time and the review comments/suggestions.

On Tue, Mar 6, 2018 at 4:53 AM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2018-02-13 12:41:26 +0530, amul sul wrote:
>> From 08c8c7ece7d9411e70a780dbeed89d81419db6b6 Mon Sep 17 00:00:00 2001
>> From: Amul Sul <sulamul@gmail.com>
>> Date: Tue, 13 Feb 2018 12:37:52 +0530
>> Subject: [PATCH 1/2] Invalidate ip_blkid v5
[....]
>
> Very nice and instructive to keep this in a submission's commit message.
>

Thank you.

>
>
>> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
>> index 8a846e7dba..f4560ee9cb 100644
>> --- a/src/backend/access/heap/heapam.c
>> +++ b/src/backend/access/heap/heapam.c
>> @@ -3037,6 +3037,8 @@ xmax_infomask_changed(uint16 new_infomask, uint16 old_infomask)
>>   *   crosscheck - if not InvalidSnapshot, also check tuple against this
>>   *   wait - true if should wait for any conflicting update to commit/abort
>>   *   hufd - output parameter, filled in failure cases (see below)
>> + *   row_moved - true iff the tuple is being moved to another partition
>> + *                           table due to an update of partition key. Otherwise, false.
>>   *
>
> I don't think 'row_moved' is a good variable name for this. Moving a row
> in our heap format can mean a lot of things. Maybe 'to_other_part' or
> 'changing_part'?
>

Okay, renamed to  'changing_part' in the attached version.

>
>> +     /*
>> +      * Sets a block identifier to the InvalidBlockNumber to indicate such an
>> +      * update being moved tuple to another partition.
>> +      */
>> +     if (row_moved)
>> +             BlockIdSet(&((tp.t_data->t_ctid).ip_blkid), InvalidBlockNumber);
>
> The parens here are set in a bit werid way. I assume that's from copying
> it out of ItemPointerSet()?  Why aren't you just using ItemPointerSetBlockNumber()?
>
>
> I think it'd be better if we followed the example of specultive inserts
> and created an equivalent of HeapTupleHeaderSetSpeculativeToken. That'd
> be a heck of a lot easier to grep for...
>

Added HeapTupleHeaderValidBlockNumber, HeapTupleHeaderSetBlockNumber and
ItemPointerValidBlockNumber macro, but not exactly same as the
HeapTupleHeaderSetSpeculativeToken. Do let me know your thoughts/suggestions.

>
>> @@ -9314,6 +9323,18 @@ heap_mask(char *pagedata, BlockNumber blkno)
>>                        */
>>                       if (HeapTupleHeaderIsSpeculative(page_htup))
>>                               ItemPointerSet(&page_htup->t_ctid, blkno, off);
>> +
>> +                     /*
>> +                      * For a deleted tuple, a block identifier is set to the
>
> I think this 'the' is superflous.
>

Fixed in the attached version.

>
>> +                      * InvalidBlockNumber to indicate that the tuple has been moved to
>> +                      * another partition due to an update of partition key.
>
> But I think it should be 'the partition key'.
>

Fixed in the attached version.

>
>> +                      * Like speculative tuple, to ignore any inconsistency set block
>> +                      * identifier to current block number.
>
> This doesn't quite parse.
>

Tried to explain a little bit more, any help or suggestion to improve it
further will be appreciated.

>
>> +                      */
>> +                     if (!BlockNumberIsValid(
>> +                                     BlockIdGetBlockNumber((&((page_htup->t_ctid).ip_blkid)))))
>> +                             BlockIdSet(&((page_htup->t_ctid).ip_blkid), blkno);
>>               }
>
> That formatting looks wrong. I think it should be replaced by a macro
> like mentioned above.
>

Used HeapTupleHeaderValidBlockNumber & HeapTupleHeaderSetBlockNumber
macro in the attached version.

>
>>               /*
>> diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
>> index 160d941c00..a770531e14 100644
>> --- a/src/backend/commands/trigger.c
>> +++ b/src/backend/commands/trigger.c
>> @@ -3071,6 +3071,11 @@ ltrmark:;
>>                                       ereport(ERROR,
>>                                                       (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
>>                                                        errmsg("could not serialize access due to concurrent
update")));
>> +                             if (!BlockNumberIsValid(BlockIdGetBlockNumber(&((hufd.ctid).ip_blkid))))
>> +                                     ereport(ERROR,
>> +                                                     (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>> +                                                      errmsg("tuple to be locked was already moved to another
partitiondue to concurrent update")));
 
>
> Yes, given that we repeat this in multiple places, I *definitely* want
> to see this wrapped in a macro with a descriptive name.
>

Used ItemPointerValidBlockNumber macro all such places.

>
>> diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
>> index 7961b4be6a..b07b7092de 100644
>> --- a/src/backend/executor/nodeLockRows.c
>> +++ b/src/backend/executor/nodeLockRows.c
>> @@ -218,6 +218,11 @@ lnext:
>>                                       ereport(ERROR,
>>                                                       (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
>>                                                        errmsg("could not serialize access due to concurrent
update")));
>> +                             if (!BlockNumberIsValid(BlockIdGetBlockNumber(&((hufd.ctid).ip_blkid))))
>> +                                     ereport(ERROR,
>> +                                                     (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>> +                                                      errmsg("tuple to be locked was already moved to another
partitiondue to concurrent update")));
 
>> +
>
> Why are we using ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE rather than
> ERRCODE_T_R_SERIALIZATION_FAILURE?  A lot of frameworks have builtin
> logic to retry serialization failures, and this kind of thing is going
> to resolved by retrying, no?
>

No change, any comments on Amit's response[1]

>
>> diff --git a/src/test/isolation/expected/partition-key-update-1.out
b/src/test/isolation/expected/partition-key-update-1.out
>> new file mode 100644
>
> I'd like to see tests that show various interactions with ON CONFLICT.
>

I've added isolation test for ON CONFLICT DO NOTHING case only, ON CONFLICT DO
UPDATE is yet to support for a partitioned table[2].  But one can we do that
with update row movement if can test ON CONFLICT DO UPDATE on the leaf table,
like attached TRIAL-on-conflict-do-update-wip.patch, thoughts?

In addition, I have added invalid block number check at the few places, as
discussed in [3]. Also, added check in heap_lock_updated_tuple,
rewrite_heap_tuple & EvalPlanQualFetch where ItemPointerEquals() used
to conclude tuple has been updated/deleted, but yet to figure out the
way to hit this changes manually, so that marking the patch as wip.

Regards,
Amul

1] https://postgr.es/m/CAA4eK1KFfm4PBbshNSikdru3Qpt8hUoKQWtBYjdVE2R7U9f6iA@mail.gmail.com
2] https://postgr.es/m/20180228004602.cwdyralmg5ejdqkq@alvherre.pgsql
3] https://postgr.es/m/CAAJ_b97BBkRWFowGRs9VNzFykoK0ikGB1yYEsWfYK8xR5enSrw@mail.gmail.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Protect syscache from bloating with negative cache entries
Следующее
От: amul sul
Дата:
Сообщение: Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key