Обсуждение: Slotification of partition tuple conversion
Hi, In [1] , it was shown that the partition tuples are needlessly formed and deformed during tuple conversion (do_convert_tuple), when the same operation can be done using tuple slots. This is because the input slot might already have a deformed tuple. Attached is a patch tup_convert.patch that does the conversion directly using input and output tuple slots. This patch is to be applied on an essential patch posted by Amit Langote in [1] that arranges for dedicated per-partition slots rather than changing tupdesc of a single slot. I have rebased that patch from [1] and attached it here ( Allocate-dedicated-slots-of-partition-tuple_amitlan_rebased.patch). In tup_convert.patch, wherever ConvertPartitionTupleSlot() and do_convert_tuple() are used to convert partition tuples, I have replaced them by a new function ConvertTupleSlot(). ConvertPartitionTupleSlot() is exclusively for conversion of partition-to-parent and vice versa, so I got rid of this. do_convert_tuple() seems to be used for tuples belonging to non-partition tables as well, so I have left such calls untouched. May be we can "slotify" these tuple conversions later as a separate task. [1] https://www.postgresql.org/message-id/e4f9d743-cd4b-efb0-7574-da21d86a7f36%40lab.ntt.co.jp -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Вложения
On 17 August 2018 at 21:47, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > Attached is a patch tup_convert.patch that does the conversion > directly using input and output tuple slots. This patch is to be > applied on an essential patch posted by Amit Langote in [1] that > arranges for dedicated per-partition slots rather than changing > tupdesc of a single slot. I have rebased that patch from [1] and > attached it here ( > Allocate-dedicated-slots-of-partition-tuple_amitlan_rebased.patch). Amit Langote has posted a new version of the Allocate-dedicated-slots-of-partition-tuple_amitlan_rebased.patch at [1] . Attached is version v2 of the tup_convert.patch, that is rebased over that patch. [1] https://www.postgresql.org/message-id/attachment/64354/v2-0001-Allocate-dedicated-slots-of-partition-tuple-conve.patch -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Вложения
On 2018/08/20 20:15, Amit Khandekar wrote: > On 17 August 2018 at 21:47, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > >> Attached is a patch tup_convert.patch that does the conversion >> directly using input and output tuple slots. This patch is to be >> applied on an essential patch posted by Amit Langote in [1] that >> arranges for dedicated per-partition slots rather than changing >> tupdesc of a single slot. I have rebased that patch from [1] and >> attached it here ( >> Allocate-dedicated-slots-of-partition-tuple_amitlan_rebased.patch). > > Amit Langote has posted a new version of the > Allocate-dedicated-slots-of-partition-tuple_amitlan_rebased.patch at > [1] . Attached is version v2 of the tup_convert.patch, that is rebased > over that patch. Thanks. Here are some comments on the patch: +ConvertTupleSlot Might be a good idea to call this ConvertSlotTuple? + /* + * Get the tuple in the per-tuple context. Also, we cannot call + * ExecMaterializeSlot(), otherwise the tuple will get freed + * while storing the next tuple. + */ + oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate)); + tuple = ExecCopySlotTuple(slot); + MemoryContextSwitchTo(oldcontext); The comment about why we cannot call ExecMaterializeSlot is a bit unclear. Maybe, the comment doesn't need to mention that? Instead, expand a bit more on why the context switch here or how it interacts with recently tuple buffering (0d5f05cde01)? Seeing that all the sites in the partitioning code that previously called do_convert_tuple() now call ConvertTupleSlot, I wonder if we don't need a full TupleConversionMap, just the attribute map in it. We don't need the input/output Datum and bool arrays in it, because we'd be using the ones from input and output slots of ConvertTupleSlot. So, can we replace all instances of TupleConversionMap in the partitioning code and data structures by AttributeNumber arrays. Also, how about putting ConvertTupleSlot in execPartition.c and exporting it in execPartition.h, instead of tupconvert.c and tupconvert.h, resp.? Thanks, Amit
On 21 August 2018 at 08:12, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Here are some comments on the patch: Thanks for the review. > > +ConvertTupleSlot > > Might be a good idea to call this ConvertSlotTuple? I thought the name 'ConvertTupleSlot' emphasizes the fact that we are operating rather on a slot without having to touch the tuple wherever possible. Hence I chose the name. But I am open to suggestions if there are more votes against this. > > + /* > + * Get the tuple in the per-tuple context. Also, we > cannot call > + * ExecMaterializeSlot(), otherwise the tuple will get freed > + * while storing the next tuple. > + */ > + oldcontext = > MemoryContextSwitchTo(GetPerTupleMemoryContext(estate)); > + tuple = ExecCopySlotTuple(slot); > + MemoryContextSwitchTo(oldcontext); > > The comment about why we cannot call ExecMaterializeSlot is a bit unclear. > Maybe, the comment doesn't need to mention that? Instead, expand a bit > more on why the context switch here or how it interacts with recently > tuple buffering (0d5f05cde01)? Done : - * Get the tuple in the per-tuple context. Also, we cannot call - * ExecMaterializeSlot(), otherwise the tuple will get freed - * while storing the next tuple. + * Get the tuple in the per-tuple context, so that it will be + * freed after each batch insert. Specifically, we could not call ExecMaterializeSlot() because it sets tts_shouldFree to true which we don't want for batch inserts. > > Seeing that all the sites in the partitioning code that previously called > do_convert_tuple() now call ConvertTupleSlot, I wonder if we don't need a > full TupleConversionMap, just the attribute map in it. We don't need the > input/output Datum and bool arrays in it, because we'd be using the ones > from input and output slots of ConvertTupleSlot. So, can we replace all > instances of TupleConversionMap in the partitioning code and data > structures by AttributeNumber arrays. Yeah, I earlier thought I could get rid of do_convert_tuple() altogether. But there are places where we currently deal with tuples rather than slots. And here, we could not replace do_convert_tuple() unless we slotify the surrounding code similar to how you did in the Allocate-dedicated-slots patch. E.g. ExecEvalConvertRowtype() and AfterTriggerSaveEvent() deals with tuples so the do_convert_tuple() calls there couldn't be replaced. So I think we can work towards what you have in mind in form of incremental patches. > > Also, how about putting ConvertTupleSlot in execPartition.c and exporting > it in execPartition.h, instead of tupconvert.c and tupconvert.h, resp.? I think the goal of ConvertTupleSlot() is to eventually replace do_convert_tuple() as well, so it would look more of a general conversion rather than specific for partitions. Hence I think it's better to have it in tupconvert.c . -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Вложения
Hi Amit, Thanks for the updated patch and sorry I couldn't reply sooner. On 2018/08/21 16:18, Amit Khandekar wrote: > On 21 August 2018 at 08:12, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Here are some comments on the patch: > > Thanks for the review. > >> >> +ConvertTupleSlot >> >> Might be a good idea to call this ConvertSlotTuple? > > I thought the name 'ConvertTupleSlot' emphasizes the fact that we are > operating rather on a slot without having to touch the tuple wherever > possible. Hence I chose the name. But I am open to suggestions if > there are more votes against this. Hmm, okay. The verb here is "to convert", which still applies to a tuple, the only difference is that the new interface accepts and returns TupleTableSlot, instead of HeapTuple. >> + /* >> + * Get the tuple in the per-tuple context. Also, we >> cannot call >> + * ExecMaterializeSlot(), otherwise the tuple will get freed >> + * while storing the next tuple. >> + */ >> + oldcontext = >> MemoryContextSwitchTo(GetPerTupleMemoryContext(estate)); >> + tuple = ExecCopySlotTuple(slot); >> + MemoryContextSwitchTo(oldcontext); >> >> The comment about why we cannot call ExecMaterializeSlot is a bit unclear. >> Maybe, the comment doesn't need to mention that? Instead, expand a bit >> more on why the context switch here or how it interacts with recently >> tuple buffering (0d5f05cde01)? > > Done : > > - * Get the tuple in the per-tuple context. Also, we cannot call > - * ExecMaterializeSlot(), otherwise the tuple will get freed > - * while storing the next tuple. > + * Get the tuple in the per-tuple context, so that it will be > + * freed after each batch insert. > > Specifically, we could not call ExecMaterializeSlot() because it sets > tts_shouldFree to true which we don't want for batch inserts. Thanks. By the way, I was a bit confused by an existing comment just above this patch's change, which says: /* * We might need to convert from the parent rowtype to the * partition rowtype. Don't free the already stored tuple as it * may still be required for a multi-insert batch. */ I wasn't sure what "Don't free the already stored tuple" here meant, until I saw a hunk from 0d5f05cde ("Allow multi-inserts during COPY into a partitioned table") that introduced it: @@ -2691,12 +2861,14 @@ CopyFrom(CopyState cstate) /* * We might need to convert from the parent rowtype to the - * partition rowtype. + * partition rowtype. Don't free the already stored tuple as it + * may still be required for a multi-insert batch. */ tuple = ConvertPartitionTupleSlot(proute->parent_child_tupconv_maps[leaf_part_index], tuple, proute->partition_tuple_slot, - &slot); + &slot, + false); So the "already stored tuple" means a previous tuple that may have been stored in 'proute->partition_tuple_slot', which shouldn't be freed in ConvertPartitionTupleSlot (via ExecStoreTuple), because it might also have been stored in the batch insert tuple array. Now, because with ConvertTupleSlot, there is no worry about freeing an existing tuple, because we never perform ExecStoreTuple on proute->partition_tuple_slot (or one of the slots in it if we consider my patch at [1] which converts it to an array). All I see applied to those slots is ExecStoreVirtualTuple() in ConvertTupleSlot(), and in some cases, ExecCopySlotTuple() in the caller of ConvertTupleSlot(). So, I think that that sentence is obsolete with your patch. >> Seeing that all the sites in the partitioning code that previously called >> do_convert_tuple() now call ConvertTupleSlot, I wonder if we don't need a >> full TupleConversionMap, just the attribute map in it. We don't need the >> input/output Datum and bool arrays in it, because we'd be using the ones >> from input and output slots of ConvertTupleSlot. So, can we replace all >> instances of TupleConversionMap in the partitioning code and data >> structures by AttributeNumber arrays. > > Yeah, I earlier thought I could get rid of do_convert_tuple() > altogether. But there are places where we currently deal with tuples > rather than slots. And here, we could not replace do_convert_tuple() > unless we slotify the surrounding code similar to how you did in the > Allocate-dedicated-slots patch. E.g. ExecEvalConvertRowtype() and > AfterTriggerSaveEvent() deals with tuples so the do_convert_tuple() > calls there couldn't be replaced. > > So I think we can work towards what you have in mind in form of > incremental patches. I was saying, because we no longer use do_convert_tuple for "partitioning-related" tuple conversions, we could get rid of TupleConversionMaps in the partitioning-specific PartitionTupleRouting structure. Also, since ConvertTupleSlot itself just uses the attrMap field of TupleConversionMap, I was wondering if its signature should contain AttrNumber * instead of TupleConversionMap *? Maybe if we get around to getting rid of do_convert_tuple altogether, that would also mean getting rid of the TupleConversionMap struct, because its various tuple data related arrays would be redundant, because ConvertTupleSlot has input and output TupleTableSlots, which have space for that information. >> Also, how about putting ConvertTupleSlot in execPartition.c and exporting >> it in execPartition.h, instead of tupconvert.c and tupconvert.h, resp.? > > I think the goal of ConvertTupleSlot() is to eventually replace > do_convert_tuple() as well, so it would look more of a general > conversion rather than specific for partitions. Hence I think it's > better to have it in tupconvert.c. Okay, maybe that makes sense for any future developments. Thanks, Amit [1] https://www.postgresql.org/message-id/6505cc8c-a8e4-986e-82d3-6106877ecd3a%40lab.ntt.co.jp
On 3 September 2018 at 12:14, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Hi Amit, > > Thanks for the updated patch and sorry I couldn't reply sooner. > > On 2018/08/21 16:18, Amit Khandekar wrote: >> On 21 August 2018 at 08:12, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>> Here are some comments on the patch: >> >> Thanks for the review. >> >>> >>> +ConvertTupleSlot >>> >>> Might be a good idea to call this ConvertSlotTuple? >> >> I thought the name 'ConvertTupleSlot' emphasizes the fact that we are >> operating rather on a slot without having to touch the tuple wherever >> possible. Hence I chose the name. But I am open to suggestions if >> there are more votes against this. > > Hmm, okay. > > The verb here is "to convert", which still applies to a tuple, the only > difference is that the new interface accepts and returns TupleTableSlot, > instead of HeapTuple. Yeah, in that sense you are right. Let's see others' suggestions. For now I haven't changed it. > @@ -2691,12 +2861,14 @@ CopyFrom(CopyState cstate) > > /* > * We might need to convert from the parent rowtype to the > - * partition rowtype. > + * partition rowtype. Don't free the already stored tuple as it > + * may still be required for a multi-insert batch. > */ > tuple = > ConvertPartitionTupleSlot(proute->parent_child_tupconv_maps[leaf_part_index], > tuple, > proute->partition_tuple_slot, > - &slot); > + &slot, > + false); > > So the "already stored tuple" means a previous tuple that may have been > stored in 'proute->partition_tuple_slot', which shouldn't be freed in > ConvertPartitionTupleSlot (via ExecStoreTuple), because it might also have > been stored in the batch insert tuple array. > > Now, because with ConvertTupleSlot, there is no worry about freeing an > existing tuple, because we never perform ExecStoreTuple on > proute->partition_tuple_slot (or one of the slots in it if we consider my > patch at [1] which converts it to an array). All I see applied to those > slots is ExecStoreVirtualTuple() in ConvertTupleSlot(), and in some cases, > ExecCopySlotTuple() in the caller of ConvertTupleSlot(). > > So, I think that that sentence is obsolete with your patch. Right. Removed the "Don't free the already stored tuple" part. > I was saying, because we no longer use do_convert_tuple for > "partitioning-related" tuple conversions, we could get rid of > TupleConversionMaps in the partitioning-specific PartitionTupleRouting > structure. We use child_parent_tupconv_maps in AfterTriggerSaveEvent() where we call do_convert_tuple() with the map. We pass parent_child_tupconv_maps to adjust_partition_tlist(), which requires the map->outdesc. So there are these places where still we can't get rid of TupleConversionMaps even for partition-related tuples. Regarding adjust_partition_tlist(), we can perhaps pass the outdesc from somewhere else rather than map->outdesc, making it possible to use AttrNumber map rather than TupleConversionMap. But it needs some investigation. Overall, I think both the above maps should be worked on as a separate item, and not put in this patch that focusses on ConvertTupleSlot(). I have changed the PartitionDispatch->tupmap type to AttrNumber[], though. This was possible because we are using the tupmap->attrMap and no other fields. > > Also, since ConvertTupleSlot itself just uses the attrMap field of > TupleConversionMap, I was wondering if its signature should contain > AttrNumber * instead of TupleConversionMap *? Done. > > Maybe if we get around to getting rid of do_convert_tuple altogether, that > would also mean getting rid of the TupleConversionMap struct, because its > various tuple data related arrays would be redundant, because > ConvertTupleSlot has input and output TupleTableSlots, which have space > for that information. Yeah agree. We will be working towards that. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Вложения
Hi Amit, Could you rebase this patch, it doesn't apply anymore. Greetings, Andres Freund
On Wed, 26 Sep 2018 at 03:33, Andres Freund <andres@anarazel.de> wrote: > > Hi Amit, > > Could you rebase this patch, it doesn't apply anymore. Thanks for informing. Attached are both mine and Amit Langote's patch rebased and attached ... -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Вложения
On 2018/09/28 19:06, Amit Khandekar wrote: > On Wed, 26 Sep 2018 at 03:33, Andres Freund <andres@anarazel.de> wrote: >> >> Hi Amit, >> >> Could you rebase this patch, it doesn't apply anymore. > > Thanks for informing. Attached are both mine and Amit Langote's patch > rebased and attached ... Thanks Amit for also taking care of the other one. I don't have time today, but will try to take a look on Monday. Regards, Amit
Hi, On 2018-09-28 15:36:00 +0530, Amit Khandekar wrote: > On Wed, 26 Sep 2018 at 03:33, Andres Freund <andres@anarazel.de> wrote: > > > > Hi Amit, > > > > Could you rebase this patch, it doesn't apply anymore. > > Thanks for informing. Attached are both mine and Amit Langote's patch > rebased and attached ... I wasn't quite happy yet with that patch. - ConvertTupleSlot seems like a too generic name, it's very unclear it's related to tuple mapping, rather than something internal to slots. I went for execute_attr_map_slot (and renamed do_convert_tuple to execute_attr_map_tuple, to match). I'd welcome a better name. - I disliked inheritence_tupconv_map, it doesn't seem very clear why this is named inheritence_* (typo aside). I went for convert_tuples_by_name_map_if_req() - while I think this sounds too much like it converts tuples itself it should be renamed with the rest of the convert_tuples_by_* routines. I'd welcome a better name. - Combined the two patches, they seemed to largely affect related code I'm pretty tired right now, so I'm sure the patch, as attached, contains a few flubs. I'll try to get this committed tomorrow morning PST. - Andres
Вложения
Hi, I looked at the patch. Some comments. On 2018/10/02 16:35, Andres Freund wrote: > I wasn't quite happy yet with that patch. > > - ConvertTupleSlot seems like a too generic name, it's very unclear it's > related to tuple mapping, rather than something internal to slots. I > went for execute_attr_map_slot (and renamed do_convert_tuple to > execute_attr_map_tuple, to match). > > I'd welcome a better name. do_convert_tuple -> convert_tuple_using_map ConvertTuplSlot -> convert_tuple_using_map_slot ? > - I disliked inheritence_tupconv_map, it doesn't seem very clear why > this is named inheritence_* (typo aside). I went for > convert_tuples_by_name_map_if_req() - while I think this sounds > too much like it converts tuples itself it should be renamed with the > rest of the convert_tuples_by_* routines. > > I'd welcome a better name. Agree about doing something about the convert_* names. A comment near the beginning of tupconvert.c says all of these convert_* routines are meant as conversion "setup" routines: /* * The conversion setup routines have the following common API: So, maybe: convert_tuples_by_position -> get_conversion_map_by_position convert_tuples_by_name -> get_conversion_map_by_name convert_tuples_by_name_map -> get_attr_map_for_conversion convert_tuples_by_name_map_if_req -> get_attr_map_for_conversion_if_req > - Combined the two patches, they seemed to largely affect related code Hmm yeah, the code and data structures that my patch changed are only related to the cases which involve tuple conversion. I noticed the comment at the top of tupconvert.c requires updating: * of dropped columns. There is some overlap of functionality with the * executor's "junkfilter" routines, but these functions work on bare * HeapTuples rather than TupleTableSlots. Now we have functions that manipulate TupleTableSlots. Thanks, Amit
On 2018-10-02 18:35:29 +0900, Amit Langote wrote: > Hi, > > I looked at the patch. Some comments. > > On 2018/10/02 16:35, Andres Freund wrote: > > I wasn't quite happy yet with that patch. > > > > - ConvertTupleSlot seems like a too generic name, it's very unclear it's > > related to tuple mapping, rather than something internal to slots. I > > went for execute_attr_map_slot (and renamed do_convert_tuple to > > execute_attr_map_tuple, to match). > > > > I'd welcome a better name. > > do_convert_tuple -> convert_tuple_using_map > ConvertTuplSlot -> convert_tuple_using_map_slot I think my name is more descriptive, referencing the attr map seems better to me. > > - I disliked inheritence_tupconv_map, it doesn't seem very clear why > > this is named inheritence_* (typo aside). I went for > > convert_tuples_by_name_map_if_req() - while I think this sounds > > too much like it converts tuples itself it should be renamed with the > > rest of the convert_tuples_by_* routines. > > > > I'd welcome a better name. > > Agree about doing something about the convert_* names. A comment near the > beginning of tupconvert.c says all of these convert_* routines are meant > as conversion "setup" routines: I'll try to tackle that in a separate commit. > /* > * The conversion setup routines have the following common API: > > So, maybe: > > convert_tuples_by_position -> get_conversion_map_by_position > convert_tuples_by_name -> get_conversion_map_by_name > convert_tuples_by_name_map -> get_attr_map_for_conversion > convert_tuples_by_name_map_if_req -> get_attr_map_for_conversion_if_req s/get/build/? I'm also not a fan of the separation of attr and conversion map to signal the difference between tuples and slots being converted... > I noticed the comment at the top of tupconvert.c requires updating: > > * of dropped columns. There is some overlap of functionality with the > * executor's "junkfilter" routines, but these functions work on bare > * HeapTuples rather than TupleTableSlots. > > Now we have functions that manipulate TupleTableSlots. Heh. I think I'll just drop the entire sentence - I don't really think there's much of an overlap to junkfilters these days. Greetings, Andres Freund
On 2018-10-02 11:02:37 -0700, Andres Freund wrote: > On 2018-10-02 18:35:29 +0900, Amit Langote wrote: > > Hi, > > > > I looked at the patch. Some comments. > > > > On 2018/10/02 16:35, Andres Freund wrote: > > > I wasn't quite happy yet with that patch. > > > > > > - ConvertTupleSlot seems like a too generic name, it's very unclear it's > > > related to tuple mapping, rather than something internal to slots. I > > > went for execute_attr_map_slot (and renamed do_convert_tuple to > > > execute_attr_map_tuple, to match). > > > > > > I'd welcome a better name. > > > > do_convert_tuple -> convert_tuple_using_map > > ConvertTuplSlot -> convert_tuple_using_map_slot > > I think my name is more descriptive, referencing the attr map seems > better to me. > > > > > - I disliked inheritence_tupconv_map, it doesn't seem very clear why > > > this is named inheritence_* (typo aside). I went for > > > convert_tuples_by_name_map_if_req() - while I think this sounds > > > too much like it converts tuples itself it should be renamed with the > > > rest of the convert_tuples_by_* routines. > > > > > > I'd welcome a better name. > > > > Agree about doing something about the convert_* names. A comment near the > > beginning of tupconvert.c says all of these convert_* routines are meant > > as conversion "setup" routines: > > I'll try to tackle that in a separate commit. > > > > /* > > * The conversion setup routines have the following common API: > > > > So, maybe: > > > > convert_tuples_by_position -> get_conversion_map_by_position > > convert_tuples_by_name -> get_conversion_map_by_name > > convert_tuples_by_name_map -> get_attr_map_for_conversion > > convert_tuples_by_name_map_if_req -> get_attr_map_for_conversion_if_req > > s/get/build/? I'm also not a fan of the separation of attr and > conversion map to signal the difference between tuples and slots being > converted... > > > > I noticed the comment at the top of tupconvert.c requires updating: > > > > * of dropped columns. There is some overlap of functionality with the > > * executor's "junkfilter" routines, but these functions work on bare > > * HeapTuples rather than TupleTableSlots. > > > > Now we have functions that manipulate TupleTableSlots. > > Heh. I think I'll just drop the entire sentence - I don't really think > there's much of an overlap to junkfilters these days. I've pushed this now. We can discuss about the other renaming and then I'll commit that separately. Greetings, Andres Freund