Re: [HACKERS] Runtime Partition Pruning

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: [HACKERS] Runtime Partition Pruning
Дата
Msg-id 56f5af24-4909-26a0-5913-a43ab58ea07b@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] Runtime Partition Pruning  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
Hi David,

On 2018/04/06 12:27, David Rowley wrote:
> (sending my reply in parts for concurrency)
> 
> On 6 April 2018 at 14:39, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> I think we can save some space here by not having the pointers stored
>> here.  Instead of passing the pointer itself in the recursive calls to
>> find_subplans_for_extparams_recurse, et al, just pass the entire array and
>> an offset to use for the given recursive invocation.
> 
> hmm, where would those offsets be stored?  I don't want to have to do
> any linear searching to determine the offset, which is why I just
> stored the pointer to the flat array. It seems very efficient to me to
> do this. Remember that this pruning can occur per tuple in cases like
> parameterized nested loops.
> 
> Are you worried about memory consumption? It's one pointer per
> partition. I imagine there's lots more allocated for DML on a
> partitioned table as it needs to store maps to map attribute numbers.
> 
> Or are you thinking the saving of storing an array of 32-bit int
> values is better than the array of probably 64-bit pointers? So
> requires half the space?

Yeah, just copy it from the PartitionPruneInfo like you're doing for
subnodeindex:

   memcpy(partrelprune->subpartindex, pinfo->subpartindex,
          sizeof(int) * pinfo->nparts);

Instead I see ExecSetupPartitionPruning is now doing this:

  /*
   * Setup the PartitionedRelPruning's subpartprune so that we can
   * quickly find sub-PartitionedRelPruning details for any
   * sub-partitioned tables that this partitioned table contains.
   * We need to be able to find these quickly during our recursive
   * search to find all matching subnodes.
   */
   for (j = 0; j < pinfo->nparts; j++)
   {
       int         subpartidx = pinfo->subpartindex[j];

       Assert(subpartidx < list_length(partitionpruneinfo));

       if (subpartidx >= 0)
           partrelprune->subpartprune[j] = &partrelprunes[subpartidx];
       else
           partrelprune->subpartprune[j] = NULL;
   }


With that in place, pass the index/offset instead of the pointer to the
next recursive invocation of find_subplans_*, along with the array
containing all PartitionedRelPruning's.

So, where you have in each of find_subplans_*:

  if (partrelprune->subnodeindex[i] >= 0)
      *validsubplans = bms_add_member(*validsubplans,
                                      partrelprune->subnodeindex[i]);
  else if (partrelprune->subpartprune[i] != NULL)
      find_subplans_for_allparams_recurse(partrelprune->subpartprune[i],
                                          validsubplans);

I'm proposing that you do:

  if (partrelprune->subnodeindex[i] >= 0)
      *validsubplans = bms_add_member(*validsubplans,
                                      partrelprune->subnodeindex[i]);
  else if (partrelprune->subpartindex[i] >= 0)
      find_subplans_for_allparams_recurse(all_partrelprunes,
                                          partrelprune->subpartindex[i],
                                          validsubplans);

And at the top of each of  find_subplans_*:

  ParitionedRelPruning *partrelprune = all_partrelprunes[offset];

where the signature is:

  static void
  find_subplans_for_allparams_recurse(
                                 PartitionRelPruning **all_partrelprune,
                                 int offset,
                                 Bitmapset **validsubplans)

The all_partrelprune above refers to the flat array in PartitionPruning.
On the first call from ExecFindMatchingSubPlans, et al, you'd pass 0 for
offset to start pruning with the root parent's PartitionedRelPruning.  All
the values contained in subnodeindex and subpartindex are indexes into the
global array (whole-tree that is) anyway and that fact would be more
apparent if we use this code structure, imho.

>> If you look at ExecFindPartition used for tuple routing, you may see that
>> there no recursion at all.  Similarly find_subplans_for_extparams_recurse,
>> et al might be able to avoid recursion if similar tricks are used.
> 
> That seems pretty different. That's looking for a single node in a
> tree, so just is following a single path from the root, it never has
> to go back up a level and look down any other paths.
> 
> What we need for the find_subplans_for_extparams_recurse is to find
> all nodes in the entire tree which match the given clause. Doing this
> without recursion would require some sort of stack so we can go back
> up a level and search again down another branch.  There are ways
> around this without using recursion, sure, but I don't think any of
> them will be quite as convenient and simple. The best I can think of
> is to palloc some stack manually and use some depth_level to track
> which element to use.  An actual stack seems more simple. I can't
> quite think of a good way to know in advance how many elements we'd
> need to palloc.

Hmm, yeah.  I just remembered that I had to give up suggesting this a
while back on this thread.  So, okay, you don't need to do anything about
this.

>> Finally about having two different functions for different sets of Params:
>> can't we have just named find_subplans_for_params_recurse() and use the
>> appropriate one based on the value of some parameter?  I can't help but
>> notice the duplication of code.
> 
> I had decided not to make this one function previously as I didn't
> really want to add unnecessary branching in the code. After
> implementing it, it does not look as bad as I thought.

You could at the top of, say, find_subplans_for_params_recurse(..., bool
extparam):

Bitmapset *params = extparam
                        ? partrelprune->extparams
                        : partrelprune->allparams;

ISTT, find_subplans_for_extparams_recurse and
find_subplans_for_allparams_recurse contain the exact same code except
these two lines in the two functions, respectively:

    if (!bms_is_empty(partrelprune->extparams))
    {
        context->safeparams = partrelprune->extparams;

    if (!bms_is_empty(partrelprune->allparams))
    {
        context->safeparams = partrelprune->allparams;

I didn't suggest the same for say ExecFindInitialMatchingSubPlans and
ExecFindMatchingSubPlans because they seem to be at least a bit different
in their missions.  IIUC, the former has to do index value adjustment
(those in subnodeindex and subpartindex) after having discovered a new set
of matching partitions in the tree after pruning with external params at
Append/MergeAppend startup and those params won't change after that point.

Thanks,
Amit



В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: WIP: a way forward on bootstrap data
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: PATCH: Configurable file mode mask