Re: [HACKERS] Runtime Partition Pruning

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: [HACKERS] Runtime Partition Pruning
Дата
Msg-id CAKU4AWqGNCAt4rGb7ZKKzPgfRZfp4DfO8fNicXWzKi9YdnBcaA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Runtime Partition Pruning  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers


On Tue, Oct 13, 2020 at 3:48 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Thu, Oct 8, 2020 at 8:25 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> On Wed, Oct 7, 2020 at 7:00 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
> > I can understand Robert's idea now,  he intended to resolve both the
> > "initial-partition-prune" case and "runtime partition prune" case while I always think
> > about the former case (Amit reminded me about that at the beginning, but I just
> > wake up right now..)
> >
> > With the Robert's idea,  I think we can do some hack at create_append_path,
> > we can figure out the pruneinfo (it was done at create_append_plan now) at that
> > stage, and then did a fix_append_path with such pruneinfo.  We need to fix not
> > only the cost of AppendPath,  but also rows of AppendPath, For example:
> > SELECT .. FROM t1, t2, p where t1.a = p.partkey and t1.b = t2.b, if the
> > plan is:
> > Nest Loop
> >    Nest Loop
> >       t1
> >       Append (p)
> >    t2
> >
> > Then the rows of Append (p) will be very important.
> >
> > For this idea,  how to estimate how many partitions/rows can be pruned is a key
> > part.  Robert said "I was thinking, rather, that if we know for example that
> > we've doing pruning on partition_column = $1, then we know that only
> > one partition will match.  That's probably a common case.  If we've
> > got partition_column > $1, we could assume that, say, 75% of the
> > partitions would match.  partition_column BETWEEN $1 and $2 is
> > probably a bit more selective, so maybe we assume 50% of the
> > partitions would match.".   I think we can't say the 75% or 50% is a good
> > number,  but the keypoint may be "partition_column = $1" is a common
> > case.  And for the others case, we probably don't make it worse.
> >
> > I think we need to do something here, any thoughts? Personally I'm more
> > like this idea now.
>
> Yes, I think we have to do something on those lines. But I am
> wondering why our stats machinery is failing to do that already. For
> equality I think it's straight forward. But even for other operators
> we should use the statistics from the partitioned table to estimate
> the selectivity and then adjust number of scanned partitions = (number
> of partitions * fraction of rows scanned). That might be better than
> 50% or 75%.

This is an interesting idea, that is, the idea of consulting parent
relation's stats to guess at the number of partitions that might be
scanned.

However, we don't currently consult an inheritance parent relation's
stats at all during "base" relation planning, which is why you don't
see the parent relation's statistics reflected in the row count and
costs assigned to its (Append at al) paths.  The hard-coded rule is to
sum up children's rows and their paths' costs; see cost_append().

My thinking on this was that we'd just extend that hard-coded rule to
take into account that run-time pruning, if applicable in a given
plan, will cause some of those child paths to be discarded.  Maybe
Andy is headed in that direction?


Yes, that's what I am trying to do.  The minimum code in my mind is:

create_append_path(...)
{

   double run_time_prune_est;

   if (enable_partition_prune && .. )
       List *  partition_prune_step_ops =  cal_prune_step_ops(rel,  partitioned_rels);
       run_time_prune_est = estimate_runtime_prune_ratio(partition_prune_step_ops);
}
// adjust the rows/costs of AppendPath based on the  run_time_prune_est.

The difference between '=' and '<' will be handled in function 
estimate_runtime_prune_ratio. Since I still don't make my mind runnable now,  
some data structures may  change, but the overall idea is something like above. 


--
Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: archive status ".ready" files may be created too early
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: Transactions involving multiple postgres foreign servers, take 2