Re: [HACKERS] Runtime Partition Pruning

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: [HACKERS] Runtime Partition Pruning
Дата
Msg-id CAKU4AWoqn2Fej7D_2ZCJKqdbOmXiizu3q1xQk5Qz=uAikck0uA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Runtime Partition Pruning  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Ответы Re: [HACKERS] Runtime Partition Pruning  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Список pgsql-hackers
Hi Ashutosh:

On Thu, Oct 8, 2020 at 7:25 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Wed, Oct 7, 2020 at 7:00 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
>
>
> On Wed, Oct 7, 2020 at 5:05 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>>
>>
>>
>> On Sun, Oct 4, 2020 at 3:10 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> Now, in my experience, the current system for custom plans vs. generic
>>>> plans doesn't approach the problem in this way at all, and in my
>>>> experience that results in some pretty terrible behavior.  It will do
>>>> things like form a custom plan every time because the estimated cost
>>>> of the custom plan is lower than the estimated cost of the generic
>>>> plan even though the two plans are structurally identical; only the
>>>> estimates differ.  It will waste gobs of CPU cycles by replanning a
>>>> primary key lookup 5 times just on the off chance that a lookup on the
>>>> primary key index isn't the best option.  But this patch isn't going
>>>> to fix any of that.  The best we can probably do is try to adjust the
>>>> costing for Append paths in some way that reflects the costs and
>>>> benefits of pruning.  I'm tentatively in favor of trying to do
>>>> something modest in that area, but I don't have a detailed proposal.
>>>>
>>>
>>> I just realized this issue recently and reported it at [1], then Amit pointed
>>> me to this issue being discussed here, so I would like to continue this topic
>>> here.
>>>
>>> I think we can split the issue into 2 issues.  One is the partition prune in initial
>>> partition prune, which maybe happen in custom plan case only and caused
>>> the above issue.  The other one happens in the "Run-Time" partition prune,
>>> I admit that is an important issue to resolve as well, but looks harder.   So I
>>> think we can fix the first one at first.
>>>
>>> ... When we count for the cost of a
>>> generic plan, we can reduce the cost based on such information.
>>
>>
>> This way doesn't work since after the initial partition prune,  not only the
>> cost of the Append node should be reduced,  the cost of other plans should
>> be reduced as well [1]
>>
>> However I think if we can use partition prune information from a custom plan
>> at the cost_append_path stage,  it looks the issue can be fixed.  If so, the idea
>> is similar to David's idea in [2],  however Robert didn't agree with this[2].
>> Can anyone elaborate this objection?  for a partkey > $1 or BETWEEN cases,
>> some real results from the past are probably better than some hard-coded
>> assumptions IMO.
>
>
> I can understand Robert's idea now,  he intended to resolve both the
> "initial-partition-prune" case and "runtime partition prune" case while I always think
> about the former case (Amit reminded me about that at the beginning, but I just
> wake up right now..)
>
> With the Robert's idea,  I think we can do some hack at create_append_path,
> we can figure out the pruneinfo (it was done at create_append_plan now) at that
> stage, and then did a fix_append_path with such pruneinfo.  We need to fix not
> only the cost of AppendPath,  but also rows of AppendPath, For example:
> SELECT .. FROM t1, t2, p where t1.a = p.partkey and t1.b = t2.b, if the
> plan is:
> Nest Loop
>    Nest Loop
>       t1
>       Append (p)
>    t2
>
> Then the rows of Append (p) will be very important.
>
> For this idea,  how to estimate how many partitions/rows can be pruned is a key
> part.  Robert said "I was thinking, rather, that if we know for example that
> we've doing pruning on partition_column = $1, then we know that only
> one partition will match.  That's probably a common case.  If we've
> got partition_column > $1, we could assume that, say, 75% of the
> partitions would match.  partition_column BETWEEN $1 and $2 is
> probably a bit more selective, so maybe we assume 50% of the
> partitions would match.".   I think we can't say the 75% or 50% is a good
> number,  but the keypoint may be "partition_column = $1" is a common
> case.  And for the others case, we probably don't make it worse.
>
> I think we need to do something here, any thoughts? Personally I'm more
> like this idea now.

Yes, I think we have to do something on those lines. But I am
wondering why our stats machinery is failing to do that already. For
equality I think it's straight forward. But even for other operators
we should use the statistics from the partitioned table to estimate
the selectivity and then adjust number of scanned partitions = (number
of partitions * fraction of rows scanned). That might be better than
50% or 75%.

 
Sorry for the late reply!  Suppose we have partition defined like this:
p
- p1
- p2

When you talk about "the statistics from the partitioned table", do you
mean the statistics from p or p1/p2?   I just confirmed there is no statistics
for p (at least pg_class.reltuples = -1),  so I think you are talking about
p1/p2.  

Here we are talking about partkey = $1 or  partkey = RunTimeValue. 
so even the value can hit 1 partition only,  but since we don't know 
the exact value, so we treat all the partition equally.  so looks
nothing wrong with partition level estimation.  However when we cost
the Append path, we need know just one of them can be hit,  then
we need do something there.  Both AppendPath->rows/total_cost
should be adjusted (That doesn't happen now). 

--
Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "tsunakawa.takay@fujitsu.com"
Дата:
Сообщение: RE: Transactions involving multiple postgres foreign servers, take 2
Следующее
От: Andy Fan
Дата:
Сообщение: Re: [PATCH] Keeps tracking the uniqueness with UniqueKey