Re: Problem with default partition pruning

Поиск

Список

Период

Сортировка

От	Thibaut
Тема	Re: Problem with default partition pruning
Дата	14 марта 2019 г. 17:10:53
Msg-id	a4968068-6401-7a9c-8bd4-6a3bc9164a86@dalibo.com обсуждение исходный текст
Ответ на	RE: Problem with default partition pruning ("Imai, Yoshikazu" <imai.yoshikazu@jp.fujitsu.com>)
Ответы	RE: Problem with default partition pruning
Список	pgsql-hackers

Дерево обсуждения

Le 28/02/2019 à 09:26, Imai, Yoshikazu a écrit :
> Hosoya-san
>
> On Wed, Feb 27, 2019 at 6:51 AM, Yuzuko Hosoya wrote:
>>> From: Amit Langote [mailto:Langote_Amit_f8@lab.ntt.co.jp]
>>> Sent: Wednesday, February 27, 2019 11:22 AM
>>>
>>> Hosoya-san,
>>>
>>> On 2019/02/22 17:14, Yuzuko Hosoya wrote:
>>>> Hi,
>>>>
>>>> I found the bug of default partition pruning when executing a range
>> query.
>>>> -----
>>>> postgres=# create table test1(id int, val text) partition by range
>>>> (id); postgres=# create table test1_1 partition of test1 for values
>>>> from (0) to (100); postgres=# create table test1_2 partition of
>>>> test1 for values from (150) to (200); postgres=# create table
>>>> test1_def partition of test1 default;
>>>>
>>>> postgres=# explain select * from test1 where id > 0 and id < 30;
>>>>                            QUERY PLAN
>>>> ----------------------------------------------------------------
>>>>  Append  (cost=0.00..11.83 rows=59 width=11)
>>>>    ->  Seq Scan on test1_1  (cost=0.00..5.00 rows=58 width=11)
>>>>          Filter: ((id > 0) AND (id < 30))
>>>>    ->  Seq Scan on test1_def  (cost=0.00..6.53 rows=1 width=12)
>>>>          Filter: ((id > 0) AND (id < 30))
>>>> (5 rows)
>>>>
>>>> There is no need to scan the default partition, but it's scanned.
>>>> -----
>>>>
>>>> In the current implement, whether the default partition is scanned
>>>> or not is determined according to each condition of given WHERE
>>>> clause at get_matching_range_bounds().  In this example,
>>>> scan_default is set true according to id > 0 because id >= 200
>>>> matches the default partition.  Similarly, according to id < 30,
>> scan_default is set true.
>>>> Then, these results are combined according to AND/OR at
>> perform_pruning_combine_step().
>>>> In this case, final result's scan_default is set true.
>>>>
>>>> The modifications I made are as follows:
>>>> - get_matching_range_bounds() determines only offsets of range bounds
>>>>   according to each condition
>>>> - These results are combined at perform_pruning_combine_step()
>>>> - Whether the default partition is scanned or not is determined at
>>>>   get_matching_partitions()
>>>>
>>>> Attached the patch.  Any feedback is greatly appreciated.
>>> Thank you for reporting.  Can you please add this to March CF in Bugs
>>> category so as not to lose
>> track
>>> of this?
>>>
>>> I will try to send review comments soon.
>>>
>> Thank you for your reply.  I added this to March CF.
> I tested with simple use case and I confirmed it works correctly like below.
>
> In case using between clause:
> postgres=# create table test1(id int, val text) partition by range (id);
> postgres=# create table test1_1 partition of test1 for values from (0) to (100);
> postgres=# create table test1_2 partition of test1 for values from (150) to (200);
> postgres=# create table test1_def partition of test1 default;
>
> [HEAD]
> postgres=# explain analyze select * from test1 where id between 0 and 50;
>                                                 QUERY PLAN
> -----------------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..58.16 rows=12 width=36) (actual time=0.008..0.008 rows=0 loops=1)
>    ->  Seq Scan on test1_1  (cost=0.00..29.05 rows=6 width=36) (actual time=0.005..0.005 rows=0 loops=1)
>          Filter: ((id >= 0) AND (id <= 50))
>    ->  Seq Scan on test1_def  (cost=0.00..29.05 rows=6 width=36) (actual time=0.002..0.002 rows=0 loops=1)
>          Filter: ((id >= 0) AND (id <= 50))
>
>
> [patched]
> postgres=# explain analyze select * from test1 where id between 0 and 50;
>                                                QUERY PLAN
> ---------------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..29.08 rows=6 width=36) (actual time=0.006..0.006 rows=0 loops=1)
>    ->  Seq Scan on test1_1  (cost=0.00..29.05 rows=6 width=36) (actual time=0.004..0.005 rows=0 loops=1)
>          Filter: ((id >= 0) AND (id <= 50))
>
>
>
> I considered about another use case. If default partition contains rows whose id = 300 and then we add another
partitionwhich have constraints like id >= 300 and id < 400, I thought we won't scan the rows anymore. But I noticed we
simplycan't add such a partition. 
>
> postgres=# insert into test1 values (300);
> INSERT 0 1
> postgres=# create table test1_3 partition of test1 for values from (300) to (400);
> ERROR:  updated partition constraint for default partition "test1_def" would be violated by some row
>
>
> So I haven't come up with bad cases so far :)
>
> --
> Yoshikazu Imai

Hello Yoshikazu-San,

I tested your patch using some sub-partitions and found a possible problem.

I create a new partitioned partition test1_3 with 2 sub-partitions :

-------------------------

create table test1_3 partition of test1 for values from (200) to (400)
partition by range (id);
create table test1_3_1 partition of test1_3 for values from (200) to (250);
create table test1_3_2 partition of test1_3 for values from (250) to (350);

# explain select * from test1 where (id > 0 and id < 30);
                          QUERY PLAN                          
---------------------------------------------------------------
 Append  (cost=0.00..29.08 rows=6 width=36)
   ->  Seq Scan on test1_1  (cost=0.00..29.05 rows=6 width=36)
         Filter: ((id > 0) AND (id < 30))
(3 rows)

# explain select * from test1 where (id > 220 and id < 230);
                           QUERY PLAN                           
-----------------------------------------------------------------
 Append  (cost=0.00..29.08 rows=6 width=36)
   ->  Seq Scan on test1_3_1  (cost=0.00..29.05 rows=6 width=36)
         Filter: ((id > 220) AND (id < 230))
(3 rows)

# explain select * from test1
where (id > 0 and id < 30) or (id > 220 and id < 230);
                                QUERY PLAN                                
---------------------------------------------------------------------------
 Append  (cost=0.00..106.40 rows=39 width=36)
   ->  Seq Scan on test1_1  (cost=0.00..35.40 rows=13 width=36)
         Filter: (((id > 0) AND (id < 30)) OR ((id > 220) AND (id < 230)))
   ->  Seq Scan on test1_3_1  (cost=0.00..35.40 rows=13 width=36)
         Filter: (((id > 0) AND (id < 30)) OR ((id > 220) AND (id < 230)))
   ->  Seq Scan on test1_3_2  (cost=0.00..35.40 rows=13 width=36)
         Filter: (((id > 0) AND (id < 30)) OR ((id > 220) AND (id < 230)))
(7 rows)

-----------------

Partition pruning is functioning when only the sub-partition is
required. When both the partition and the sub-partition is required,
there is no pruning on the sub-partition.

Cordialement,

--
Thibaut Madelaine
Dalibo

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 14 марта 2019 г., 17:03:34
Сообщение: Re: Timeout parameters

Следующее

От: Amit Kapila
Дата: 14 марта 2019 г., 17:16:41
Сообщение: Re: WIP: Avoid creation of the free space map for small tables

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Problem with default partition pruning

Предыдущее

Следующее