BUG #16624: Query Optimizer - Performance bug related to predicate simplification

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #16624: Query Optimizer - Performance bug related to predicate simplification
Дата
Msg-id 16624-1375434d97659203@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #16624: Query Optimizer - Performance bug related to predicate simplification  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      16624
Logged by:          XINYU LIU
Email address:      XINYULIU@UMICH.EDU
PostgreSQL version: 13rc1
Operating system:   Ubuntu 20.04
Description:

Hello,

We are developing a tool for automatically finding performance bugs in
PostgreSQL. Our key insight is that given a pair of semantic equivalent
queries, a robust DBMS should return the same result within a similar
execution time. Significant time difference suggests a potential performance
bug in the DBMS.

We are sharing a pair of TPC-H queries that exhibit a potential performance
bug in this report:

First query:
SELECT "ps_suppkey" 
FROM   "partsupp" 
WHERE  "ps_partkey" = 1486; 

Second query:
SELECT "ps_suppkey" 
FROM   "partsupp" 
WHERE  "ps_partkey" + 1486 = 2972; 

[Actual Behavior]
We executed both queries on the TPC-H benchmark of scale factor 5: the first
query takes only 1.059 millisecond, while the second query takes 247.176
millisecond. We think the time difference results from different plans
selected.

[Query Execution Plan]
* First query:
                                                         QUERY PLAN
                                                

-----------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using partsupp_pkey on partsupp  (cost=0.43..4.59 rows=9
width=4) (actual time=0.692..0.694 rows=4 loops=1)
   Index Cond: (ps_partkey = 1486)
   Heap Fetches: 0
 Planning Time: 4.748 ms
 Execution Time: 1.059 ms
(5 rows)

* Second query:
                                                                      QUERY
PLAN


-------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.43..91865.94 rows=19994 width=4) (actual
time=2.032..246.821 rows=4 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Index Only Scan using partsupp_pkey on partsupp
(cost=0.43..88866.54 rows=8331 width=4) (actual time=159.371..240.012 rows=1
loops=3)
         Filter: ((ps_partkey + 1486) = 2972)
         Rows Removed by Filter: 1333332
         Heap Fetches: 0
 Planning Time: 4.556 ms
 Execution Time: 247.176 ms
(9 rows)


[Expected Behavior]
I would have expected the DBMS to run these two queries with similar
execution time, given that they both have the same semantics. Notably, the
execution time difference between these two queries will grow significantly
when the size of the database grows. On the TPC-H benchmark of scale factor
100, the first query takes 1.9 millisecond, while the second query takes 83
seconds.

[Test Environment]
Ubuntu 20.04 machine "Linux panda 5.4.0-40-generic #44-Ubuntu SMP Tue Jun 23
00:01:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux"
PostgreSQL v13 beta3
Database: TPC-H benchmark (with scale factor 5)

[Steps for reproducing our observations]
    
* Download the dataset from the link:
https://drive.google.com/file/d/13rFa1BNDi4e2RmXBn-yEQkcqt6lsBu1c/view?usp=sharing
* Set up TPC-H benchmark
tar xzvf tpch5_postgresql.tar.gz
cd tpch5_postgresql
db=tpch5
createdb $db
psql -d $db < dss.ddl
for i in `ls *.tbl`
do
    echo $i
    name=`echo $i|cut -d'.' -f1`
    psql -d $db -c "COPY $name FROM '`pwd`/$i' DELIMITER '|' ENCODING
'LATIN1';"
done
psql -d $db < dss_postgres.ri
* Execute the queries


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #16623: JSON select query result is getting differed when we change DB version
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #16624: Query Optimizer - Performance bug related to predicate simplification