9.6 Beta 2 Performance Regression on Recursive CTE

Поиск
Список
Период
Сортировка
От Nicholson, Brad (Toronto, ON, CA)
Тема 9.6 Beta 2 Performance Regression on Recursive CTE
Дата
Msg-id CS1PR84MB0214C7D911E405CB83EE188AD7380@CS1PR84MB0214.NAMPRD84.PROD.OUTLOOK.COM
обсуждение исходный текст
Ответы Re: 9.6 Beta 2 Performance Regression on Recursive CTE  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Hi,

I'm seeing a performance regression on 9.6 Beta 2 compared to 9.5.3.  The query is question is a recursive query on
graphdata stored as an adjacency list.  While this is example is fairly contrived, it mimics the behavior I am seeing
onreal data with more realistic queries.  The example below uses the same data set, same DB configuration, same query
planand same host.  But the 9.6 beta 2 query runs in 26 seconds and 9.5.3 runs in 13.4 seconds.  This pattern holds up
onrepeated runs.  

This seems to be related to the citext data type.  If I re-run the test below with varchar as the data type,
performanceis similar between 9.5 and 9.6. 

Both are running on CentOS 6.6 using the PG community RPMs.

Version details:

                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit


                                                    version
-------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6beta2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit



Relevant config options (same on both)

cpu_index_tuple_cost=0.005
cpu_operator_cost=0.0025
cpu_tuple_cost=0.05
effective_cache_size=4GB
random_page_cost=4
seq_page_cost=1
shared_buffers=2GB
temp_buffers=8MB
work_mem=4MB

Very basic table structure:

\d adj_edge_citext
Table "public.adj_edge_citext"
 Column |  Type  | Modifiers
--------+--------+-----------
 parent | citext |
 child  | citext |

Import the Graph of Thrones data set from: https://www.macalester.edu/~abeverid/data/stormofswords.csv (ignore the
weightcolumn). 

Run VACUUM ANALYZE on the table, and then run the following:

EXPLAIN (analyze, buffers, verbose) WITH RECURSIVE d AS (
                SELECT
                    a.parent,
                    a.child,
                    1 AS LEVEL,
                    ARRAY[a.parent] AS path,
                    FALSE AS CYCLE
                FROM
                    adj_edge_citext a
                WHERE a.parent in (SELECT distinct(parent) FROM adj_edge_citext)
                UNION ALL
                SELECT
                   a1.parent,
                   a1.child,
                   d.LEVEL+1,
                   path || a1.parent,
                  a1.parent=ANY(path)
                FROM
                    adj_edge_citext a1
                JOIN d ON d.child=a1.parent
                WHERE
                     NOT CYCLE
                     )
                SELECT
                    parent,
                    child,
                    level,
                    path
                FROM d;


9.5.3:
https://explain.depesz.com/s/HOO7
                                                                     QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on d  (cost=9766.98..12900.18 rows=31332 width=100) (actual time=0.810..13023.860 rows=1427997 loops=1)
   Output: d.parent, d.child, d.level, d.path
   Buffers: shared hit=6, temp read=31714 written=64240
   CTE d
     ->  Recursive Union  (cost=31.31..9766.98 rows=31332 width=48) (actual time=0.805..9613.617 rows=1427997 loops=1)
           Buffers: shared hit=6, temp read=31714 written=31701
           ->  Hash Semi Join  (cost=31.31..69.83 rows=352 width=12) (actual time=0.803..1.879 rows=352 loops=1)
                 Output: a.parent, a.child, 1, ARRAY[a.parent], false
                 Hash Cond: (a.parent = adj_edge_citext.parent)
                 Buffers: shared hit=4
                 ->  Seq Scan on public.adj_edge_citext a  (cost=0.00..19.60 rows=352 width=12) (actual
time=0.021..0.177rows=352 loops=1) 
                       Output: a.parent, a.child
                       Buffers: shared hit=2
                 ->  Hash  (cost=27.58..27.58 rows=71 width=6) (actual time=0.760..0.760 rows=71 loops=1)
                       Output: adj_edge_citext.parent
                       Buckets: 1024  Batches: 1  Memory Usage: 11kB
                       Buffers: shared hit=2
                       ->  HashAggregate  (cost=20.48..24.03 rows=71 width=6) (actual time=0.638..0.681 rows=71
loops=1)
                             Output: adj_edge_citext.parent
                             Group Key: adj_edge_citext.parent
                             Buffers: shared hit=2
                             ->  Seq Scan on public.adj_edge_citext  (cost=0.00..19.60 rows=352 width=6) (actual
time=0.010..0.168rows=352 loops=1) 
                                   Output: adj_edge_citext.parent
                                   Buffers: shared hit=2
           ->  Hash Join  (cost=38.08..656.40 rows=3098 width=48) (actual time=0.054..303.806 rows=62072 loops=23)
                 Output: a1.parent, a1.child, (d_1.level + 1), (d_1.path || a1.parent), (a1.parent = ANY (d_1.path))
                 Hash Cond: (d_1.child = a1.parent)
                 Buffers: shared hit=2, temp read=31714 written=13
                 ->  WorkTable Scan on d d_1  (cost=0.00..352.00 rows=1760 width=68) (actual time=0.013..35.953
rows=62087loops=23) 
                       Output: d_1.parent, d_1.child, d_1.level, d_1.path, d_1.cycle
                       Filter: (NOT d_1.cycle)
                       Buffers: temp read=31714 written=13
                 ->  Hash  (cost=19.60..19.60 rows=352 width=12) (actual time=0.496..0.496 rows=352 loops=1)
                       Output: a1.parent, a1.child
                       Buckets: 1024  Batches: 1  Memory Usage: 24kB
                       Buffers: shared hit=2
                       ->  Seq Scan on public.adj_edge_citext a1  (cost=0.00..19.60 rows=352 width=12) (actual
time=0.009..0.204rows=352 loops=1) 
                             Output: a1.parent, a1.child
                             Buffers: shared hit=2
 Planning time: 0.543 ms
 Execution time: 13415.280 ms

9.6 Beta 2:
https://explain.depesz.com/s/R7oR
                                                                     QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on d  (cost=9766.98..12900.18 rows=31332 width=100) (actual time=1.737..25688.068 rows=1427997 loops=1)
   Output: d.parent, d.child, d.level, d.path
   Buffers: shared hit=6, temp read=31714 written=64240
   CTE d
     ->  Recursive Union  (cost=31.31..9766.98 rows=31332 width=49) (actual time=1.731..22178.446 rows=1427997 loops=1)
           Buffers: shared hit=6, temp read=31714 written=31701
           ->  Hash Semi Join  (cost=31.31..69.83 rows=352 width=49) (actual time=1.728..3.862 rows=352 loops=1)
                 Output: a.parent, a.child, 1, ARRAY[a.parent], false
                 Hash Cond: (a.parent = adj_edge_citext.parent)
                 Buffers: shared hit=4
                 ->  Seq Scan on index.adj_edge_citext a  (cost=0.00..19.60 rows=352 width=12) (actual
time=0.023..0.183rows=352 loops=1) 
                       Output: a.parent, a.child
                       Buffers: shared hit=2
                 ->  Hash  (cost=27.58..27.58 rows=71 width=6) (actual time=1.678..1.678 rows=71 loops=1)
                       Output: adj_edge_citext.parent
                       Buckets: 1024  Batches: 1  Memory Usage: 11kB
                       Buffers: shared hit=2
                       ->  HashAggregate  (cost=20.48..24.03 rows=71 width=6) (actual time=1.485..1.527 rows=71
loops=1)
                             Output: adj_edge_citext.parent
                             Group Key: adj_edge_citext.parent
                             Buffers: shared hit=2
                             ->  Seq Scan on index.adj_edge_citext  (cost=0.00..19.60 rows=352 width=6) (actual
time=0.010..0.177rows=352 loops=1) 
                                   Output: adj_edge_citext.parent, adj_edge_citext.child
                                   Buffers: shared hit=2
           ->  Hash Join  (cost=38.08..656.40 rows=3098 width=49) (actual time=0.095..845.508 rows=62072 loops=23)
                 Output: a1.parent, a1.child, (d_1.level + 1), (d_1.path || a1.parent), (a1.parent = ANY (d_1.path))
                 Hash Cond: (d_1.child = a1.parent)
                 Buffers: shared hit=2, temp read=31714 written=13
                 ->  WorkTable Scan on d d_1  (cost=0.00..352.00 rows=1760 width=68) (actual time=0.021..41.653
rows=62087loops=23) 
                       Output: d_1.parent, d_1.child, d_1.level, d_1.path, d_1.cycle
                       Filter: (NOT d_1.cycle)
                       Buffers: temp read=31714 written=13
                 ->  Hash  (cost=19.60..19.60 rows=352 width=12) (actual time=0.757..0.757 rows=352 loops=1)
                       Output: a1.parent, a1.child
                       Buckets: 1024  Batches: 1  Memory Usage: 24kB
                       Buffers: shared hit=2
                       ->  Seq Scan on index.adj_edge_citext a1  (cost=0.00..19.60 rows=352 width=12) (actual
time=0.012..0.166rows=352 loops=1) 
                             Output: a1.parent, a1.child
                             Buffers: shared hit=2
 Planning time: 0.785 ms
 Execution time: 26062.927 ms

As an aside, the cost mis-estimates are very common with these types of queries on graph data, and frequently result in
theplanner making bad choices which result in queries taking seconds instead of milliseconds.  I am aware of the whole
CTEoptimization fence issue, but it is a pretty big problem for graph data in Postgres.                 


В списке pgsql-general по дате отправления:

Предыдущее
От: Craig Boyd
Дата:
Сообщение: Re: psql connection option: statement_timeout
Следующее
От: Martín Marqués
Дата:
Сообщение: Re: Unable to recovery due missing wal_file