BUG #16824: Planner chooses poor path on query with Merge Join and pagination
От | PG Bug reporting form |
---|---|
Тема | BUG #16824: Planner chooses poor path on query with Merge Join and pagination |
Дата | |
Msg-id | 16824-cac06100763db5ae@postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #16824: Planner chooses poor path on query with Merge Join and pagination
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 16824 Logged by: Jacob Mitash Email address: jacobmitash@gmail.com PostgreSQL version: 13.1 Operating system: Alpine Linux (Docker container) Description: Hi, I have been working with this relatively simple query that is used to iterate over a large subset of rows across two tables, one page at a time. The query was originally using an OFFSET/LIMIT to execute the paging, which you might guess performs poorly in the later pages. I changed the query to do pagination using a combination of ORDER BY id, WHERE id > id_from_prev_page, and LIMIT page_size. I'm not sure if that strategy has a name, but I'll refer to it as "ID pagination". My dataset involves just two tables, described by this DDL: ``` -- DDL: create table sub ( -- 4.3M records subscription_id varchar(255) -- UUIDs not null constraint pk_sub primary key, subscription_status varchar(255) -- (ACTIVE, PENDING, or CANCELLED) ); create table sub_item ( -- 4.4M records, mostly 1:1 with subscriptions subscription_item_id varchar(255) -- UUIDs not null constraint pk_sub_item primary key, subscription_id varchar(255) -- FK to sub constraint fk_sub_item_sub_id references sub, item_ref varchar(255) -- 25 character strings, mostly the same ); ``` And here's the query I've been using to select pages while testing performance. The literal value of the subscription_id is to represent grabbing a page in the middle of the data. ``` -- Query: Grab a page from the middle EXPLAIN ANALYZE SELECT s.subscription_id, * FROM sub s JOIN sub_item si ON s.subscription_id = si.subscription_id AND si.item_ref = '1001107599999910222001000' WHERE s.subscription_status = 'ACTIVE' AND s.SUBSCRIPTION_ID > '7ca1cf6b-2452-4d1b-bd03-1ba68b63b528' ORDER BY s.subscription_id LIMIT 50; ``` My understanding of this ID pagination is that it should be very quick as it needs only find the ID in the index, and then scan the next 50 entries. However, the execution time is slow: ``` Limit (cost=2.41..325.76 rows=50 width=179) (actual time=10336.095..10340.389 rows=50 loops=1) -> Merge Join (cost=2.41..765832.96 rows=118421 width=179) (actual time=10336.086..10339.900 rows=50 loops=1) Merge Cond: ((s.subscription_id)::text = (si.subscription_id)::text) -> Index Scan using pk_sub on sub s (cost=0.56..270648.21 rows=632913 width=45) (actual time=0.025..1.643 rows=90 loops=1) Index Cond: ((subscription_id)::text > '7ca1cf6b-2452-4d1b-bd03-1ba68b63b528'::text) Filter: ((subscription_status)::text = 'ACTIVE'::text) Rows Removed by Filter: 211 -> Index Scan using idx_sub_item_sub_id_btree on sub_item si (cost=0.56..490385.99 rows=813151 width=98) (actual time=0.016..8260.586 rows=393058 loops=1) Filter: ((item_ref)::text = '1001107599999910222001000'::text) Rows Removed by Filter: 1742475 Planning Time: 0.309 ms Execution Time: 10340.691 ms ``` Just fiddling around, I saw that performance significantly improved when I ran: SET enable_mergejoin = OFF; ``` Limit (cost=1.11..446.97 rows=50 width=179) (actual time=0.150..4.351 rows=50 loops=1) -> Nested Loop (cost=1.11..1055964.59 rows=118421 width=179) (actual time=0.140..3.850 rows=50 loops=1) -> Index Scan using pk_sub on sub s (cost=0.56..270648.21 rows=632913 width=45) (actual time=0.064..0.898 rows=90 loops=1) Index Cond: ((subscription_id)::text > '7ca1cf6b-2452-4d1b-bd03-1ba68b63b528'::text) Filter: ((subscription_status)::text = 'ACTIVE'::text) Rows Removed by Filter: 211 -> Index Scan using idx_sub_item_sub_id_btree on sub_item si (cost=0.56..1.23 rows=1 width=98) (actual time=0.013..0.016 rows=1 loops=90) Index Cond: ((subscription_id)::text = (s.subscription_id)::text) Filter: ((item_ref)::text = '1001107599999910222001000'::text) Rows Removed by Filter: 0 Planning Time: 0.196 ms Execution Time: 4.623 ms ``` I originally asked about this on the DBA StackExchange. I had two questions to which I'll summarize here: 1. Why is the Merge Join performing so slowly? It seems to be because the planner doesn't recognize that it can apply the subscription_id index condition on the inner table. If I explicitly tell it: "AND si.subscription_id > '7ca1...'", then it applies an index condition and is almost instant. 2. Why does the planner believe that Merge Join (as-is) is optimal here? I don't have an answer for this. Potentially a bug related to the previous answer? So I think there's two things going on here that may or may not be classified as bugs: 1. Using Merge Join, the planner does not apply an index condition to the inner table that could be implied from the clause on the outer table. 2. In the same scenario, the planner's cost estimation is grossly different from the actual execution time.
В списке pgsql-bugs по дате отправления:
Следующее
От: Tom LaneДата:
Сообщение: Re: BUG #16824: Planner chooses poor path on query with Merge Join and pagination