Обсуждение: [PATCH] Add a guc parameter to control limit clause adjust path cost.

Поиск
Список
Период
Сортировка

[PATCH] Add a guc parameter to control limit clause adjust path cost.

От
"Haiyang Li"
Дата:
Hi all,
When a query contains LIMIT/OFFSET clauses, the optimizer uses this
information to influence path generation. For example, it may consider
the path with the lowest startup cost and adjust the cost of paths
accordingly.

In most cases, these behaviors are beneficial and result in a more optimal
path. However, there are some scenarios where these behaviors can lead to
a bad plan. One typical scenario is when a query contains both
ORDER BY and LIMIT clauses, and the optimizer chooses an incorrect index scan.
PostgreSQL makes very optimistic predictions about the path that uses an
ordered index to satisfy the row count requirement. When the actual scenario
differs from the prediction, PostgreSQL may generate a bad plan. One case
provided by Lukas [1] hits this issue. And another case i met recently is:

```
-- bad plan
                                                                                  QUERY PLAN                                                                                   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=1.00..611.58 rows=200 width=419) (actual time=3848.321..3867.253 rows=200 loops=1)
  Buffers: shared hit=5957080 (main=5957080 vm=0 fsm=0) read=1 (main=1 vm=0 fsm=0)
  I/O Timings: shared/local read=0.518
  Read Timings: total=0.534 ms buffer alloc=0.010 read io=0.519 page replay=0.002
  ->  Nested Loop  (cost=1.00..1556085.69 rows=509704 width=419) (actual time=3848.320..3867.231 rows=200 loops=1)
        Buffers: shared hit=5957080 (main=5957080 vm=0 fsm=0) read=1 (main=1 vm=0 fsm=0)
        I/O Timings: shared/local read=0.518
        Read Timings: total=0.534 ms buffer alloc=0.010 read io=0.519 page replay=0.002
        ->  Index Scan using "idx_outDetail_channel_id" on pmc_outcome_detail d  (cost=0.43..369798.26 rows=641998 width=411) (actual time=0.042..758.725 rows=1172313 loops=1)
              Index Cond: ((channel_id = ANY ('{7000,2000,4000}'::numeric[])) AND (channel_id = '4000'::numeric))
              Filter: ((voucher_file_name IS NULL) AND (payment_type = ANY ('{7,8}'::numeric[])) AND (status = '3'::numeric))
              Rows Removed by Filter: 586208
              Buffers: shared hit=95516 (main=95516 vm=0 fsm=0)
        ->  Index Scan using pmc_outcome_pkey on pmc_outcome o  (cost=0.56..1.85 rows=1 width=18) (actual time=0.002..0.002 rows=0 loops=1172313)
              Index Cond: (id = d.outcome_id)
              Filter: (user_type = '1'::numeric)
              Rows Removed by Filter: 1
              Buffers: shared hit=5861564 (main=5861564 vm=0 fsm=0) read=1 (main=1 vm=0 fsm=0)
              I/O Timings: shared/local read=0.518
              Read Timings: total=0.534 ms buffer alloc=0.010 read io=0.519 page replay=0.002
Planning:
  Buffers: shared hit=20 (main=16 vm=4 fsm=0)
Planning Time: 0.269 ms
Execution Time: 3867.322 ms
(24 rows)
-- good plan
                                                                                QUERY PLAN                                                                               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=15964.44..595103.67 rows=200 width=419) (actual time=809.243..818.407 rows=200 loops=1)
  Buffers: shared hit=5957257 (main=5957257 vm=0 fsm=0) read=5 (main=5 vm=0 fsm=0)
  I/O Timings: shared/local read=1.219
  Read Timings: total=1.243 ms buffer alloc=0.015 read io=1.219 page replay=0.004
  ->  Gather  (cost=15964.44..595103.67 rows=509704 width=419) (actual time=809.242..818.381 rows=200 loops=1)
        Workers Planned: 6
        Workers Launched: 6
        Buffers: shared hit=5957257 (main=5957257 vm=0 fsm=0) read=5 (main=5 vm=0 fsm=0)
        I/O Timings: shared/local read=1.219
        Read Timings: total=1.243 ms buffer alloc=0.015 read io=1.219 page replay=0.004
        ->  Nested Loop  (cost=14964.44..543133.27 rows=84951 width=419) (actual time=801.757..803.199 rows=31 loops=7)
              Buffers: shared hit=5957257 (main=5957257 vm=0 fsm=0) read=5 (main=5 vm=0 fsm=0)
              I/O Timings: shared/local read=1.219
              Read Timings: total=1.243 ms buffer alloc=0.015 read io=1.219 page replay=0.004
              ->  Parallel Bitmap Heap Scan on pmc_outcome_detail d  (cost=14963.87..345418.09 rows=107000 width=411) (actual time=107.837..210.957 rows=167477 loops=7)
                    Recheck Cond: ((channel_id = ANY ('{7000,2000,4000}'::numeric[])) AND (channel_id = '4000'::numeric))
                    Filter: ((voucher_file_name IS NULL) AND (payment_type = ANY ('{7,8}'::numeric[])) AND (status = '3'::numeric))
                    Rows Removed by Filter: 83745
                    Heap Blocks: exact=13247
                    Buffers: shared hit=95541 (main=95541 vm=0 fsm=0) read=5 (main=5 vm=0 fsm=0)
                    I/O Timings: shared/local read=1.219
                    Read Timings: total=1.243 ms buffer alloc=0.015 read io=1.219 page replay=0.004
                    ->  Bitmap Index Scan on "idx_outDetail_channel_id"  (cost=0.00..14803.38 rows=1072436 width=0) (actual time=101.764..101.764 rows=1820498 loops=1)
                          Index Cond: ((channel_id = ANY ('{7000,2000,4000}'::numeric[])) AND (channel_id = '4000'::numeric))
                          Buffers: shared hit=6016 (main=6016 vm=0 fsm=0) read=5 (main=5 vm=0 fsm=0)
                          I/O Timings: shared/local read=1.219
                          Read Timings: total=1.243 ms buffer alloc=0.015 read io=1.219 page replay=0.004
              ->  Index Scan using pmc_outcome_pkey on pmc_outcome o  (cost=0.56..1.85 rows=1 width=18) (actual time=0.003..0.003 rows=0 loops=1172342)
                    Index Cond: (id = d.outcome_id)
                    Filter: (user_type = '1'::numeric)
                    Rows Removed by Filter: 1
                    Buffers: shared hit=5861716 (main=5861716 vm=0 fsm=0)
Planning:
  Buffers: shared hit=20 (main=16 vm=4 fsm=0)
Planning Time: 0.299 ms
Execution Time: 818.499 ms
(36 rows)
```

In such cases, users do not have a good way to force the optimizer to behave differently.

To address this, I have added a GUC parameter to control the
optimizer's use of LIMIT clauses for adjusting the cost of paths,
providing a means to force intervention. We have also considered similar
scenarios, such as MIN/MAX. Note that we still retain the optimizer's
use of LIMIT clauses to adjust the number of rows, as this is always
reasonable. The patch is provided in attachment(not add test case yet).

Any thoughts?

Regards,
Haiyang Li

[1] https://pganalyze.com/blog/5mins-postgres-planner-order-by-limit


Вложения

Re: [PATCH] Add a guc parameter to control limit clause adjust path cost.

От
Tom Lane
Дата:
"Haiyang Li" <mohen.lhy@alibaba-inc.com> writes:
> To address this, I have added a GUC parameter to control the
> optimizer's use of LIMIT clauses for adjusting the cost of paths,
> providing a means to force intervention. We have also considered similar
> scenarios, such as MIN/MAX. Note that we still retain the optimizer's
> use of LIMIT clauses to adjust the number of rows, as this is always
> reasonable. The patch is provided in attachment(not add test case yet).
> Any thoughts?

I think this is a pretty bad solution as given.  A global GUC switch
is an extremely blunt instrument and hard to use in production without
breaking cases you didn't want to break.  The proposed patch seems to
have that problem in spades, as it looks like you've turned off
*every* place that has any consideration of LIMIT effects without
concern for whether that place is known to have problems, and
furthermore done so at the greatest possible distance from where the
estimates actually get made.

We have discussed fixes with a bit more finesse, such as adjusting
LIMIT cost estimates with the understanding that the tuples being
sought may not be uniformly distributed in the input data (which is
usually the ultimate cause of such plans performing badly).  Nobody's
carried such ideas through to a complete proposal as yet, though.

            regards, tom lane



Re: [PATCH] Add a guc parameter to control limit clause adjust path cost.

От
Bryan Green
Дата:
On 11/2/2025 11:28 AM, Haiyang Li wrote:
> Hi all,
> When a query contains LIMIT/OFFSET clauses, the optimizer uses this
> information to influence path generation. For example, it may consider
> the path with the lowest startup cost and adjust the cost of paths
> accordingly.
> 
> In most cases, these behaviors are beneficial and result in a more optimal
> path. However, there are some scenarios where these behaviors can lead to
> a bad plan. One typical scenario is when a query contains both
> ORDER BY and LIMIT clauses, and the optimizer chooses an incorrect index
> scan.
> PostgreSQL makes very optimistic predictions about the path that uses an
> ordered index to satisfy the row count requirement. When the actual scenario
> differs from the prediction, PostgreSQL may generate a bad plan. One case
> provided by Lukas [1] hits this issue. And another case i met recently is:
> 
> ```
> -- bad plan
>                                                                        
>           QUERY PLAN                                                   
>                                
>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=1.00..611.58 rows=200 width=419) (actual
> time=3848.321..3867.253 rows=200 loops=1)
>   Buffers: shared hit=5957080 (main=5957080 vm=0 fsm=0) read=1 (main=1
> vm=0 fsm=0)
>   I/O Timings: shared/local read=0.518
>   Read Timings: total=0.534 ms buffer alloc=0.010 read io=0.519 page
> replay=0.002
>   ->  Nested Loop  (cost=1.00..1556085.69 rows=509704 width=419) (actual
> time=3848.320..3867.231 rows=200 loops=1)
>         Buffers: shared hit=5957080 (main=5957080 vm=0 fsm=0) read=1
> (main=1 vm=0 fsm=0)
>         I/O Timings: shared/local read=0.518
>         Read Timings: total=0.534 ms buffer alloc=0.010 read io=0.519
> page replay=0.002
>         ->  Index Scan using "idx_outDetail_channel_id" on
> pmc_outcome_detail d  (cost=0.43..369798.26 rows=641998 width=411)
> (actual time=0.042..758.725 rows=1172313 loops=1)
>               Index Cond: ((channel_id = ANY
> ('{7000,2000,4000}'::numeric[])) AND (channel_id = '4000'::numeric))
>               Filter: ((voucher_file_name IS NULL) AND (payment_type =
> ANY ('{7,8}'::numeric[])) AND (status = '3'::numeric))
>               Rows Removed by Filter: 586208
>               Buffers: shared hit=95516 (main=95516 vm=0 fsm=0)
>         ->  Index Scan using pmc_outcome_pkey on pmc_outcome o 
> (cost=0.56..1.85 rows=1 width=18) (actual time=0.002..0.002 rows=0
> loops=1172313)
>               Index Cond: (id = d.outcome_id)
>               Filter: (user_type = '1'::numeric)
>               Rows Removed by Filter: 1
>               Buffers: shared hit=5861564 (main=5861564 vm=0 fsm=0)
> read=1 (main=1 vm=0 fsm=0)
>               I/O Timings: shared/local read=0.518
>               Read Timings: total=0.534 ms buffer alloc=0.010 read
> io=0.519 page replay=0.002
> Planning:
>   Buffers: shared hit=20 (main=16 vm=4 fsm=0)
> Planning Time: 0.269 ms
> Execution Time: 3867.322 ms
> (24 rows)
> -- good plan
>                                                                        
>         QUERY PLAN                                                     
>                          
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=15964.44..595103.67 rows=200 width=419) (actual
> time=809.243..818.407 rows=200 loops=1)
>   Buffers: shared hit=5957257 (main=5957257 vm=0 fsm=0) read=5 (main=5
> vm=0 fsm=0)
>   I/O Timings: shared/local read=1.219
>   Read Timings: total=1.243 ms buffer alloc=0.015 read io=1.219 page
> replay=0.004
>   ->  Gather  (cost=15964.44..595103.67 rows=509704 width=419) (actual
> time=809.242..818.381 rows=200 loops=1)
>         Workers Planned: 6
>         Workers Launched: 6
>         Buffers: shared hit=5957257 (main=5957257 vm=0 fsm=0) read=5
> (main=5 vm=0 fsm=0)
>         I/O Timings: shared/local read=1.219
>         Read Timings: total=1.243 ms buffer alloc=0.015 read io=1.219
> page replay=0.004
>         ->  Nested Loop  (cost=14964.44..543133.27 rows=84951 width=419)
> (actual time=801.757..803.199 rows=31 loops=7)
>               Buffers: shared hit=5957257 (main=5957257 vm=0 fsm=0)
> read=5 (main=5 vm=0 fsm=0)
>               I/O Timings: shared/local read=1.219
>               Read Timings: total=1.243 ms buffer alloc=0.015 read
> io=1.219 page replay=0.004
>               ->  Parallel Bitmap Heap Scan on pmc_outcome_detail d 
> (cost=14963.87..345418.09 rows=107000 width=411) (actual
> time=107.837..210.957 rows=167477 loops=7)
>                     Recheck Cond: ((channel_id = ANY
> ('{7000,2000,4000}'::numeric[])) AND (channel_id = '4000'::numeric))
>                     Filter: ((voucher_file_name IS NULL) AND
> (payment_type = ANY ('{7,8}'::numeric[])) AND (status = '3'::numeric))
>                     Rows Removed by Filter: 83745
>                     Heap Blocks: exact=13247
>                     Buffers: shared hit=95541 (main=95541 vm=0 fsm=0)
> read=5 (main=5 vm=0 fsm=0)
>                     I/O Timings: shared/local read=1.219
>                     Read Timings: total=1.243 ms buffer alloc=0.015 read
> io=1.219 page replay=0.004
>                     ->  Bitmap Index Scan on "idx_outDetail_channel_id" 
> (cost=0.00..14803.38 rows=1072436 width=0) (actual time=101.764..101.764
> rows=1820498 loops=1)
>                           Index Cond: ((channel_id = ANY
> ('{7000,2000,4000}'::numeric[])) AND (channel_id = '4000'::numeric))
>                           Buffers: shared hit=6016 (main=6016 vm=0
> fsm=0) read=5 (main=5 vm=0 fsm=0)
>                           I/O Timings: shared/local read=1.219
>                           Read Timings: total=1.243 ms buffer
> alloc=0.015 read io=1.219 page replay=0.004
>               ->  Index Scan using pmc_outcome_pkey on pmc_outcome o 
> (cost=0.56..1.85 rows=1 width=18) (actual time=0.003..0.003 rows=0
> loops=1172342)
>                     Index Cond: (id = d.outcome_id)
>                     Filter: (user_type = '1'::numeric)
>                     Rows Removed by Filter: 1
>                     Buffers: shared hit=5861716 (main=5861716 vm=0 fsm=0)
> Planning:
>   Buffers: shared hit=20 (main=16 vm=4 fsm=0)
> Planning Time: 0.299 ms
> Execution Time: 818.499 ms
> (36 rows)
> ```
> 
> In such cases, users do not have a good way to force the optimizer to
> behave differently.
> 
> To address this, I have added a GUC parameter to control the
> optimizer's use of LIMIT clauses for adjusting the cost of paths,
> providing a means to force intervention. We have also considered similar
> scenarios, such as MIN/MAX. Note that we still retain the optimizer's
> use of LIMIT clauses to adjust the number of rows, as this is always
> reasonable. The patch is provided in attachment(not add test case yet).
> 
> Any thoughts?
> 
> Regards,
> Haiyang Li
> 
> [1] https://pganalyze.com/blog/5mins-postgres-planner-order-by-limit
> <https://pganalyze.com/blog/5mins-postgres-planner-order-by-limit>
> 
> 
I am new to this area, but isn't the problem the filter selectivity
estimation?  The filter Filter: ((voucher_file_name IS NULL) AND
(payment_type =
> ANY ('{7,8}'::numeric[])) AND (status = '3'::numeric))
>               Rows Removed by Filter: 586208
>               Buffers: shared hit=95516 (main=95516 vm=0 fsm=0)
removes the most rows.  The optimizer shouldn't have selected the index
scan path even with LIMIT existing.  Maybe the benefit of the LIMIT
should be adjusted downward based on the filter selectivity?  Calculate
a new limit by dividing the total filter selectivity by the LIMIT?
Forgive me if I am completely offbase with this discussion.

Bryan Green



Re: [PATCH] Add a guc parameter to control limit clause adjust path cost.

От
Tom Lane
Дата:
Bryan Green <dbryan.green@gmail.com> writes:
> I am new to this area, but isn't the problem the filter selectivity
> estimation?  The filter Filter: ((voucher_file_name IS NULL) AND
> (payment_type =
>> ANY ('{7,8}'::numeric[])) AND (status = '3'::numeric))
>> Rows Removed by Filter: 586208
>> Buffers: shared hit=95516 (main=95516 vm=0 fsm=0)
> removes the most rows.  The optimizer shouldn't have selected the index
> scan path even with LIMIT existing.

I hadn't looked closely at the two plans being compared, but yeah,
they *both* suck.  Why is it choosing a nestloop join when it knows
the outside will yield many many rows?  The "good" plan is better
only because it decided to parallelize the outer table scan, which
moves the goalposts not very far and frankly seems like a chance
result anyway.  I wonder if we are looking at the behavior of a
planner that's been locally modified in other ways, or is using
strange cost settings.

> Maybe the benefit of the LIMIT
> should be adjusted downward based on the filter selectivity?

Yeah, ideas similar to that are what I was alluding to in my
other response.  The typical issue is that the filter clause's
selectivity is underestimated (so that there are fewer matching
rows than we thought) and/or the matching rows preferentially
appear closer to the end of the index scan than the beginning.
I'd be happier with a proposal that would inflate the LIMIT's
minimum cost by some factor based on the likelihood of that
sort of error, rather than just kneecapping the calculation
altogether.

            regards, tom lane



Re: [PATCH] Add a guc parameter to control limit clause adjust path cost.

От
ocean_li_996
Дата:
There were some formatting display issues with my previous email reply, 
so I’m using another email account to send this message.

"Tom Lane" <tgl@sss.pgh.pa.us> Sun, 02 Nov 2025 12:44:13 -0500 write:

> I think this is a pretty bad solution as given.  A global GUC switch
> is an extremely blunt instrument and hard to use in production without
> breaking cases you didn't want to break.  
Yeah.  Disabling this globally in a production environment is not advisable.
A more acceptable approach would be to use the pg_hint_plan extension
and apply hints only to the specific SQL statements that exhibit issues
for example:
```
/*+set(enable_limit_adjust_cost off) */ SELECT xxx;
```

> The proposed patch seems to
> have that problem in spades, as it looks like you've turned off
> *every* place that has any consideration of LIMIT effects without
> concern for whether that place is known to have problems, and
> furthermore done so at the greatest possible distance from where the
> estimates actually get made.
The patch works by making the cost estimator behave as if there is no LIMIT
clause inside the subquery, except for the rows estimation performed at the
limit node itself.


> We have discussed fixes with a bit more finesse, such as adjusting
> LIMIT cost estimates with the understanding that the tuples being
> sought may not be uniformly distributed in the input data (which is
> usually the ultimate cause of such plans performing badly).  Nobody's
> carried such ideas through to a complete proposal as yet, though.
 I have also considered more finesse solutions, but haven't yet found one that
is appropriate. It seems that this issue has been discussed for quite some time.
Thus, I proposed adding a dedicated GUC. This solution may not be elegant, but
it is straightforward and working.


regards,
Haiyang Li