Обсуждение: Singleton SELECT inside cursor loop

Поиск
Список
Период
Сортировка

Singleton SELECT inside cursor loop

От
Ron
Дата:
AWS RDS Postgresql 12.11

It runs quickly, no matter what the fetched cursor values are, as long as 
the cursor returns less than 8 rows.  After that, the singleton SELECT's 
performance tanks.

We've got plenty of other procedures which have a similar "Singleton SELECT 
inside a cursor" design (many even query the same table that the problematic 
query does), but only this procedure's singleton SELECT tanks if the cursor 
returns more than 8 rows.

It only runs once a month, so we've only just noticed the problem.

Why would that be?  (We recently updated from 12.10 to 12.11, if that's 
relevant.)

-- 
Angular momentum makes the world go 'round.



Example code Re: Singleton SELECT inside cursor loop

От
Ron
Дата:
I extracted the singleton into a function, and that helped me isolate the 
problem.

Note how quickly it runs the first five times, but takes 780x longer the 
sixth time I run it.  Exiting psql and entering again causes the same 
slowness the sixth time it's run.

sides=> select strans.TASK001785639(1, 507650, '2022-08-01', '2022-08-31');
task001785639
---------------
            498
(1 row)

sides=> select strans.TASK001785639(1, 507650, '2022-08-01', '2022-08-31');
task001785639
---------------
            498
(1 row)

sides=> \timing
Timing is on.

sides=> select strans.TASK001785639(1, 507650, '2022-08-01', '2022-08-31');
task001785639
---------------
            498
(1 row)

Time: 457.325 ms

sides=> select strans.TASK001785639(1, 507650, '2022-08-01', '2022-08-31');
task001785639
---------------
            498
(1 row)

Time: 459.724 ms

sides=> select strans.TASK001785639(1, 507650, '2022-08-01', '2022-08-31');
task001785639
---------------
            498
(1 row)

Time: 461.155 ms

sides=> select strans.TASK001785639(1, 507650, '2022-08-01', '2022-08-31');
task001785639
---------------
            498
(1 row)

Time: 358426.159 ms (05:58.426)


On 10/1/22 01:29, Ron wrote:
> AWS RDS Postgresql 12.11
>
> It runs quickly, no matter what the fetched cursor values are, as long as 
> the cursor returns less than 8 rows.  After that, the singleton SELECT's 
> performance tanks.
>
> We've got plenty of other procedures which have a similar "Singleton 
> SELECT inside a cursor" design (many even query the same table that the 
> problematic query does), but only this procedure's singleton SELECT tanks 
> if the cursor returns more than 8 rows.
>
> It only runs once a month, so we've only just noticed the problem.
>
> Why would that be?  (We recently updated from 12.10 to 12.11, if that's 
> relevant.)
>

-- 
Angular momentum makes the world go 'round.
Вложения

Re: Example code Re: Singleton SELECT inside cursor loop

От
"Peter J. Holzer"
Дата:
On 2022-10-01 02:22:22 -0500, Ron wrote:
> I extracted the singleton into a function, and that helped me isolate the
> problem.

I don't know what a singleton select is (duckduckgo tells me it has
something to do with DB2 and/or COBOL, or maybe it's a whisky), but:


> Note how quickly it runs the first five times, but takes 780x longer the
> sixth time I run it.  Exiting psql and entering again causes the same
> slowness the sixth time it's run.

PostgresSQL evaluates the exact plan the first 5 times it encounters the
same prepared query. After that it switches to a generic plan if it
thinks that the generic plan isn't worse than the specialized plans.

If the plan suddenly gets worse after 5 executions, you've probably run
into a case where the generic plan is worse although the cost computed
by the planner isn't.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Вложения

Re: Example code Re: Singleton SELECT inside cursor loop

От
Christoph Moench-Tegeder
Дата:
## Ron (ronljohnsonjr@gmail.com):

> Note how quickly it runs the first five times, but takes 780x longer the 
> sixth time I run it.  Exiting psql and entering again causes the same 
> slowness the sixth time it's run.

Tanks at the sixth time? That rings a bell: "The current rule for this
is that the first five executions are done with custom plans[...]" from
https://www.postgresql.org/docs/12/sql-prepare.html
And your functions create prepared statements under the hood:
https://www.postgresql.org/docs/12/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
I guess you should visit
https://www.postgresql.org/docs/12/runtime-config-query.html#GUC-PLAN-CACHE_MODE
and try if plan_cache_mode = force_custom_plan helps here.

Regards,
Christoph

-- 
Spare Space



Re: Example code Re: Singleton SELECT inside cursor loop

От
Ron
Дата:
On 10/1/22 07:20, Peter J. Holzer wrote:
> On 2022-10-01 02:22:22 -0500, Ron wrote:
>> I extracted the singleton into a function, and that helped me isolate the
>> problem.
> I don't know what a singleton select is (duckduckgo tells me it has
> something to do with DB2 and/or COBOL, or maybe it's a whisky), but:

A query which returns one row.  Typically SELECT COUNT/MAX/etc without a 
GROUP BY clause.

-- 
Angular momentum makes the world go 'round.



Re: Example code Re: Singleton SELECT inside cursor loop

От
Ron
Дата:
On 10/1/22 07:21, Christoph Moench-Tegeder wrote:
> ## Ron (ronljohnsonjr@gmail.com):
>
>> Note how quickly it runs the first five times, but takes 780x longer the
>> sixth time I run it.  Exiting psql and entering again causes the same
>> slowness the sixth time it's run.
> Tanks at the sixth time? That rings a bell: "The current rule for this
> is that the first five executions are done with custom plans[...]" from
> https://www.postgresql.org/docs/12/sql-prepare.html

The question then is "why am I just now seeing the problem?"  We've been 
using v12 for two years, and it just happened.

The only recent change is that I upgraded it from RDS 12.10 to 12.11 a 
couple of weeks ago.

> And your functions create prepared statements under the hood:
> https://www.postgresql.org/docs/12/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

The same thing happens when I put the SELECT in a prepared statement, so 
that seems the likely cause.

> I guess you should visit
> https://www.postgresql.org/docs/12/runtime-config-query.html#GUC-PLAN-CACHE_MODE
> and try if plan_cache_mode = force_custom_plan helps here.

That solved the problem.  Thank you!

-- 
Angular momentum makes the world go 'round.



Re: Example code Re: Singleton SELECT inside cursor loop

От
Christoph Moench-Tegeder
Дата:
## Ron (ronljohnsonjr@gmail.com):

> The question then is "why am I just now seeing the problem?"  We've been 
> using v12 for two years, and it just happened.
> 
> The only recent change is that I upgraded it from RDS 12.10 to 12.11 a 
> couple of weeks ago.

That's correlation, but no proof for causality.
Now that you've confirmed that you have indeed a mismatch between generic
and custom plan, you could compare those (EXPLAIN EXECUTE) and see
where the generic plan goes wrong. Otherwise, prime suspects are bad
statistics (Were stats accurate before the update? How about now?
Ran VACUUM ANALYZE recently? If not, does it improve matters?)
If would help if you could compare execution plans with plans from
before the update, but I guess you didn't document those?

Regards,
Christoph

-- 
Spare Space



Re: Example code Re: Singleton SELECT inside cursor loop

От
Ron
Дата:
On 10/1/22 14:54, Christoph Moench-Tegeder wrote:
> ## Ron (ronljohnsonjr@gmail.com):
>
>> The question then is "why am I just now seeing the problem?"  We've been
>> using v12 for two years, and it just happened.
>>
>> The only recent change is that I upgraded it from RDS 12.10 to 12.11 a
>> couple of weeks ago.
> That's correlation, but no proof for causality.

Right.  But it is an important change which happened between job runs (the 
22nd of each month).

> Now that you've confirmed that you have indeed a mismatch between generic
> and custom plan, you could compare those (EXPLAIN EXECUTE) and see
> where the generic plan goes wrong.

I'll rerun the EXPLAIN EXECUTE with and without "plan_cache_mode = 
force_custom_plan", and attach them in a reply some time soon.

>   Otherwise, prime suspects are bad
> statistics (Were stats accurate before the update? How about now?
> Ran VACUUM ANALYZE recently?

VACUUM yesterday at 04:00, and ANALYZE at 09:00, both from a cron job.  
Manually ran ANALYZE last night during the dig into the problem.

>   If not, does it improve matters?)

Nope.  Not one bit.

> If would help if you could compare execution plans with plans from
> before the update, but I guess you didn't document those?

So, since ANALYZE did not help.

-- 
Angular momentum makes the world go 'round.



Re: Example code Re: Singleton SELECT inside cursor loop

От
Ron
Дата:
On 10/1/22 15:42, Ron wrote:
On 10/1/22 14:54, Christoph Moench-Tegeder wrote:
## Ron (ronljohnsonjr@gmail.com):

The question then is "why am I just now seeing the problem?"  We've been 
using v12 for two years, and it just happened.

The only recent change is that I upgraded it from RDS 12.10 to 12.11 a 
couple of weeks ago.
That's correlation, but no proof for causality.

Right.  But it is an important change which happened between job runs (the 22nd of each month).

Now that you've confirmed that you have indeed a mismatch between generic
and custom plan, you could compare those (EXPLAIN EXECUTE) and see
where the generic plan goes wrong.

I'll rerun the EXPLAIN EXECUTE with and without "plan_cache_mode = force_custom_plan", and attach them in a reply some time soon.

Attached are explain outputs from: , and then

TASK001785639_explain_output_custom.log: a "first five" fast execution
TASK001785639_explain_output_generic.log: "the sixth" (generic) plan when it took 6 minutes.

Next are similar plans except where default_statistic_target = 1000, and the table is reanalyzed.  It didn't help with this query.

TASK001785639_explain_output_custom_def_stats_1000.log
TASK001785639_explain_output_generic_def_stats_1000.log

Finally is the explain output from "plan_cache_mode = force_custom_plan":
TASK001785639_explain_output_force_custom_def_stats_1000.log

According to meld diff, custom_def_stats_1000 and force_custom_def_stats_1000 have surprisingly similar plans (the only difference being that in the forced custom plan, 2 workers were launched, and so they filtered out some rows.  Execution time was about 460ms in both.


[snip]
If would help if you could compare execution plans with plans from
before the update, but I guess you didn't document those?

So, since ANALYZE did not help.

EDIT: No, since ANALYZE did not help.

--
Angular momentum makes the world go 'round.
Вложения