Обсуждение: Performance issue after migration from 9.4 to 15

Поиск
Список
Период
Сортировка

Performance issue after migration from 9.4 to 15

От
Christian Schröder
Дата:
Hi list,
We have recently migrated from our old PostgreSQL 9.4 server to a PostgreSQL 15 server. Even though the new machine has
moreresources, we see a considerable decrease in the performance of some of our heavier queries, and I have no idea
whereI should start tuning. ? 

Old database: PostgreSQL 9.4.26 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17),
64-bit
New database: PostgreSQL 15.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

I tested the same query against the old and the new database.

Old: https://explain.dalibo.com/plan/b7d7gab73f6c7274#plan/node/21
New: https://explain.dalibo.com/plan/84h16cg1f19266be#plan/node/34

Particularly interesting are the sequential scans. In the old plan, we have node #21, which took 32 seconds. Almost all
ofthe time goes into actual I/O. In the new plan, the corresponding node is #34. It took 55 seconds, but less than 1
secondwas actual I/O (because most of the data was already in memory). Why did this step still take about twice the
time?
There is another Seq Scan of the same table. Node #10 in the old plan took 3 seconds, whereas the corresponding node
#21in the new plan took more than 2 minutes (of which less than 2 seconds was actual I/O). 

Am I misreading the plans? If not, I have no idea why the sequential scans take so much longer in the new database,
eventhough the I/O is even faster than before. 

The configuration was left almost unchanged, with only some adjustments due to changes between the versions. As far as
Ican tell, none of these changes is performance related. 

Can anybody give me a hint into which direction I should investigate further?

Thanks,
Christian



RE: Performance issue after migration from 9.4 to 15

От
Christian Schröder
Дата:
Hi,
I will definitely give it a try; however, I do not understand how this could explain the observed performance changes.
Asfar as I understand, an outdated/non-optimized index would not be relevant in a sequential scan, and outdated
statisticscould cause a non-optimal plan to be chosen, but they should not affect the performance of a single
sequentialscan. 

Best,
Christian


From: Inzamam Shafiq <inzamam.shafiq@hotmail.com>
Sent: Tuesday, April 11, 2023 1:25 PM
To: Christian Schröder <Christian.Schroeder@solvians.com>
Subject: Re: Performance issue after migration from 9.4 to 15

Hi,

How did you upgrade the databse? if it is done by pg_upgrade then I think you have to do the following,

Try re-indexing the database
Update/analyze statistics

Regards,

Inzamam Shafiq
Sr. DBA
________________________________________
From: Christian Schröder <mailto:Christian.Schroeder@solvians.com>
Sent: Tuesday, April 11, 2023 3:58 PM
To: mailto:pgsql-general@lists.postgresql.org <mailto:pgsql-general@lists.postgresql.org>
Subject: Performance issue after migration from 9.4 to 15

Hi list,
We have recently migrated from our old PostgreSQL 9.4 server to a PostgreSQL 15 server. Even though the new machine has
moreresources, we see a considerable decrease in the performance of some of our heavier queries, and I have no idea
whereI should start tuning. ? 

Old database: PostgreSQL 9.4.26 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17),
64-bit
New database: PostgreSQL 15.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

I tested the same query against the old and the new database.

Old: https://explain.dalibo.com/plan/b7d7gab73f6c7274#plan/node/21
New: https://explain.dalibo.com/plan/84h16cg1f19266be#plan/node/34

Particularly interesting are the sequential scans. In the old plan, we have node #21, which took 32 seconds. Almost all
ofthe time goes into actual I/O. In the new plan, the corresponding node is #34. It took 55 seconds, but less than 1
secondwas actual I/O (because most of the data was already in memory). Why did this step still take about twice the
time?
There is another Seq Scan of the same table. Node #10 in the old plan took 3 seconds, whereas the corresponding node
#21in the new plan took more than 2 minutes (of which less than 2 seconds was actual I/O). 

Am I misreading the plans? If not, I have no idea why the sequential scans take so much longer in the new database,
eventhough the I/O is even faster than before. 

The configuration was left almost unchanged, with only some adjustments due to changes between the versions. As far as
Ican tell, none of these changes is performance related. 

Can anybody give me a hint into which direction I should investigate further?

Thanks,
Christian




Re: Performance issue after migration from 9.4 to 15

От
David Rowley
Дата:
On Tue, 11 Apr 2023 at 23:03, Christian Schröder
<Christian.Schroeder@solvians.com> wrote:
> We have recently migrated from our old PostgreSQL 9.4 server to a PostgreSQL 15 server. Even though the new machine
hasmore resources, we see a considerable decrease in the performance of some of our heavier queries, and I have no idea
whereI should start tuning. ? 

Using pg_upgrade? Did you run ANALYZE? If not then you may be
suffering from lack of statistics leading to bad plans.

> Old: https://explain.dalibo.com/plan/b7d7gab73f6c7274#plan/node/21
> New: https://explain.dalibo.com/plan/84h16cg1f19266be#plan/node/34

I'm not well versed in looking at JSON output plans. You might get a
better overall response posting to https://explain.depesz.com/.
Personally, I always look for raw output, and at least with depesz,
that's text-based.

> Particularly interesting are the sequential scans. In the old plan, we have node #21, which took 32 seconds. Almost
allof the time goes into actual I/O. In the new plan, the corresponding node is #34. It took 55 seconds, but less than
1second was actual I/O (because most of the data was already in memory). Why did this step still take about twice the
time?

Perhaps your 15 server is under more load than 9.4 due to all
concurrent plans being slower from bad statistics? Load averages might
be a good indicator. (I assume the server is busy due to the "Workers
Launched": 0)

> Am I misreading the plans? If not, I have no idea why the sequential scans take so much longer in the new database,
eventhough the I/O is even faster than before. 

Looks that way to me too.

> Can anybody give me a hint into which direction I should investigate further?

Probably just run ANALYZE on the database in question.

David



Re: Performance issue after migration from 9.4 to 15

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> On Tue, 11 Apr 2023 at 23:03, Christian Schröder
> <Christian.Schroeder@solvians.com> wrote:
>> Particularly interesting are the sequential scans. In the old plan, we have node #21, which took 32 seconds. Almost
allof the time goes into actual I/O. In the new plan, the corresponding node is #34. It took 55 seconds, but less than
1second was actual I/O (because most of the data was already in memory). Why did this step still take about twice the
time?

> Perhaps your 15 server is under more load than 9.4 due to all
> concurrent plans being slower from bad statistics? Load averages might
> be a good indicator. (I assume the server is busy due to the "Workers
> Launched": 0)

I think the extra time is due to useless overhead from trying and failing
to parallelize: the leader has to do all the work, but there's probably
overhead added anyway.  9.4 of course knew nothing of parallelism.

My guess is that the OP is trying to run with a large number of backends
and has not raised the max number of parallel workers to match.  It
does look like the stats might need updating (since 9.4's rowcount
estimate is OK and 15's less so) but that is not why we see "Workers
planned: 2, Workers launched: 0".  Either provision enough parallel
workers to fix that, or disable parallelism.

            regards, tom lane



RE: Performance issue after migration from 9.4 to 15

От
Christian Schröder
Дата:
Hi all,
Thanks for all your tips!
I have meanwhile run "ANALYZE" on all tables. Either that was as important as everybody tells me 😉 or it is just a
coincidence(e.g., fewer other queries on the database). In any case, the performance has indeed improved considerably.
Iwill observe the situation tomorrow in the morning, when there is typically more load on the system.
 
I also noticed that, after analyzing, the sequential scan node shows fewer columns in the "output" section
(https://explain.dalibo.com/plan/b8bfa5a3d2dc33bc#plan/node/18).I am not sure if this has an impact on the performance
ofthis particular operation.
 
I will also follow the suggestion and increase the number of parallel workers. The new machine has enough cores, so
thisshould be possible.
 

Best,
Christian

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us> 
Sent: Tuesday, April 11, 2023 4:09 PM
To: David Rowley <dgrowleyml@gmail.com>
Cc: Christian Schröder <Christian.Schroeder@solvians.com>; pgsql-general@lists.postgresql.org
Subject: Re: Performance issue after migration from 9.4 to 15

David Rowley <dgrowleyml@gmail.com> writes:
> On Tue, 11 Apr 2023 at 23:03, Christian Schröder 
> <Christian.Schroeder@solvians.com> wrote:
>> Particularly interesting are the sequential scans. In the old plan, we have node #21, which took 32 seconds. Almost
allof the time goes into actual I/O. In the new plan, the corresponding node is #34. It took 55 seconds, but less than
1second was actual I/O (because most of the data was already in memory). Why did this step still take about twice the
time?

> Perhaps your 15 server is under more load than 9.4 due to all 
> concurrent plans being slower from bad statistics? Load averages might 
> be a good indicator. (I assume the server is busy due to the "Workers
> Launched": 0)

I think the extra time is due to useless overhead from trying and failing to parallelize: the leader has to do all the
work,but there's probably overhead added anyway.  9.4 of course knew nothing of parallelism.
 

My guess is that the OP is trying to run with a large number of backends and has not raised the max number of parallel
workersto match.  It does look like the stats might need updating (since 9.4's rowcount estimate is OK and 15's less
so)but that is not why we see "Workers
 
planned: 2, Workers launched: 0".  Either provision enough parallel workers to fix that, or disable parallelism.

            regards, tom lane

Re: Performance issue after migration from 9.4 to 15

От
Adrian Klaver
Дата:
On 4/11/23 07:41, Christian Schröder wrote:
> Hi all,
> Thanks for all your tips!
> I have meanwhile run "ANALYZE" on all tables. Either that was as important as everybody tells me 😉 or it is just a
coincidence(e.g., fewer other queries on the database). 
 

It is:

https://www.postgresql.org/docs/current/pgupgrade.html


Statistics

Because optimizer statistics are not transferred by pg_upgrade, you will 
be instructed to run a command to regenerate that information at the end 
of the upgrade. You might need to set connection parameters to match 
your new cluster.




-- 
Adrian Klaver
adrian.klaver@aklaver.com