BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
От | PG Bug reporting form |
---|---|
Тема | BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) |
Дата | |
Msg-id | 17717-6c50eb1c7d23a886@postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17717 Logged by: Gunnar L Email address: postgresql@taljaren.se PostgreSQL version: 15.0 Operating system: Ubuntu Linux Description: We have observed a significant slowdown in vacuumdb performance between different versions of postgresql. And possibly also a memory issue. We run a specific data model, where each customer has its own schema with its own set of tables. Each database server hosts 16 databases, each containing around 250 customer schemas. Due to postgres creating a new file for each database object, we end up with around 5 million files on each database server. This may or may not be related to the issue we're seeing (new algorithms with new time complexity?) We upgraded from postgresql 9.5 to postgresql 13, and noticed a significant slowdown in how vacuumdb performs. Before, we could run a vacuumdb -a -z each night, taking around 2 hours to complete. After the upgrade, we see a constant 100% CPU utilization during the vacuumdb process (almost no I/O activity), and vacuumdb cannot complete within a reasonable time. We're able to vacuum about 3-4 databases each night. We are able to recreate this issue, using a simple bash script to generate a similar setup. From local testing, here are our findings: Concerning speed: * Version 9.5, 10, 11 are fast (9.5 slower than 10 and 11) * Version 12, 13, 14 are very, very slow * Version 15 is faster (a lot faster than 12,13,14) but not nearly as fast as 10 or 11. Concerning memory usage: * Version 15 is using a lot more shared memory OR it might not be releasing it properly after vacuuming a db. These are the timings for vacuuming the 16 dbs. Version Seconds Completed ------------------------------ 9.5 412 16/16 10 178 16/16 11 166 16/16 12 8319 1/16 or 2/16 (manually aborted) 13 18853 3/16 or 4/16 (manually aborted) 14 16857 3/16 or 4/16 (manually aborted) 15 617 1/16 (crashed!) 15 4158 6/16 (crashed! --shm-size=256mb) 15 9500 16/16 (--shm-size=4096mb) The timing of the only successful run for postgres 15 is somewhat flaky, since the machine was suspended for about 1-1.5 hours so 9500 is only an estimate, but the first run (1 db completed in 10 minutes) gives that it is faster than 12-14 but slower than 10 and 11 (3 minutes to complete everything) The following describes our setup This is the script (called setup.sh) we’re using to populate the databases (we give a port number as parameter) ##### start of setup.sh export PGPASSWORD=mysecretpassword PORT=$1 echo ""> tables_$PORT.sql for schema in `seq -w 1 250`; do echo "create schema schema$schema;" >> tables_$PORT.sql for table in `seq -w 1 500`; do echo "create table schema$schema.table$table (id int);" >> tables_$PORT.sql done done echo "Setting up db: 01" createdb -h localhost -U postgres -p $PORT db01 psql -q -h localhost -U postgres -p $PORT db01 -f tables_$PORT.sql # This seems to be the fastest way to create the databases for db in `seq -w 2 16`; do echo "Setting up db: $db" createdb -h localhost -U postgres -p $PORT --template db01 db$db done ####### end of setup.sh To execute a test for a particular postgres version (in this example PG 9.5), we run the following. It will setup PG 9.5 on port 15432. docker run --rm --name pg95 -e POSTGRES_PASSWORD=mysecretpassword -p 15432:5432 -d postgres:9.5 ./setup.sh 15432 date; time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres"; date (The date commands are added to keep track of when tasks were started). Here are complete set of commands and output and comments (We use different ports for different versions of PG) date; time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres"; date (The date commands since it takes some time to run) time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres" vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db16" vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" real 6m52,070s user 0m0,048s sys 0m0,029s time docker exec -it pg10 bash -c "vacuumdb -a -z -U postgres" vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db16" vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" real 2m58,354s user 0m0,043s sys 0m0,013s time docker exec -it pg11 bash -c "vacuumdb -a -z -U postgres" vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db16" vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" real 2m46,181s user 0m0,047s sys 0m0,012s date; time docker exec -it pg12 bash -c "vacuumdb -a -z -U postgres"; date lör 10 dec 2022 18:57:43 CET vacuumdb: vacuuming database "db01" vacuumdb: vacuuming database "db02" ^CCancel request sent vacuumdb: error: vacuuming of table "schema241.table177" in database "db02" failed: ERROR: canceling statement due to user request real 138m39,600s user 0m0,177s sys 0m0,418s lör 10 dec 2022 21:16:22 CET date;time docker exec -it pg13 bash -c "vacuumdb -a -z -U postgres" lör 10 dec 2022 07:22:32 CET vacuumdb: vacuuming database "db01" vacuumdb: vacuuming database "db02" vacuumdb: vacuuming database "db03" vacuumdb: vacuuming database "db04" ^CCancel request sent real 314m13,172s user 0m0,551s sys 0m0,663s lör 10 dec 2022 12:37:03 CET date;time docker exec -it pg14 bash -c "vacuumdb -a -z -U postgres"; date lör 10 dec 2022 14:15:37 CET vacuumdb: vacuuming database "db01" vacuumdb: vacuuming database "db02" vacuumdb: vacuuming database "db03" vacuumdb: vacuuming database "db04" ^CCancel request sent real 280m57,172s user 0m0,586s sys 0m0,559s lör 10 dec 2022 18:56:34 CET date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date lör 10 dec 2022 12:50:25 CET vacuumdb: vacuuming database "db01" vacuumdb: vacuuming database "db02" vacuumdb: error: processing of database "db02" failed: ERROR: could not resize shared memory segment "/PostgreSQL.2952321776" to 27894720 bytes: No space left on device real 10m17,913s user 0m0,030s sys 0m0,049s lör 10 dec 2022 13:00:43 CET # it was faster, but we need to extend shared memory to make it work docker run --rm --name pg15 --shm-size=256mb -e POSTGRES_PASSWORD=mysecretpassword -p 55555:5432 -d postgres:15 date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date mån 12 dec 2022 08:56:17 CET vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db07" vacuumdb: error: processing of database "db07" failed: ERROR: could not resize shared memory segment "/PostgreSQL.1003084622" to 27894720 bytes: No space left on device real 69m18,345s user 0m0,217s sys 0m0,086s mån 12 dec 2022 10:05:36 CET docker run --rm --name pg15 --shm-size=4096mb -e POSTGRES_PASSWORD=mysecretpassword -p 55555:5432 -d postgres:15 date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date mån 12 dec 2022 11:16:11 CET vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db16" vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" real 232m46,168s user 0m0,227s sys 0m0,467s mån 12 dec 2022 15:08:57 CET Here is the hardware that was used AMD Ryzen 7 PRO 5850U with Radeon Graphics 8 Cores, 16 threads $ free total used free shared buff/cache available Mem: 28562376 5549716 752624 1088488 22260036 21499752 Swap: 999420 325792 673628 Disk: NVMe device, Samsung SSD 980 1TB
В списке pgsql-bugs по дате отправления:
Предыдущее
От: PG Bug reporting formДата:
Сообщение: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG
Следующее
От: Bowen ShiДата:
Сообщение: Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG