Обсуждение: VACUUM FULL ANALYSE hanging
Hi guys,
I am having problems with freeing disk space after a massive delete operation on a table that had approximately 80 million record. I ran the following command, by setting the vacuum memory to approximately a GigaByte:
SET vacuum_mem TO 1024000
VACUUM FULL ANALYSE VERBOSE oltp.requests
Here is what I get:
There were 34221203 unused item pointers.
Total free space (including removable row versions) is 8969616624 bytes.
1129827 pages are or will become empty, including 0 at the end of the table.
1337307 pages containing 8964065020 free bytes are potential move destinations.
CPU 16.03s/9.12u sec elapsed 219.47 sec.
Total free space (including removable row versions) is 8969616624 bytes.
1129827 pages are or will become empty, including 0 at the end of the table.
1337307 pages containing 8964065020 free bytes are potential move destinations.
CPU 16.03s/9.12u sec elapsed 219.47 sec.
INFO: index "requests_pkey" now contains 20075362 row versions in 327419 pages
DETAIL: 8211835 index row versions were removed.
217782 index pages have been deleted, 217782 are currently reusable.
CPU 5.38s/12.53u sec elapsed 100.80 sec.
DETAIL: 8211835 index row versions were removed.
217782 index pages have been deleted, 217782 are currently reusable.
CPU 5.38s/12.53u sec elapsed 100.80 sec.
INFO: index "idx_oltp_requests_access_date" now contains 20075362 row versions in 491725 pages
DETAIL: 8211835 index row versions were removed.
426501 index pages have been deleted, 426501 are currently reusable.
CPU 14.96s/13.47u sec elapsed 200.91 sec.
INFO: index "idx_oltp_requests_access_time" now contains 20075362 row versions in 343915 pages
DETAIL: 8211835 index row versions were removed.
213612 index pages have been deleted, 213612 are currently reusable.
CPU 6.32s/14.03u sec elapsed 111.24 sec.
DETAIL: 8211835 index row versions were removed.
426501 index pages have been deleted, 426501 are currently reusable.
CPU 14.96s/13.47u sec elapsed 200.91 sec.
INFO: index "idx_oltp_requests_access_time" now contains 20075362 row versions in 343915 pages
DETAIL: 8211835 index row versions were removed.
213612 index pages have been deleted, 213612 are currently reusable.
CPU 6.32s/14.03u sec elapsed 111.24 sec.
INFO: index "idx_oltp_requests_referer" now contains 20075362 row versions in 470822 pages
DETAIL: 8211835 index row versions were removed.
376873 index pages have been deleted, 376873 are currently reusable.
CPU 18.85s/17.18u sec elapsed 265.25 sec.
DETAIL: 8211835 index row versions were removed.
376873 index pages have been deleted, 376873 are currently reusable.
CPU 18.85s/17.18u sec elapsed 265.25 sec.
INFO: index "idx_oltp_requests_session" now contains 20075362 row versions in 611141 pages
DETAIL: 8211835 index row versions were removed.
478827 index pages have been deleted, 478827 are currently reusable.
CPU 16.83s/14.33u sec elapsed 258.47 sec.
INFO: index "idx_oltp_requests_status_code" now contains 20075362 row versions in 690337 pages
DETAIL: 8211835 index row versions were removed.
600953 index pages have been deleted, 600953 are currently reusable.
CPU 34.37s/24.44u sec elapsed 297.21 sec.
DETAIL: 8211835 index row versions were removed.
478827 index pages have been deleted, 478827 are currently reusable.
CPU 16.83s/14.33u sec elapsed 258.47 sec.
INFO: index "idx_oltp_requests_status_code" now contains 20075362 row versions in 690337 pages
DETAIL: 8211835 index row versions were removed.
600953 index pages have been deleted, 600953 are currently reusable.
CPU 34.37s/24.44u sec elapsed 297.21 sec.
INFO: index "idx_oltp_requests_url" now contains 20075362 row versions in 336075 pages
DETAIL: 8211835 index row versions were removed.
73821 index pages have been deleted, 73821 are currently reusable.
CPU 17.06s/28.14u sec elapsed 319.16 sec.
DETAIL: 8211835 index row versions were removed.
73821 index pages have been deleted, 73821 are currently reusable.
CPU 17.06s/28.14u sec elapsed 319.16 sec.
But here, the command simply hangs.
The table description is:
htminer=> \d oltp.requests
Tabella "oltp.requests"
Colonna | Tipo | Modificatori
--------------------+-----------------------------+--------------------
id_request | integer | not null
id_session | integer |
access_time | timestamp(0) with time zone | not null
request_method | numeric(2,0) | not null default 1
http_version_major | numeric(1,0) | not null default 1
http_version_minor | numeric(1,0) | not null default 1
status_code | numeric(3,0) | not null
bytes | integer |
time_taken | numeric(3,0) |
id_url | integer | not null
id_referer | integer |
content_language | character(2) |
dwell_time | smallint | not null default 1
request_type | numeric(1,0) |
Indici:
"requests_pkey" PRIMARY KEY, btree (id_request), tablespace "htminer_oltp"
"idx_oltp_requests_access_date" btree (date_trunc('day'::text, timezone('UTC'::text, access_time))), tablespace "htminer_oltp"
"idx_oltp_requests_access_time" btree (access_time), tablespace "htminer_oltp"
"idx_oltp_requests_referer" btree (id_referer), tablespace "htminer_oltp"
"idx_oltp_requests_session" btree (id_session, status_code), tablespace "htminer_oltp"
"idx_oltp_requests_status_code" btree (status_code), tablespace "htminer_oltp"
"idx_oltp_requests_url" btree (id_url), tablespace "htminer_oltp"
Vincoli di controllo:
"requests_bytes_check" CHECK (bytes >= 0)
"requests_dwell_time_check" CHECK (dwell_time >= 0)
"requests_id_request_check" CHECK (id_request > 0)
"requests_request_method_check" CHECK (request_method > 0::numeric)
"requests_request_type_check" CHECK (request_type > 0::numeric)
Vincoli di integrità referenziale
"requests_id_referer_fkey" FOREIGN KEY (id_referer) REFERENCES oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE
"requests_id_session_fkey" FOREIGN KEY (id_session) REFERENCES oltp.sessions(id_session) ON UPDATE CASCADE ON DELETE CASCADE
"requests_id_url_fkey" FOREIGN KEY (id_url) REFERENCES oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE
Tablespace: "htminer_oltp"
Tabella "oltp.requests"
Colonna | Tipo | Modificatori
--------------------+-----------------------------+--------------------
id_request | integer | not null
id_session | integer |
access_time | timestamp(0) with time zone | not null
request_method | numeric(2,0) | not null default 1
http_version_major | numeric(1,0) | not null default 1
http_version_minor | numeric(1,0) | not null default 1
status_code | numeric(3,0) | not null
bytes | integer |
time_taken | numeric(3,0) |
id_url | integer | not null
id_referer | integer |
content_language | character(2) |
dwell_time | smallint | not null default 1
request_type | numeric(1,0) |
Indici:
"requests_pkey" PRIMARY KEY, btree (id_request), tablespace "htminer_oltp"
"idx_oltp_requests_access_date" btree (date_trunc('day'::text, timezone('UTC'::text, access_time))), tablespace "htminer_oltp"
"idx_oltp_requests_access_time" btree (access_time), tablespace "htminer_oltp"
"idx_oltp_requests_referer" btree (id_referer), tablespace "htminer_oltp"
"idx_oltp_requests_session" btree (id_session, status_code), tablespace "htminer_oltp"
"idx_oltp_requests_status_code" btree (status_code), tablespace "htminer_oltp"
"idx_oltp_requests_url" btree (id_url), tablespace "htminer_oltp"
Vincoli di controllo:
"requests_bytes_check" CHECK (bytes >= 0)
"requests_dwell_time_check" CHECK (dwell_time >= 0)
"requests_id_request_check" CHECK (id_request > 0)
"requests_request_method_check" CHECK (request_method > 0::numeric)
"requests_request_type_check" CHECK (request_type > 0::numeric)
Vincoli di integrità referenziale
"requests_id_referer_fkey" FOREIGN KEY (id_referer) REFERENCES oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE
"requests_id_session_fkey" FOREIGN KEY (id_session) REFERENCES oltp.sessions(id_session) ON UPDATE CASCADE ON DELETE CASCADE
"requests_id_url_fkey" FOREIGN KEY (id_url) REFERENCES oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE
Tablespace: "htminer_oltp"
Should I drop the indexes and re-create them?
Thanks,
Gabriele
--
Gabriele Bartolini - Istruttore Informatico - Comune di Prato
Sistema Informativo - Servizi di E-Government e Open-Source
I've noticed in my own experiments and experiences with VACUUM FULL that it tries to reindex all the indexes to compress them. While a good idea, this unfortunately takes a *long* time. You should check two things. First, the server CPU usage should be high (~100% if on a single core). Second, check the contents of the pg_locks view. It should hold exclusive locks on all the indexes (though it's ordered by oid, so you might have to check pg_class or something else to get the actual table/index name). If it's truly hanging/deadlocking, the locks won't be granted, and the CPU usage will be low. In my experiences, doing a dump/restore was far faster, but this method creates downtime. (e.g. a 10 GB database took 2 hours to restore, while reindexing/vacuuming full was still on the same table after 4 hours). If anyone can shed some light onto why reindexing/vacuuming full takes so long, I'd like to know. Peter Gabriele Bartolini wrote: > Hi guys, > > I am having problems with freeing disk space after a massive delete > operation on a table that had approximately 80 million record. I > ran the following command, by setting the vacuum memory to > approximately a GigaByte: > > SET vacuum_mem TO 1024000 > VACUUM FULL ANALYSE VERBOSE oltp.requests > > Here is what I get: > > There were 34221203 unused item pointers. > Total free space (including removable row versions) is 8969616624 bytes. > 1129827 pages are or will become empty, including 0 at the end of the > table. > 1337307 pages containing 8964065020 free bytes are potential move > destinations. > CPU 16.03s/9.12u sec elapsed 219.47 sec. > INFO: index "requests_pkey" now contains 20075362 row versions in > 327419 pages > DETAIL: 8211835 index row versions were removed. > 217782 index pages have been deleted, 217782 are currently reusable. > CPU 5.38s/12.53u sec elapsed 100.80 sec. > INFO: index "idx_oltp_requests_access_date" now contains 20075362 row > versions in 491725 pages > DETAIL: 8211835 index row versions were removed. > 426501 index pages have been deleted, 426501 are currently reusable. > CPU 14.96s/13.47u sec elapsed 200.91 sec. > INFO: index "idx_oltp_requests_access_time" now contains 20075362 row > versions in 343915 pages > DETAIL: 8211835 index row versions were removed. > 213612 index pages have been deleted, 213612 are currently reusable. > CPU 6.32s/14.03u sec elapsed 111.24 sec. > INFO: index "idx_oltp_requests_referer" now contains 20075362 row > versions in 470822 pages > DETAIL: 8211835 index row versions were removed. > 376873 index pages have been deleted, 376873 are currently reusable. > CPU 18.85s/17.18u sec elapsed 265.25 sec. > INFO: index "idx_oltp_requests_session" now contains 20075362 row > versions in 611141 pages > DETAIL: 8211835 index row versions were removed. > 478827 index pages have been deleted, 478827 are currently reusable. > CPU 16.83s/14.33u sec elapsed 258.47 sec. > INFO: index "idx_oltp_requests_status_code" now contains 20075362 row > versions in 690337 pages > DETAIL: 8211835 index row versions were removed. > 600953 index pages have been deleted, 600953 are currently reusable. > CPU 34.37s/24.44u sec elapsed 297.21 sec. > INFO: index "idx_oltp_requests_url" now contains 20075362 row > versions in 336075 pages > DETAIL: 8211835 index row versions were removed. > 73821 index pages have been deleted, 73821 are currently reusable. > CPU 17.06s/28.14u sec elapsed 319.16 sec. > But here, the command simply hangs. > > The table description is: > > htminer=> \d oltp.requests > Tabella "oltp.requests" > Colonna | Tipo | Modificatori > --------------------+-----------------------------+-------------------- > id_request | integer | not null > id_session | integer | > access_time | timestamp(0) with time zone | not null > request_method | numeric(2,0) | not null default 1 > http_version_major | numeric(1,0) | not null default 1 > http_version_minor | numeric(1,0) | not null default 1 > status_code | numeric(3,0) | not null > bytes | integer | > time_taken | numeric(3,0) | > id_url | integer | not null > id_referer | integer | > content_language | character(2) | > dwell_time | smallint | not null default 1 > request_type | numeric(1,0) | > Indici: > "requests_pkey" PRIMARY KEY, btree (id_request), tablespace > "htminer_oltp" > "idx_oltp_requests_access_date" btree (date_trunc('day'::text, > timezone('UTC'::text, access_time))), tablespace "htminer_oltp" > "idx_oltp_requests_access_time" btree (access_time), tablespace > "htminer_oltp" > "idx_oltp_requests_referer" btree (id_referer), tablespace > "htminer_oltp" > "idx_oltp_requests_session" btree (id_session, status_code), > tablespace "htminer_oltp" > "idx_oltp_requests_status_code" btree (status_code), tablespace > "htminer_oltp" > "idx_oltp_requests_url" btree (id_url), tablespace "htminer_oltp" > Vincoli di controllo: > "requests_bytes_check" CHECK (bytes >= 0) > "requests_dwell_time_check" CHECK (dwell_time >= 0) > "requests_id_request_check" CHECK (id_request > 0) > "requests_request_method_check" CHECK (request_method > 0::numeric) > "requests_request_type_check" CHECK (request_type > 0::numeric) > Vincoli di integrità referenziale > "requests_id_referer_fkey" FOREIGN KEY (id_referer) REFERENCES > oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE > "requests_id_session_fkey" FOREIGN KEY (id_session) REFERENCES > oltp.sessions(id_session) ON UPDATE CASCADE ON DELETE CASCADE > "requests_id_url_fkey" FOREIGN KEY (id_url) REFERENCES > oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE > Tablespace: "htminer_oltp" > Should I drop the indexes and re-create them? > > Thanks, > Gabriele > > -- > Gabriele Bartolini - Istruttore Informatico - Comune di Prato > Sistema Informativo - Servizi di E-Government e Open-Source > g.bartolini@comune.prato.it <mailto:g.bartolini@comune.prato.it> - > www.comune.prato.it <http://www.comune.prato.it/> - www.htminer.it > <http://www.htminer.it/> >
On Wed, 2007-05-02 at 05:05, Gabriele Bartolini wrote: > Hi guys, > > I am having problems with freeing disk space after a massive delete > operation on a table that had approximately 80 million record. I > ran the following command, by setting the vacuum memory to > approximately a GigaByte: > > SET vacuum_mem TO 1024000 > VACUUM FULL ANALYSE VERBOSE oltp.requests > > Here is what I get: SNIP > DETAIL: 8211835 index row versions were removed. > 73821 index pages have been deleted, 73821 are currently reusable. > CPU 17.06s/28.14u sec elapsed 319.16 sec. > > But here, the command simply hangs. > > The table description is: Maybe it's hanging, maybe it's just taking a really long time to finish and get back to you. What's vmstat / top /iostat got to say about the machine during this "hang"? IF the I/O or CPU are high, then it's not hung, it's just taking a while.
Peter Koczan escribió: > I've noticed in my own experiments and experiences with VACUUM FULL that > it tries to reindex all the indexes to compress them. While a good idea, > this unfortunately takes a *long* time. Huh, this is not an accurate description of what happens. VACUUM FULL tries to keep indexes up to date, but it doesn't *reindex* them (like a REINDEX command would do). There was another thread about VACUUM FULL in which Tom Lane described exactly what happened to indexes. > In my experiences, doing a dump/restore was far faster, but this method > creates downtime. (e.g. a 10 GB database took 2 hours to restore, while > reindexing/vacuuming full was still on the same table after 4 hours). It has been reported that CLUSTER is also faster than VACUUM FULL, and it leaves the indexes more compact than VACUUM FULL to boot. So it may be the best option. It is also easier to do than dump/restore. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hi guys, thanks for your answer. Unfortunately I have no superuser privileges on that server, but I have tried to look it up closelywith our administrator today. I launched the VACUUM FULL ANALYSE command last night on a table, and it is still running. Now, I want to finish it and see how long it took. It is a test I want to do. The process is not idle, we have 'straced'it and here it's the output: postgres 9178 9.2 10.9 527384 443476 ? Ds May03 75:37 \_ postgres: htminer htminer [local] VACUUM > strace -i -p 9178 Process 9178 attached - interrupt to quit [ 339e0c6902] lseek(55, 1067171840, SEEK_SET) = 1067171840 [ 339e0b9302] write(55, "\317\0\0\0h\37\353\331\1\0\0\0l\1\220\32\360\37\3 \340"..., 8192) = 8192 [ 339e0c6902] lseek(31, 31145984, SEEK_SET) = 31145984 [ 339e0b9272] read(31, "m\0\0\0\320\22V\36\1\0\0\0\240\5\310\t\360\37\3 \340\237"..., 8192) = 8192 [ 339e0c6902] lseek(31, 31309824, SEEK_SET) = 31309824 [ 339e0b9272] read(31, "\317\0\0\0\310\26\353\331\1\0\0\0\350\4\240\f\360\37\3"..., 8192) = 8192 [ 339e0c6902] lseek(32, 667787264, SEEK_SET) = 667787264 [ 339e0b9272] read(32, "\301\0\0\0PI\t\27\1\0\0\0\34\0\330\37\360\37\3 \350\237"..., 8192) = 8192 [ 339e0c6902] lseek(34, 157900800, SEEK_SET) = 157900800 [ 339e0b9272] read(34, "\317\0\0\0P\3725\327\1\0\0\0\260\5\210\t\360\37\3 \350"..., 8192) = 8192 If you are interested - and may be useful for comparisons - I will let you know the results. The tables are locked (I looked at pg_lock): htminer=> select * from pg_locks where pid = 9178; locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid | mode | granted ---------------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+------+---------------------+--------- relation | 16415 | 199381 | | | | | | | 64899160 | 9178 | RowExclusiveLock | t relation | 16415 | 199381 | | | | | | | 64899160 | 9178 | AccessExclusiveLock| t relation | 16415 | 199385 | | | | | | | 64899160 | 9178 | RowExclusiveLock | t relation | 16415 | 199385 | | | | | | | 64899160 | 9178 | AccessExclusiveLock| t transactionid | | | | | 64899160 | | | | 64899160 | 9178 | ExclusiveLock | t relation | 16415 | 199142 | | | | | | | 64899160 | 9178 | RowExclusiveLock | t relation | 16415 | 199142 | | | | | | | 64899160 | 9178 | AccessExclusiveLock| t relation | 16415 | 199382 | | | | | | | 64899160 | 9178 | RowExclusiveLock | t relation | 16415 | 199382 | | | | | | | 64899160 | 9178 | AccessExclusiveLock| t relation | 16415 | 199383 | | | | | | | 64899160 | 9178 | RowExclusiveLock | t relation | 16415 | 199383 | | | | | | | 64899160 | 9178 | AccessExclusiveLock| t relation | 16415 | 199386 | | | | | | | 64899160 | 9178 | RowExclusiveLock | t relation | 16415 | 199386 | | | | | | | 64899160 | 9178 | AccessExclusiveLock| t relation | 16415 | 198716 | | | | | | | 64899160 | 9178 | AccessExclusiveLock| t relation | 16415 | 199384 | | | | | | | 64899160 | 9178 | RowExclusiveLock | t relation | 16415 | 199384 | | | | | | | 64899160 | 9178 | AccessExclusiveLock| t It should be alright. I am waiting for the results. Thank you for your nice answers. Ciao, Gabriele -- Gabriele Bartolini - Istruttore Informatico - Comune di Prato Sistema Informativo - Servizi di E-Government e Open-Source g.bartolini@comune.prato.it - www.comune.prato.it - www.htminer.it -----Messaggio originale----- Da: Scott Marlowe [mailto:smarlowe@g2switchworks.com] Inviato: mercoledì 2 maggio 2007 17.41 A: Gabriele Bartolini Cc: pgsql-admin@postgresql.org Oggetto: Re: [ADMIN] VACUUM FULL ANALYSE hanging On Wed, 2007-05-02 at 05:05, Gabriele Bartolini wrote: > Hi guys, > > I am having problems with freeing disk space after a massive delete > operation on a table that had approximately 80 million record. I ran > the following command, by setting the vacuum memory to approximately a > GigaByte: > > SET vacuum_mem TO 1024000 > VACUUM FULL ANALYSE VERBOSE oltp.requests > > Here is what I get: SNIP > DETAIL: 8211835 index row versions were removed. > 73821 index pages have been deleted, 73821 are currently reusable. > CPU 17.06s/28.14u sec elapsed 319.16 sec. > > But here, the command simply hangs. > > The table description is: Maybe it's hanging, maybe it's just taking a really long time to finish and get back to you. What's vmstat / top /iostat got to say about the machine during this "hang"? IF the I/O or CPU are high, then it's not hung, it's just taking a while.