Обсуждение: VACUUM FULL ANALYSE hanging

Поиск
Список
Период
Сортировка

VACUUM FULL ANALYSE hanging

От
"Gabriele Bartolini"
Дата:
Hi guys,
 
   I am having problems with freeing disk space after a massive delete operation on a table that had approximately 80 million record. I ran the following command, by setting the vacuum memory to approximately a GigaByte:
 
SET vacuum_mem TO 1024000
VACUUM FULL ANALYSE VERBOSE oltp.requests
 
Here is what I get:
 
There were 34221203 unused item pointers.
Total free space (including removable row versions) is 8969616624 bytes.
1129827 pages are or will become empty, including 0 at the end of the table.
1337307 pages containing 8964065020 free bytes are potential move destinations.
CPU 16.03s/9.12u sec elapsed 219.47 sec.
INFO:  index "requests_pkey" now contains 20075362 row versions in 327419 pages
DETAIL:  8211835 index row versions were removed.
217782 index pages have been deleted, 217782 are currently reusable.
CPU 5.38s/12.53u sec elapsed 100.80 sec.
INFO:  index "idx_oltp_requests_access_date" now contains 20075362 row versions in 491725 pages
DETAIL:  8211835 index row versions were removed.
426501 index pages have been deleted, 426501 are currently reusable.
CPU 14.96s/13.47u sec elapsed 200.91 sec.
INFO:  index "idx_oltp_requests_access_time" now contains 20075362 row versions in 343915 pages
DETAIL:  8211835 index row versions were removed.
213612 index pages have been deleted, 213612 are currently reusable.
CPU 6.32s/14.03u sec elapsed 111.24 sec.
INFO:  index "idx_oltp_requests_referer" now contains 20075362 row versions in 470822 pages
DETAIL:  8211835 index row versions were removed.
376873 index pages have been deleted, 376873 are currently reusable.
CPU 18.85s/17.18u sec elapsed 265.25 sec.
INFO:  index "idx_oltp_requests_session" now contains 20075362 row versions in 611141 pages
DETAIL:  8211835 index row versions were removed.
478827 index pages have been deleted, 478827 are currently reusable.
CPU 16.83s/14.33u sec elapsed 258.47 sec.
INFO:  index "idx_oltp_requests_status_code" now contains 20075362 row versions in 690337 pages
DETAIL:  8211835 index row versions were removed.
600953 index pages have been deleted, 600953 are currently reusable.
CPU 34.37s/24.44u sec elapsed 297.21 sec.
INFO:  index "idx_oltp_requests_url" now contains 20075362 row versions in 336075 pages
DETAIL:  8211835 index row versions were removed.
73821 index pages have been deleted, 73821 are currently reusable.
CPU 17.06s/28.14u sec elapsed 319.16 sec.
   But here, the command simply hangs.
 
   The table description is:
 
htminer=> \d oltp.requests
                        Tabella "oltp.requests"
      Colonna       |            Tipo             |    Modificatori
--------------------+-----------------------------+--------------------
 id_request         | integer                     | not null
 id_session         | integer                     |
 access_time        | timestamp(0) with time zone | not null
 request_method     | numeric(2,0)                | not null default 1
 http_version_major | numeric(1,0)                | not null default 1
 http_version_minor | numeric(1,0)                | not null default 1
 status_code        | numeric(3,0)                | not null
 bytes              | integer                     |
 time_taken         | numeric(3,0)                |
 id_url             | integer                     | not null
 id_referer         | integer                     |
 content_language   | character(2)                |
 dwell_time         | smallint                    | not null default 1
 request_type       | numeric(1,0)                |
Indici:
    "requests_pkey" PRIMARY KEY, btree (id_request), tablespace "htminer_oltp"
    "idx_oltp_requests_access_date" btree (date_trunc('day'::text, timezone('UTC'::text, access_time))), tablespace "htminer_oltp"
    "idx_oltp_requests_access_time" btree (access_time), tablespace "htminer_oltp"
    "idx_oltp_requests_referer" btree (id_referer), tablespace "htminer_oltp"
    "idx_oltp_requests_session" btree (id_session, status_code), tablespace "htminer_oltp"
    "idx_oltp_requests_status_code" btree (status_code), tablespace "htminer_oltp"
    "idx_oltp_requests_url" btree (id_url), tablespace "htminer_oltp"
Vincoli di controllo:
    "requests_bytes_check" CHECK (bytes >= 0)
    "requests_dwell_time_check" CHECK (dwell_time >= 0)
    "requests_id_request_check" CHECK (id_request > 0)
    "requests_request_method_check" CHECK (request_method > 0::numeric)
    "requests_request_type_check" CHECK (request_type > 0::numeric)
Vincoli di integrità referenziale
    "requests_id_referer_fkey" FOREIGN KEY (id_referer) REFERENCES oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE
    "requests_id_session_fkey" FOREIGN KEY (id_session) REFERENCES oltp.sessions(id_session) ON UPDATE CASCADE ON DELETE CASCADE
    "requests_id_url_fkey" FOREIGN KEY (id_url) REFERENCES oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE
Tablespace: "htminer_oltp"
Should I drop the indexes and re-create them?
 
Thanks,
Gabriele
 
--
Gabriele Bartolini - Istruttore Informatico - Comune di Prato
Sistema Informativo - Servizi di E-Government e Open-Source
 

Re: VACUUM FULL ANALYSE hanging

От
Peter Koczan
Дата:
I've noticed in my own experiments and experiences with VACUUM FULL that
it tries to reindex all the indexes to compress them. While a good idea,
this unfortunately takes a *long* time.

You should check two things. First, the server CPU usage should be high
(~100% if on a single core). Second, check the contents of the pg_locks
view. It should hold exclusive locks on all the indexes (though it's
ordered by oid, so you might have to check pg_class or something else to
get the actual table/index name). If it's truly hanging/deadlocking, the
locks won't be granted, and the CPU usage will be low.

In my experiences, doing a dump/restore was far faster, but this method
creates downtime. (e.g. a 10 GB database took 2 hours to restore, while
reindexing/vacuuming full was still on the same table after 4 hours).

If anyone can shed some light onto why reindexing/vacuuming full takes
so long, I'd like to know.

Peter

Gabriele Bartolini wrote:
> Hi guys,
>
>    I am having problems with freeing disk space after a massive delete
> operation on a table that had approximately 80 million record. I
> ran the following command, by setting the vacuum memory to
> approximately a GigaByte:
>
> SET vacuum_mem TO 1024000
> VACUUM FULL ANALYSE VERBOSE oltp.requests
>
> Here is what I get:
>
> There were 34221203 unused item pointers.
> Total free space (including removable row versions) is 8969616624 bytes.
> 1129827 pages are or will become empty, including 0 at the end of the
> table.
> 1337307 pages containing 8964065020 free bytes are potential move
> destinations.
> CPU 16.03s/9.12u sec elapsed 219.47 sec.
> INFO:  index "requests_pkey" now contains 20075362 row versions in
> 327419 pages
> DETAIL:  8211835 index row versions were removed.
> 217782 index pages have been deleted, 217782 are currently reusable.
> CPU 5.38s/12.53u sec elapsed 100.80 sec.
> INFO:  index "idx_oltp_requests_access_date" now contains 20075362 row
> versions in 491725 pages
> DETAIL:  8211835 index row versions were removed.
> 426501 index pages have been deleted, 426501 are currently reusable.
> CPU 14.96s/13.47u sec elapsed 200.91 sec.
> INFO:  index "idx_oltp_requests_access_time" now contains 20075362 row
> versions in 343915 pages
> DETAIL:  8211835 index row versions were removed.
> 213612 index pages have been deleted, 213612 are currently reusable.
> CPU 6.32s/14.03u sec elapsed 111.24 sec.
> INFO:  index "idx_oltp_requests_referer" now contains 20075362 row
> versions in 470822 pages
> DETAIL:  8211835 index row versions were removed.
> 376873 index pages have been deleted, 376873 are currently reusable.
> CPU 18.85s/17.18u sec elapsed 265.25 sec.
> INFO:  index "idx_oltp_requests_session" now contains 20075362 row
> versions in 611141 pages
> DETAIL:  8211835 index row versions were removed.
> 478827 index pages have been deleted, 478827 are currently reusable.
> CPU 16.83s/14.33u sec elapsed 258.47 sec.
> INFO:  index "idx_oltp_requests_status_code" now contains 20075362 row
> versions in 690337 pages
> DETAIL:  8211835 index row versions were removed.
> 600953 index pages have been deleted, 600953 are currently reusable.
> CPU 34.37s/24.44u sec elapsed 297.21 sec.
> INFO:  index "idx_oltp_requests_url" now contains 20075362 row
> versions in 336075 pages
> DETAIL:  8211835 index row versions were removed.
> 73821 index pages have been deleted, 73821 are currently reusable.
> CPU 17.06s/28.14u sec elapsed 319.16 sec.
>    But here, the command simply hangs.
>
>    The table description is:
>
> htminer=> \d oltp.requests
>                         Tabella "oltp.requests"
>       Colonna       |            Tipo             |    Modificatori
> --------------------+-----------------------------+--------------------
>  id_request         | integer                     | not null
>  id_session         | integer                     |
>  access_time        | timestamp(0) with time zone | not null
>  request_method     | numeric(2,0)                | not null default 1
>  http_version_major | numeric(1,0)                | not null default 1
>  http_version_minor | numeric(1,0)                | not null default 1
>  status_code        | numeric(3,0)                | not null
>  bytes              | integer                     |
>  time_taken         | numeric(3,0)                |
>  id_url             | integer                     | not null
>  id_referer         | integer                     |
>  content_language   | character(2)                |
>  dwell_time         | smallint                    | not null default 1
>  request_type       | numeric(1,0)                |
> Indici:
>     "requests_pkey" PRIMARY KEY, btree (id_request), tablespace
> "htminer_oltp"
>     "idx_oltp_requests_access_date" btree (date_trunc('day'::text,
> timezone('UTC'::text, access_time))), tablespace "htminer_oltp"
>     "idx_oltp_requests_access_time" btree (access_time), tablespace
> "htminer_oltp"
>     "idx_oltp_requests_referer" btree (id_referer), tablespace
> "htminer_oltp"
>     "idx_oltp_requests_session" btree (id_session, status_code),
> tablespace "htminer_oltp"
>     "idx_oltp_requests_status_code" btree (status_code), tablespace
> "htminer_oltp"
>     "idx_oltp_requests_url" btree (id_url), tablespace "htminer_oltp"
> Vincoli di controllo:
>     "requests_bytes_check" CHECK (bytes >= 0)
>     "requests_dwell_time_check" CHECK (dwell_time >= 0)
>     "requests_id_request_check" CHECK (id_request > 0)
>     "requests_request_method_check" CHECK (request_method > 0::numeric)
>     "requests_request_type_check" CHECK (request_type > 0::numeric)
> Vincoli di integrità referenziale
>     "requests_id_referer_fkey" FOREIGN KEY (id_referer) REFERENCES
> oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE
>     "requests_id_session_fkey" FOREIGN KEY (id_session) REFERENCES
> oltp.sessions(id_session) ON UPDATE CASCADE ON DELETE CASCADE
>     "requests_id_url_fkey" FOREIGN KEY (id_url) REFERENCES
> oltp.urls(id_url) ON UPDATE CASCADE ON DELETE CASCADE
> Tablespace: "htminer_oltp"
> Should I drop the indexes and re-create them?
>
> Thanks,
> Gabriele
>
> --
> Gabriele Bartolini - Istruttore Informatico - Comune di Prato
> Sistema Informativo - Servizi di E-Government e Open-Source
> g.bartolini@comune.prato.it <mailto:g.bartolini@comune.prato.it> -
> www.comune.prato.it <http://www.comune.prato.it/> - www.htminer.it
> <http://www.htminer.it/>
>


Re: VACUUM FULL ANALYSE hanging

От
Scott Marlowe
Дата:
On Wed, 2007-05-02 at 05:05, Gabriele Bartolini wrote:
> Hi guys,
>
>    I am having problems with freeing disk space after a massive delete
> operation on a table that had approximately 80 million record. I
> ran the following command, by setting the vacuum memory to
> approximately a GigaByte:
>
> SET vacuum_mem TO 1024000
> VACUUM FULL ANALYSE VERBOSE oltp.requests
>
> Here is what I get:

SNIP

> DETAIL:  8211835 index row versions were removed.
> 73821 index pages have been deleted, 73821 are currently reusable.
> CPU 17.06s/28.14u sec elapsed 319.16 sec.
>
>    But here, the command simply hangs.
>
>    The table description is:

Maybe it's hanging, maybe it's just taking a really long time to finish
and get back to you.

What's vmstat / top /iostat got to say about the machine during this
"hang"?

IF the I/O or CPU are high, then it's not hung, it's just taking a
while.

Re: VACUUM FULL ANALYSE hanging

От
Alvaro Herrera
Дата:
Peter Koczan escribió:
> I've noticed in my own experiments and experiences with VACUUM FULL that
> it tries to reindex all the indexes to compress them. While a good idea,
> this unfortunately takes a *long* time.

Huh, this is not an accurate description of what happens.  VACUUM FULL
tries to keep indexes up to date, but it doesn't *reindex* them (like a
REINDEX command would do).

There was another thread about VACUUM FULL in which Tom Lane described
exactly what happened to indexes.

> In my experiences, doing a dump/restore was far faster, but this method
> creates downtime. (e.g. a 10 GB database took 2 hours to restore, while
> reindexing/vacuuming full was still on the same table after 4 hours).

It has been reported that CLUSTER is also faster than VACUUM FULL, and
it leaves the indexes more compact than VACUUM FULL to boot.  So it may
be the best option.  It is also easier to do than dump/restore.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

R: VACUUM FULL ANALYSE hanging

От
"Gabriele Bartolini"
Дата:
Hi guys,

   thanks for your answer. Unfortunately I have no superuser privileges on that server, but I have tried to look it up
closelywith our administrator today. I launched the VACUUM FULL ANALYSE command last night on a table, and it is still
running.

   Now, I want to finish it and see how long it took. It is a test I want to do. The process is not idle, we have
'straced'it and here it's the output: 

postgres  9178  9.2 10.9 527384 443476 ?     Ds   May03  75:37  \_ postgres: htminer htminer [local] VACUUM

> strace -i -p 9178
Process 9178 attached - interrupt to quit
[      339e0c6902] lseek(55, 1067171840, SEEK_SET) = 1067171840
[      339e0b9302] write(55, "\317\0\0\0h\37\353\331\1\0\0\0l\1\220\32\360\37\3 \340"..., 8192) = 8192
[      339e0c6902] lseek(31, 31145984, SEEK_SET) = 31145984
[      339e0b9272] read(31, "m\0\0\0\320\22V\36\1\0\0\0\240\5\310\t\360\37\3 \340\237"..., 8192) = 8192
[      339e0c6902] lseek(31, 31309824, SEEK_SET) = 31309824
[      339e0b9272] read(31, "\317\0\0\0\310\26\353\331\1\0\0\0\350\4\240\f\360\37\3"..., 8192) = 8192
[      339e0c6902] lseek(32, 667787264, SEEK_SET) = 667787264
[      339e0b9272] read(32, "\301\0\0\0PI\t\27\1\0\0\0\34\0\330\37\360\37\3 \350\237"..., 8192) = 8192
[      339e0c6902] lseek(34, 157900800, SEEK_SET) = 157900800
[      339e0b9272] read(34, "\317\0\0\0P\3725\327\1\0\0\0\260\5\210\t\360\37\3 \350"..., 8192) = 8192

   If you are interested - and may be useful for comparisons - I will let you know the results.

   The tables are locked (I looked at pg_lock):

htminer=> select * from pg_locks where pid = 9178;
   locktype    | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid  |
      mode         | granted 

---------------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+------+---------------------+---------
 relation      |    16415 |   199381 |      |       |               |         |       |          |    64899160 | 9178 |
RowExclusiveLock   | t 
 relation      |    16415 |   199381 |      |       |               |         |       |          |    64899160 | 9178 |
AccessExclusiveLock| t 
 relation      |    16415 |   199385 |      |       |               |         |       |          |    64899160 | 9178 |
RowExclusiveLock   | t 
 relation      |    16415 |   199385 |      |       |               |         |       |          |    64899160 | 9178 |
AccessExclusiveLock| t 
 transactionid |          |          |      |       |      64899160 |         |       |          |    64899160 | 9178 |
ExclusiveLock      | t 
 relation      |    16415 |   199142 |      |       |               |         |       |          |    64899160 | 9178 |
RowExclusiveLock   | t 
 relation      |    16415 |   199142 |      |       |               |         |       |          |    64899160 | 9178 |
AccessExclusiveLock| t 
 relation      |    16415 |   199382 |      |       |               |         |       |          |    64899160 | 9178 |
RowExclusiveLock   | t 
 relation      |    16415 |   199382 |      |       |               |         |       |          |    64899160 | 9178 |
AccessExclusiveLock| t 
 relation      |    16415 |   199383 |      |       |               |         |       |          |    64899160 | 9178 |
RowExclusiveLock   | t 
 relation      |    16415 |   199383 |      |       |               |         |       |          |    64899160 | 9178 |
AccessExclusiveLock| t 
 relation      |    16415 |   199386 |      |       |               |         |       |          |    64899160 | 9178 |
RowExclusiveLock   | t 
 relation      |    16415 |   199386 |      |       |               |         |       |          |    64899160 | 9178 |
AccessExclusiveLock| t 
 relation      |    16415 |   198716 |      |       |               |         |       |          |    64899160 | 9178 |
AccessExclusiveLock| t 
 relation      |    16415 |   199384 |      |       |               |         |       |          |    64899160 | 9178 |
RowExclusiveLock   | t 
 relation      |    16415 |   199384 |      |       |               |         |       |          |    64899160 | 9178 |
AccessExclusiveLock| t 


   It should be alright. I am waiting for the results. Thank you for your nice answers.

Ciao,
Gabriele

--
Gabriele Bartolini - Istruttore Informatico - Comune di Prato
Sistema Informativo - Servizi di E-Government e Open-Source
g.bartolini@comune.prato.it - www.comune.prato.it - www.htminer.it

-----Messaggio originale-----
Da: Scott Marlowe [mailto:smarlowe@g2switchworks.com]
Inviato: mercoledì 2 maggio 2007 17.41
A: Gabriele Bartolini
Cc: pgsql-admin@postgresql.org
Oggetto: Re: [ADMIN] VACUUM FULL ANALYSE hanging

On Wed, 2007-05-02 at 05:05, Gabriele Bartolini wrote:
> Hi guys,
>
>    I am having problems with freeing disk space after a massive delete
> operation on a table that had approximately 80 million record. I ran
> the following command, by setting the vacuum memory to approximately a
> GigaByte:
>
> SET vacuum_mem TO 1024000
> VACUUM FULL ANALYSE VERBOSE oltp.requests
>
> Here is what I get:

SNIP

> DETAIL:  8211835 index row versions were removed.
> 73821 index pages have been deleted, 73821 are currently reusable.
> CPU 17.06s/28.14u sec elapsed 319.16 sec.
>
>    But here, the command simply hangs.
>
>    The table description is:

Maybe it's hanging, maybe it's just taking a really long time to finish and get back to you.

What's vmstat / top /iostat got to say about the machine during this "hang"?

IF the I/O or CPU are high, then it's not hung, it's just taking a while.