Обсуждение: Vacuum very big table - how the full vacuum works in background/internally?

Поиск
Список
Период
Сортировка

Vacuum very big table - how the full vacuum works in background/internally?

От
Durumdara
Дата:
Hello!

We stored some binaries in largeobjects.
Because of that the table size now 80 GB.
We deleted 80% of records (lo_unlink), and autovacuum reclaimed space for new elements. So the table don't grow anymore, but we have to get more space in this server.

We can delete 99% of these records, but for really reclaim free space in HDD we need to run full vacuum.

For this operation we need to know how the PGSQL vacuum works in the background.

Some of admins said to us that:
a.) It copies the table fully (minium 66 GB space needed).
b.) Then it deletes the unneeded data.
In this case we need extra empty space in a temporary period, and more time (the copy of 66 GB could be slow in SSD too).

The DBISAM/ElevateDB, ZIP file deletion, VirtualBOX VDI Compact works as:
a.) It locks the original file/table.
b.) Copy remaining elements to new (first empty) file.
c.) Then it removes old file, and use new.
In this case we need only very limited empty space (3-4 GB), and the operation is much faster (because of less HDD operation).

Please help me, how the PGSQL full vacuum works internally? (1., 2. case, or something else)

How we (and the clients) prepare to this operation?
We must know it to avoid disk out problems, and too much off-time.

Thank you for your help!

Best regards
   dd

Re: Vacuum very big table - how the full vacuum works in background/internally?

От
Luca Ferrari
Дата:
On Thu, Oct 17, 2019 at 5:10 PM Durumdara <durumdara@gmail.com> wrote:
> Please help me, how the PGSQL full vacuum works internally? (1., 2. case, or something else)
>

As far as I know a VACUUM FULL will rewrite the whole table without
inserting (and therefore deleting later) not visible tuples. I would
be quite surprised of it duplicating the table and removing after the
tuples.
Surely it is a very invasive command that locks the table and requires I/O.

> How we (and the clients) prepare to this operation?
> We must know it to avoid disk out problems, and too much off-time.

The best guess is that you are going to need almost the double of the
table size. Since you said that autovacuum is preventing the table to
grow, it could mean all the reclaimed space has been consumed by other
tuples, so I'm not sure vacuum full can provide you space.
Have you looked at pg_stat_user_tables to see the dead tuples fraction?

Luca



Re: Vacuum very big table - how the full vacuum works in background/internally?

От
Michael Lewis
Дата:
We must know it to avoid disk out problems, and too much off-time.

You may be interested in this extension- https://github.com/reorg/pg_repack

Re: Vacuum very big table - how the full vacuum works in background/internally?

От
Durumdara
Дата:
Hello!

We solved it. The VACUUM full finished fast on pg_largeobject, because we deleted 98% of big largeobject (ours) before.
And it worked as zip deletion - it created a new file and copied only living records, which was fast (3 GB vs. 80 GB).

Thanks

dd

Luca Ferrari <fluca1978@gmail.com> ezt írta (időpont: 2019. okt. 17., Cs, 17:43):
On Thu, Oct 17, 2019 at 5:10 PM Durumdara <durumdara@gmail.com> wrote:
> Please help me, how the PGSQL full vacuum works internally? (1., 2. case, or something else)
>

As far as I know a VACUUM FULL will rewrite the whole table without
inserting (and therefore deleting later) not visible tuples. I would
be quite surprised of it duplicating the table and removing after the
tuples.
Surely it is a very invasive command that locks the table and requires I/O.

> How we (and the clients) prepare to this operation?
> We must know it to avoid disk out problems, and too much off-time.

The best guess is that you are going to need almost the double of the
table size. Since you said that autovacuum is preventing the table to
grow, it could mean all the reclaimed space has been consumed by other
tuples, so I'm not sure vacuum full can provide you space.
Have you looked at pg_stat_user_tables to see the dead tuples fraction?

Luca