Обсуждение: vacuumlo

Поиск
Список
Период
Сортировка

vacuumlo

От
Ian Dauncey
Дата:

Hi All,

 

I need to run the vacuumlo command against our production database.

Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.

 

I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.

 

 

Has anyone had issues before running this utility?

 

Regards

Ian



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.

Re: vacuumlo

От
Julien Rouhaud
Дата:
Hi,

On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:
>
> I need to run the vacuumlo command against our production database.
>
> Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.

It's safe as long as you're aware of what this tool is doing.  As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.html:

> vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An
orphanedlarge object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the
database.

So:

> I have run it in our QA environment with success, but now they are having a few application issues and I have told
themthat the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility. 

The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column.  If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.



RE: vacuumlo

От
Ian Dauncey
Дата:

Hi,

 

Just an update on my vacuumlo issue.

 

I did run the vacuumlo  against the pg_largeobject table without any issues but afterwards I ran a vacuum full against this table which caused lots of issues.

Because the vacuum full takes an exclusive lock (which was my first mistake as I did not stop the applications accessing the database) on the table I had all the applications hanging. The next issue was it started writing out WAL logs and in the end the file system which housed the Wal logs filled up causing the vacuum to fail.

Now the issue I have here is that the vacuum full created a temporary table , and when it crashed this temporary table did not get deleted. I did rerun the vacuum full against the pg_largeobject table (and yes, I did stop all the applications first). It did complete successfully but it did not drop the previous temporary table. This table is taking close to 100 Gig of disk space.

 

If I backup and restore the database onto a different server this temporary table does not get restored.

My question here is.

  1. How do I get rid of this temporary table without a backup and restore as this is our Prod system?
  2. Is there a way of finding out the name of this temp table and matching it up to files on disk?

 

Any help will be appreciated

 

Regards

Ian.

 

From: Julien Rouhaud <rjuju123@gmail.com>
Sent: Tuesday, 17 August 2021 14:18
To: Ian Dauncey <Ian.Dauncey@bankzero.co.za>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: vacuumlo

 

External email - treat with caution

Hi,

On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:
>
> I need to run the vacuumlo command against our production database.
>
> Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.

It's safe as long as you're aware of what this tool is doing. As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.html

> vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An orphaned large object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the database.

So:

> I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.

The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column. If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.

Re: vacuumlo

От
Laurenz Albe
Дата:
On Mon, 2021-08-30 at 15:08 +0000, Ian Dauncey wrote: 
> I did run the vacuumlo  against the pg_largeobject table without any issues but afterwards
> I ran a vacuum full against this table which caused lots of issues.
> Because the vacuum full takes an exclusive lock (which was my first mistake as I
> did not stop the applications accessing the database) on the table I had all the
> applications hanging.

Right.

> The next issue was it started writing out WAL logs and in the end the file system
> which housed the Wal logs filled up causing the vacuum to fail.
> Now the issue I have here is that the vacuum full created a temporary table , and when
> it crashed this temporary table did not get deleted. I did rerun the vacuum full against
> the pg_largeobject table (and yes, I did stop all the applications first).
> It did complete successfully but it did not drop the previous temporary table.
> This table is taking close to 100 Gig of disk space.

VACUUM (FULL) will write a compact copy of the table, then delete the old one,
so it temporarily needs more space.  It may be a good idea to put "pg_wal" into
a different file system, so that WAL space cannot run out even if the data files
fill the file system.  Then PostgreSQL won't crash, and the files will be removed.
 
> If I backup and restore the database onto a different server this temporary table does not get restored.
> My question here is.
>    1. How do I get rid of this temporary table without a backup and restore as this is our Prod system?

You cannot :^(
If you know PostgreSQL well, and you know what you are doing, you might be able
to identify those files and manually delete them.  But that's definitely not
something for the casual user: delete the wrong file, and you have to restore
the backup.

>    2. Is there a way of finding out the name of this temp table and matching it up to files on disk?

No, unfortunately not.
You'd have to identify the files that do *not* belong to any table, and that
is tricky.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: vacuumlo

От
Mladen Gogala
Дата:

That can be verified by checking RELFILENODE of the table. VACUUM FULL changes RELFILENODE:

mgogala=# select relfilenode from pg_class where relname='emp';
 relfilenode
-------------
       52178
(1 row)

mgogala=# vacuum full emp;
VACUUM
mgogala=# select relfilenode from pg_class where relname='emp';
 relfilenode
-------------
       52182
(1 row)



According to unsubstantiated rumors from https://www.postgresql.org/docs/13/catalog-pg-class.html, the definition of RELFILENODE column is as follows:

relfilenode oid

Name of the on-disk file of this relation; zero means this is a “mapped” relation whose disk file name is determined by low-level state

Change of the RELFILENODE means that the relation has got a new file. And that means that vacuum full rewrites the table, just as Laurenz has said. Note that OID itself is NOT changed.

On 8/30/21 1:15 PM, Laurenz Albe wrote:
VACUUM (FULL) will write a compact copy of the table, then delete the old one,
so it temporarily needs more space.  It may be a good idea to put "pg_wal" into
a different file system, so that WAL space cannot run out even if the data files
fill the file system.  Then PostgreSQL won't crash, and the files will be removed.
-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: vacuumlo

От
Laurenz Albe
Дата:
On Mon, 2021-08-30 at 13:38 -0400, Mladen Gogala wrote:
> According to unsubstantiated rumors from https://www.postgresql.org/docs/13/catalog-pg-class.html,
> the definition of RELFILENODE column is as follows:
>
>  relfilenode oid
>
> Name of the on-disk file of this relation; zero means this is a “mapped” relation whose disk
> file name is determined by low-level state
> Change of the RELFILENODE means that the relation has got a new file. And that means that vacuum full
> rewrites the table, just as Laurenz has said. Note that OID itself is NOT changed.

Please don't top-post on these lists.

I am not sure what you mean with "unsubstantiated".

Note that this will be 0 for certain crucial system tables (essentially those that have to be accessed
*before* regular metadata queries can be executed).  So it is better to use the "pg_relation_filenode()"
function to find the file that belongs to a certain table.

You can use that to find which files do *not* belong to any table, index, sequence or
materialized view and then delete those.  But I would think twice before manually deleting
files in the data directory.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: vacuumlo

От
Atul Kumar
Дата:
Hi,

You may restart the postgres services. That temp file will be deleted automatically then.


Regards 
Atul








On Monday, August 30, 2021, Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:

Hi,

 

Just an update on my vacuumlo issue.

 

I did run the vacuumlo  against the pg_largeobject table without any issues but afterwards I ran a vacuum full against this table which caused lots of issues.

Because the vacuum full takes an exclusive lock (which was my first mistake as I did not stop the applications accessing the database) on the table I had all the applications hanging. The next issue was it started writing out WAL logs and in the end the file system which housed the Wal logs filled up causing the vacuum to fail.

Now the issue I have here is that the vacuum full created a temporary table , and when it crashed this temporary table did not get deleted. I did rerun the vacuum full against the pg_largeobject table (and yes, I did stop all the applications first). It did complete successfully but it did not drop the previous temporary table. This table is taking close to 100 Gig of disk space.

 

If I backup and restore the database onto a different server this temporary table does not get restored.

My question here is.

  1. How do I get rid of this temporary table without a backup and restore as this is our Prod system?
  2. Is there a way of finding out the name of this temp table and matching it up to files on disk?

 

Any help will be appreciated

 

Regards

Ian.

 

From: Julien Rouhaud <rjuju123@gmail.com>
Sent: Tuesday, 17 August 2021 14:18
To: Ian Dauncey <Ian.Dauncey@bankzero.co.za>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: vacuumlo

 

External email - treat with caution

Hi,

On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:
>
> I need to run the vacuumlo command against our production database.
>
> Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.

It's safe as long as you're aware of what this tool is doing. As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.html

> vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An orphaned large object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the database.

So:

> I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.

The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column. If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.

Re: vacuumlo

От
Laurenz Albe
Дата:
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
> You may restart the postgres services. That temp file will be deleted automatically then.

No - orphaned files left behind from a crash will not be removed during a restart.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: vacuumlo

От
Atul Kumar
Дата:
Ok, but after restart it is safe to remove such files.





On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
> You may restart the postgres services. That temp file will be deleted automatically then.

No - orphaned files left behind from a crash will not be removed during a restart.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Re: vacuumlo

От
Laurenz Albe
Дата:
On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote:
> On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> 
> > On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
> > You may restart the postgres services. That temp file will be deleted automatically then.
> > 
> > No - orphaned files left behind from a crash will not be removed during a restart.
>
> Ok, but after restart it is safe to remove such files.

It is always save to remove those files.
The problem is to identify them.
Remove the wrong file, and your cluster is toast.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: vacuumlo

От
Atul Kumar
Дата:
Hi,

Don’t you think it as a drawback of postgres that such temporary files don’t get deleted automatically once the transaction gets committed or rolled back .

Could you suggest a good permanent solution for such case.



Regards 
Atul






On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote:
> On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> > On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
> > You may restart the postgres services. That temp file will be deleted automatically then.
> >
> > No - orphaned files left behind from a crash will not be removed during a restart.
>
> Ok, but after restart it is safe to remove such files.

It is always save to remove those files.
The problem is to identify them.
Remove the wrong file, and your cluster is toast.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Re: vacuumlo

От
Shaish Guni
Дата:
You are right 


On Thu, Sep 2, 2021 at 10:56 AM Atul Kumar <akumar14871@gmail.com> wrote:
Hi,

Don’t you think it as a drawback of postgres that such temporary files don’t get deleted automatically once the transaction gets committed or rolled back .

Could you suggest a good permanent solution for such case.



Regards 
Atul






On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote:
> On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> > On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
> > You may restart the postgres services. That temp file will be deleted automatically then.
> >
> > No - orphaned files left behind from a crash will not be removed during a restart.
>
> Ok, but after restart it is safe to remove such files.

It is always save to remove those files.
The problem is to identify them.
Remove the wrong file, and your cluster is toast.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

--
Shaish Guni

Re: vacuumlo

От
Guillaume Lelarge
Дата:
Le jeu. 2 sept. 2021 à 07:11, Atul Kumar <akumar14871@gmail.com> a écrit :
Hi,

Don’t you think it as a drawback of postgres that such temporary files don’t get deleted automatically once the transaction gets committed or rolled back .


They are automatically removed in case of a proper commit or rollback. They aren't in case of a crash.

Could you suggest a good permanent solution for such case.


You should try https://github.com/bdrouvot/pg_orphaned to find the list of orphaned files. It does a pretty good job at it.


--
Guillaume.

RE: vacuumlo

От
Ian Dauncey
Дата:

Morning.

 

Thanks for all the replies.

 

What I did to remove these files was to backup of the DB, drop the DB and then I restored the DB.

 

Regards

Ian

 

 

From: Guillaume Lelarge <guillaume@lelarge.info>
Sent: Thursday, 02 September 2021 09:33
To: Atul Kumar <akumar14871@gmail.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; Ian Dauncey <Ian.Dauncey@bankzero.co.za>; pgsql-admin@lists.postgresql.org
Subject: Re: vacuumlo

 

External email - treat with caution

Le jeu. 2 sept. 2021 à 07:11, Atul Kumar <akumar14871@gmail.com> a écrit :

Hi,

 

Don’t you think it as a drawback of postgres that such temporary files don’t get deleted automatically once the transaction gets committed or rolled back .

 

 

They are automatically removed in case of a proper commit or rollback. They aren't in case of a crash.

 

Could you suggest a good permanent solution for such case.

 

 

You should try https://github.com/bdrouvot/pg_orphaned to find the list of orphaned files. It does a pretty good job at it.



--

Guillaume.



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.