Обсуждение: pg_xlog

Поиск

Список

Период

Сортировка

pg_xlog

От

Albert Shih

Дата:

11 февраля 2010 г., 18:15:06

Hi all

How can I clean up the pg_xlog directory ?

I've 710 go of those file.

I've archive_mode on but not in the pg_xlog directory
I've something like

archive_mode = on               # allows archiving to be done
archive_command = 'cp -i %p /databases/Archives/WAL/%f </dev/null'

so can I just do

    postgresql stop
    rm f pg_xlog/*
    postgresql start

Regards.

JAS
--
Albert SHIH
SIO batiment 15
Observatoire de Paris Meudon
5 Place Jules Janssen
92195 Meudon Cedex
Heure local/Local time:
Jeu 11 fév 2010 23:06:32 CET

Re: pg_xlog

От

"Kevin Grittner"

Дата:

11 февраля 2010 г., 18:20:44

Albert Shih <Albert.Shih@obspm.fr> wrote:

> How can I clean up the pg_xlog directory ?
>
> I've 710 go of those file.
>
> I've archive_mode on but not in the pg_xlog directory
> I've something like
>
> archive_mode = on               # allows archiving to be done
> archive_command = 'cp -i %p /databases/Archives/WAL/%f </dev/null'

It will keep each WAL file until your archive command which is
supposed to copy it completes with an exit code of zero.  It sounds
like that's not happening.  What does that destination directory
look like?  What messages are you seeing in your log files?

> so can I just do
>
>     postgresql stop
>     rm f pg_xlog/*
>     postgresql start

No.

-Kevin

Re: pg_xlog

От

"Joshua D. Drake"

Дата:

11 февраля 2010 г., 18:25:07

On Thu, 2010-02-11 at 23:09 +0100, Albert Shih wrote:
> Hi all
>
> How can I clean up the pg_xlog directory ?
>
> I've 710 go of those file.
>
> I've archive_mode on but not in the pg_xlog directory
> I've something like
>
> archive_mode = on               # allows archiving to be done
> archive_command = 'cp -i %p /databases/Archives/WAL/%f </dev/null'
>
> so can I just do
>
>     postgresql stop
>     rm f pg_xlog/*

No.

>     postgresql start
>
> Regards.

I strongly suggest using something like walmgr or pitrtools to manage
this. Walmgr is part of skytools, pitrtools can be found here:

https://projects.commandprompt.com/public/pitrtools

Joshua D. Drake


--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir.

Re: pg_xlog

От

Albert Shih

Дата:

11 февраля 2010 г., 18:43:55

 Le 11/02/2010 à 16:20:26-0600, Kevin Grittner a écrit
> Albert Shih <Albert.Shih@obspm.fr> wrote:
>
> > How can I clean up the pg_xlog directory ?
> >
> > I've 710 go of those file.
> >
> > I've archive_mode on but not in the pg_xlog directory
> > I've something like
> >
> > archive_mode = on               # allows archiving to be done
> > archive_command = 'cp -i %p /databases/Archives/WAL/%f </dev/null'
>
> It will keep each WAL file until your archive command which is
> supposed to copy it completes with an exit code of zero.  It sounds
> like that's not happening.  What does that destination directory
> look like?  What messages are you seeing in your log files?
>
Thanks for the help

In fact you right, they are some problem about the copy (wrong owner of the
/WAL directory), so the WAL is not copied. That's why I got ... 45000 files
in the pg_xlog.

Thanks for your help.


Regards.

JAS
--
Albert SHIH
SIO batiment 15
Observatoire de Paris Meudon
5 Place Jules Janssen
92195 Meudon Cedex
Téléphone : 01 45 07 76 26/06 86 69 95 71
Heure local/Local time:
Jeu 11 fév 2010 23:42:11 CET

Re: pg_xlog

От

"Joshua D. Drake"

Дата:

12 февраля 2010 г., 01:24:37

On Thu, 2010-02-11 at 23:09 +0100, Albert Shih wrote:
> Hi all
>
> How can I clean up the pg_xlog directory ?
>
> I've 710 go of those file.
>
> I've archive_mode on but not in the pg_xlog directory
> I've something like
>
> archive_mode = on               # allows archiving to be done
> archive_command = 'cp -i %p /databases/Archives/WAL/%f </dev/null'
>
> so can I just do
>
>     postgresql stop
>     rm f pg_xlog/*

No.

>     postgresql start
>
> Regards.

I strongly suggest using something like walmgr or pitrtools to manage
this. Walmgr is part of skytools, pitrtools can be found here:

https://projects.commandprompt.com/public/pitrtools

Joshua D. Drake


--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir.

PG_DUMP backup

От

Renato Oliveira

Дата:

12 февраля 2010 г., 05:59:20

Dear all,

I have a server running 8.2.4 and has a database 170GB in size.
Currently I am backing it up using pg_dump and it takes around 28 hours, sadly.
I was asked to check and compare the newly created DUMP file to the live database and compare records.

I personally cannot see an easy or quick way of doing this, and even the point in doing so.
I am already restoring the full database to a separate server and no errors were reported.

How far can I trust pg_dump, can I trust a restore of the full DUMP to a separate server will be good enough?
The time it takes for me to:
1 - Backup it up
2 - Transfer the dump (12.7GB compressed) to the office across the internet
3 - Decompress the full dump locally which will be 105GB raw file
4 - Restore the full dump to a test server

It is easily a week to do all of that, when I have finished doing all of that if a problem has developed with my live
server,what good that test will be? How relevant that will be? How helpful?

My question is:
1 - Is there a more efficient way of backing up such large database, using pg_dump or any other tool?
2 - Is there an easy way to compare the live database with the DUMP file just created?
3 - If I restore the database to a separate server, is there a point to do such a check, especially if it is going to
takesuck a long time to even start doing such a check?

Idea:
Pg_dump to split the file into smaller usable chuncks, which could be restored one at time, is that possible?

PS I know about PITR but I can't implement it yet as I am still figuring out certain things with the restore process.

I would really appreciate helps, please?

Thank you very much in advance

Renato

Renato Oliveira
Systems Administrator
e-mail: renato.oliveira@grant.co.uk

Tel: +44 (0)1763 260811
Fax: +44 (0)1763 262410
http://www.grant.co.uk/

Grant Instruments (Cambridge) Ltd

Company registered in England, registration number 658133

Registered office address:
29 Station Road,
Shepreth,
CAMBS SG8 6GB
UK

P Please consider the environment before printing this email
CONFIDENTIALITY: The information in this e-mail and any attachments is confidential. It is intended only for the named
recipients(s).If you are not the named recipient please notify the sender immediately and do not disclose the contents
toanother person or take copies.

VIRUSES: The contents of this e-mail or attachment(s) may contain viruses which could damage your own computer system.
WhilstGrant Instruments (Cambridge) Ltd has taken every reasonable precaution to minimise this risk, we cannot accept
liabilityfor any damage which you sustain as a result of software viruses. You should therefore carry out your own
viruschecks before opening the attachment(s).

OpenXML: For information about the OpenXML file format in use within Grant Instruments please visit our
http://www.grant.co.uk/Support/openxml.html

Re: PG_DUMP backup

От

Josh Kupershmidt

Дата:

12 февраля 2010 г., 11:29:44

On Feb 12, 2010, at 4:58 AM, Renato Oliveira wrote:

> Dear all,
>
> I have a server running 8.2.4 and has a database 170GB in size.
> Currently I am backing it up using pg_dump and it takes around 28 hours, sadly.

That's suspiciously slow for a pg_dump alone. I have a ~168 GB database which gets pg_dumped nightly, taking about 2.5
hours,all on 2+ year-old commodity hardware. 

> I was asked to check and compare the newly created DUMP file to the live database and compare records.
>

If you really must run this comparison, maybe you can check out "pg_comparator" (I think you would restore first, then
usepg_comparator to run the diffs). However, it sounds like your assignment really is more about making sure that your
backupserver is functional and ready to take over if the master dies. There are easier, and better, ways to establish
thisthan doing a row-by-row comparison of your backup and live server 

> I personally cannot see an easy or quick way of doing this, and even the point in doing so.
> I am already restoring the full database to a separate server and no errors were reported.
>

There's probably a much easier way of ensuring the validity of your backup server without running this diff, but
that'llof course depend on your environment and your boss' wishes.  

> My question is:
> 1 - Is there a more efficient way of backing up such large database, using pg_dump or any other tool?

Only other ways, other than PITR which you rule out, are documented here, but I doubt you'll like them:
http://developer.postgresql.org/pgdocs/postgres/backup-file.html

> 2 - Is there an easy way to compare the live database with the DUMP file just created?

Take another dump, and compare the two dumps? This borders on absurdity, of course.

> Idea:
> Pg_dump to split the file into smaller usable chuncks, which could be restored one at time, is that possible?

You can dump a table at a time, or a few at a time, using pg_dump --table=... I doubt this will speed the restore up,
though.If you can upgrade to 8.4, or upgrade the backup server to 8.4, your pg_restore should be faster with parallel
restores. 

Also, I would look into tuning your backup server to make pg_restore as fast as possible. See e.g.
http://wiki.postgresql.org/wiki/Bulk_Loading_and_Restores

Josh

Re: PG_DUMP backup

От

Renato Oliveira

Дата:

12 февраля 2010 г., 11:53:01

Josh,
That is great thank you very much

I really appreciate your reply

Thank you

Renato

Renato Oliveira
Systems Administrator
e-mail: renato.oliveira@grant.co.uk

Tel: +44 (0)1763 260811
Fax: +44 (0)1763 262410
http://www.grant.co.uk/

Grant Instruments (Cambridge) Ltd

Company registered in England, registration number 658133

Registered office address:
29 Station Road,
Shepreth,
CAMBS SG8 6GB
UK

-----Original Message-----

From: Josh Kupershmidt [mailto:schmiddy@gmail.com]
Sent: 12 February 2010 15:30
To: Renato Oliveira
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PG_DUMP backup
Importance: High

On Feb 12, 2010, at 4:58 AM, Renato Oliveira wrote:

> Dear all,
>
> I have a server running 8.2.4 and has a database 170GB in size.
> Currently I am backing it up using pg_dump and it takes around 28 hours, sadly.

That's suspiciously slow for a pg_dump alone. I have a ~168 GB database which gets pg_dumped nightly, taking about 2.5
hours,all on 2+ year-old commodity hardware. 

> I was asked to check and compare the newly created DUMP file to the live database and compare records.
>

If you really must run this comparison, maybe you can check out "pg_comparator" (I think you would restore first, then
usepg_comparator to run the diffs). However, it sounds like your assignment really is more about making sure that your
backupserver is functional and ready to take over if the master dies. There are easier, and better, ways to establish
thisthan doing a row-by-row comparison of your backup and live server 

> I personally cannot see an easy or quick way of doing this, and even the point in doing so.
> I am already restoring the full database to a separate server and no errors were reported.
>

There's probably a much easier way of ensuring the validity of your backup server without running this diff, but
that'llof course depend on your environment and your boss' wishes. 

> My question is:
> 1 - Is there a more efficient way of backing up such large database, using pg_dump or any other tool?

Only other ways, other than PITR which you rule out, are documented here, but I doubt you'll like them:
http://developer.postgresql.org/pgdocs/postgres/backup-file.html

> 2 - Is there an easy way to compare the live database with the DUMP file just created?

Take another dump, and compare the two dumps? This borders on absurdity, of course.

> Idea:
> Pg_dump to split the file into smaller usable chuncks, which could be restored one at time, is that possible?

You can dump a table at a time, or a few at a time, using pg_dump --table=... I doubt this will speed the restore up,
though.If you can upgrade to 8.4, or upgrade the backup server to 8.4, your pg_restore should be faster with parallel
restores.

Also, I would look into tuning your backup server to make pg_restore as fast as possible. See e.g.
http://wiki.postgresql.org/wiki/Bulk_Loading_and_Restores

Josh

-----Original Message-----

P Please consider the environment before printing this email
CONFIDENTIALITY: The information in this e-mail and any attachments is confidential. It is intended only for the named
recipients(s).If you are not the named recipient please notify the sender immediately and do not disclose the contents
toanother person or take copies. 

VIRUSES: The contents of this e-mail or attachment(s) may contain viruses which could damage your own computer system.
WhilstGrant Instruments (Cambridge) Ltd has taken every reasonable precaution to minimise this risk, we cannot accept
liabilityfor any damage which you sustain as a result of software viruses. You should therefore carry out your own
viruschecks before opening the attachment(s). 

OpenXML: For information about the OpenXML file format in use within Grant Instruments please visit our
http://www.grant.co.uk/Support/openxml.html

Re: PG_DUMP backup

От

"Tomeh, Husam"

Дата:

12 февраля 2010 г., 15:42:23

Backing up a 170GB in 28 hours definitely doesn't sound right and I
almost certain has nothing to do with pg_dump, but rather your hardware,
ie, server, disk, etc. with a 170GB, backup should be done in a couple
of hours in my opinion. Seems to be more like a system resource issue.

Regards,
      Husam

-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Renato Oliveira
Sent: Friday, February 12, 2010 1:59 AM
To: pgsql-admin@postgresql.org
Subject: [ADMIN] PG_DUMP backup
Importance: High

Dear all,

I have a server running 8.2.4 and has a database 170GB in size.
Currently I am backing it up using pg_dump and it takes around 28 hours,
sadly.
I was asked to check and compare the newly created DUMP file to the live
database and compare records.

I personally cannot see an easy or quick way of doing this, and even the
point in doing so.
I am already restoring the full database to a separate server and no
errors were reported.

How far can I trust pg_dump, can I trust a restore of the full DUMP to a
separate server will be good enough?
The time it takes for me to:
1 - Backup it up
2 - Transfer the dump (12.7GB compressed) to the office across the
internet
3 - Decompress the full dump locally which will be 105GB raw file
4 - Restore the full dump to a test server

It is easily a week to do all of that, when I have finished doing all of
that if a problem has developed with my live server, what good that test
will be? How relevant that will be? How helpful?

My question is:
1 - Is there a more efficient way of backing up such large database,
using pg_dump or any other tool?
2 - Is there an easy way to compare the live database with the DUMP file
just created?
3 - If I restore the database to a separate server, is there a point to
do such a check, especially if it is going to take suck a long time to
even start doing such a check?

Idea:
Pg_dump to split the file into smaller usable chuncks, which could be
restored one at time, is that possible?

PS I know about PITR but I can't implement it yet as I am still figuring
out certain things with the restore process.

I would really appreciate helps, please?

Thank you very much in advance

Renato



Renato Oliveira
Systems Administrator
e-mail: renato.oliveira@grant.co.uk

Tel: +44 (0)1763 260811
Fax: +44 (0)1763 262410
http://www.grant.co.uk/

Grant Instruments (Cambridge) Ltd

Company registered in England, registration number 658133

Registered office address:
29 Station Road,
Shepreth,
CAMBS SG8 6GB
UK








P Please consider the environment before printing this email
CONFIDENTIALITY: The information in this e-mail and any attachments is
confidential. It is intended only for the named recipients(s). If you
are not the named recipient please notify the sender immediately and do
not disclose the contents to another person or take copies.

VIRUSES: The contents of this e-mail or attachment(s) may contain
viruses which could damage your own computer system. Whilst Grant
Instruments (Cambridge) Ltd has taken every reasonable precaution to
minimise this risk, we cannot accept liability for any damage which you
sustain as a result of software viruses. You should therefore carry out
your own virus checks before opening the attachment(s).

OpenXML: For information about the OpenXML file format in use within
Grant Instruments please visit our
http://www.grant.co.uk/Support/openxml.html


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
******************************************************************************************
This message may contain confidential or proprietary information intended only for the use of the
addressee(s) named above or may contain information that is legally privileged. If you are
not the intended addressee, or the person responsible for delivering it to the intended addressee,
you are hereby notified that reading, disseminating, distributing or copying this message is strictly
prohibited. If you have received this message by mistake, please immediately notify us by
replying to the message and delete the original message and any copies immediately thereafter.

Thank you.
******************************************************************************************
FACLD

Re: PG_DUMP backup

От

Renato Oliveira

Дата:

16 февраля 2010 г., 04:54:52

Hi Husam,

I know the problem is not pg_dump. It is a combination of things.
The hardware is old and the configuration is terrible:
CPU AMD Opteron(tm) Processor 246
RAM 8GB
DISKS single volume 300GB with 70GB FREE
1GB SWAP

It might be there are some problems also with the database design and application, how it works.

What I am trying to do is to gather some ideas on how I can improve the backup time so we can at least backup daily.
Once we have improved certain things internally then I am sure we will upgrade this Server.

How you guys are backing up your servers?
Using pg_dump using pitr? Combination?

I have an idea of what hardware I should have to run a 176GB database, but what sort of configuration are you guys
running?

Thank you very much

Best regards

Renato





Renato Oliveira
Systems Administrator
e-mail: renato.oliveira@grant.co.uk

Tel: +44 (0)1763 260811
Fax: +44 (0)1763 262410
http://www.grant.co.uk/

Grant Instruments (Cambridge) Ltd

Company registered in England, registration number 658133

Registered office address:
29 Station Road,
Shepreth,
CAMBS SG8 6GB
UK

-----Original Message-----


From: Tomeh, Husam [mailto:HTomeh@facorelogic.com]
Sent: 12 February 2010 19:31
To: Renato Oliveira; pgsql-admin@postgresql.org
Subject: RE: [ADMIN] PG_DUMP backup

Backing up a 170GB in 28 hours definitely doesn't sound right and I
almost certain has nothing to do with pg_dump, but rather your hardware,
ie, server, disk, etc. with a 170GB, backup should be done in a couple
of hours in my opinion. Seems to be more like a system resource issue.

Regards,
      Husam

-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Renato Oliveira
Sent: Friday, February 12, 2010 1:59 AM
To: pgsql-admin@postgresql.org
Subject: [ADMIN] PG_DUMP backup
Importance: High

Dear all,

I have a server running 8.2.4 and has a database 170GB in size.
Currently I am backing it up using pg_dump and it takes around 28 hours,
sadly.
I was asked to check and compare the newly created DUMP file to the live
database and compare records.

I personally cannot see an easy or quick way of doing this, and even the
point in doing so.
I am already restoring the full database to a separate server and no
errors were reported.

How far can I trust pg_dump, can I trust a restore of the full DUMP to a
separate server will be good enough?
The time it takes for me to:
1 - Backup it up
2 - Transfer the dump (12.7GB compressed) to the office across the
internet
3 - Decompress the full dump locally which will be 105GB raw file
4 - Restore the full dump to a test server

It is easily a week to do all of that, when I have finished doing all of
that if a problem has developed with my live server, what good that test
will be? How relevant that will be? How helpful?

My question is:
1 - Is there a more efficient way of backing up such large database,
using pg_dump or any other tool?
2 - Is there an easy way to compare the live database with the DUMP file
just created?
3 - If I restore the database to a separate server, is there a point to
do such a check, especially if it is going to take suck a long time to
even start doing such a check?

Idea:
Pg_dump to split the file into smaller usable chuncks, which could be
restored one at time, is that possible?

PS I know about PITR but I can't implement it yet as I am still figuring
out certain things with the restore process.

I would really appreciate helps, please?

Thank you very much in advance

Renato



Renato Oliveira
Systems Administrator
e-mail: renato.oliveira@grant.co.uk

Tel: +44 (0)1763 260811
Fax: +44 (0)1763 262410
http://www.grant.co.uk/

Grant Instruments (Cambridge) Ltd

Company registered in England, registration number 658133

Registered office address:
29 Station Road,
Shepreth,
CAMBS SG8 6GB
UK








P Please consider the environment before printing this email
CONFIDENTIALITY: The information in this e-mail and any attachments is
confidential. It is intended only for the named recipients(s). If you
are not the named recipient please notify the sender immediately and do
not disclose the contents to another person or take copies.

VIRUSES: The contents of this e-mail or attachment(s) may contain
viruses which could damage your own computer system. Whilst Grant
Instruments (Cambridge) Ltd has taken every reasonable precaution to
minimise this risk, we cannot accept liability for any damage which you
sustain as a result of software viruses. You should therefore carry out
your own virus checks before opening the attachment(s).

OpenXML: For information about the OpenXML file format in use within
Grant Instruments please visit our
http://www.grant.co.uk/Support/openxml.html


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
******************************************************************************************
This message may contain confidential or proprietary information intended only for the use of the
addressee(s) named above or may contain information that is legally privileged. If you are
not the intended addressee, or the person responsible for delivering it to the intended addressee,
you are hereby notified that reading, disseminating, distributing or copying this message is strictly
prohibited. If you have received this message by mistake, please immediately notify us by
replying to the message and delete the original message and any copies immediately thereafter.

Thank you.
******************************************************************************************
FACLD




-----Original Message-----


P Please consider the environment before printing this email
CONFIDENTIALITY: The information in this e-mail and any attachments is confidential. It is intended only for the named
recipients(s).If you are not the named recipient please notify the sender immediately and do not disclose the contents
toanother person or take copies. 

VIRUSES: The contents of this e-mail or attachment(s) may contain viruses which could damage your own computer system.
WhilstGrant Instruments (Cambridge) Ltd has taken every reasonable precaution to minimise this risk, we cannot accept
liabilityfor any damage which you sustain as a result of software viruses. You should therefore carry out your own
viruschecks before opening the attachment(s). 

OpenXML: For information about the OpenXML file format in use within Grant Instruments please visit our
http://www.grant.co.uk/Support/openxml.html

Re: PG_DUMP backup

От

Renato Oliveira

Дата:

16 февраля 2010 г., 13:34:21

Josh,

Thank you again for the links you sent to me, really appreciated.
I actually was thinking of that to backup the server, here is the idea, maybe one of you guys can tell me if it would
work.

Backup idea, pick holes please?
Create a script which does at the beggning:
        SELECT pg_start_backup('label');
                Then tar -cf backup.tar /usr/local/pgsql/data
                Restore the tar file to a secondary DB server
                Rsync -avf /usr/local/pgsql/data to remote machine
        SELECT pg_stop_backup();

I think rsync will sinc the files and as it can copy deltas it will keep the files in sync, will this work?
I must bring the database window to below 24 hours so I can backup daily.
The server hardware is old and we will not change it right now.

Do you guys think I have any hope to achieve this? ;-)

Thank you very much

I would welcome ideas and any help.

Really appreciated.

Renato

Renato Oliveira
Systems Administrator
e-mail: renato.oliveira@grant.co.uk

Tel: +44 (0)1763 260811
Fax: +44 (0)1763 262410
http://www.grant.co.uk/

Grant Instruments (Cambridge) Ltd

Company registered in England, registration number 658133

Registered office address:
29 Station Road,
Shepreth,
CAMBS SG8 6GB
UK

-----Original Message-----

From: Josh Kupershmidt [mailto:schmiddy@gmail.com]
Sent: 12 February 2010 15:30
To: Renato Oliveira
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PG_DUMP backup
Importance: High

On Feb 12, 2010, at 4:58 AM, Renato Oliveira wrote:

> Dear all,
>
> I have a server running 8.2.4 and has a database 170GB in size.
> Currently I am backing it up using pg_dump and it takes around 28 hours, sadly.

That's suspiciously slow for a pg_dump alone. I have a ~168 GB database which gets pg_dumped nightly, taking about 2.5
hours,all on 2+ year-old commodity hardware. 

> I was asked to check and compare the newly created DUMP file to the live database and compare records.
>

If you really must run this comparison, maybe you can check out "pg_comparator" (I think you would restore first, then
usepg_comparator to run the diffs). However, it sounds like your assignment really is more about making sure that your
backupserver is functional and ready to take over if the master dies. There are easier, and better, ways to establish
thisthan doing a row-by-row comparison of your backup and live server 

> I personally cannot see an easy or quick way of doing this, and even the point in doing so.
> I am already restoring the full database to a separate server and no errors were reported.
>

There's probably a much easier way of ensuring the validity of your backup server without running this diff, but
that'llof course depend on your environment and your boss' wishes. 

> My question is:
> 1 - Is there a more efficient way of backing up such large database, using pg_dump or any other tool?

Only other ways, other than PITR which you rule out, are documented here, but I doubt you'll like them:
http://developer.postgresql.org/pgdocs/postgres/backup-file.html

> 2 - Is there an easy way to compare the live database with the DUMP file just created?

Take another dump, and compare the two dumps? This borders on absurdity, of course.

> Idea:
> Pg_dump to split the file into smaller usable chuncks, which could be restored one at time, is that possible?

You can dump a table at a time, or a few at a time, using pg_dump --table=... I doubt this will speed the restore up,
though.If you can upgrade to 8.4, or upgrade the backup server to 8.4, your pg_restore should be faster with parallel
restores.

Also, I would look into tuning your backup server to make pg_restore as fast as possible. See e.g.
http://wiki.postgresql.org/wiki/Bulk_Loading_and_Restores

Josh

-----Original Message-----

P Please consider the environment before printing this email
CONFIDENTIALITY: The information in this e-mail and any attachments is confidential. It is intended only for the named
recipients(s).If you are not the named recipient please notify the sender immediately and do not disclose the contents
toanother person or take copies. 

VIRUSES: The contents of this e-mail or attachment(s) may contain viruses which could damage your own computer system.
WhilstGrant Instruments (Cambridge) Ltd has taken every reasonable precaution to minimise this risk, we cannot accept
liabilityfor any damage which you sustain as a result of software viruses. You should therefore carry out your own
viruschecks before opening the attachment(s). 

OpenXML: For information about the OpenXML file format in use within Grant Instruments please visit our
http://www.grant.co.uk/Support/openxml.html

Altering a column (increasing size) in Postgres takes long time

От

"Tomeh, Husam"

Дата:

25 февраля 2010 г., 19:04:34

We have a huge table with hundred million records. We need to increase
the size of an existing column with varchar type, and it has been
running for more than 12 hours.


We're using: ALTER TABLE Property ALTER COLUMN "situs-number" TYPE
varchar(30);


The index on that column has been dropped before issuing the above
statement.


1)       Can you explain what's happening internally that make this a
very long process? Does the table get re-created?


2)       Assuming the Alter statement finished successfully, And if I
didn't drop the index (on that column), do I have to rebuild the index?
Does the index get invalidated for just alter the indexed column?


3)       Some folks referred to directly updating Postgres internal
tables (pg_attribute) which takes seconds to make the column change
happen. How safe if this and would potentially cause any corruption?


SET SESSION AUTHORIZATION 'postgres';


BEGIN;


update pg_attribute

set atttypmod = 21 + 4

where attrelid = 'property'::regclass

and attname = 'situs-number';


update pg_attribute

set atttypmod = 21 + 4

where attrelid = 'interim-refresh'::regclass

and attname = 'situs-number';


update pg_attribute

set atttypmod = 21 + 4

where attrelid = 'interim-drefresh'::regclass

and attname = 'situs-number';


update pg_attribute

set atttypmod = 21 + 4

where attrelid = 'interim-update-property'::regclass

and attname = 'situs-number';


update pg_attribute

set atttypmod = 21 + 4

where attrelid = 'ix-property-address'::regclass

and attname = 'situs-number';


RESET SESSION AUTHORIZATION;



4)       Is there a more practical and safe method  to alter a huge
table with reasonable amount of time?


Please advise. Your help is much appreciated.

We're running Postgres 8.3.7 on RedHat Enterprise AS 4.7 on HP585DL.


Regards,
     Husam

******************************************************************************************
This message may contain confidential or proprietary information intended only for the use of the
addressee(s) named above or may contain information that is legally privileged. If you are
not the intended addressee, or the person responsible for delivering it to the intended addressee,
you are hereby notified that reading, disseminating, distributing or copying this message is strictly
prohibited. If you have received this message by mistake, please immediately notify us by
replying to the message and delete the original message and any copies immediately thereafter.

Thank you.
******************************************************************************************
FACLD

Re: Altering a column (increasing size) in Postgres takes long time

От

Jaime Casanova

Дата:

25 февраля 2010 г., 19:41:25

On Thu, Feb 25, 2010 at 6:04 PM, Tomeh, Husam <HTomeh@facorelogic.com> wrote:
> We have a huge table with hundred million records. We need to increase the
> size of an existing column with varchar type, and it has been running for
> more than 12 hours.
>
>
>
> We’re using: ALTER TABLE Property ALTER COLUMN "situs-number" TYPE
> varchar(30);
>

it's not better to have the field beign type text and don't worry
about length but just in check constraints if really necesary?

>
> 1)       Can you explain what’s happening internally that make this a very
> long process? Does the table get re-created?
>

yes, and all it's indexes rebuild not just the one you dropped
and the FK's rechecked (don't think so but can't remember right now)?

>
>
> 2)       Assuming the Alter statement finished successfully, And if I didn’t
> drop the index (on that column), do I have to rebuild the index? Does the
> index get invalidated for just alter the indexed column?
>

it's get rebuilt

>
>
> 3)       Some folks referred to directly updating Postgres internal tables
> (pg_attribute) which takes seconds to make the column change happen. How
> safe if this and would potentially cause any corruption?
>

no, that's insane

>
> 4)       Is there a more practical and safe method  to alter a huge table
> with reasonable amount of time?
>
>

use text fields instead of varchar(n)

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Altering a column (increasing size) in Postgres takes long time

От

"Kevin Grittner"

Дата:

26 февраля 2010 г., 11:02:50

Jaime Casanova <jcasanov@systemguards.com.ec> wrote:
> Tomeh, Husam <HTomeh@facorelogic.com> wrote:
>> We have a huge table with hundred million records. We need to
>> increase the size of an existing column with varchar type, and it
>> has been running for more than 12 hours.

>> 3) Some folks referred to directly updating Postgres internal
>> tables (pg_attribute) which takes seconds to make the column
>> change happen. How safe if this and would potentially cause any
>> corruption?
>>
>
> no, that's insane

Not really; but it is something to approach with great caution.
With this approach you don't need to drop or rebuild the index, and
it's all done, as you say, in a matter of seconds.

Thoughts:

*  Don't over-generalize the technique.  Going from a varchar(n) to
a varchar(larger-n) is safe.  Most changes aren't.

*  Test, test, test.  Copy your schema to a test database.  Look at
the pg_attribute row.  Use ALTER TABLE to make the change.  Then
look at it again.  Restore the schema to the starting point and try
it with a direct update as a database superuser.  Write a query to
SELECT the row which will be updated using table name and column
name (since oid might not match between your copy and the real
database), then modify the SELECT to get to your UPDATE.  Confirm
that it made exactly the right change to the right row.  If you can
arrange a copy of the complete database, or some reasonable test
facsimile, test there; then make sure your application works as
expected.

*  Have a good backup.  Confirm that it can actually be restored;
otherwise you'll be doing this trapeze act without a net.

We've done this successfully with large production databases, but
we've been very careful.  If you're not, you could corrupt your
database.

Insane, no.  If it doesn't make you nervous enough to take great
care -- well, *that* would be insane.

-Kevin

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: pg_xlog