Обсуждение: pg_dump vs pg_basebackup

Поиск
Список
Период
Сортировка

pg_dump vs pg_basebackup

От
gianfranco caca
Дата:
Hai,

Can anyone tell me the difference and performance between pgdump and pg_basebackup if I want to backup a large database.

Thanks

Re: pg_dump vs pg_basebackup

От
Ilya Kosmodemiansky
Дата:
Hi gianfranco,


How exactly large is your database and how heavy is a workload on it?
Usually if you have more than ~200Gb, better to use pg_basebackup
because pg_dump will take too long time. And please take in mind, that
pg_dump makes dump, which is  actually not the same thing as a backup.

Best regards,
Ilya

On Tue, Mar 25, 2014 at 5:45 AM, gianfranco caca <limpcaca@yahoo.com> wrote:
> Hai,
>
> Can anyone tell me the difference and performance between pgdump and
> pg_basebackup if I want to backup a large database.
>
> Thanks



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: pg_dump vs pg_basebackup

От
gianfranco caca
Дата:
Hai ilya,

Thanks for the respond. The database is estimated over 100gb and the workload will be high. Can we use a pg_basebackup with pitr to restore based on transaction time?

Thanks


On Tuesday, 25 March 2014, 15:13, Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
Hi gianfranco,


How exactly large is your database and how heavy is a workload on it?
Usually if you have more than ~200Gb, better to use pg_basebackup
because pg_dump will take too long time. And please take in mind, that
pg_dump makes dump, which is  actually not the same thing as a backup.

Best regards,
Ilya

On Tue, Mar 25, 2014 at 5:45 AM, gianfranco caca <limpcaca@yahoo.com> wrote:
> Hai,
>
> Can anyone tell me the difference and performance between pgdump and
> pg_basebackup if I want to backup a large database.
>
> Thanks




--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Re: pg_dump vs pg_basebackup

От
Ilya Kosmodemiansky
Дата:
Yes, you need to set recovery_target_time in your recovery.conf while
performing recovery
(http://www.postgresql.org/docs/9.3/static/recovery-target-settings.html).
That could be a tricky thing - depends on that exactly you need. All
those transactions, which were not committed at given timestamp, will
be rollbacked, so read url above carefully.

On Tue, Mar 25, 2014 at 8:19 AM, gianfranco caca <limpcaca@yahoo.com> wrote:
> Hai ilya,
>
> Thanks for the respond. The database is estimated over 100gb and the
> workload will be high. Can we use a pg_basebackup with pitr to restore based
> on transaction time?
>
> Thanks
>
>
> On Tuesday, 25 March 2014, 15:13, Ilya Kosmodemiansky
> <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
> Hi gianfranco,
>
>
> How exactly large is your database and how heavy is a workload on it?
> Usually if you have more than ~200Gb, better to use pg_basebackup
> because pg_dump will take too long time. And please take in mind, that
> pg_dump makes dump, which is  actually not the same thing as a backup.
>
> Best regards,
> Ilya
>
> On Tue, Mar 25, 2014 at 5:45 AM, gianfranco caca <limpcaca@yahoo.com> wrote:
>> Hai,
>>
>> Can anyone tell me the difference and performance between pgdump and
>> pg_basebackup if I want to backup a large database.
>>
>> Thanks
>
>
>
>
> --
> Ilya Kosmodemiansky,
>
> PostgreSQL-Consulting.com
> tel. +14084142500
> cell. +4915144336040
> ik@postgresql-consulting.com
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
>
>



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: pg_dump vs pg_basebackup

От
David Johnston
Дата:
gianfranco caca wrote
> Hai,
>
> Can anyone tell me the difference and performance between pgdump and
> pg_basebackup if I want to backup a large database.
>
> Thanks

Yes.  And many of their words have been written down in the documentation in
a chapter named "Backup and Restore".  Do you have a specific question about
what is written there?

I'll add that comparing the performance of both is relatively meaningless.
You need to understand how each works then choose the correct tool for your
situation.

Lastly, you should actually do both, on a development database, and measure
the time and effort while practicing both routines (backup and restoring)
yourself.

David J.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-vs-pg-basebackup-tp5797351p5797364.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: pg_dump vs pg_basebackup

От
Claudio Freire
Дата:
On Tue, Mar 25, 2014 at 4:39 AM, David Johnston <polobo@yahoo.com> wrote:
>> Hai,
>>
>> Can anyone tell me the difference and performance between pgdump and
>> pg_basebackup if I want to backup a large database.
>>
>> Thanks
>
> Yes.  And many of their words have been written down in the documentation in
> a chapter named "Backup and Restore".  Do you have a specific question about
> what is written there?
>
> I'll add that comparing the performance of both is relatively meaningless.
> You need to understand how each works then choose the correct tool for your
> situation.


I don't know if meaningless is the right word here. I have a ~450G
database, and the difference is quite meaningful to me, as it is
measured in days.

The difference being, pg_basebackup is dumber and using it is harder,
but its performance is only limited by sequential I/O capacity (which
is usually quite high). It is also used in conjunction with PITR to
get not only that, but also incremental backups, which is something
you really want for big databass. pg_dump, on the other hand, will
only do full dumps and it will be limited both by I/O and CPU power,
because the reformatting involved in making a dump is considerable. In
my experience, a base backup takes hours, while a dump takes days.


Re: pg_dump vs pg_basebackup

От
"Joshua D. Drake"
Дата:
On 03/25/2014 05:05 AM, Claudio Freire wrote:
>
> On Tue, Mar 25, 2014 at 4:39 AM, David Johnston <polobo@yahoo.com> wrote:
>>> Hai,
>>>
>>> Can anyone tell me the difference and performance between pgdump and
>>> pg_basebackup if I want to backup a large database.
>>>

Honestly,

Neither is particularly good at backing up large databases. I would look
into PITR with rsync.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


Re: pg_dump vs pg_basebackup

От
desmodemone
Дата:



2014-03-25 15:56 GMT+01:00 Joshua D. Drake <jd@commandprompt.com>:

On 03/25/2014 05:05 AM, Claudio Freire wrote:

On Tue, Mar 25, 2014 at 4:39 AM, David Johnston <polobo@yahoo.com> wrote:
Hai,

Can anyone tell me the difference and performance between pgdump and
pg_basebackup if I want to backup a large database.


Honestly,

Neither is particularly good at backing up large databases. I would look into PITR with rsync.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


For large database it's possible also to consider , also, to change database status in backup mode and after take a snapshoot and returning to normal mode, saving also all archive after you finish the backup.

With that snapshoot you could easy  mount it and restore on another machine or open in readonly mode (hot standby and after do a logical dump ) , a lot of storage have these capabilities and also filesystem or volume manager.

I think these is the fater  option you have.

Mat Dba

Re: pg_dump vs pg_basebackup

От
Ilya Kosmodemiansky
Дата:
Joshua,

that is really good point: an alternative is to use pg_basebackup
through ssh tunnel with compression, but rsync is much simpler.

On Tue, Mar 25, 2014 at 3:56 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>
> On 03/25/2014 05:05 AM, Claudio Freire wrote:
>>
>>
>> On Tue, Mar 25, 2014 at 4:39 AM, David Johnston <polobo@yahoo.com> wrote:
>>>>
>>>> Hai,
>>>>
>>>> Can anyone tell me the difference and performance between pgdump and
>>>> pg_basebackup if I want to backup a large database.
>>>>
>
> Honestly,
>
> Neither is particularly good at backing up large databases. I would look
> into PITR with rsync.
>
> JD
>
>
> --
> Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
> PostgreSQL Support, Training, Professional Services and Development
> High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
> Political Correctness is for cowards.
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: pg_dump vs pg_basebackup

От
Magnus Hagander
Дата:
I would say that's the one thing that rsync is *not*. pg_basebackup takes care of a lot of things under the hood. rsync is a lot more complicated, in particular in failure scenarios, since you have to manually deal with pg_start/stop_backup().

There are definitely reasons you'd prefer rsync over pg_basebackup, but I don't believe simplicity is one of them.

//Magnus


On Tue, Mar 25, 2014 at 4:18 PM, Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
Joshua,

that is really good point: an alternative is to use pg_basebackup
through ssh tunnel with compression, but rsync is much simpler.

On Tue, Mar 25, 2014 at 3:56 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>
> On 03/25/2014 05:05 AM, Claudio Freire wrote:
>>
>>
>> On Tue, Mar 25, 2014 at 4:39 AM, David Johnston <polobo@yahoo.com> wrote:
>>>>
>>>> Hai,
>>>>
>>>> Can anyone tell me the difference and performance between pgdump and
>>>> pg_basebackup if I want to backup a large database.
>>>>
>
> Honestly,
>
> Neither is particularly good at backing up large databases. I would look
> into PITR with rsync.
>
> JD
>
>
> --
> Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
> PostgreSQL Support, Training, Professional Services and Development
> High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
> Political Correctness is for cowards.
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_dump vs pg_basebackup

От
"Joshua D. Drake"
Дата:
On 03/25/2014 08:18 AM, Ilya Kosmodemiansky wrote:
>
> Joshua,
>
> that is really good point: an alternative is to use pg_basebackup
> through ssh tunnel with compression, but rsync is much simpler.

Or rsync over ssh. The advantage is that you can create backups that
don't have to be restored, just started. You can also use the
differential portions of rsync to do it multiple times a day without
much issue.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


Re: pg_dump vs pg_basebackup

От
Ilya Kosmodemiansky
Дата:
Joshua,

On Tue, Mar 25, 2014 at 4:22 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
The advantage is that you can create backups that don't
> have to be restored, just started. You can also use the differential
> portions of rsync to do it multiple times a day without much issue.

Are you sure, that it is a nice idea on a database with heavy write workload?

And also Im not sure, that differential backups using rsync will be
recoverable, if you have actually meant that.

>
>
> JD
>
> --
> Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
> PostgreSQL Support, Training, Professional Services and Development
> High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
> Political Correctness is for cowards.



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: pg_dump vs pg_basebackup

От
Ilya Kosmodemiansky
Дата:
Magnus,

That is correct, but I'am afraid that such all-in-one functionality
also hides from one how backup really works. Probably such sort of
knowledge is so essential for a DBA, that it is better to learn both
methods, at least to be able to choose correctly? But maybe it is a
rhetorical question.

On Tue, Mar 25, 2014 at 4:21 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I would say that's the one thing that rsync is *not*. pg_basebackup takes
> care of a lot of things under the hood. rsync is a lot more complicated, in
> particular in failure scenarios, since you have to manually deal with
> pg_start/stop_backup().
>
> There are definitely reasons you'd prefer rsync over pg_basebackup, but I
> don't believe simplicity is one of them.
>
> //Magnus
>
>
> On Tue, Mar 25, 2014 at 4:18 PM, Ilya Kosmodemiansky
> <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
>>
>> Joshua,
>>
>> that is really good point: an alternative is to use pg_basebackup
>> through ssh tunnel with compression, but rsync is much simpler.
>>
>> On Tue, Mar 25, 2014 at 3:56 PM, Joshua D. Drake <jd@commandprompt.com>
>> wrote:
>> >
>> > On 03/25/2014 05:05 AM, Claudio Freire wrote:
>> >>
>> >>
>> >> On Tue, Mar 25, 2014 at 4:39 AM, David Johnston <polobo@yahoo.com>
>> >> wrote:
>> >>>>
>> >>>> Hai,
>> >>>>
>> >>>> Can anyone tell me the difference and performance between pgdump and
>> >>>> pg_basebackup if I want to backup a large database.
>> >>>>
>> >
>> > Honestly,
>> >
>> > Neither is particularly good at backing up large databases. I would look
>> > into PITR with rsync.
>> >
>> > JD
>> >
>> >
>> > --
>> > Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
>> > PostgreSQL Support, Training, Professional Services and Development
>> > High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
>> > Political Correctness is for cowards.
>> >
>> >
>> >
>> > --
>> > Sent via pgsql-performance mailing list
>> > (pgsql-performance@postgresql.org)
>> > To make changes to your subscription:
>> > http://www.postgresql.org/mailpref/pgsql-performance
>>
>>
>>
>> --
>> Ilya Kosmodemiansky,
>>
>> PostgreSQL-Consulting.com
>> tel. +14084142500
>> cell. +4915144336040
>> ik@postgresql-consulting.com
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>
>
>
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: pg_dump vs pg_basebackup

От
Magnus Hagander
Дата:
Oh, I agree it's good that you should know both methods. I only disagree with that the choice of rsync be made with the argument of simplicity. Simplicity is one of the main reasons to choose the *other* method (pg_basebackup), and the rsync method is for more advanced usecases. But it's definitely good to know both!

//Magnus


On Tue, Mar 25, 2014 at 4:37 PM, Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
Magnus,

That is correct, but I'am afraid that such all-in-one functionality
also hides from one how backup really works. Probably such sort of
knowledge is so essential for a DBA, that it is better to learn both
methods, at least to be able to choose correctly? But maybe it is a
rhetorical question.

On Tue, Mar 25, 2014 at 4:21 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I would say that's the one thing that rsync is *not*. pg_basebackup takes
> care of a lot of things under the hood. rsync is a lot more complicated, in
> particular in failure scenarios, since you have to manually deal with
> pg_start/stop_backup().
>
> There are definitely reasons you'd prefer rsync over pg_basebackup, but I
> don't believe simplicity is one of them.
>
> //Magnus
>
>
> On Tue, Mar 25, 2014 at 4:18 PM, Ilya Kosmodemiansky
> <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
>>
>> Joshua,
>>
>> that is really good point: an alternative is to use pg_basebackup
>> through ssh tunnel with compression, but rsync is much simpler.
>>
>> On Tue, Mar 25, 2014 at 3:56 PM, Joshua D. Drake <jd@commandprompt.com>
>> wrote:
>> >
>> > On 03/25/2014 05:05 AM, Claudio Freire wrote:
>> >>
>> >>
>> >> On Tue, Mar 25, 2014 at 4:39 AM, David Johnston <polobo@yahoo.com>
>> >> wrote:
>> >>>>
>> >>>> Hai,
>> >>>>
>> >>>> Can anyone tell me the difference and performance between pgdump and
>> >>>> pg_basebackup if I want to backup a large database.
>> >>>>
>> >
>> > Honestly,
>> >
>> > Neither is particularly good at backing up large databases. I would look
>> > into PITR with rsync.
>> >
>> > JD
>> >
>> >
>> > --
>> > Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
>> > PostgreSQL Support, Training, Professional Services and Development
>> > High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
>> > Political Correctness is for cowards.
>> >
>> >
>> >
>> > --
>> > Sent via pgsql-performance mailing list
>> > (pgsql-performance@postgresql.org)
>> > To make changes to your subscription:
>> > http://www.postgresql.org/mailpref/pgsql-performance
>>
>>
>>
>> --
>> Ilya Kosmodemiansky,
>>
>> PostgreSQL-Consulting.com
>> tel. +14084142500
>> cell. +4915144336040
>> ik@postgresql-consulting.com
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>
>
>
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com



--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_dump vs pg_basebackup

От
Ilya Kosmodemiansky
Дата:
OK, agreed. Ive got your point;-)

On Tue, Mar 25, 2014 at 4:40 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Oh, I agree it's good that you should know both methods. I only disagree
> with that the choice of rsync be made with the argument of simplicity.
> Simplicity is one of the main reasons to choose the *other* method
> (pg_basebackup), and the rsync method is for more advanced usecases. But
> it's definitely good to know both!
>
> //Magnus
>
>
>
> On Tue, Mar 25, 2014 at 4:37 PM, Ilya Kosmodemiansky
> <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
>>
>> Magnus,
>>
>> That is correct, but I'am afraid that such all-in-one functionality
>> also hides from one how backup really works. Probably such sort of
>> knowledge is so essential for a DBA, that it is better to learn both
>> methods, at least to be able to choose correctly? But maybe it is a
>> rhetorical question.
>>
>> On Tue, Mar 25, 2014 at 4:21 PM, Magnus Hagander <magnus@hagander.net>
>> wrote:
>> > I would say that's the one thing that rsync is *not*. pg_basebackup
>> > takes
>> > care of a lot of things under the hood. rsync is a lot more complicated,
>> > in
>> > particular in failure scenarios, since you have to manually deal with
>> > pg_start/stop_backup().
>> >
>> > There are definitely reasons you'd prefer rsync over pg_basebackup, but
>> > I
>> > don't believe simplicity is one of them.
>> >
>> > //Magnus
>> >
>> >
>> > On Tue, Mar 25, 2014 at 4:18 PM, Ilya Kosmodemiansky
>> > <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
>> >>
>> >> Joshua,
>> >>
>> >> that is really good point: an alternative is to use pg_basebackup
>> >> through ssh tunnel with compression, but rsync is much simpler.
>> >>
>> >> On Tue, Mar 25, 2014 at 3:56 PM, Joshua D. Drake <jd@commandprompt.com>
>> >> wrote:
>> >> >
>> >> > On 03/25/2014 05:05 AM, Claudio Freire wrote:
>> >> >>
>> >> >>
>> >> >> On Tue, Mar 25, 2014 at 4:39 AM, David Johnston <polobo@yahoo.com>
>> >> >> wrote:
>> >> >>>>
>> >> >>>> Hai,
>> >> >>>>
>> >> >>>> Can anyone tell me the difference and performance between pgdump
>> >> >>>> and
>> >> >>>> pg_basebackup if I want to backup a large database.
>> >> >>>>
>> >> >
>> >> > Honestly,
>> >> >
>> >> > Neither is particularly good at backing up large databases. I would
>> >> > look
>> >> > into PITR with rsync.
>> >> >
>> >> > JD
>> >> >
>> >> >
>> >> > --
>> >> > Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
>> >> > PostgreSQL Support, Training, Professional Services and Development
>> >> > High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
>> >> > Political Correctness is for cowards.
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Sent via pgsql-performance mailing list
>> >> > (pgsql-performance@postgresql.org)
>> >> > To make changes to your subscription:
>> >> > http://www.postgresql.org/mailpref/pgsql-performance
>> >>
>> >>
>> >>
>> >> --
>> >> Ilya Kosmodemiansky,
>> >>
>> >> PostgreSQL-Consulting.com
>> >> tel. +14084142500
>> >> cell. +4915144336040
>> >> ik@postgresql-consulting.com
>> >>
>> >>
>> >> --
>> >> Sent via pgsql-performance mailing list
>> >> (pgsql-performance@postgresql.org)
>> >> To make changes to your subscription:
>> >> http://www.postgresql.org/mailpref/pgsql-performance
>> >
>> >
>> >
>> >
>> > --
>> >  Magnus Hagander
>> >  Me: http://www.hagander.net/
>> >  Work: http://www.redpill-linpro.com/
>>
>>
>>
>> --
>> Ilya Kosmodemiansky,
>>
>> PostgreSQL-Consulting.com
>> tel. +14084142500
>> cell. +4915144336040
>> ik@postgresql-consulting.com
>
>
>
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: pg_dump vs pg_basebackup

От
"Graeme B. Bell"
Дата:
Postgresql rsync backups require the DB to be shutdown during the 'second' rsync.

1. rsync the DB onto the backup filesystem  (produces e.g. 95-99.99% consistent DB on the backup filesystem)
2. shut down the DB
3. rsync the shut down DB onto the backup filesystem    (synchronises the last few files to make the DB consistent, and
isusually very fast) 
4. start the DB up again

Is there any way to notify postgres to pause transactions (and note that they should be restarted), and flush out write
buffersetc, instead of doing a full shutdown?  
e.g. so that the second rsync call would bring the backup filesystem's representation of the DB into a recoverable
statewithout needing to shutdown the production DB completely.  

G

On 25 Mar 2014, at 16:29, Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com> wrote:

> Joshua,
>
> On Tue, Mar 25, 2014 at 4:22 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
> The advantage is that you can create backups that don't
>> have to be restored, just started. You can also use the differential
>> portions of rsync to do it multiple times a day without much issue.
>
> Are you sure, that it is a nice idea on a database with heavy write workload?
>
> And also Im not sure, that differential backups using rsync will be
> recoverable, if you have actually meant that.
>
>>
>>
>> JD
>>
>> --
>> Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
>> PostgreSQL Support, Training, Professional Services and Development
>> High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
>> Political Correctness is for cowards.
>
>
>
> --
> Ilya Kosmodemiansky,
>
> PostgreSQL-Consulting.com
> tel. +14084142500
> cell. +4915144336040
> ik@postgresql-consulting.com
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



Re: pg_dump vs pg_basebackup

От
"Joshua D. Drake"
Дата:
On 03/25/2014 08:21 AM, Magnus Hagander wrote:
> I would say that's the one thing that rsync is *not*. pg_basebackup
> takes care of a lot of things under the hood. rsync is a lot more
> complicated, in particular in failure scenarios, since you have to
> manually deal with pg_start/stop_backup().
>
> There are definitely reasons you'd prefer rsync over pg_basebackup, but
> I don't believe simplicity is one of them.
>
> //Magnus

Good God man... since when do you top post!

Well there are tools that use rsync to solve those issues :P. We even
have one that does multi-threaded rsync so you can pull many Terabytes
in very little time (relatively).

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


Re: pg_dump vs pg_basebackup

От
Alan Hodgson
Дата:
On Tuesday, March 25, 2014 03:48:07 PM Graeme B. Bell wrote:
> Postgresql rsync backups require the DB to be shutdown during the 'second'
> rsync.
>
> 1. rsync the DB onto the backup filesystem  (produces e.g. 95-99.99%
> consistent DB on the backup filesystem) 2. shut down the DB
> 3. rsync the shut down DB onto the backup filesystem    (synchronises the
> last few files to make the DB consistent, and is usually very fast) 4.
> start the DB up again
>
> Is there any way to notify postgres to pause transactions (and note that
> they should be restarted), and flush out write buffers etc, instead of
> doing a full shutdown? e.g. so that the second rsync call would bring the
> backup filesystem's representation of the DB into a recoverable state
> without needing to shutdown the production DB completely.
>

You use pg_start_backup() before rsync, and pg_stop_backup() after. And keep
all your WAL log files. No need to pause transactions; whatever happens during
the rsync just gets replayed during recovery (as I understand it). You do need
to do a PITR restore to make use of this rsync copy.

That's basically what pg_basebackup does, I believe (I haven't used it, I only
do rsyncs).



Re: pg_dump vs pg_basebackup

От
Claudio Freire
Дата:
On Tue, Mar 25, 2014 at 12:22 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
> On 03/25/2014 08:18 AM, Ilya Kosmodemiansky wrote:
>>
>>
>> Joshua,
>>
>> that is really good point: an alternative is to use pg_basebackup
>> through ssh tunnel with compression, but rsync is much simpler.
>
>
> Or rsync over ssh. The advantage is that you can create backups that don't
> have to be restored, just started. You can also use the differential
> portions of rsync to do it multiple times a day without much issue.


rsync's delta transfer isn't relly very effective with postgres. You
don't save any I/O, just network traffic, and in general the
bottleneck is I/O (unless you have a monster I/O subsys or a snail of
a network one).

There were some musing about making delta transfer more efficient in
pg in hackers, but I don't think anything tangible came out of that,
so it's basically equivalent to a full transfer. The only reason to
leverage rsync's delta transfer would be to decrease the time between
pg_start_backup and pg_stop_backup, which could only matter if you're
low on WAL space, but the reduction, in my experience, isn't stellar.