Обсуждение: Incremental Refresh - PostgreSQL 9.2

Поиск
Список
Период
Сортировка

Incremental Refresh - PostgreSQL 9.2

От
"drum.lucas@gmail.com"
Дата:
Hi all,

I've got a Master server and a Test Server.

Master server has a 1.7 TB
Test server has a full copy from master - 1.7 TB.

The copy was made using pg_dump

I.E: /usr/pgsql-9.2/bin/pg_dump --exclude-table-data='junk.*' --format=custom geoop_live > bigdump.sql

Question:
Now that the test server has a full copy from master, how could I just do a incremental refreshing once a month?

Instead copying all the DB (1.7GB), just copy what has been changed?
Is that possible?

Cheers

Re: Incremental Refresh - PostgreSQL 9.2

От
"David G. Johnston"
Дата:
On Tue, Feb 23, 2016 at 2:59 PM, drum.lucas@gmail.com <drum.lucas@gmail.com> wrote:
Hi all,

I've got a Master server and a Test Server.

Master server has a 1.7 TB
Test server has a full copy from master - 1.7 TB.

The copy was made using pg_dump

I.E: /usr/pgsql-9.2/bin/pg_dump --exclude-table-data='junk.*' --format=custom geoop_live > bigdump.sql

Question:
Now that the test server has a full copy from master, how could I just do a incremental refreshing once a month?

Instead copying all the DB (1.7GB), just copy what has been changed?
Is that possible?


​You cannot accomplish incremental updates if your choice of tool is pg_dump.  The only way to do a true incremental would be to rely upon something like rsync on the data directory while the master server is offline.

So, assuming that is not an option, you should probably look at the various true replication solutions that are available and see which of those might serve as an alternative.

​David J.

Re: Incremental Refresh - PostgreSQL 9.2

От
"drum.lucas@gmail.com"
Дата:

​You cannot accomplish incremental updates if your choice of tool is pg_dump.  The only way to do a true incremental would be to rely upon something like rsync on the data directory while the master server is offline.

So, assuming that is not an option, you should probably look at the various true replication solutions that are available and see which of those might serve as an alternative.

​David J.


The point is: I don't want a slave server - I want a full copy from my MASTER server.

So replication wouldn't be a nice solution...


Can I do a rsync from /data folder from my SLAVE into my TEST server?
To do that, both postgres process should be stopped, right?

Re: Incremental Refresh - PostgreSQL 9.2

От
Scott Whitney
Дата:
Are you using it for just select queries, or is this a full 100% test (insert/update/delete) test environment?

I have both. In my case they are geographically separate. Production server at data center. Replicated "just in case," also at data center. At HQ, I replicate that for DR purposes. INTERNALLY (gigabit backend), I have another internal test server that I refresh FROM THE REPLICATED SERVER AT HQ on a regular basis. I use the replicated server to query stuff (select) and dump databases to the interaction test server regularly for full testing.


-------- Original message --------
From: "David G. Johnston" <david.g.johnston@gmail.com>
Date: 02/23/2016 4:16 PM (GMT-06:00)
To: drum.lucas@gmail.com
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Incremental Refresh - PostgreSQL 9.2

On Tue, Feb 23, 2016 at 2:59 PM, drum.lucas@gmail.com <drum.lucas@gmail.com> wrote:
Hi all,

I've got a Master server and a Test Server.

Master server has a 1.7 TB
Test server has a full copy from master - 1.7 TB.

The copy was made using pg_dump

I.E: /usr/pgsql-9.2/bin/pg_dump --exclude-table-data='junk.*' --format=custom geoop_live > bigdump.sql

Question:
Now that the test server has a full copy from master, how could I just do a incremental refreshing once a month?

Instead copying all the DB (1.7GB), just copy what has been changed?
Is that possible?


​You cannot accomplish incremental updates if your choice of tool is pg_dump.  The only way to do a true incremental would be to rely upon something like rsync on the data directory while the master server is offline.

So, assuming that is not an option, you should probably look at the various true replication solutions that are available and see which of those might serve as an alternative.

​David J.



Journyx, Inc.
7600 Burnet Road #300
Austin, TX 78757
www.journyx.com

p 512.834.8888 
f 512-834-8858 

Do you receive our promotional emails? You can subscribe or unsubscribe to those emails at http://go.journyx.com/emailPreference/e/4932/714/ 

Re: Incremental Refresh - PostgreSQL 9.2

От
"drum.lucas@gmail.com"
Дата:


On 24 February 2016 at 11:42, Scott Whitney <scott@journyx.com> wrote:
Are you using it for just select queries, or is this a full 100% test (insert/update/delete) test environment?

I have both. In my case they are geographically separate. Production server at data center. Replicated "just in case," also at data center. At HQ, I replicate that for DR purposes. INTERNALLY (gigabit backend), I have another internal test server that I refresh FROM THE REPLICATED SERVER AT HQ on a regular basis. I use the replicated server to query stuff (select) and dump databases to the interaction test server regularly for full testing.



That's what I'm looking for:


Re: Incremental Refresh - PostgreSQL 9.2

От
Scott Whitney
Дата:
Please do not refer me (or the list) to a URL when you are asking for help. Be clear and concise. If you have a URI that helps you, please be specific in the places you do not understand.

-------- Original message --------
From: drum.lucas@gmail.com
Date: 02/23/2016 6:46 PM (GMT-06:00)
To: Scott Whitney <scott@journyx.com>
Cc: "David G. Johnston" <david.g.johnston@gmail.com>, pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Incremental Refresh - PostgreSQL 9.2



On 24 February 2016 at 11:42, Scott Whitney <scott@journyx.com> wrote:
Are you using it for just select queries, or is this a full 100% test (insert/update/delete) test environment?

I have both. In my case they are geographically separate. Production server at data center. Replicated "just in case," also at data center. At HQ, I replicate that for DR purposes. INTERNALLY (gigabit backend), I have another internal test server that I refresh FROM THE REPLICATED SERVER AT HQ on a regular basis. I use the replicated server to query stuff (select) and dump databases to the interaction test server regularly for full testing.



That's what I'm looking for:




Journyx, Inc.
7600 Burnet Road #300
Austin, TX 78757
www.journyx.com

p 512.834.8888 
f 512-834-8858 

Do you receive our promotional emails? You can subscribe or unsubscribe to those emails at http://go.journyx.com/emailPreference/e/4932/714/ 

Re: Incremental Refresh - PostgreSQL 9.2

От
"drum.lucas@gmail.com"
Дата:


On 24 February 2016 at 14:23, Scott Whitney <scott@journyx.com> wrote:
Please do not refer me (or the list) to a URL when you are asking for help. Be clear and concise. If you have a URI that helps you, please be specific in the places you do not understand.

Scott,

I've asked for some tips about how could I do incremental backups in Postgres WITHOUT stopping it.

As I haven't had an expected answer, I decided to post here what I've found.
So in the future others with the same question may be helped.

And at any time I referred you or the list to a URL

Lucas

Re: Incremental Refresh - PostgreSQL 9.2

От
"drum.lucas@gmail.com"
Дата:


On 24 February 2016 at 14:34, drum.lucas@gmail.com <drum.lucas@gmail.com> wrote:


On 24 February 2016 at 14:23, Scott Whitney <scott@journyx.com> wrote:
Please do not refer me (or the list) to a URL when you are asking for help. Be clear and concise. If you have a URI that helps you, please be specific in the places you do not understand.

Scott,

I've asked for some tips about how could I do incremental backups in Postgres WITHOUT stopping it.

As I haven't had an expected answer, I decided to post here what I've found.
So in the future others with the same question may be helped.

I haven't referred you or the list to a URL********

Lucas

Re: Incremental Refresh - PostgreSQL 9.2

От
Keith
Дата:

On Tue, Feb 23, 2016 at 8:36 PM, drum.lucas@gmail.com <drum.lucas@gmail.com> wrote:


On 24 February 2016 at 14:34, drum.lucas@gmail.com <drum.lucas@gmail.com> wrote:


On 24 February 2016 at 14:23, Scott Whitney <scott@journyx.com> wrote:
Please do not refer me (or the list) to a URL when you are asking for help. Be clear and concise. If you have a URI that helps you, please be specific in the places you do not understand.

Scott,

I've asked for some tips about how could I do incremental backups in Postgres WITHOUT stopping it.

As I haven't had an expected answer, I decided to post here what I've found.
So in the future others with the same question may be helped.

I haven't referred you or the list to a URL********

Lucas


What you originally asked for and what you're referring to in that link are two very different things. An incremental backup using the methods in that link is possible without any third-party tools (look into pg_basebackup & pg_receivexlogs which both come with 9.2+), it's just that the third party tools can possibly make it slightly easier and provide some additional options

However, once you restore that backup and bring it up as a standalone database instance, you can no longer use the WAL stream from the original instance to "replay" those changes. This is also the case with any sort of rsync'ing of the data files from one instance to another.

You may be better off looking into a logical replication solution to copy your data to a testing system if it needs to be a standalone system. That is not currently built into Postgres, but it's on track for possible inclusion with 9.6. In the mean time look into tools such as Mimeo or Bucardo. 


Keith

Re: Incremental Refresh - PostgreSQL 9.2

От
Matheus de Oliveira
Дата:

On Tue, Feb 23, 2016 at 6:59 PM, drum.lucas@gmail.com <drum.lucas@gmail.com> wrote:
I.E: /usr/pgsql-9.2/bin/pg_dump --exclude-table-data='junk.*' --format=custom geoop_live > bigdump.sql

How are restoring it? Have tried using pg_restore with "-j" or "--jobs" option? This tend to make the restore process way faster. And as you should be doing logical backups routinely, you can simple use those.

In fact, if you are on 9.3+, you can also use directory format on pg_dump to have -j option there also.

Now that the test server has a full copy from master, how could I just do a incremental refreshing once a month?

Besides the already mentioned options, I see two more, but both are based on physical backup of the primary dabase, and not a logical backup (by using pg_dump):

Option 1: you can use rsync to copy from primary to test database, while on the first run it will take awhile, the next runs might be way faster *if* good part of the files haven't changed. Problem with rsync is that you either need to have the primary database completely shtdown or you need to use --checksum option of rsync (it is not safe otherwise and it makes a read the entire data set, but transfer only the changed files). Another option is to use rsync from an standby, streaming replica, database. If you choose to use rsync with --checksum, you need to do somthing in the lines of:

    a) shutdown test database
    b) on primary: psql -c "SELECT pg_start_backup('rsync-test');"
    c) on any node: rsync -azvP --checksum <primary>:/path/to/pgdata <test>:/path/to/pgata
    d) on primary: psql -c "SELECT pg_stop_backup();"
    e) copy archivelogs to test database and setup recovery.conf with restore_command at least
    f) start test database

You need to check how much rsync --checksum really buy you, sometimes it is just faster to copy everything (but then pg_dump/pg_restore is probably fast enough either), but, if the nodes are geographic apart and most of the files haven't changed, then --checksum is probably faster.

Option 2: use pg_rewind [1], if you can afford to save every WAL segment generated since each "refresh".
Best regards,
--
Matheus de Oliveira


Re: Incremental Refresh - PostgreSQL 9.2

От
Stephen Frost
Дата:
David,

* David G. Johnston (david.g.johnston@gmail.com) wrote:
> On Tue, Feb 23, 2016 at 2:59 PM, drum.lucas@gmail.com <drum.lucas@gmail.com>
> wrote:
> > Question:
> > Now that the test server has a full copy from master, how could I just do
> > a incremental refreshing once a month?
> >
> > Instead copying all the DB (1.7GB), just copy what has been changed?
> > Is that possible?
> >
> >
> ​You cannot accomplish incremental updates if your choice of tool is
> pg_dump.

Agreed.

> The only way to do a true incremental would be to rely upon
> something like rsync on the data directory while the master server is
> offline.

pgBackRest actually provides incremental backups and restores and I've
used it for exactly this kind of setup previously.

Thanks!

Stephen

Вложения

Re: Incremental Refresh - PostgreSQL 9.2

От
Stephen Frost
Дата:
Keith,

* Keith (keith@keithf4.com) wrote:
> However, once you restore that backup and bring it up as a standalone
> database instance, you can no longer use the WAL stream from the original
> instance to "replay" those changes. This is also the case with any sort of
> rsync'ing of the data files from one instance to another.

Right, but if you don't mind throwing away the changes which were made
to the restored server (the test server) then you can use rsync (if the
source database is shut down) to re-sync the restored server without
having to re-copy everything (I'd strongly recommend using the CRC
option to rsync).  The same is possible with pgBackRest, which allows
you to keep the source database online (pgBackRest will check that the
hashes match between the backup and the restored system when doing
incremental restores, there's no need to explicitly ask for that as
there is with rsync).

> You may be better off looking into a logical replication solution to copy
> your data to a testing system if it needs to be a standalone system. That
> is not currently built into Postgres, but it's on track for possible
> inclusion with 9.6. In the mean time look into tools such as Mimeo or
> Bucardo.
>
> https://github.com/omniti-labs/mimeo
> https://bucardo.org/wiki/Bucardo

Right, and there's also Slony.  These are useful if the requirement is
that changes to the test/restored server be persisted.

Thanks!

Stephen

Вложения