Обсуждение: standby parallel backup

Поиск
Список
Период
Сортировка

standby parallel backup

От
Aniko Belim
Дата:
Hi,

We are currently backing up our PostgreSQL 9.4 standby databases with pg_dump using file method. Now we need to change the method to using directory backup with parallel option to make the backup faster. 

But we get the following error:
ERROR:  cannot assign TransactionIds during recovery
STATEMENT:  SELECT pg_export_snapshot()

I assume that the problem occurs because it want to use export_snapshot() to ensure that all parallel connections of the dump will see the same snapshot, but can't because the standby is read-only.  

Is there a workaround for this problem? 
We found the --no-synchronized-snapshots option of pg_dump for a possible solution, which allows the backup to succeed, but I'm concerned about the consistency of the backup. I did some test with it, and it looks like I only need to ensure that no changes made while the connections of the parallel threads occur, not during the whole backup (which could cause a large lag in replication). Would it be safe to pause and resume replication only for that few seconds?

Or is there any other good solution for backing up hot standby with pg_dump using the parallel directory method while this bug fixed? (We found this error reported in 2013, and it looks like isn't fixed yet.)

Thank you!
Aniko

Re: standby parallel backup

От
John Scalia
Дата:
Is there some reason that you can't use pg_basebackup? It's much faster than pg_dump, like a couple of orders of
magnitude.It also allows you to specify how many threads to want 
to speed it up.

On 3/13/2015 5:26 PM, Aniko Belim wrote:
> Hi,
>
> We are currently backing up our PostgreSQL 9.4 standby databases with pg_dump using file method. Now we need to
changethe method to using directory backup with parallel option 
> to make the backup faster.
>
> But we get the following error:
> ERROR:  cannot assign TransactionIds during recovery
> STATEMENT:  SELECT pg_export_snapshot()
>
> I assume that the problem occurs because it want to use export_snapshot() to ensure that all parallel connections of
thedump will see the same snapshot, but can't because the 
> standby is read-only.
>
> Is there a workaround for this problem?
> We found the --no-synchronized-snapshots option of pg_dump for a possible solution, which allows the backup to
succeed,but I'm concerned about the consistency of the backup. I 
> did some test with it, and it looks like I only need to ensure that no changes made while the connections of the
parallelthreads occur, not during the whole backup (which could 
> cause a large lag in replication). Would it be safe to pause and resume replication only for that few seconds?
>
> Or is there any other good solution for backing up hot standby with pg_dump using the parallel directory method while
thisbug fixed? (We found this error reported in 2013, and 
> it looks like isn't fixed yet.)
>
> Thank you!
> Aniko
>



Re: standby parallel backup

От
Aniko Belim
Дата:
Hi,

We need to backup individual databases, sometimes just schemas. That's why we use pg_dump.

Aniko

On Fri, Mar 13, 2015 at 5:24 PM, John Scalia <jayknowsunix@gmail.com> wrote:
Is there some reason that you can't use pg_basebackup? It's much faster than pg_dump, like a couple of orders of magnitude. It also allows you to specify how many threads to want
to speed it up.

On 3/13/2015 5:26 PM, Aniko Belim wrote:
> Hi,
>
> We are currently backing up our PostgreSQL 9.4 standby databases with pg_dump using file method. Now we need to change the method to using directory backup with parallel option
> to make the backup faster.
>
> But we get the following error:
> ERROR:  cannot assign TransactionIds during recovery
> STATEMENT:  SELECT pg_export_snapshot()
>
> I assume that the problem occurs because it want to use export_snapshot() to ensure that all parallel connections of the dump will see the same snapshot, but can't because the
> standby is read-only.
>
> Is there a workaround for this problem?
> We found the --no-synchronized-snapshots option of pg_dump for a possible solution, which allows the backup to succeed, but I'm concerned about the consistency of the backup. I
> did some test with it, and it looks like I only need to ensure that no changes made while the connections of the parallel threads occur, not during the whole backup (which could
> cause a large lag in replication). Would it be safe to pause and resume replication only for that few seconds?
>
> Or is there any other good solution for backing up hot standby with pg_dump using the parallel directory method while this bug fixed? (We found this error reported in 2013, and
> it looks like isn't fixed yet.)
>
> Thank you!
> Aniko
>


Re: standby parallel backup

От
Simon Riggs
Дата:
On 13 March 2015 at 21:26, Aniko Belim <belim.aniko@gmail.com> wrote:

> Or is there any other good solution for backing up hot standby with pg_dump
> using the parallel directory method while this bug fixed? (We found this
> error reported in 2013, and it looks like isn't fixed yet.)

The ERROR message isn't a bug, it is an unimplemented feature.

Regrettably the developer that did the original work did not include
this requirement in his design. I wish that they had.

Sometime there is some confusion over the way things work with
PostgreSQL. There is no central development team working on features,
they are mostly donated by various companies, or worked on by
contributors employed by companies. So the way it works is you either
implement it yourself, or find/pay someone that can develop and
support that for you. Many contributors started by "scratching their
own itch", as its often described, that can be interesting work. You
may already have a company contact if you are working with one of the
professional companies around PostgreSQL. If so, please remind them
this is not a bug.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services