Обсуждение: [HACKERS] recovery_target_time = 'now' is not an error but still impracticalsetting

Поиск
Список
Период
Сортировка

[HACKERS] recovery_target_time = 'now' is not an error but still impracticalsetting

От
Piotr Stefaniak
Дата:
First I'll describe my setup just to give you some context. If anyone
would like to discuss my ideas or propose their own ideas for
discussion, let's do so on -ADMIN or -GENERAL.

I have multiple production database clusters which I want to make
backups of. Restoring from plain dumps takes too long, so I made an
almost typical continuous archiving setup. The unusual assumption in
this case is that the standbys are all on a single machine and they are
not always running. There are multiple $PGDATA directories on the
backups machine, but only one postmaster running in standby mode,
replaying archived WAL files from each master. When it's finished
replaying them for one $PGDATA, it'll move to another. That way they all
will be sufficiently up to date while not requiring resources needed for
N replicas running all the time on a single machine. This of course
requires that the standbys are never promoted, never change the
timeline, etc. - they need to be able to keep replaying WAL files from
the masters.

I've achieved what I wanted essentially by setting standby_mode = on and
restore_command = 'cp /archivedir/%f "%p" || { pg_ctl -D . stop && false
; }', but I was looking for a more elegant solution. Which brings us to
the topic.

One thing I tried was a combination of recovery_target_action =
'shutdown' and recovery_target_time = 'now'. The result is surprising,
because then the standby tries to set the target to year 2000. That's
because recovery_target_time depends on timestamptz_in(), the result of
which can depend on GetCurrentTransactionStartTimestamp(). But at that
point there isn't any transaction yet. Which is why I'm getting
"starting point-in-time recovery to 2000-01-01 01:00:00+01".

At the very least, I think timestamptz_in() should either complain about
being called outside of transaction or return the expected value,
because returning year 2000 is unuseful at best. I would also like to
become able to do what I'm doing in a less hacky way (assuming there
isn't one already but I may be wrong), perhaps once there is a new
'furthest' setting for recovery_target or when recovery_target_time =
'now' works as I expected.

Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Robert Haas
Дата:
On Tue, Aug 15, 2017 at 10:37 AM, Piotr Stefaniak
<postgres@piotr-stefaniak.me> wrote:
> One thing I tried was a combination of recovery_target_action =
> 'shutdown' and recovery_target_time = 'now'. The result is surprising,
> because then the standby tries to set the target to year 2000.

Seems like this is just a plain old bug.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Michael Paquier
Дата:
On Tue, Aug 15, 2017 at 11:37 PM, Piotr Stefaniak
<postgres@piotr-stefaniak.me> wrote:
> At the very least, I think timestamptz_in() should either complain about
> being called outside of transaction or return the expected value,
> because returning year 2000 is unuseful at best. I would also like to
> become able to do what I'm doing in a less hacky way (assuming there
> isn't one already but I may be wrong), perhaps once there is a new
> 'furthest' setting for recovery_target or when recovery_target_time =
> 'now' works as I expected.

Hm. I think that the most simple solution here would be to change
GetCurrentDateTime() and GetCurrentTimeUsec() so as they use
GetCurrentTimestamp() instead of the transaction-level equivalents if
those code paths are invoked outside of a transaction. Any code using
those routines would get stupid timestamps anyway if they try to use
keywords like 'today', 'now' or 'tomorrow' so that does not sound like
a bad change to me.

And yes, that's clearly a bug. Nice discovery.
-- 
Michael



Re: [HACKERS] recovery_target_time = 'now' is not an error but still impractical setting

От
Tom Lane
Дата:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Tue, Aug 15, 2017 at 11:37 PM, Piotr Stefaniak
> <postgres@piotr-stefaniak.me> wrote:
>> At the very least, I think timestamptz_in() should either complain about
>> being called outside of transaction or return the expected value,
>> because returning year 2000 is unuseful at best. I would also like to
>> become able to do what I'm doing in a less hacky way (assuming there
>> isn't one already but I may be wrong), perhaps once there is a new
>> 'furthest' setting for recovery_target or when recovery_target_time =
>> 'now' works as I expected.

> Hm. I think that the most simple solution here would be to change
> GetCurrentDateTime() and GetCurrentTimeUsec() so as they use
> GetCurrentTimestamp() instead of the transaction-level equivalents if
> those code paths are invoked outside of a transaction.

-1 ... every datatype I/O function is entitled to assume it's being
invoked inside a transaction.  I do not think we should break that
on a case-by-case basis.  So using timestamptz_in directly in xlog.c
was a bad idea, and we need to rethink that.

The semantic problem here is that if "now" is defined to mean start
of current transaction --- which it is --- then that's meaningless
outside a transaction.  Redefining it as "start of current transaction,
unless we're not inside in a transaction, in which case something else"
isn't really an improvement.  I think we'd be much better off throwing
an error for this case.

Not sure offhand how to mechanize that given the current code
structure.  We don't want to duplicate all the infrastructure
used by timestamptz_in, certainly ... but how much of that
code is depending in some way on being inside a transaction?
I doubt that this issue with "now" is the only soft spot.
        regards, tom lane



Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Michael Paquier
Дата:
On Thu, Aug 17, 2017 at 1:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> -1 ... every datatype I/O function is entitled to assume it's being
> invoked inside a transaction.  I do not think we should break that
> on a case-by-case basis.  So using timestamptz_in directly in xlog.c
> was a bad idea, and we need to rethink that.

[evil mode]
Why not just calling SetCurrentStatementStartTimestamp() before
parsing the timestamp? A bit hacky, but I think that it would work.
[/evil mode]

> Not sure offhand how to mechanize that given the current code
> structure.  We don't want to duplicate all the infrastructure
> used by timestamptz_in, certainly ... but how much of that
> code is depending in some way on being inside a transaction?
> I doubt that this issue with "now" is the only soft spot.

tomorrow, today, etc are other things.
-- 
Michael



Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Simon Riggs
Дата:
On 15 August 2017 at 15:37, Piotr Stefaniak <postgres@piotr-stefaniak.me> wrote:

> One thing I tried was a combination of recovery_target_action =
> 'shutdown' and recovery_target_time = 'now'. The result is surprising

Indeed, special timestamp values were never considered in the design,
so I'm not surprised they don't work and have never been tested.

Your suggestion of "furthest" is already the default behaviour.

Why are you using 'now'? Why would you want to pick a randomly
selected end time?

recovery_target_time = 'allballs' sounds fun for recovering corrupted databases

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Michael Paquier
Дата:
On Thu, Aug 17, 2017 at 6:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 15 August 2017 at 15:37, Piotr Stefaniak <postgres@piotr-stefaniak.me> wrote:
>
>> One thing I tried was a combination of recovery_target_action =
>> 'shutdown' and recovery_target_time = 'now'. The result is surprising
>
> Indeed, special timestamp values were never considered in the design,
> so I'm not surprised they don't work and have never been tested.

We could always use a TRY/CATCH block and add an error in
GetCurrentDateTime and GetCurrentTimeUsec if they are called out of a
transaction context. Rather-safe-than-sorry.

> Your suggestion of "furthest" is already the default behaviour.
>
> Why are you using 'now'? Why would you want to pick a randomly
> selected end time?

"now" is not much interesting, targets in the past are more, like
'yesterday'. This could create back an instance back to the beginning
of the previous day, simplifying scripts creating recovery.conf a bit,
even if that's not much simplification as we are talking about
creating a timestamp string.

> recovery_target_time = 'allballs' sounds fun for recovering corrupted databases

This one would not work :)
=# select timestamptz_in('allballs', 0, 0);
ERROR:  22007: invalid input syntax for type timestamp with time zone:
"allballs"
LOCATION:  DateTimeParseError, datetime.c:3800
Still that would be fun.
-- 
Michael



Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Simon Riggs
Дата:
On 18 August 2017 at 07:30, Michael Paquier <michael.paquier@gmail.com> wrote:
> On Thu, Aug 17, 2017 at 6:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On 15 August 2017 at 15:37, Piotr Stefaniak <postgres@piotr-stefaniak.me> wrote:
>>
>>> One thing I tried was a combination of recovery_target_action =
>>> 'shutdown' and recovery_target_time = 'now'. The result is surprising
>>
>> Indeed, special timestamp values were never considered in the design,
>> so I'm not surprised they don't work and have never been tested.
>
> We could always use a TRY/CATCH block and add an error in
> GetCurrentDateTime and GetCurrentTimeUsec if they are called out of a
> transaction context. Rather-safe-than-sorry.
>
>> Your suggestion of "furthest" is already the default behaviour.
>>
>> Why are you using 'now'? Why would you want to pick a randomly
>> selected end time?
>
> "now" is not much interesting, targets in the past are more, like
> 'yesterday'. This could create back an instance back to the beginning
> of the previous day, simplifying scripts creating recovery.conf a bit,
> even if that's not much simplification as we are talking about
> creating a timestamp string.

I can't see any value in allowing imprecise and effective random timestamps.

ISTM if we care, it would be better to simply exclude the 7 named
timestamps prior to their being sent, as in the attached patch.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Piotr Stefaniak
Дата:
On 2017-08-17 11:24, Simon Riggs wrote:
> Your suggestion of "furthest" is already the default behaviour.
> 
> Why are you using 'now'? Why would you want to pick a randomly
> selected end time?

The idea in both cases was to stop the recovery in order to prevent the
standby from selecting new timeline. I want to be able to continue the
recovery from future WAL files.  "Furthest" really meant "as far as
possible without completing recovery".

Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Robert Haas
Дата:
On Fri, Aug 18, 2017 at 4:39 AM, Piotr Stefaniak
<postgres@piotr-stefaniak.me> wrote:
> On 2017-08-17 11:24, Simon Riggs wrote:
>> Your suggestion of "furthest" is already the default behaviour.
>>
>> Why are you using 'now'? Why would you want to pick a randomly
>> selected end time?
>
> The idea in both cases was to stop the recovery in order to prevent the
> standby from selecting new timeline. I want to be able to continue the
> recovery from future WAL files.  "Furthest" really meant "as far as
> possible without completing recovery".

Can you use recovery_target_action='shutdown' instead?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] recovery_target_time = 'now' is not an error but stillimpractical setting

От
Piotr Stefaniak
Дата:
On 2017-08-18 20:51, Robert Haas wrote:
> On Fri, Aug 18, 2017 at 4:39 AM, Piotr Stefaniak
> <postgres@piotr-stefaniak.me> wrote:
>> On 2017-08-17 11:24, Simon Riggs wrote:
>>> Your suggestion of "furthest" is already the default behaviour.
>>>
>>> Why are you using 'now'? Why would you want to pick a randomly
>>> selected end time?
>>
>> The idea in both cases was to stop the recovery in order to prevent the
>> standby from selecting new timeline. I want to be able to continue the
>> recovery from future WAL files.  "Furthest" really meant "as far as
>> possible without completing recovery".
> 
> Can you use recovery_target_action='shutdown' instead?

The thing I've tried was a combination of recovery_target_action =
'shutdown' and recovery_target_time = 'now'. Do you mean
recovery_target_action = 'shutdown' and everything else unset (default)?
That leads to the standby selecting new timeline anyway. Adding
standby_mode = on prevents that, but then there is no shutdown. Am I
missing something?

The only way I've managed to get recovery_target_action = 'shutdown' to
work as I needed was to follow advice by Matthijs and Christoph to
combine recovery_target_action with either setting recovery_target_time
to "now" spelled in the usual, non-special way or setting
recovery_target_xid to the latest transaction ID pg_xlogdump could find.
You could also try recovery_target_lsn, but I don't have that in 9.4. I
don't like that line of thought though, so for now I'll use the
restore_command hack I mentioned in the first message of this thread.