Обсуждение: pg_resetwal --next-transaction-id may cause database failed to restart.

Поиск
Список
Период
Сортировка

pg_resetwal --next-transaction-id may cause database failed to restart.

От
"movead.li@highgo.ca"
Дата:
hello hackers,

When I try to use pg_resetwal tool to skip some transaction ID, I get a problem that is
the tool can accept all transaction id I offered with '-x' option, however, the database
may failed to restart because of can not read file under $PGDATA/pg_xact.  For
example, the 'NextXID' in a database is 1000, if you offer '-x 32769' then the database
failed to restart.

I read the document of pg_resetwal tool, it told me to write a 'safe value', but I think
pg_resetwal tool should report it and refuse to exec walreset work when using an unsafe
value, rather than remaining it until the user restarts the database.

I do a initial patch to limit the input, now it accepts transaction in two ways:
1. The transaction ID is on the same CLOG page with the 'NextXID' in pg_control.
2. The transaction ID is right at the end of a CLOG page.
The input limited above can ensure the database restart successfully.

The same situation with multixact and multixact-offset option and I make
the same change in the patch.

Do you think it is an issue?


Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Вложения

Re: pg_resetwal --next-transaction-id may cause database failed torestart.

От
Alvaro Herrera
Дата:
On 2020-Jun-22, movead.li@highgo.ca wrote:

> hello hackers,
> 
> When I try to use pg_resetwal tool to skip some transaction ID, I get a problem that is
> the tool can accept all transaction id I offered with '-x' option, however, the database
> may failed to restart because of can not read file under $PGDATA/pg_xact.  For
> example, the 'NextXID' in a database is 1000, if you offer '-x 32769' then the database
> failed to restart.

Yeah, the normal workaround is to create the necessary file manually in
order to let the system start after such an operation; they are
sometimes necessary to enable testing weird cases with wraparound and
such.  So a total rejection to work for these cases would be unhelpful
precisely for the scenario that those switches were intended to serve.

Maybe a better answer is to have a new switch in postmaster that creates
any needed files (incl. producing associated WAL etc); so you'd run
pg_resetwal -x some-value
postmaster --create-special-stuff
then start your server and off you go.

Now maybe this is too much complication for a mechanism that really
isn't for general consumption anyway.  I mean, if you're using
pg_resetwal, you're already playing with fire.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_resetwal --next-transaction-id may cause database failed to restart.

От
"movead.li@highgo.ca"
Дата:

>Yeah, the normal workaround is to create the necessary file manually in
>order to let the system start after such an operation; they are
>sometimes necessary to enable testing weird cases with wraparound and
>such.  So a total rejection to work for these cases would be unhelpful
>precisely for the scenario that those switches were intended to serve.
I think these words should appear in pg_resetwal document if we decide
to do nothing for this issue. 

>Maybe a better answer is to have a new switch in postmaster that creates
>any needed files (incl. producing associated WAL etc); so you'd run
>pg_resetwal -x some-value
>postmaster --create-special-stuff
>then start your server and off you go.
As shown in the document, it looks like to rule a safe input, so I think it's better
to rule it and add an option to focus write an unsafe value if necessary.
 
>Now maybe this is too much complication for a mechanism that really
>isn't for general consumption anyway.  I mean, if you're using
>pg_resetwal, you're already playing with fire.
Yes, that's true, I always heard the word "You'd better not use pg_walreset".
But the tool appear in PG code, it's better to improve it than do nothing.


Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: pg_resetwal --next-transaction-id may cause database failed torestart.

От
Alvaro Herrera
Дата:
On 2020-Jun-24, movead.li@highgo.ca wrote:

> >Maybe a better answer is to have a new switch in postmaster that creates
> >any needed files (incl. producing associated WAL etc); so you'd run
> >pg_resetwal -x some-value
> >postmaster --create-special-stuff
> >then start your server and off you go.
>
> As shown in the document, it looks like to rule a safe input, so I think it's better
> to rule it and add an option to focus write an unsafe value if necessary.

ISTM that a reasonable compromise is that if you use -x (or -c, -m, -O)
and the input value is outside the range supported by existing files,
then it's a fatal error; unless you use --force, which turns it into
just a warning.

> >Now maybe this is too much complication for a mechanism that really
> >isn't for general consumption anyway.  I mean, if you're using
> >pg_resetwal, you're already playing with fire.
> Yes, that's true, I always heard the word "You'd better not use pg_walreset".
> But the tool appear in PG code, it's better to improve it than do nothing.

Sure.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_resetwal --next-transaction-id may cause database failed to restart.

От
"movead.li@highgo.ca"
Дата:

 >ISTM that a reasonable compromise is that if you use -x (or -c, -m, -O)
>and the input value is outside the range supported by existing files,
>then it's a fatal error; unless you use --force, which turns it into
>just a warning.
I do not think '--force' is a good choice, so I add a '--test, -t' option to
force to write a unsafe value to pg_control.
Do you think it is an acceptable method?



Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Вложения

Re: pg_resetwal --next-transaction-id may cause database failed to restart.

От
Alvaro Herrera
Дата:
On 2020-Jul-07, movead.li@highgo.ca wrote:

>  >ISTM that a reasonable compromise is that if you use -x (or -c, -m, -O)
> >and the input value is outside the range supported by existing files,
> >then it's a fatal error; unless you use --force, which turns it into
> >just a warning.
>
> I do not think '--force' is a good choice, so I add a '--test, -t' option to
> force to write a unsafe value to pg_control.
> Do you think it is an acceptable method?

The rationale for this interface is unclear to me.  Please explain what
happens in each case?

In my proposal, we'd have:

* Bad value, no --force:
  - program raises error, no work done.
* Bad value with --force:
  - program raises warning but changes anyway.
* Good value, no --force:
  - program changes value without saying anything
* Good value with --force:
  - same

The rationale for this interface is convenient knowledgeable access: the
DBA runs the program with value X, and if the value is good, then
they're done.  If the program raises an error, DBA has a choice: either
run with --force because they know what they're doing, or don't do
anything because they know that they would make a mess.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_resetwal --next-transaction-id may cause database failed to restart.

От
"movead.li@highgo.ca"
Дата:

>The rationale for this interface is unclear to me.  Please explain what
>happens in each case?
>In my proposal, we'd have:
>* Bad value, no --force:
>  - program raises error, no work done.
>* Bad value with --force:
>  - program raises warning but changes anyway.
>* Good value, no --force:
>  - program changes value without saying anything
>* Good value with --force:
>  - same
You have list all cases, maybe you are right it needs to raise a warning
when force a Bad value write which missed in the patch.
And I use '--test' in the patch, not '--force' temporary, maybe it needs
a deep research and discuss.

>The rationale for this interface is convenient knowledgeable access: the
>DBA runs the program with value X, and if the value is good, then
>they're done.  If the program raises an error, DBA has a choice: either
>run with --force because they know what they're doing, or don't do
>anything because they know that they would make a mess.
Yes that's it, in addition the raised error, can tell the DBA to input a good
value.


Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: pg_resetwal --next-transaction-id may cause database failed to restart.

От
Robert Haas
Дата:
On Wed, Jun 24, 2020 at 11:04 AM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> ISTM that a reasonable compromise is that if you use -x (or -c, -m, -O)
> and the input value is outside the range supported by existing files,
> then it's a fatal error; unless you use --force, which turns it into
> just a warning.

One potential problem is that you might be using --force for some
other reason and end up forcing this, too. But maybe that's OK.

Perhaps we should consider the idea of having pg_resetwal create the
relevant clog file and zero-fill it, if it doesn't exist already,
rather than leaving that to to the DBA or the postmaster binary to do
it. It seems like that is what people would want to happen in this
situation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_resetwal --next-transaction-id may cause database failed to restart.

От
"movead.li@highgo.ca"
Дата:

>> ISTM that a reasonable compromise is that if you use -x (or -c, -m, -O)
>> and the input value is outside the range supported by existing files,
>> then it's a fatal error; unless you use --force, which turns it into
>> just a warning.
 
>One potential problem is that you might be using --force for some
>other reason and end up forcing this, too. But maybe that's OK.
Yes it's true, so I try to add a new option to control this behavior, you
can see it in the last mail with attach.
 
>Perhaps we should consider the idea of having pg_resetwal create the
>relevant clog file and zero-fill it, if it doesn't exist already,
>rather than leaving that to to the DBA or the postmaster binary to do
>it. It seems like that is what people would want to happen in this
>situation.
I have considered this idea, but I think it produces files uncontrolled
by postmaster, so I think it may be unacceptable and give up.

In the case we force to write an unsafe value, we can create or extend
related files I think. Do you have any further idea, I can work out a new
patch.


Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: pg_resetwal --next-transaction-id may cause database failed to restart.

От
Alvaro Herrera
Дата:
On 2020-Jul-09, movead.li@highgo.ca wrote:

> >> ISTM that a reasonable compromise is that if you use -x (or -c, -m, -O)
> >> and the input value is outside the range supported by existing files,
> >> then it's a fatal error; unless you use --force, which turns it into
> >> just a warning.
>  
> >One potential problem is that you might be using --force for some
> >other reason and end up forcing this, too. But maybe that's OK.
> Yes it's true, so I try to add a new option to control this behavior, you
> can see it in the last mail with attach.

It may be OK actually; if you're doing multiple dangerous changes, you'd
use --dry-run beforehand ... No?  (It's what *I* would do, for sure.)
Which in turns suggests that it would good to ensure that --dry-run
*also* emits a warning (not an error, so that any other warnings can
also be thrown and the user gets the full picture).

I think adding multiple different --force switches makes the UI more
complex for little added value.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_resetwal --next-transaction-id may cause database failed to restart.

От
"movead.li@highgo.ca"
Дата:

>It may be OK actually; if you're doing multiple dangerous changes, you'd
>use --dry-run beforehand ... No?  (It's what *I* would do, for sure.)
>Which in turns suggests that it would good to ensure that --dry-run
>*also* emits a warning (not an error, so that any other warnings can
>also be thrown and the user gets the full picture).
Yes that's true, I have chaged the patch and will get a warning rather than
error when we point a --dry-run option.
And I remake the code which looks more clearly.

>I think adding multiple different --force switches makes the UI more
>complex for little added value.
Yes I also feel about that, but I can't convince myself to use --force
to finish the mission, because --force is used when something wrong with
pg_control file and we can listen to hackers' proposals.


Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Вложения