Обсуждение: [HACKERS] Proposal: pg_rewind to skip config files

Поиск
Список
Период
Сортировка

[HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:
In some experiments with pg_rewind and rep mgr I noticed that local testing is complicated by the fact that pg_rewind appears to copy configuration files from the source to target directory.

I would propose to make a modest patch to exclude postgresql.conf, pg_hba.conf, and pg_ident.conf from the file tree traversal.

Any feedback before I create.a proof of concept?

--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Sokolov Yura
Дата:
On 2017-09-04 11:53, Chris Travers wrote:
> In some experiments with pg_rewind and rep mgr I noticed that local
> testing is complicated by the fact that pg_rewind appears to copy
> configuration files from the source to target directory.
> 
> I would propose to make a modest patch to exclude postgresql.conf,
> pg_hba.conf, and pg_ident.conf from the file tree traversal.
> 
> Any feedback before I create.a proof of concept?
> 
> --
> 
> Best Regards,
> Chris Travers
> Database Administrator
> 
> Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com [1]
> Saarbrücker Straße 37a, 10405 Berlin
> 
> 
> 
> Links:
> ------
> [1] http://www.adjust.com/

And we had production issue with pg_rewind which copied huge textual
logs from pg_log (20GB each, cause statements were logged for
statistic). It will be convenient to tell pg_rewind not to copy logs
too.

-- 
Sokolov Yura
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company



Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Michael Paquier
Дата:
On Mon, Sep 4, 2017 at 5:53 PM, Chris Travers <chris.travers@adjust.com> wrote:
> In some experiments with pg_rewind and rep mgr I noticed that local testing
> is complicated by the fact that pg_rewind appears to copy configuration
> files from the source to target directory.
>
> I would propose to make a modest patch to exclude postgresql.conf,
> pg_hba.conf, and pg_ident.conf from the file tree traversal.
>
> Any feedback before I create.a proof of concept?

A simple idea would be to pass as a parameter a regex on which we
check files to skip when scanning the directory of the target remotely
or locally. This needs to be used with care though, it would be easy
to corrupt an instance.
-- 
Michael



Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Michael Paquier
Дата:
On Mon, Sep 4, 2017 at 7:21 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> A simple idea would be to pass as a parameter a regex on which we
> check files to skip when scanning the directory of the target remotely
> or locally. This needs to be used with care though, it would be easy
> to corrupt an instance.

I actually shortcut that with a strategy similar to base backups: logs
are on another partition, log_directory uses an absolute path, and
PGDATA has no reference to the log path.
-- 
Michael



Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:
Ok so I have a proof of concept patch here.

This is proof of concept only.  It odes not change documentation or the like.

The purpose of the patch is discussion on the "do we want this" side.

The patch is fairly trivial but I have not added test cases or changed docs yet.

Intention of the patch:
pg_rewind is an important backbone tool for recovering data directories following a switchover.  However currently it is over inclusive as to what it copies.  This patch excludes any file ending in "serverlog", ".conf", and ".log" because these are never directly related and add a great deal of complexity to switchovers.

.conf files are excluded for two major reasons.  The first is that often we may want to put postgresql.conf and other files in the data directory, and if we change these during switchover this can change, for example, the port the database is running on or other things that can break production or testing environments.  This is usually a problem with testing environments, but it could happen with production environments as well.

A much larger concern with .conf files though is the recovery.conf.  This file MUST be put in the data directory, and it helps determine the replication topology regarding cascading replication and the like.  If you rewind from an upstream replica, you suddenly change the replication topology and that can have wide-ranging impacts.

I think we are much better off insisting that .conf files should be copied separately because the scenarios where you want to do so are more limited and the concern often separate from rewinding the timeline itself.

The second major exclusion added are files ending in "serverlog" and ".log."  I can find no good reason why server logs from the source should *ever* clobber those on the target.  If you do this, you lose historical information relating to problems and introduce management issues.


Backwards-incompatibility scenarios:
If somehow one has a workflow that depends on copying .conf files, this would break that.  I cannot think of any cases where anyone would actually want to do this but that doesn't mean they aren't out there.  If people really want to, then they need to copy the configuration files they want separately.

Next Steps:

If people like this idea I will add test cases and edit documentation as appropriate.

--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Вложения

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:


On Mon, Sep 4, 2017 at 12:23 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Sep 4, 2017 at 7:21 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> A simple idea would be to pass as a parameter a regex on which we
> check files to skip when scanning the directory of the target remotely
> or locally. This needs to be used with care though, it would be easy
> to corrupt an instance.

I actually shortcut that with a strategy similar to base backups: logs
are on another partition, log_directory uses an absolute path, and
PGDATA has no reference to the log path.

Yeah, it is quite possible to move all these out of the data directory, but bad things can happen when you accidentally copy configuration or logs over those on the target and expecting that all environments will be properly set up to avoid these problems is not always a sane assumption.

So consequently, I think it would be good to fix in the tool.  The fundamental question is if there is any reason someone would actually want to copy config files over.


--
Michael



--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Alvaro Herrera
Дата:
Chris Travers wrote:
> On Mon, Sep 4, 2017 at 12:23 PM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
> 
> > On Mon, Sep 4, 2017 at 7:21 PM, Michael Paquier
> > <michael.paquier@gmail.com> wrote:
> > > A simple idea would be to pass as a parameter a regex on which we
> > > check files to skip when scanning the directory of the target remotely
> > > or locally. This needs to be used with care though, it would be easy
> > > to corrupt an instance.
> >
> > I actually shortcut that with a strategy similar to base backups: logs
> > are on another partition, log_directory uses an absolute path, and
> > PGDATA has no reference to the log path.
> 
> Yeah, it is quite possible to move all these out of the data directory, but
> bad things can happen when you accidentally copy configuration or logs over
> those on the target and expecting that all environments will be properly
> set up to avoid these problems is not always a sane assumption.

I agree that operationally it's better if these files weren't in PGDATA
to start with, but from a customer support perspective, things are
frequently not already setup like that, so failing to support that
scenario is a loser.

I wonder how portable fnmatch() is in practice (which we don't currently
use anywhere).  A shell glob seems a more natural interface to me for
this than a regular expression.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:


On Mon, Sep 4, 2017 at 3:38 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Chris Travers wrote:
> On Mon, Sep 4, 2017 at 12:23 PM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>
> > On Mon, Sep 4, 2017 at 7:21 PM, Michael Paquier
> > <michael.paquier@gmail.com> wrote:
> > > A simple idea would be to pass as a parameter a regex on which we
> > > check files to skip when scanning the directory of the target remotely
> > > or locally. This needs to be used with care though, it would be easy
> > > to corrupt an instance.
> >
> > I actually shortcut that with a strategy similar to base backups: logs
> > are on another partition, log_directory uses an absolute path, and
> > PGDATA has no reference to the log path.
>
> Yeah, it is quite possible to move all these out of the data directory, but
> bad things can happen when you accidentally copy configuration or logs over
> those on the target and expecting that all environments will be properly
> set up to avoid these problems is not always a sane assumption.

I agree that operationally it's better if these files weren't in PGDATA
to start with, but from a customer support perspective, things are
frequently not already setup like that, so failing to support that
scenario is a loser.

I wonder how portable fnmatch() is in practice (which we don't currently
use anywhere).  A shell glob seems a more natural interface to me for
this than a regular expression.

I think the simplest solution for now is to skip any files ending in .conf, .log, and serverlog.

Long run, it would be nice to change pg_rewind from an opt-out approach to an approach of processing the subdirectories we know are important.

It is worth noting further that if you rewind in the wrong way, in a cascading replication environment, you can accidentally change your replication topology if you clobber the recovery.conf from another replica and there is no way to ensure that this file is not in the data directory since it MUST be put there.

Best Wishes,
Chris Travers
 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Vladimir Borodin
Дата:

5 сент. 2017 г., в 12:31, Chris Travers <chris.travers@adjust.com> написал(а):

I think the simplest solution for now is to skip any files ending in .conf, .log, and serverlog.

Why don’t you want to solve the problem once? It is a bit harder to get consensus on a way how to do it, but it seems that there are no reasons to make temporary solution here.

For example, in archive_command we put WALs for archiving from pg_xlog/pg_wal into another directory inside PGDATA and than another cron task makes real archiving. This directory ideally should be skipped by pg_rewind, but it would not be handled by proposed change.


Long run, it would be nice to change pg_rewind from an opt-out approach to an approach of processing the subdirectories we know are important.

While it is definitely an awful idea the user can easily put something strange (i.e. logs) to any important directory in PGDATA (i.e. into base or pg_wal). Or how for example pg_replslot should be handled (I asked about it a couple of years ago [1])? It seems that a glob/regexp for things to skip is a more universal solution.



--
May the force be with you…

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Michael Paquier
Дата:
On Tue, Sep 5, 2017 at 7:54 PM, Vladimir Borodin <root@simply.name> wrote:
> 5 сент. 2017 г., в 12:31, Chris Travers <chris.travers@adjust.com>
> написал(а):
>
> I think the simplest solution for now is to skip any files ending in .conf,
> .log, and serverlog.

This is not a portable solution. Users can include configuration files
with the names they want. So the current patch as proposed is
definitely not something worth it.

> For example, in archive_command we put WALs for archiving from
> pg_xlog/pg_wal into another directory inside PGDATA and than another cron
> task makes real archiving. This directory ideally should be skipped by
> pg_rewind, but it would not be handled by proposed change.

I would be curious to follow the reasoning for such a two-phase
archiving (You basically want to push it in two places, no? But why
not just use pg_receivexlog then?). This is complicated to handle from
the point of view of availability and backup reliability + durability.

> While it is definitely an awful idea the user can easily put something
> strange (i.e. logs) to any important directory in PGDATA (i.e. into base or
> pg_wal). Or how for example pg_replslot should be handled (I asked about it
> a couple of years ago [1])? It seems that a glob/regexp for things to skip
> is a more universal solution.
>
> [1]
> https://www.postgresql.org/message-id/flat/8DDCCC9D-450D-4CA2-8CF6-40B382F1F699%40simply.name

Well, keeping the code simple is not always a bad thing. Logs are an
example that can be easily countered, as well as archives in your
case.
--
Michael



Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Michael Paquier
Дата:
On Mon, Sep 4, 2017 at 10:38 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> I wonder how portable fnmatch() is in practice (which we don't currently
> use anywhere).  A shell glob seems a more natural interface to me for
> this than a regular expression.

On Windows you could use roughly PathMatchSpecEx, but it does not seem
that all the wildcards of fnmatch are available there.
-- 
Michael



Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:


On Tue, Sep 5, 2017 at 12:54 PM, Vladimir Borodin <root@simply.name> wrote:

5 сент. 2017 г., в 12:31, Chris Travers <chris.travers@adjust.com> написал(а):

I think the simplest solution for now is to skip any files ending in .conf, .log, and serverlog.

Why don’t you want to solve the problem once? It is a bit harder to get consensus on a way how to do it, but it seems that there are no reasons to make temporary solution here.

For example, in archive_command we put WALs for archiving from pg_xlog/pg_wal into another directory inside PGDATA and than another cron task makes real archiving. This directory ideally should be skipped by pg_rewind, but it would not be handled by proposed change.

Ok let's back up a bit in terms of what I see is the proper long-term fix.  Simple code, by the way, is important, but at least as important are programs which solve simple, well defined problems.  The current state is:

1.  pg_rewind makes *no guarantee* as to whether differences in logs, config files, etc. are clobbered.  They may (If a rewind is needed) or not (If the timelines haven't diverged).  Therefore the behavior of these sorts of files with the invocation of pg_rewind is not really very well defined. That's a fairly big issue in an operational environment.

2.  There are files which *may* be copied (I.e. are copied if the timelines have diverged) which *may* have side effects on replication topology, wal archiving etc.  Replication slots, etc. are good examples.

The problem I think pg_rewind should solve is "give me a consistent data environment from the timeline on that server."  I would think that access to the xlog/clog files would indeed be relevant to that.  If I were rewriting the application now I would include those.  Just because something can be handled separately doesn't mean it should be, and I would refer not to assume that archiving is properly set up and working.


Long run, it would be nice to change pg_rewind from an opt-out approach to an approach of processing the subdirectories we know are important.

While it is definitely an awful idea the user can easily put something strange (i.e. logs) to any important directory in PGDATA (i.e. into base or pg_wal). Or how for example pg_replslot should be handled (I asked about it a couple of years ago [1])? It seems that a glob/regexp for things to skip is a more universal solution.

I am not convinced it is a universal solution unless you take an arbitrary number or regexes to check and loop through checking all of them.  Then the chance of getting something catastrophically wrong in a critical environment goes way up and you may end up in an inconsistent state at the end.

Simple code is good.  A program that solves simple problems reliably (and in simple ways) is better.

The problem I see is that pg_rewind gets incorporated into other tools which don't really provide the user before or after hooks and therefore it isn't really fair to say, for example that repmgr has the responsibility to copy server logs out if present, or to make sure that configuration files are not in the directory.

The universal solution is to only touch the files that we know are needed and therefore work simply and reliably in a demanding environment.
 



--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Vladimir Borodin
Дата:

5 сент. 2017 г., в 14:04, Michael Paquier <michael.paquier@gmail.com> написал(а):

For example, in archive_command we put WALs for archiving from
pg_xlog/pg_wal into another directory inside PGDATA and than another cron
task makes real archiving. This directory ideally should be skipped by
pg_rewind, but it would not be handled by proposed change.

I would be curious to follow the reasoning for such a two-phase
archiving (You basically want to push it in two places, no? But why
not just use pg_receivexlog then?). This is complicated to handle from
the point of view of availability and backup reliability + durability.

We do compress WALs and send them over network. Doing it via archive_command in single thread is sometimes slower than new WALs are written under heavy load.

--
May the force be with you…

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:


On Tue, Sep 5, 2017 at 1:04 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Tue, Sep 5, 2017 at 7:54 PM, Vladimir Borodin <root@simply.name> wrote:
> 5 сент. 2017 г., в 12:31, Chris Travers <chris.travers@adjust.com>
> написал(а):
>
> I think the simplest solution for now is to skip any files ending in .conf,
> .log, and serverlog.

This is not a portable solution. Users can include configuration files
with the names they want. So the current patch as proposed is
definitely not something worth it.

Actually that is exactly why I think the long-term solution is to figure out what we need to copy and not copy anything we don't recognise.

That means the following directories as far as I can see:
 * base
 * global
 * pg_xlog/pg_wal
 * pg_clog/pg_xact
 * pg_commit_ts
 * pg_twophase
 * pg_snapshots?

Are there any other directories I am missing?


At any rate, I think the current state makes it very difficult to test rewind adequately, and it makes it extremely difficult to use in a non-trivial environment because you have to handle replication slots, configuration files, and so forth yourself, and you have to be aware that these *may* or *may not* be consistently clobbered by a rewind, so you have to have some way of applying another set of files in following a rewind.

If nothing else we ought to *at least* special case the recovery.conf and the postgresql.auto.conf, and pg_replslot because these are always located there and should never be clobbered.
 

> For example, in archive_command we put WALs for archiving from
> pg_xlog/pg_wal into another directory inside PGDATA and than another cron
> task makes real archiving. This directory ideally should be skipped by
> pg_rewind, but it would not be handled by proposed change.

I would be curious to follow the reasoning for such a two-phase
archiving (You basically want to push it in two places, no? But why
not just use pg_receivexlog then?). This is complicated to handle from
the point of view of availability and backup reliability + durability.

> While it is definitely an awful idea the user can easily put something
> strange (i.e. logs) to any important directory in PGDATA (i.e. into base or
> pg_wal). Or how for example pg_replslot should be handled (I asked about it
> a couple of years ago [1])? It seems that a glob/regexp for things to skip
> is a more universal solution.
>
> [1]
> https://www.postgresql.org/message-id/flat/8DDCCC9D-450D-4CA2-8CF6-40B382F1F699%40simply.name

Well, keeping the code simple is not always a bad thing. Logs are an
example that can be easily countered, as well as archives in your
case.


 
--
Michael



--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:


On Tue, Sep 5, 2017 at 2:40 PM, Vladimir Borodin <root@simply.name> wrote:

5 сент. 2017 г., в 14:04, Michael Paquier <michael.paquier@gmail.com> написал(а):

For example, in archive_command we put WALs for archiving from
pg_xlog/pg_wal into another directory inside PGDATA and than another cron
task makes real archiving. This directory ideally should be skipped by
pg_rewind, but it would not be handled by proposed change.

I would be curious to follow the reasoning for such a two-phase
archiving (You basically want to push it in two places, no? But why
not just use pg_receivexlog then?). This is complicated to handle from
the point of view of availability and backup reliability + durability.

We do compress WALs and send them over network. Doing it via archive_command in single thread is sometimes slower than new WALs are written under heavy load.

How would this work when it comes to rewinding against a file directory? 

--
May the force be with you…




--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Michael Paquier
Дата:
On Tue, Sep 5, 2017 at 9:40 PM, Vladimir Borodin <root@simply.name> wrote:
> We do compress WALs and send them over network. Doing it via archive_command
> in single thread is sometimes slower than new WALs are written under heavy
> load.

Ah, yeah, true. I do use pg_receivexlog --compress for that locally
and do a bulk copy of only the compressed WALs needed, when needed...
So there is a guarantee that completed segments are durable locally,
which is very useful. You should definitely avoid putting that in
PGDATA though, the same counts for tablespaces within PGDATA for
example.
-- 
Michael



Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Vladimir Borodin
Дата:

5 сент. 2017 г., в 15:42, Chris Travers <chris.travers@adjust.com> написал(а):

On Tue, Sep 5, 2017 at 2:40 PM, Vladimir Borodin <root@simply.name> wrote:

5 сент. 2017 г., в 14:04, Michael Paquier <michael.paquier@gmail.com> написал(а):

For example, in archive_command we put WALs for archiving from
pg_xlog/pg_wal into another directory inside PGDATA and than another cron
task makes real archiving. This directory ideally should be skipped by
pg_rewind, but it would not be handled by proposed change.

I would be curious to follow the reasoning for such a two-phase
archiving (You basically want to push it in two places, no? But why
not just use pg_receivexlog then?). This is complicated to handle from
the point of view of availability and backup reliability + durability.

We do compress WALs and send them over network. Doing it via archive_command in single thread is sometimes slower than new WALs are written under heavy load.

How would this work when it comes to rewinding against a file directory? 

Very bad, of course. Sometimes we get 'could not remove file "/var/lib/postgresql/9.6/data/wals/00000001000000C3000000C6": No such file or directory’ while running pg_rewind ($PGDATA/wals is a directory where archive_command copies WALs). That’s why I want to solve the initial problem. Both proposed solutions (using only needed files and skipping files through glob/regex) are fine for me, but not the initial patch.

--
May the force be with you…

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Vladimir Borodin
Дата:

5 сент. 2017 г., в 15:48, Michael Paquier <michael.paquier@gmail.com> написал(а):

On Tue, Sep 5, 2017 at 9:40 PM, Vladimir Borodin <root@simply.name> wrote:
We do compress WALs and send them over network. Doing it via archive_command
in single thread is sometimes slower than new WALs are written under heavy
load.

Ah, yeah, true. I do use pg_receivexlog --compress for that locally
and do a bulk copy of only the compressed WALs needed, when needed...
So there is a guarantee that completed segments are durable locally,
which is very useful.

It seems that option --compress appeared only in postgres 10 which is not ready for production yet. BTW I assume that pg_receivexlog is single-threaded too? So it still may be the bottleneck when 3-5 WALs per second are written.

You should definitely avoid putting that in
PGDATA though, the same counts for tablespaces within PGDATA for
example.

I would love to but there might be some problems with archiving and in many cases the only partition with enough space to accumulate WALs is partition for PGDATA.

--
May the force be with you…

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:
One more side to this which is relevant to other discussions.

If I am rewinding back to before when a table was created, the current algorithm as well as any proposed algorithms will delete the reference to the relfilenode in the catalogs but not the file itself.  I don't see how an undo subsystem would fix this.

Is this a reason to rethink the idea that maybe a pg_fsck utility might be useful that could be run immediately after a rewind?

--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Alvaro Herrera
Дата:
After reading this discussion, I agree that pg_rewind needs to become
smarter in order to be fully useful in production environments; right
now there are many warts and corner cases that did not seem to have been
considered during the initial development (which I think is all right,
taking into account that its mere concept was complicated enough; so we
need not put any blame on the original developers, rather the contrary).

I think we need to make the program simple to use (i.e. not have the
user write shell globs for the common log file naming patterns) while
remaining powerful (i.e. not forcibly copy any files that do not match
hardcoded globs).  Is the current dry-run mode enough to give the user
peace of mind regarding what would be done in terms of testing globs
etc?  If not, maybe the debug mode needs to be improved (for instance,
have it report the file size for each file that would be copied;
otherwise you may not notice it's going to copy the 20GB log file until
it's too late ...)

Now, in order for any of this to happen, there will need to be a
champion to define what the missing pieces are, write all those patches
and see them through the usual (cumbersome, slow) procedure.  Are you,
Chris, willing to take the lead on that?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: pg_rewind to skip config files

От
Chris Travers
Дата:


On Thu, Sep 7, 2017 at 2:47 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
After reading this discussion, I agree that pg_rewind needs to become
smarter in order to be fully useful in production environments; right
now there are many warts and corner cases that did not seem to have been
considered during the initial development (which I think is all right,
taking into account that its mere concept was complicated enough; so we
need not put any blame on the original developers, rather the contrary).

Agreed with this assessment.  And as a solution to the problem of "base backups take too long to take and transfer" the solution and the corner cases make a lot of sense.

I think we need to make the program simple to use (i.e. not have the
user write shell globs for the common log file naming patterns) while
remaining powerful (i.e. not forcibly copy any files that do not match
hardcoded globs).

I would add that well-defined tasks are a key aspect of powerful software in my view and here the well defined task is to restore data states to a particular usable timeline point taken from another system.  If that is handled well, that opens up new uses and makes some problems that are difficult right now much easier to solve.
 
  Is the current dry-run mode enough to give the user
peace of mind regarding what would be done in terms of testing globs
etc?  If not, maybe the debug mode needs to be improved (for instance,
have it report the file size for each file that would be copied;
otherwise you may not notice it's going to copy the 20GB log file until
it's too late ...)

The dry run facility solves one problem in one circumstance, namely a manually invoked run of the software along with the question of "will this actually re-wind?"  I suppose software developers might be able to use it to backup and restore things that are to be clobbered (but is anyone likely to on the software development side?).  I don't see anything in that corner that can be improved without over engineering the solution.

There are two reasons I am skeptical that a dry-run mode will ever be "enough."

The first is that pg_rewind is often integrated into auto-failover/back tools and the chance of running it in a dry-run mode before it is automatically triggered is more or less nil.  These are also the cases where you won't notice it does something bad until much later.

The second is that there are at least some corner cases we may need to define as outside the responsibility of pg_rewind.  The one that comes to mind here is if I am rewinding back past the creation of a small table.  I don't see an easy or safe way to address that from inside pg_rewind without a lot of complication.  It might be better to have a dedicated tool for that.
 

Now, in order for any of this to happen, there will need to be a
champion to define what the missing pieces are, write all those patches
and see them through the usual (cumbersome, slow) procedure.  Are you,
Chris, willing to take the lead on that?

 Yeah. I think the first step would be list of the corner cases and a proposal for how I think it should work.  Then maybe a roadmap of patches, and then submitting them as they become complete.


--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin