Re: Return codes for archive and restore commands

Поиск
Список
Период
Сортировка
От Oleg Bartunov
Тема Re: Return codes for archive and restore commands
Дата
Msg-id CAF4Au4wfPYwaYvU9rkABDpDyWvi296hVrEPu8eBmTNTdVkE6LQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Return codes for archive and restore commands  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-docs
On Thu, Nov 29, 2018 at 5:40 AM Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
>
> * Michael Paquier (michael@paquier.xyz) wrote:
> > On Wed, Nov 28, 2018 at 11:00:31AM +0000, PG Doc comments form wrote:
> > > For the archive command:
> > > <=128 There are not errors in the PostgreSQL log (messages with severity
> > > equal or higher than ERROR). Firstly 3 messages of type LOG about fault,
> > > then WARNING about this and pause for 1 minute, then repeated.
> > > >=129 FATAL error in the PostgeSQL log. The message about stoping an archive
> > > process, but not the database. Repeated after roughly 16 seconds.
> >
> > This code is around for some time, and comes from this commit:
> > commit: 3ad0728c817bf8abd2c76bd11d856967509b307c
> > author: Tom Lane <tgl@sss.pgh.pa.us>
> > date: Tue, 21 Nov 2006 20:59:53 +0000
> > committer: Tom Lane <tgl@sss.pgh.pa.us>
> > date: Tue, 21 Nov 2006 20:59:53 +0000
> > On systems that have setsid(2) (which should be just about everything except
> > Windows), arrange for each postmaster child process to be its own process
> > group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole
> > process group not only the direct child process.  This provides saner behavior
> > for archive and recovery scripts; in particular, it's possible to shut down a
> > warm-standby recovery server using "pg_ctl stop -m immediate", since delivery
> > of SIGQUIT to the startup subprocess will result in killing the waiting
> > recovery_command.  Also, this makes Query Cancel and statement_timeout apply
> > to scripts being run from backends via system().  (There is no support in the
> > core backend for that, but it's widely done using untrusted PLs.)  Per gripe
> > from Stephen Harris and subsequent discussion.
> >
> > The relevant part if pgarch_archiveXlog() in pgarch.c, and this part
> > is most relevant:
> > * Per the Single Unix Spec, shells report exit status > 128 when a
> > * called command died on a signal.
> >
> > > In this case PostgreSQL tries confirm rules for return codes of a unix
> > > shell. A unix shell return 126 in the case of "command not executable", 127
> > > in the case "command not found", 128+# of signal in the case if application
> > > interrupted by uncatched signal.
> >
> > If you were to rewrite those paragraphs or make them more precise, how
> > would you actually shape your suggestions?  I personally quite like the
> > current formulations, but I am rather used to it to be honest.
>
> This is another example, at least imv, of why we really need to move
> away from archive_command as an interface for doing WAL archiving.

+1

>
> Having discussed this quite a bit lately with David Steele and Magnus,
> it's pretty clear that we need to completely rip out how this works
> today and rewrite it based around an extension model where a background
> worker can start up and essentially take the place of the archiver
> process, with flexibility to jump forward through the WAL stream,
> communicate clearly with other processes, handle failure to do so
> gracefully based on the specific cases, etc.
>
> We could then possibly write an extension to be included that mimics
> what archive_command does today, but imv we should immediately consider
> it deprecated and encourage people to move off of it.
>
> Thanks!
>
> Stephen



-- 
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


В списке pgsql-docs по дате отправления:

Предыдущее
От: Dmitry Igrishin
Дата:
Сообщение: Re: Updating the intro for packages - improve usability, reduce newuser confusion
Следующее
От: Steve Atkins
Дата:
Сообщение: Re: Updating the intro for packages - improve usability, reduce newuser confusion