Обсуждение: pg_upgrade should truncate/remove its logs before running

Поиск
Список
Период
Сортировка

pg_upgrade should truncate/remove its logs before running

От
Justin Pryzby
Дата:
I have seen this numerous times but had not dug into it, until now.

If pg_upgrade fails and is re-run, it appends to its logfiles, which is
confusing since, if it fails again, it then looks like the original error
recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.

I think it should either truncate the logfiles, or error early if any of the
files exist.  Or it could put all its output files into a newly-created
subdirectory.  Or this message could be output to the per-db logfiles, and not
just the static ones:
| "pg_upgrade run on %s".

For the per-db logfiels with OIDs in their name, changing open() from "append"
mode to truncate mode doesn't work, since they're written to in parallel.
They have to be removed/truncated in advance.

This is one possible fix.  You can test its effect by deliberately breaking one
of the calls to exec_progs(), like this.

-  "\"%s/pg_restore\" %s %s --exit-on-error --verbose "
+  "\"%s/pg_restore\" %s %s --exit-on-error --verboose "

Вложения

Re: pg_upgrade should truncate/remove its logs before running

От
Bruce Momjian
Дата:
On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
> I have seen this numerous times but had not dug into it, until now.
> 
> If pg_upgrade fails and is re-run, it appends to its logfiles, which is
> confusing since, if it fails again, it then looks like the original error
> recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.
> 
> I think it should either truncate the logfiles, or error early if any of the
> files exist.  Or it could put all its output files into a newly-created
> subdirectory.  Or this message could be output to the per-db logfiles, and not
> just the static ones:
> | "pg_upgrade run on %s".
> 
> For the per-db logfiels with OIDs in their name, changing open() from "append"
> mode to truncate mode doesn't work, since they're written to in parallel.
> They have to be removed/truncated in advance.
> 
> This is one possible fix.  You can test its effect by deliberately breaking one
> of the calls to exec_progs(), like this.
> 
> -  "\"%s/pg_restore\" %s %s --exit-on-error --verbose "
> +  "\"%s/pg_restore\" %s %s --exit-on-error --verboose "

Uh, the database server doesn't erase its logs on crash/failure, so why
should pg_upgrade do that?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: pg_upgrade should truncate/remove its logs before running

От
Justin Pryzby
Дата:
On Wed, Dec 15, 2021 at 04:09:16PM -0500, Bruce Momjian wrote:
> On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
> > I have seen this numerous times but had not dug into it, until now.
> > 
> > If pg_upgrade fails and is re-run, it appends to its logfiles, which is
> > confusing since, if it fails again, it then looks like the original error
> > recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.
> > 
> > I think it should either truncate the logfiles, or error early if any of the
> > files exist.  Or it could put all its output files into a newly-created
> > subdirectory.  Or this message could be output to the per-db logfiles, and not
> > just the static ones:
> > | "pg_upgrade run on %s".
> > 
> > For the per-db logfiels with OIDs in their name, changing open() from "append"
> > mode to truncate mode doesn't work, since they're written to in parallel.
> > They have to be removed/truncated in advance.
> > 
> > This is one possible fix.  You can test its effect by deliberately breaking one
> > of the calls to exec_progs(), like this.
> > 
> > -  "\"%s/pg_restore\" %s %s --exit-on-error --verbose "
> > +  "\"%s/pg_restore\" %s %s --exit-on-error --verboose "
> 
> Uh, the database server doesn't erase its logs on crash/failure, so why
> should pg_upgrade do that?

To avoid the presence of irrelevant errors from the previous invocation of
pg_upgrade.

Maybe you would prefer one of my other ideas , like "put all its output files
into a newly-created subdirectory" ?

-- 
Justin



Re: pg_upgrade should truncate/remove its logs before running

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
>> If pg_upgrade fails and is re-run, it appends to its logfiles, which is
>> confusing since, if it fails again, it then looks like the original error
>> recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.

> Uh, the database server doesn't erase its logs on crash/failure, so why
> should pg_upgrade do that?

The server emits enough information so that it's not confusing:
there are timestamps, and there's an identifiable startup line.
pg_upgrade does neither.  If you don't want to truncate as
Justin suggests, you should do that instead.

Personally I like the idea of making a timestamped subdirectory
and dropping all the files in that, because the thing that most
annoys *me* about pg_upgrade is the litter it leaves behind in
$CWD.  A subdirectory would make it far easier to mop up the mess.

            regards, tom lane



Re: pg_upgrade should truncate/remove its logs before running

От
Andrew Dunstan
Дата:
On 12/15/21 16:23, Bruce Momjian wrote:
> On Wed, Dec 15, 2021 at 04:17:23PM -0500, Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
>>>> If pg_upgrade fails and is re-run, it appends to its logfiles, which is
>>>> confusing since, if it fails again, it then looks like the original error
>>>> recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.
>>> Uh, the database server doesn't erase its logs on crash/failure, so why
>>> should pg_upgrade do that?
>> The server emits enough information so that it's not confusing:
>> there are timestamps, and there's an identifiable startup line.
>> pg_upgrade does neither.  If you don't want to truncate as
>> Justin suggests, you should do that instead.
>>
>> Personally I like the idea of making a timestamped subdirectory
>> and dropping all the files in that, because the thing that most
>> annoys *me* about pg_upgrade is the litter it leaves behind in
>> $CWD.  A subdirectory would make it far easier to mop up the mess.
> Yes, lot of litter.  Putting it in a subdirectory makes a lot of sense.
> Justin, do you want to work on that patch, since you had an earlier
> version to fix this?
>



The directory name needs to be predictable somehow, or maybe optionally
set as a parameter. Having just a timestamped directory name would make
life annoying for a poor buildfarm maintainer. Also, please don't change
anything before I have a chance to adjust the buildfarm code to what is
going to be done.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: pg_upgrade should truncate/remove its logs before running

От
Michael Paquier
Дата:
On Wed, Dec 15, 2021 at 04:13:10PM -0600, Justin Pryzby wrote:
> On Wed, Dec 15, 2021 at 05:04:54PM -0500, Andrew Dunstan wrote:
>> The directory name needs to be predictable somehow, or maybe optionally
>> set as a parameter. Having just a timestamped directory name would make
>> life annoying for a poor buildfarm maintainer. Also, please don't change
>> anything before I have a chance to adjust the buildfarm code to what is
>> going to be done.
>
> Feel free to suggest the desirable behavior.
> It could write to pg_upgrade.log/* and refuse to run if the dir already exists.

Andrew's point looks rather sensible to me.  So, this stuff should
have a predictable name (pg_upgrade.log, pg_upgrade_log or upgrade_log
would be fine).  But I would also add an option to be able to define a
custom log path.  The latter would be useful for the regression tests
so as everything gets could get redirected to a path already filtered
out.
--
Michael

Вложения

Re: pg_upgrade should truncate/remove its logs before running

От
Peter Eisentraut
Дата:
On 16.12.21 02:39, Michael Paquier wrote:
> On Wed, Dec 15, 2021 at 04:13:10PM -0600, Justin Pryzby wrote:
>> On Wed, Dec 15, 2021 at 05:04:54PM -0500, Andrew Dunstan wrote:
>>> The directory name needs to be predictable somehow, or maybe optionally
>>> set as a parameter. Having just a timestamped directory name would make
>>> life annoying for a poor buildfarm maintainer. Also, please don't change
>>> anything before I have a chance to adjust the buildfarm code to what is
>>> going to be done.
>>
>> Feel free to suggest the desirable behavior.
>> It could write to pg_upgrade.log/* and refuse to run if the dir already exists.
> 
> Andrew's point looks rather sensible to me.  So, this stuff should
> have a predictable name (pg_upgrade.log, pg_upgrade_log or upgrade_log
> would be fine).  But I would also add an option to be able to define a
> custom log path.  The latter would be useful for the regression tests
> so as everything gets could get redirected to a path already filtered
> out.

Could we make it write just one log file?  Is having multiple log files 
better?



Re: pg_upgrade should truncate/remove its logs before running

От
Daniel Gustafsson
Дата:
> On 16 Dec 2021, at 12:11, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

> Could we make it write just one log file?  Is having multiple log files better?

Having individual <checkname>.txt files from checks with additional information
on how to handle the error are quite convenient when writing wrappers around
pg_upgrade (speaking from experience of having written multiple pg_upgraade
frontends).  Parsing a single logfile is more work, and will break existing
scripts.

I'm in favor of a predictable by default logpath, with a parameter to override,
as mentioned upthread.

--
Daniel Gustafsson        https://vmware.com/




Re: pg_upgrade should truncate/remove its logs before running

От
Peter Eisentraut
Дата:
On 19.01.22 09:13, Michael Paquier wrote:
> - Renaming of the option from --logdir to --outputdir, as this does
> not include only logs.  That matches also better with default value
> assigned in previous patches, aka pg_upgrade_output.d.

I'm afraid that is too easily confused with the target directory. 
Generally, a tool processes data from input to output or from source to 
target or something like that, whereas a log is more clearly something 
separate from this main processing stream.  The desired "output" of 
pg_upgrade is the upgraded cluster, after all.

A wildcard idea is to put the log output into the target cluster.



Re: pg_upgrade should truncate/remove its logs before running

От
Andrew Dunstan
Дата:
On 1/28/22 08:42, Michael Paquier wrote:
> On Wed, Jan 26, 2022 at 11:00:28AM +0900, Michael Paquier wrote:
>> Bleh.  This would point to the old data directory, so this needs to be
>> "$self->{pgsql}/src/bin/pg_upgrade/tmp_check/data/pg_upgrade_output.d/log/*.log"
>> to point to the upgraded cluster.
> Please note that I have sent a patch to merge this change in the
> buildfarm code.  Comments are welcome.



I have committed this. But it will take time to get every buildfarm own
to upgrade. I will try to make a new release ASAP.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: pg_upgrade should truncate/remove its logs before running

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> So, it took me some time to get back to this thread, and looked at it
> for the last couple of days...  The buildfarm client v14 has been
> released on the 29th of January, which means that we are good to go.

As already mentioned, there's been no notice to buildfarm owners ...
so has Andrew actually made a release?

            regards, tom lane



Re: pg_upgrade should truncate/remove its logs before running

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Sun, Feb 06, 2022 at 01:58:21AM -0500, Tom Lane wrote:
>> As already mentioned, there's been no notice to buildfarm owners ...
>> so has Andrew actually made a release?

> There has been one as of 8 days ago:
> https://github.com/PGBuildFarm/client-code/releases

[ scrapes buildfarm logs ... ]

Not even Andrew's own buildfarm critters are using it, so
permit me leave to doubt that he thinks it's fully baked.

Andrew?

            regards, tom lane



Re: pg_upgrade should truncate/remove its logs before running

От
Andrew Dunstan
Дата:
On 2/6/22 02:17, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
>> On Sun, Feb 06, 2022 at 01:58:21AM -0500, Tom Lane wrote:
>>> As already mentioned, there's been no notice to buildfarm owners ...
>>> so has Andrew actually made a release?
>> There has been one as of 8 days ago:
>> https://github.com/PGBuildFarm/client-code/releases
> [ scrapes buildfarm logs ... ]
>
> Not even Andrew's own buildfarm critters are using it, so
> permit me leave to doubt that he thinks it's fully baked.
>
> Andrew?
>
>             


*sigh* Sometimes I have a mind like a sieve. I prepped the release a few
days ago and meant to come back the next morning and send out emails
announcing it, as well as rolling it out to my animals, and got diverted
so that didn't happen and it slipped my mind. I'll go and do those
things now.

But the commit really shouldn't have happened until we know that most
buildfarm owners have installed it. It should have waited wait not just
for the release but for widespread deployment. Otherwise we will just
lose any logging for an error that might appear.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: pg_upgrade should truncate/remove its logs before running

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Sun, Feb 06, 2022 at 08:32:59AM -0500, Andrew Dunstan wrote:
>> But the commit really shouldn't have happened until we know that most
>> buildfarm owners have installed it. It should have waited wait not just
>> for the release but for widespread deployment. Otherwise we will just
>> lose any logging for an error that might appear.

> Would it be better if I just revert the change for now then and do it
> again in one/two weeks?

I don't see a need to revert it.

I note, though, that there's still not been any email to the buildfarm
owners list about this update.

            regards, tom lane



Re: pg_upgrade should truncate/remove its logs before running

От
Andrew Dunstan
Дата:

> On Feb 6, 2022, at 7:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Michael Paquier <michael@paquier.xyz> writes:
>>> On Sun, Feb 06, 2022 at 08:32:59AM -0500, Andrew Dunstan wrote:
>>> But the commit really shouldn't have happened until we know that most
>>> buildfarm owners have installed it. It should have waited wait not just
>>> for the release but for widespread deployment. Otherwise we will just
>>> lose any logging for an error that might appear.
>
>> Would it be better if I just revert the change for now then and do it
>> again in one/two weeks?
>
> I don't see a need to revert it.
>
> I note, though, that there's still not been any email to the buildfarm
> owners list about this update.
>
>

It’s stuck in moderation

Cheers

Andrew


Re: pg_upgrade should truncate/remove its logs before running

От
Andrew Dunstan
Дата:
On 2/6/22 19:39, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
>> On Sun, Feb 06, 2022 at 08:32:59AM -0500, Andrew Dunstan wrote:
>>> But the commit really shouldn't have happened until we know that most
>>> buildfarm owners have installed it. It should have waited wait not just
>>> for the release but for widespread deployment. Otherwise we will just
>>> lose any logging for an error that might appear.
>> Would it be better if I just revert the change for now then and do it
>> again in one/two weeks?
> I don't see a need to revert it.
>
> I note, though, that there's still not been any email to the buildfarm
> owners list about this update.
>
>             


The announcement was held up in list moderation for 20 hours or so.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com