Обсуждение: Patch to implement pg_current_logfile() function

Поиск
Список
Период
Сортировка

Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Hi,

Here is a patch that is supposed to solve the remaining problem to
find the current log file used by the log collector after a rotation.
There is lot of external command to try to find this information but
it seems useful to have an internal function to retrieve the name
of the current log file from the log collector.

There is a corresponding item in the TODO list at "Administration"
section. The original thread can be reach at the following link
http://archives.postgresql.org/pgsql-general/2008-11/msg00418.php
The goal is to provide a way to query the log collector subprocess
to determine the name of the currently active log file.

This patch implements the pg_current_logfile() function that can be
used as follow. The function returns NULL when logging_collector
is not active and outputs a warning.

postgres=# \pset null *
postgres=# SELECT pg_current_logfile();
WARNING:  current log can not be reported because log collection is not
active
 pg_current_logfile
--------------------
 *
(1 line)

So a better query should be:

postgres=# SELECT CASE WHEN current_setting('logging_collector')='on'
        THEN pg_current_logfile()
        ELSE current_setting('log_destination')
    END;
 current_setting
-----------------
 syslog
(1 line)

Same query with log collection active and, for example,
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'

postgres=# SELECT CASE WHEN current_setting('logging_collector')='on'
                THEN pg_current_logfile()
                ELSE current_setting('log_destination')
        END;
             current_setting
-----------------------------------------
 pg_log/postgresql-2016-03-09_152827.log
(1 line)

Then after a log rotation:

postgres=# SELECT pg_rotate_logfile();
 pg_rotate_logfile
-------------------
 t
(1 line)

postgres=# select pg_current_logfile();
           pg_current_logfile
-----------------------------------------
 pg_log/postgresql-2016-03-09_152908.log
(1 line)

I choose to allow the log collector to write his current log file name
into the lock file 'postmaster.pid'. This allow simple access to this
information through system commands, for example:

postgres@W230ST:~$ tail -n1 /usr/local/pgql-devel/data/postmaster.pid
pg_log/postgresql-2016-03-09_152908.log

Log filename is written at the 8th line position when log collection
is active and all other information have been written to lock file.

The function pg_current_logfile() use in SQL mode read the lock file
to report the information.

I don't know if there's any limitation on using postmaster.pid file to
do that but it seems to me a bit weird to log this information to an
other file. My first attempt was to use a dedicated file and save it
to global/pg_current_logfile or pg_stat_tmp/pg_current_logfile but I
think it is better to use the postmaster.pid file for that. I also
though about a communication protocol or notification with the log
collector subprocess to query and retrieve the name of the currently
active log file. But obviously, it would be too much work for just this
simple function and I can't see any other feature that need such a
work.

Any though? Should I add this patch to the commit fest? If the use
of the postmater.pid file is a problem I can easily modify the patch
to use an alternate file.

Best regards,

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Wed, Mar 9, 2016 at 12:32 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
> I choose to allow the log collector to write his current log file name
> into the lock file 'postmaster.pid'. This allow simple access to this
> information through system commands, for example:
>
> postgres@W230ST:~$ tail -n1 /usr/local/pgql-devel/data/postmaster.pid
> pg_log/postgresql-2016-03-09_152908.log
>
> Log filename is written at the 8th line position when log collection
> is active and all other information have been written to lock file.
>
> The function pg_current_logfile() use in SQL mode read the lock file
> to report the information.
>
> I don't know if there's any limitation on using postmaster.pid file to
> do that but it seems to me a bit weird to log this information to an
> other file. My first attempt was to use a dedicated file and save it
> to global/pg_current_logfile or pg_stat_tmp/pg_current_logfile but I
> think it is better to use the postmaster.pid file for that.

Gosh, why?  Piggybacking this on a file written for a specific purpose
by a different process seems like making life very hard for yourself,
and almost certainly a recipe for bugs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Patch to implement pg_current_logfile() function

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Mar 9, 2016 at 12:32 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
>> I choose to allow the log collector to write his current log file name
>> into the lock file 'postmaster.pid'.

> Gosh, why?  Piggybacking this on a file written for a specific purpose
> by a different process seems like making life very hard for yourself,
> and almost certainly a recipe for bugs.

That's a *complete* nonstarter.  postmaster.pid has to be written by the
postmaster process and nobody else.

It's a particularly bad choice for the syslogger, which will exist
fractionally longer than the postmaster, and thus might be trying to write
into the file after the postmaster has removed it.
        regards, tom lane



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 10/03/2016 16:26, Tom Lane a écrit :
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Mar 9, 2016 at 12:32 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
>>> I choose to allow the log collector to write his current log file name
>>> into the lock file 'postmaster.pid'.
>> Gosh, why?  Piggybacking this on a file written for a specific purpose
>> by a different process seems like making life very hard for yourself,
>> and almost certainly a recipe for bugs.
> That's a *complete* nonstarter.  postmaster.pid has to be written by the
> postmaster process and nobody else.
>
> It's a particularly bad choice for the syslogger, which will exist
> fractionally longer than the postmaster, and thus might be trying to write
> into the file after the postmaster has removed it.

I was thinking that the lock file was removed after all and that
postmaster was waiting for all childs die before removing it, but ok I
know why this was not my first choice, this was a very bad idea :-)
Then, should I have to use an alternate file to store the information or
implement a bidirectional communication with the syslogger? The last
solution looks like really too much for this very simple feature. With
an alternate file what is the best place where it has to be written? All
places have their special use, the global/ directory?

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
Tom Lane
Дата:
Gilles Darold <gilles.darold@dalibo.com> writes:
> Then, should I have to use an alternate file to store the information or
> implement a bidirectional communication with the syslogger?

I'd just define a new single-purpose file $PGDATA/log_file_name
or some such.
        regards, tom lane



Re: Patch to implement pg_current_logfile() function

От
"Shulgin, Oleksandr"
Дата:
On Thu, Mar 10, 2016 at 9:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Gilles Darold <gilles.darold@dalibo.com> writes:
> Then, should I have to use an alternate file to store the information or
> implement a bidirectional communication with the syslogger?

I'd just define a new single-purpose file $PGDATA/log_file_name
or some such.

Would it make sense to have it as a symlink instead?

--
Alex

Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 11/03/2016 10:49, Shulgin, Oleksandr a écrit :
On Thu, Mar 10, 2016 at 9:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Gilles Darold <gilles.darold@dalibo.com> writes:
> Then, should I have to use an alternate file to store the information or
> implement a bidirectional communication with the syslogger?

I'd just define a new single-purpose file $PGDATA/log_file_name
or some such.

Would it make sense to have it as a symlink instead?

The only cons I see is that it can be more "difficult" with some language to gather the real path, but do we really need it? There is also little time where the symlink doesn't exist, this is when it needs to be removed before being recreated to point to the new log file. If your external script try to reopen the log file at this moment it will complain that file doesn't exists and looping until the file exists is probably a bad idea.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: Patch to implement pg_current_logfile() function

От
Tom Lane
Дата:
Gilles Darold <gilles.darold@dalibo.com> writes:
> Le 11/03/2016 10:49, Shulgin, Oleksandr a écrit :
>> Would it make sense to have it as a symlink instead?

> The only cons I see is that it can be more "difficult" with some
> language to gather the real path, but do we really need it? There is
> also little time where the symlink doesn't exist, this is when it needs
> to be removed before being recreated to point to the new log file.

Yeah, a symlink is impossible to update atomically (on most platforms
anyway).  Plain file containing the pathname seems better.

Another point is that we might not necessarily want *only* the pathname in
there.  postmaster.pid has accreted more stuff over time, and this file
might too.  I can see wanting the syslogger PID in there, for example,
so that onlookers can detect a totally stale file.  (Not proposing this
right now, just pointing out that it's a conceivable future feature.)
        regards, tom lane



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 11/03/2016 15:22, Tom Lane a écrit :
> Gilles Darold <gilles.darold@dalibo.com> writes:
>> Le 11/03/2016 10:49, Shulgin, Oleksandr a écrit :
>>> Would it make sense to have it as a symlink instead?
>> The only cons I see is that it can be more "difficult" with some
>> language to gather the real path, but do we really need it? There is
>> also little time where the symlink doesn't exist, this is when it needs
>> to be removed before being recreated to point to the new log file.
> Yeah, a symlink is impossible to update atomically (on most platforms
> anyway).  Plain file containing the pathname seems better.
>
> Another point is that we might not necessarily want *only* the pathname in
> there.  postmaster.pid has accreted more stuff over time, and this file
> might too.  I can see wanting the syslogger PID in there, for example,
> so that onlookers can detect a totally stale file.  (Not proposing this
> right now, just pointing out that it's a conceivable future feature.)
>
>             regards, tom lane

Here is the patch rewritten to use alternate file
$PGDATA/pg_log_filename to store the current log filename used by
syslogger. All examples used in the first mail of this thread work the
exact same way. If there's no other remarks, I will add the patch to the
next commit fest.


Here are some additional examples with this feature.

To obtain the filling percentage of the log file when log_rotation_size
is used:

postgres=# SELECT pg_current_logfile(), (select setting::int*1000 from
pg_settings where name='log_rotation_size'), a.size size,
((a.size*100)/(select setting::int*1000 from pg_settings where
name='log_rotation_size')) percent_used FROM
pg_stat_file(pg_current_logfile())
a(size,access,modification,change,creation,isdir);

-[ RECORD 1 ]------+----------------------------------------
pg_current_logfile | pg_log/postgresql-2016-03-11_160817.log
log_rotation_size  | 10240000
size               | 1246000
percent_used       | 12

This can help to know if the file is near to be rotated. Or if you use
time based rotation:

postgres=# select pg_current_logfile(), (select setting::int*60 from
pg_settings where name='log_rotation_age')
log_rotation_age,a.access,a.modification, (((extract(epoch from
a.modification) - extract(epoch from a.access)) * 100) / (select
setting::int*60 from pg_settings where name='log_rotation_age'))
percent_used FROM pg_stat_file(pg_current_logfile())
a(size,access,modification,change,creation,isdir);
-[ RECORD 1 ]------+----------------------------------------
pg_current_logfile | pg_log/postgresql-2016-03-11_162143.log
log_rotation_age   | 3600
access             | 2016-03-11 16:21:43+01
modification       | 2016-03-11 16:33:12+01
percent_used       | 19.1388888888889

Best regards,

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Sat, Mar 12, 2016 at 12:46 AM, Gilles Darold
<gilles.darold@dalibo.com> wrote:
> Here is the patch rewritten to use alternate file
> $PGDATA/pg_log_filename to store the current log filename used by
> syslogger. All examples used in the first mail of this thread work the
> exact same way. If there's no other remarks, I will add the patch to the
> next commit fest.

Please be sure to register this patch to the next CF:
https://commitfest.postgresql.org/10/
Things are hot enough with 9.6, so this will only be considered in 9.7.
-- 
Michael



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 22/03/2016 14:44, Michael Paquier a écrit :
On Sat, Mar 12, 2016 at 12:46 AM, Gilles Darold
<gilles.darold@dalibo.com> wrote:
Here is the patch rewritten to use alternate file
$PGDATA/pg_log_filename to store the current log filename used by
syslogger. All examples used in the first mail of this thread work the
exact same way. If there's no other remarks, I will add the patch to the
next commit fest.
Please be sure to register this patch to the next CF:
https://commitfest.postgresql.org/10/
Things are hot enough with 9.6, so this will only be considered in 9.7.

Thanks for the reminder, here is the v3 of the patch after a deeper review and testing. It is now registered to the next commit fest under the System Administration topic.

Fixes in this patch are:

- Output file have been renamed as PGDATA/pg_log_file

- Log level of the warning when logging collector is not active has been changed to NOTICE
postgres=# select pg_current_logfile();
NOTICE:  current log can not be reported, log collection is not active
 pg_current_logfile
--------------------

(1 row)
- Log level for file access errors in function store_current_log_filename() of file src/backend/postmaster/syslogger.c has been set to WARNING, using ERROR level forced the backend to stop with a FATAL error.

- Add information about file PGDATA/pg_log_file in storage file layout of doc/src/sgml/storage.sgml


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org
Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Wed, 23 Mar 2016 23:22:26 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Thanks for the reminder, here is the v3 of the patch after a deeper
> review and testing. It is now registered to the next commit fest under
> the System Administration topic.

I am going to try reviewing your patch.  I don't feel entirely
confident, but should be able to provide at least some help.

I've not yet even tried building it, but the the first thing I notice
is that you're going to need to use pgrename() of src/port/dirmod.c
in order to get an atomic update of the pg_log_file file.

I believe this is the right approach here.  Any other program
will always see a fully-formed file content.

The typical way this works is: you make a new file with new
content, then rename the new file to the old file name.
This makes the new file name go away and the old file
content go away and, effectively, replaces
the content of your file with new content.

You'll want to look at other places where pg uses pgrename()
to see what sort of name you should give to the new file.
If it was me, I'd just stick a dot in front, calling it
".pg_log_file" but we want to be consistent with existing
practice.

I'd imagine that you'd create the new file in the same directly
as pg_log_file, that being the usual way to ensure that both
files are in the same file system (a requirement).  But when
you're looking at other uses of pgrename() in the existing code
it wouldn't hurt to see check what that code is doing regards
placement of the new file in the file system.

If you have any guidance/scripts/whatever which support
testing your patch please send it my way.  Thanks.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 6 Apr 2016 22:26:13 -0500
"Karl O. Pinc" <kop@meme.com> wrote:
> On Wed, 23 Mar 2016 23:22:26 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
> 
> > Thanks for the reminder, here is the v3 of the patch after a deeper
> > review and testing. It is now registered to the next commit fest
> > under the System Administration topic. 

> I've not yet even tried building it, 

Ok.  I've built it (but not tested it).  Some comments.

The docs don't build.  You are missing a "<row"> at line
15197 of func.sgml.  (Patched against current HEAD.)

Is there any rationale for the placement of your function
in that documentation table?  I don't see any organizing principle
to the table so am wondering where it might best fit.
(http://www.postgresql.org/docs/devel/static/functions-info.html)

Perhaps you want to write?:
  <para>   <function>pg_current_logfile</function> returns the name of the  current log file used by the logging
collector,as  <type>text</type>. Log collection must be active or the  return value is undefined.  </para>
 

(Removed "a" before "text", and added "or..." to the end.)

Unless you want to define the return value to be NULL when
log collection is not active.  This might be cleaner.
I say this because I'm tempted to add "This can be tested
for with current_setting('logging_collector')='on'." to
the end of the paragraph.  But adding such text means
that the "logging_collector" setting is documented in multiple
places, in some sense, and such redundancy is best
avoided when possible.

I don't see a problem with defining the return value to be
NULL -- so long as it's defined to be NULL when there is
no current log file.  This would be different from defining
it to be NULL when log collection is off.  Although not
very different it should be clear that using pg_currrent_logfile()
to test whether log collection is on is not a good practice.
Perhaps some text like?:
  <para>   <function>pg_current_logfile</function> returns the name of the  current log file used by the logging
collector,as  <type>text</type>. <literal>NULL</literal> is returned  and a <literal>NOTICE</literal> raised when no
logfile exists.  </para>
 

(I'm going to have to think some more about the raising of the
notice, and of the other error handling in your code.
I've not paid it any attention and error handling is always
problematic.)

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 6 Apr 2016 23:37:09 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> On Wed, 6 Apr 2016 22:26:13 -0500
> "Karl O. Pinc" <kop@meme.com> wrote:
> > On Wed, 23 Mar 2016 23:22:26 +0100
> > Gilles Darold <gilles.darold@dalibo.com> wrote:
> >   
> > > Thanks for the reminder, here is the v3 of the patch after a
> > > deeper review and testing. It is now registered to the next
> > > commit fest under the System Administration topic.   

> Ok.  I've built it (but not tested it).  Some comments.

The logic in src/backend/postmaster/syslogger.c
looks good to me.  (The log file is created before you
create the pg_curr_log file, so the worst thing to happen
is that the user gets the old log file, never a non-existent
file.)

In two places in src/backend/postmaster/syslogger.c,
before you call store_current_log_filename()
you have a comment that's more than 80 characters
long.  Please fix.  (But see below for more.)
(http://www.postgresql.org/docs/devel/static/source-format.html)

Given the name store_current_log_filename() the code
comment is not particularly useful.  You might
consider removing the comment.  You might also
consider changing store_currrent_log_filename()
to something like write_pg_log_file().  A little
shorter, and maybe more descriptive.  Maybe enough
so that you can get rid of the comment.

(Are there other functions that create similar files?
Maybe there is a naming convention you can follow?)

(I like comments, but the pg coding style uses fewer
of them than I might use.  Hence my notes above.)

Also, your patch seems to be using spaces, not tabs.
You want tabs.
See the formatting url above.  (I forget whether
the docs use spaces or tabs.  Check the code.)

Other thoughts:

You're going to have to do something for MS Windows
EOL conventions like in logfile_open() of
src/backend/postmaster/syslogger.  You can't just
use a "\n".

The initial pg_log_file file is created by the postmaster.
Subsequent work is done by a logging subprocess.
Are there any permission implications?

src/backend/postmaster/syslogger.c expects to see
fopen() fail with ENFILE and EMFILE.  What will you
do if you get these?  Can you do the same thing
that the log rotation code does and try to update
pg_log_file the next time something logs?  You'd
have to set a flag somewhere and test (in the regular
logging code) since presumably
the next time something is logged the log rotation
code (where all your code is) would no longer be
executed.  This would leave the user with a "stale"
pg_log_file, but I'm not sure what else to do.

Regards the ereport() log level for when you can't
create pg_log_file at all, WARNING seems a bit severe
since LOG is used when the log stops rotating for some
reason.  But there does not seem to be anything else that
fits so WARNING is ok with me.  (See: include/utils/elog.h)

(I'd use LOG if you have to defer the updating of
pg_log_file.  But logging at all here could be problematic.
You wouldn't want to trigger more even more logging
and some sort of tight little loop.  It might be best
_not_ to log in this case.)

Have you given any thought as to when logfile_rotate()
is called?  Since logfile_rotate() itself logs with ereport(),
it would _seem_ safe to ereport() from within your
store_current_log_filename(), called from within
logfile_rotate().  All the same, it wouldn't hurt to be
sure that calling ereport() from within your code
can't trigger another invocation of logfile_rotate()
leading to recursive awfulness.

The indentation of the ereport(), in the part that
continues over multiple lines, in store_current_log_filename()
seems too much.  I think the continued lines should line up with
the "W" in WARNING.

This is what I see at the moment.  I'll wait for replies
before looking further and writing more.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 7 Apr 2016 01:13:51 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> On Wed, 6 Apr 2016 23:37:09 -0500
> "Karl O. Pinc" <kop@meme.com> wrote:
> 
> > On Wed, 6 Apr 2016 22:26:13 -0500
> > "Karl O. Pinc" <kop@meme.com> wrote:  
> > > On Wed, 23 Mar 2016 23:22:26 +0100
> > > Gilles Darold <gilles.darold@dalibo.com> wrote:
> > >     
> > > > Thanks for the reminder, here is the v3 of the patch after a
> > > > deeper review and testing. It is now registered to the next
> > > > commit fest under the System Administration topic.    

> This is what I see at the moment.  I'll wait for replies
> before looking further and writing more.

Er, one more thing.  Isn't it true that in logfile_rotate()
you only need to call store_current_log_filename() when
logfile_open() is called with a "w" mode, and never need to
call it when logfile_open() is called with an "a" mode?


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 07/04/2016 08:30, Karl O. Pinc a écrit :
> On Thu, 7 Apr 2016 01:13:51 -0500
> "Karl O. Pinc" <kop@meme.com> wrote:
>
>> On Wed, 6 Apr 2016 23:37:09 -0500
>> "Karl O. Pinc" <kop@meme.com> wrote:
>>
>>> On Wed, 6 Apr 2016 22:26:13 -0500
>>> "Karl O. Pinc" <kop@meme.com> wrote:
>>>> On Wed, 23 Mar 2016 23:22:26 +0100
>>>> Gilles Darold <gilles.darold@dalibo.com> wrote:
>>>>
>>>>> Thanks for the reminder, here is the v3 of the patch after a
>>>>> deeper review and testing. It is now registered to the next
>>>>> commit fest under the System Administration topic.
>> This is what I see at the moment.  I'll wait for replies
>> before looking further and writing more.
> Er, one more thing.  Isn't it true that in logfile_rotate()
> you only need to call store_current_log_filename() when
> logfile_open() is called with a "w" mode, and never need to
> call it when logfile_open() is called with an "a" mode?
>
>
> Karl <kop@meme.com>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein
>
>

Hi Karl,

Thank you very much for the patch review and please apologies this too
long response delay. I was traveling since end of April and totally
forgotten this patch. I have applied all your useful feedbacks on the
patch and attached a new one (v4) to this email :

    - Fix the missing <row> in doc/src/sgml/func.sgml
    - Rewrite pg_current_logfile descritption as suggested
    - Remove comment in src/backend/postmaster/syslogger.c
    - Use pgrename to first write the filename to a temporary file
before overriding the last log file.
    - Rename store_current_log_filename() into logfile_writename() -
logfile_getname is used to retrieve the filename.
    - Use logfile_open() to enable CRLF line endings on Windows
    - Change log level for when it can't create pg_log_file, from
WARNING to LOG
    - Remove NOTICE message when the syslogger is not enabled, the
function return null.

On other questions:

> "src/backend/postmaster/syslogger.c expects to see fopen() fail with
ENFILE and EMFILE.  What will you do if you get these?"

    - Nothing, if the problem occurs during the log rotate call, then
log rotation file is disabled so logfile_writename() will not be called.
Case where the problem occurs between the rotation call and the
logfile_writename() call is possible but I don't think that it will be
useful to try again. In this last case log filename will be updated
during next rotation.

> "Have you given any thought as to when logfile_rotate() is called?
Since logfile_rotate() itself logs with ereport(), it would _seem_ safe
to ereport() from within your store_current_log_filename(), called from
within logfile_rotate()."

    - Other part of logfile_rotate() use ereport() so I guess it is safe
to use it.

> "The indentation of the ereport(), in the part that continues over
multiple lines"

    - This was my first thought but seeing what is done in the other
call to ereport() I think I have done it the right way.

> "Isn't it true that in logfile_rotate() you only need to call
store_current_log_filename() when logfile_open() is called with a "w"
mode, and never need to call it when logfile_open() is called with an
"a" mode?"

    - No, it is false, append mode is used when logfile_rotate used an
existing file but the filename still change.


Best regards,

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 28 Jun 2016 11:06:24 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Thank you very much for the patch review and please apologies this too
> long response delay. I was traveling since end of April and totally
> forgotten this patch. I have applied all your useful feedbacks on the
> patch and attached a new one (v4) to this email :

Hi Gilles,

My schedule is really full at the moment.  I'll get to this
when I have a bit of time.  If you get anxious please write
and I'll see about making faster progress.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Sat, Jul 2, 2016 at 8:56 AM, Karl O. Pinc <kop@meme.com> wrote:
> On Tue, 28 Jun 2016 11:06:24 +0200
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> Thank you very much for the patch review and please apologies this too
>> long response delay. I was traveling since end of April and totally
>> forgotten this patch. I have applied all your useful feedbacks on the
>> patch and attached a new one (v4) to this email :
>
> My schedule is really full at the moment.  I'll get to this
> when I have a bit of time.  If you get anxious please write
> and I'll see about making faster progress.

Moved to next CF, the patch still applies. Karl, you have registered
to review this patch a couple of months back but nothing happened. I
have removed your name for now. If you have time, don't hesitate to
come back to it.
-- 
Michael



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Mon, 3 Oct 2016 13:35:16 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:
> 
> Moved to next CF, the patch still applies. Karl, you have registered
> to review this patch a couple of months back but nothing happened. I
> have removed your name for now. If you have time, don't hesitate to
> come back to it.

Right.  Hope to get back to it soon.  Won't register until I'm ready
to look at it.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Christoph Berg
Дата:
Hi Gilles,

I've just tried v4 of the patch. The OID you picked for
pg_current_logfile doesn't work anymore, but after increasing it
randomly by 10000, it compiles. I like the added functionality,
especially that "select pg_read_file(pg_current_logfile());" just
works.

What bugs me is the new file "pg_log_file" in PGDATA. It clutters the
directory listing. I wouldn't know where else to put it, but you might
want to cross-check that with the thread that is trying to reshuffle
the directory layout to make it easier to exclude files from backups.
(Should this file be part of backups?)

It's probably correct to leave the file around on shutdown (given it's
still a correct pointer). But there might be a case for removing it on
startup if logging_collector isn't active anymore.

Also, pg_log_file is tab-completion-unfriendly, it conflicts with
pg_log/. Maybe name it current_logfile?

Another thing that might possibly be improved is csv logging:

# select pg_read_file(pg_current_logfile());                        pg_read_file                          
───────────────────────────────────────────────────────────────LOG:  ending log output to stderr
   ↵HINT:  Future log output will go to log destination "csvlog".↵
 

-rw-------  1 cbe staff 1011 Okt  3 15:06 postgresql-2016-10-03_150602.csv
-rw-------  1 cbe staff   96 Okt  3 15:06 postgresql-2016-10-03_150602.log

... though it's unclear what to do if both stderr and csvlog are
selected.

Possibly NULL should be returned if only "syslog" is selected.
(Maybe remove pg_log_file once 'HINT:  Future log output will go to
log destination "syslog".' is logged?)

Christoph



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 03/10/2016 à 16:09, Christoph Berg a écrit :
> Hi Gilles,
>
> I've just tried v4 of the patch. The OID you picked for
> pg_current_logfile doesn't work anymore, but after increasing it
> randomly by 10000, it compiles. I like the added functionality,
> especially that "select pg_read_file(pg_current_logfile());" just
> works.

I've changed the OID and some other things in this v5 of the patch, see
bellow.

> What bugs me is the new file "pg_log_file" in PGDATA. It clutters the
> directory listing. I wouldn't know where else to put it, but you might
> want to cross-check that with the thread that is trying to reshuffle
> the directory layout to make it easier to exclude files from backups.
> (Should this file be part of backups?)
>
> It's probably correct to leave the file around on shutdown (given it's
> still a correct pointer). But there might be a case for removing it on
> startup if logging_collector isn't active anymore.

The file has been renamed into current_logfile and is now removed at
startup if logging_collector is not active. The file can be excluded
from a backup but otherwise if it is restored it will be removed or
overridden at startup. Perhaps the file can give a useful information in
a backup to know the last log file active at backup time, but not sure
it has any interest.

I'm not sure which thread is talking about reshuffling the directory
layout, please give me a pointer if this is not the thread talking about
renaming of pg_xlog and pg_clog. In the future if we have a directory to
store files that must be excluded from backup or status files, it will
be easy to move this file here. I will follow such a change.

> Also, pg_log_file is tab-completion-unfriendly, it conflicts with
> pg_log/. Maybe name it current_logfile?
Right, done.

> Another thing that might possibly be improved is csv logging:
>
> # select pg_read_file(pg_current_logfile());
>                          pg_read_file
> ───────────────────────────────────────────────────────────────
>  LOG:  ending log output to stderr                            ↵
>  HINT:  Future log output will go to log destination "csvlog".↵
>
> -rw-------  1 cbe staff 1011 Okt  3 15:06 postgresql-2016-10-03_150602.csv
> -rw-------  1 cbe staff   96 Okt  3 15:06 postgresql-2016-10-03_150602.log
>
> ... though it's unclear what to do if both stderr and csvlog are
> selected.
>
> Possibly NULL should be returned if only "syslog" is selected.
> (Maybe remove pg_log_file once 'HINT:  Future log output will go to
> log destination "syslog".' is logged?)

I've totally missed that we can have log_destination set to stderr and
csvlog at the same time, so pg_current_logfile() might return two
filenames in this case. I've changed the function to return a setof
record to report the last stderr or csv log file or both. One another
major change is that the current log filename list is also updated after
a configuration reload and not just after a startup or a log rotation.
So in the case of you are switching from stderr to  csvlog or both you
will see immediately the change in current_logfile instead of waiting
for the next log rotation.

  * log_destination set to csvlog only:

postgres=# select * from pg_current_logfile();
          pg_current_logfile
---------------------------------------
 pg_log/postgresql-2016-10-07_1646.csv
(1 row)

* log_destination set to stderr only:

postgres=# select pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

postgres=# select * from pg_current_logfile();
          pg_current_logfile
---------------------------------------
 pg_log/postgresql-2016-10-07_1647.log
(1 row)

* log_destination set to both stderr,csvlog:

postgres=# select pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

postgres=# select * from pg_current_logfile();
          pg_current_logfile
---------------------------------------
 pg_log/postgresql-2016-10-07_1648.log
 pg_log/postgresql-2016-10-07_1648.csv
(2 rows)

When logging_collector is disabled, this function return null.

As the return type has changed to a setof, the query to read the file
need to be change too:

postgres=# SELECT pg_read_file(log) FROM pg_current_logfile() b(log);

pg_read_file

--------------------------------------------------------------------------------------------------------------------------------
LOG:  duration: 0.182 ms  statement: select  pg_read_file(log) from
pg_current_logfile() file(log);+

(1 row)

I can change the return type to a single text[] if that's looks better.

Thanks

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
Christoph Berg
Дата:
Hi Gilles,

Re: Gilles Darold 2016-10-07 <0731a353-8d2f-0f2f-fcd5-fde77114cb7e@dalibo.com>
> > What bugs me is the new file "pg_log_file" in PGDATA. It clutters the
> > directory listing. I wouldn't know where else to put it, but you might
> > want to cross-check that with the thread that is trying to reshuffle
> > the directory layout to make it easier to exclude files from backups.
> > (Should this file be part of backups?)
> >
> > It's probably correct to leave the file around on shutdown (given it's
> > still a correct pointer). But there might be a case for removing it on
> > startup if logging_collector isn't active anymore.
>
> The file has been renamed into current_logfile and is now removed at
> startup if logging_collector is not active. The file can be excluded
> from a backup but otherwise if it is restored it will be removed or
> overridden at startup. Perhaps the file can give a useful information in
> a backup to know the last log file active at backup time, but not sure
> it has any interest.
>
> I'm not sure which thread is talking about reshuffling the directory
> layout, please give me a pointer if this is not the thread talking about
> renaming of pg_xlog and pg_clog. In the future if we have a directory to
> store files that must be excluded from backup or status files, it will
> be easy to move this file here. I will follow such a change.

I meant "exclude from backup", but let's not worry here because it
will be very easy to move the file to this location once it exists.

Thanks for renaming the file, the new name makes the purpose of the
file very clear.

I'm still not very happy about the file being in PGDATA (modulo the
move mentioned above) - how about the idea of putting it into global/?
That location isn't ideal either, but maybe a better place for now?
(pg_control is a similar write-by-rename file in there as well, so
we'd only be constantly rewriting one subdirectory instead of two.)

> > # select pg_read_file(pg_current_logfile());
> >                          pg_read_file
> > ───────────────────────────────────────────────────────────────
> >  LOG:  ending log output to stderr                            ↵
> >  HINT:  Future log output will go to log destination "csvlog".↵
> >
> > -rw-------  1 cbe staff 1011 Okt  3 15:06 postgresql-2016-10-03_150602.csv
> > -rw-------  1 cbe staff   96 Okt  3 15:06 postgresql-2016-10-03_150602.log
> >
> > ... though it's unclear what to do if both stderr and csvlog are
> > selected.
> >
> > Possibly NULL should be returned if only "syslog" is selected.
> > (Maybe remove pg_log_file once 'HINT:  Future log output will go to
> > log destination "syslog".' is logged?)
>
> I've totally missed that we can have log_destination set to stderr and
> csvlog at the same time, so pg_current_logfile() might return two
> filenames in this case. I've changed the function to return a setof
> record to report the last stderr or csv log file or both. One another
> major change is that the current log filename list is also updated after
> a configuration reload and not just after a startup or a log rotation.
> So in the case of you are switching from stderr to  csvlog or both you
> will see immediately the change in current_logfile instead of waiting
> for the next log rotation.

I'm unsure if having two lines of output there is helpful. In the case
where both log destinations are active, pg_read_file(pg_current_logfile())
prints the log contents twice. If I want a specific format, I'll have
to use limit and/or offset which seems fragile.

A better design might be to return two columns instead:

postgres=# select * from pg_current_logfile();            stderr                    |              csvlog
---------------------------------------+---------------------------------------pg_log/postgresql-2016-10-07_1646.log |
pg_log/postgresql-2016-10-07_1646.csv

("stderr" is not a nice column name, but it would at least match the
log_destination setting.)

The on-disk format in current_logfile could then be to always write
two lines, and leave the first or second line empty for the disabled
format.

The downside of that idea is that "pg_read_file(pg_current_logfile())"
won't work, so I'm not sure this change is an improvement in
usability.

(The alternative could be to return an extra column:

postgres=# select * from pg_current_logfile(); type  |     logfile
---------------------------------------stderr | pg_log/postgresql-2016-10-07_1646.logcsvlog |
pg_log/postgresql-2016-10-07_1646.csv

... but this makes (interactive) querying even harder, though it
would scale better to any future added log formats/channels.)

> I can change the return type to a single text[] if that's looks better.

Arrays would be worse, I think.


There's a bug in the SRF code, on rm'ing current_logfile, invoking
pg_current_logfile() crashes the server.

Christoph



Re: Patch to implement pg_current_logfile() function

От
Christoph Berg
Дата:
Re: To Gilles Darold 2016-10-14 <20161014125406.albrfj3qldiwjnrr@msg.df7cb.de>
> A better design might be to return two columns instead:
> 
> postgres=# select * from pg_current_logfile();
>              stderr                    |              csvlog
> ---------------------------------------+---------------------------------------
>  pg_log/postgresql-2016-10-07_1646.log | pg_log/postgresql-2016-10-07_1646.csv

> (The alternative could be to return an extra column:
> 
> postgres=# select * from pg_current_logfile();
>   type  |     logfile
> ---------------------------------------
>  stderr | pg_log/postgresql-2016-10-07_1646.log
>  csvlog | pg_log/postgresql-2016-10-07_1646.csv

Usability-wise it might be better to have pg_current_logfile() just
return the name of the text log (and possibly a HINT that there's a
csv log if stderr is disabled), and have a second function
pg_current_csvlog() return the csv log name.

The choice which design is better will probably depend on if we think
these functions are meant for interactive use (-> 2 functions), or for
automated use (-> 2 columns). My guess would be that interactive use
is more important here.

Christoph



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
<div class="moz-cite-prefix">Le 14/10/2016 à 18:50, Christoph Berg a écrit :<br /></div><blockquote
cite="mid:20161014165007.bzbuzeklsvcr3swh@msg.df7cb.de"type="cite"><pre wrap="">Re: To Gilles Darold 2016-10-14 <a
class="moz-txt-link-rfc2396E"
href="mailto:20161014125406.albrfj3qldiwjnrr@msg.df7cb.de"><20161014125406.albrfj3qldiwjnrr@msg.df7cb.de></a>
</pre><blockquote type="cite"><pre wrap="">A better design might be to return two columns instead:

postgres=# select * from pg_current_logfile();            stderr                    |              csvlog
---------------------------------------+---------------------------------------pg_log/postgresql-2016-10-07_1646.log |
pg_log/postgresql-2016-10-07_1646.csv
</pre></blockquote><blockquote type="cite"><pre wrap="">(The alternative could be to return an extra column:

postgres=# select * from pg_current_logfile(); type  |     logfile
---------------------------------------stderr | pg_log/postgresql-2016-10-07_1646.logcsvlog |
pg_log/postgresql-2016-10-07_1646.csv
</pre></blockquote><pre wrap="">Usability-wise it might be better to have pg_current_logfile() just
return the name of the text log (and possibly a HINT that there's a
csv log if stderr is disabled), and have a second function
pg_current_csvlog() return the csv log name.

The choice which design is better will probably depend on if we think
these functions are meant for interactive use (-> 2 functions), or for
automated use (-> 2 columns). My guess would be that interactive use
is more important here.
</pre></blockquote><br /><p>Agree, the usability of the current version is really a pain. What I've thought is that the
functioncould return the csv log in all case when csvlog is set in log_destination and the stderr log file when csvlog
isnot defined. We can also have a parametrer to ask for a specific log format, like pg_current_logfile('csv') or
pg_current_logfile('stderr').<preclass="moz-signature" cols="72">-- 
 
Gilles Darold
Consultant PostgreSQL
<a class="moz-txt-link-freetext" href="http://dalibo.com">http://dalibo.com</a> - <a class="moz-txt-link-freetext"
href="http://dalibo.org">http://dalibo.org</a>
</pre>

Re: Patch to implement pg_current_logfile() function

От
Christoph Berg
Дата:
Re: Gilles Darold 2016-10-14 <be9571dc-ae7c-63d4-6750-d7243cb5fbc0@dalibo.com>
> Agree, the usability of the current version is really a pain. What I've
> thought is that the function could return the csv log in all case when
> csvlog is set in log_destination and the stderr log file when csvlog is
> not defined. We can also have a parametrer to ask for a specific log
> format, like pg_current_logfile('csv') or pg_current_logfile('stderr').

Good idea to add an optional parameter.

pg_current_logfile(type text DEFAULT 'auto')

pg_current_logfile()
pg_current_logfile('stderr')
pg_current_logfile('csvlog')
pg_current_logfile('auto')

I think I'd prefer the stderr variant by default when both variants
are configured.

Christoph



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 14/10/2016 à 20:45, Christoph Berg a écrit :
> Re: Gilles Darold 2016-10-14 <be9571dc-ae7c-63d4-6750-d7243cb5fbc0@dalibo.com>
>> Agree, the usability of the current version is really a pain. What I've
>> thought is that the function could return the csv log in all case when
>> csvlog is set in log_destination and the stderr log file when csvlog is
>> not defined. We can also have a parametrer to ask for a specific log
>> format, like pg_current_logfile('csv') or pg_current_logfile('stderr').
> Good idea to add an optional parameter.
>
> pg_current_logfile(type text DEFAULT 'auto')
>
> pg_current_logfile()
> pg_current_logfile('stderr')
> pg_current_logfile('csvlog')
> pg_current_logfile('auto')
>
> I think I'd prefer the stderr variant by default when both variants
> are configured.
>
> Christoph
>
>

Here is the v6 of the patch, here is the description of the
pg_current_logfile() function, I have tried to keep thing as simple as
possible:

pg_current_logfile( [ destination text ] )
-----------------------------------------------------

Returns the name of the current log file used by the logging collector,
as text.
Log collection must be active or the return value is undefined. When
csvlog is
used as log destination, the csv filename is returned, when it is set to
stderr,
the stderr filename is returned. When both are used, it returns the
stderr filename.

There is an optional parameter of type text to determines the log
filename to
return following the log format, values can be 'csvlog' or 'stderr'.
When the log
format asked is not used as log destination then the return value is
undefined.

Example of use when log_destination = 'csvlog,stderr'

postgres=# select pg_current_logfile();
           pg_current_logfile
-----------------------------------------
 pg_log/postgresql-2016-10-18_121326.log
(1 row)

postgres=# select pg_current_logfile('csvlog');

           pg_current_logfile
-----------------------------------------
 pg_log/postgresql-2016-10-18_121326.csv
(1 row)

postgres=# select pg_current_logfile('stderr');
           pg_current_logfile
-----------------------------------------
 pg_log/postgresql-2016-10-18_121326.log
(1 row)

postgres=# select pg_current_logfile('csv');
ERROR:  log format csv not supported, possible values are stderr or csvlog

postgres=# select pg_read_file(pg_current_logfile());

pg_read_file
---------------------------------------------------------------------------------------------------
 2016-10-18 14:06:30.576 CEST [8113] ERROR:  invalid input syntax for
integer: "A" at character 10+
 2016-10-18 14:06:30.576 CEST [8113] STATEMENT:  select
1='A';                                    +

(1 row)

I've also fixed the crash when current_logfile is removed by hand.


--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 18 Oct 2016 14:18:36 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Here is the v6 of the patch, here is the description of the
> pg_current_logfile() function, I have tried to keep thing as simple as
> possible:
>
> pg_current_logfile( [ destination text ] )
> -----------------------------------------------------

Attached is a doc patch to apply on top of your v6 patch.

Changes are, roughly:

Put pg_log_file in alphabetical order in the file name listing
and section body.

Put pg_current_logfile in alphabetical order in the function
name listing and section body.

1 argument functions don't seem to have a parameter name
when listed in documentation tables, just a data type,
so I got rid of the parameter name for pg_current_logfile().

Cleaned up the wording and provided more detail.

Added hyperlink to log_destination config parameter.

Added markup to various system values.  The markup does
not stand out very well in the html docs, but that's a different
issue and should be fixed by changing the markup processing.
(I used the markup found in the log_destination()
config parameter docs.)

pg_current_logfile does not seem related to pg_listening_channels
or pg_notification_queue_usage so I moved the latter 2 index
entries closer to their text.


I also have a couple of preliminary comments.

It seems like the code is testing for whether the
logfile is a CSV file by looking for a '.csv' suffix
on the end of the file name.  This seems a little fragile.
Shouldn't it be looking at something set internally when the
log_destination config declaration is parsed?

Since pg_log_file may contain only one line, and that
line may be either the filename of the csv log file or
the file name of the stderr file name it's impossible
to tell whether that single file is in csv or stderr
format.  I suppose it might be possible based on file
name suffix, but Unix expressly ignores file name
extensions and it seems unwise to force dependence
on them.  Perhaps each line could begin with
the "type" of the file, either 'csv' or 'stderr'
followed by a space and the file name?

In other words,
as long as you're making the content of pg_log_file
a data structure that contains more than just a single
file name you may as well make that data structure
something well-defined, easily parseable in shell, extensible,
and informative.

Hope to provide more feedback soon.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 25 Oct 2016 22:30:48 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> Hope to provide more feedback soon.

Before I forget:

"make check" fails, due to oid issues with pg_current_logfile().

You're writing Unix eol characters into pg_log_file.  (I think.)
Does this matter on MS Windows?  (I'm not up on MS Windows,
and haven't put any thought into this at all.  But didn't
want to forget about the potential issue.)

Now that pg_log_file contains multiple lines shouldn't
it be called pg_log_files?

In the docs, other functions that take optional arguments
show up as multiple rows.  Attached is a new version of
my patch to the v6 patch which fixes this and supplies
a slightly better short description of pg_log_filename().

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 25 Oct 2016 22:53:41 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> On Tue, 25 Oct 2016 22:30:48 -0500
> "Karl O. Pinc" <kop@meme.com> wrote:
>
> > Hope to provide more feedback soon.

Er, attached is yet another doc patch to the v6 patch.
Sorry about that.

Changes pg_current_logfile() detailed documentation.

Instead of saying that return values are undefined
I've documented that NULL is returned.  And
reworded, and shortened, my previous wording.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Another new version of a doc patch to the v6 patch.

More better English.  *sigh*

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 25 Oct 2016 22:30:48 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> Since pg_log_file may contain only one line, and that
> line may be either the filename of the csv log file or
> the file name of the stderr file name it's impossible
> to tell whether that single file is in csv or stderr
> format.  I suppose it might be possible based on file
> name suffix, but Unix expressly ignores file name
> extensions and it seems unwise to force dependence
> on them.  Perhaps each line could begin with
> the "type" of the file, either 'csv' or 'stderr'
> followed by a space and the file name?  
> 
> In other words,
> as long as you're making the content of pg_log_file
> a data structure that contains more than just a single
> file name you may as well make that data structure
> something well-defined, easily parseable in shell, extensible,
> and informative.

While you're at it, it wouldn't hurt to provide another
function that tells you the format of the file returned
by pg_current_logfile(), since pg_current_logfile()
called without arguments could return either a stderr
formatted file or a csvlog formatted file.

Or leave it for the future.  Just a thought.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 26/10/2016 à 05:30, Karl O. Pinc a écrit :
On Tue, 18 Oct 2016 14:18:36 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

Here is the v6 of the patch, here is the description of the
pg_current_logfile() function, I have tried to keep thing as simple as
possible:

pg_current_logfile( [ destination text ] )
-----------------------------------------------------
Attached is a doc patch to apply on top of your v6 patch.

Thanks a lot for the documentation fixes, I've also patched some of your changes, see v7 of the patch and explanations bellow.

Changes are, roughly:

Put pg_log_file in alphabetical order in the file name listing
and section body.

This file is now named current_logfile, I have changed the named in the documentation, especially in storage.sgml. Sorry for missing that in my v6 patch.

One other change in documentation is the explanation of values stored in this file. This is a path: log_directory/log_filename, and no more the log file name only. This will help to get full path of the log at system command level. This is also the value returned by function the pg_current_logfile() to be able to read file directly with a simple:

     SELECT pg_read_file(pg_current_logfile());

It can be also useful if the file is included in a backup to find the last log corresponding to the backup.


I also have a couple of preliminary comments.

It seems like the code is testing for whether the
logfile is a CSV file by looking for a '.csv' suffix
on the end of the file name.  This seems a little fragile.
Shouldn't it be looking at something set internally when the
log_destination config declaration is parsed?

Right, even if syslogger.c always adds the .csv suffix to the log file. This method would failed if the log file was using this extension even for a stderr format. It has been fixed in the v7 patch to no more use the extension suffix.

Since pg_log_file may contain only one line, and that
line may be either the filename of the csv log file or
the file name of the stderr file name it's impossible
to tell whether that single file is in csv or stderr
format.  I suppose it might be possible based on file
name suffix, but Unix expressly ignores file name
extensions and it seems unwise to force dependence
on them.  Perhaps each line could begin with
the "type" of the file, either 'csv' or 'stderr'
followed by a space and the file name? 

The current_logfile may contain 2 lines, one for csvlog and an other for stderr when they are both defined in log_destination. As said above, the csvlog file will always have the .csv suffix, I guess it is enough to now the format but I agree that it will not be true in all cases. To keep things simple I prefer to only register the path to the log file, external tools can easily detect the format or can ask for the path to a specific log format using SELECT pg_current_logfile('stderr'); for example. This is my point of view, but if there's a majority to add the log format into the current_logfile why not.

I have copy/paste here your other comments to limit the number of messages:

> You're writing Unix eol characters into pg_log_file.  (I think.)
> Does this matter on MS Windows?  (I'm not up on MS Windows,
> and haven't put any thought into this at all.  But didn't
> want to forget about the potential issue.)

This doesn't matter as the file is opened in text mode (see logfile_open() in syslogger.c) and use of LF in fprintf() will be automatically converted to CRLF on MS Windows.

> While you're at it, it wouldn't hurt to provide another
> function that tells you the format of the file returned
> by pg_current_logfile(), since pg_current_logfile()
> called without arguments could return either a stderr
> formatted file or a csvlog formatted file.

The log format can be check using something like:

postgres=# SELECT pg_current_logfile(), pg_current_logfile('csvlog');
           pg_current_logfile            |           pg_current_logfile           
-----------------------------------------+-----------------------------------------
 pg_log/postgresql-2016-10-26_231700.log | pg_log/postgresql-2016-10-26_231700.csv
(1 row)

postgres=# SELECT CASE WHEN (pg_current_logfile() = pg_current_logfile('stderr')) THEN 'stderr' ELSE 'csvlog' END AS log_format;
 log_format
------------
 stderr
(1 row)

but if we want we can have an other internal function with a call like:

    SELECT pg_log_format(pg_current_logfile());

that will return stderr or csvlog but I'm not sure this is necessary.


Thanks for your review, let me know if there's any thing to adapt.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org
Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 00:31:56 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Thanks a lot for the documentation fixes, I've also patched some of
> your changes, see v7 of the patch and explanations bellow.

Thanks.  Sorry if I've not kept up on your latest decisions.

> > Put pg_log_file in alphabetical order in the file name listing
> > and section body.  
> 
> This file is now named current_logfile, I have changed the named in
> the documentation, especially in storage.sgml. 

Since it now can contain multiple pathnames perhaps the name of the
file should be "current_logfiles"?  Seems more descriptive.

> One other change in documentation is the explanation of values stored
> in this file. This is a path: log_directory/log_filename, and no more
> the log file name only. This will help to get full path of the log at
> system command level. This is also the value returned by function the
> pg_current_logfile() to be able to read file directly with a simple:
> 
>      SELECT pg_read_file(pg_current_logfile());

Sounds good.

> > I also have a couple of preliminary comments.

> > Since pg_log_file may contain only one line, and that
> > line may be either the filename of the csv log file or
> > the file name of the stderr file name it's impossible
> > to tell whether that single file is in csv or stderr
> > format.  I suppose it might be possible based on file
> > name suffix, but Unix expressly ignores file name
> > extensions and it seems unwise to force dependence
> > on them.  Perhaps each line could begin with
> > the "type" of the file, either 'csv' or 'stderr'
> > followed by a space and the file name?   
> 
> The current_logfile may contain 2 lines, one for csvlog and an other
> for stderr when they are both defined in log_destination. As said
> above, the csvlog file will always have the .csv suffix, I guess it
> is enough to now the format but I agree that it will not be true in
> all cases. To keep things simple I prefer to only register the path
> to the log file, external tools can easily detect the format or can
> ask for the path to a specific log format using SELECT
> pg_current_logfile('stderr'); for example. This is my point of view,
> but if there's a majority to add the log format into the
> current_logfile why not.

Let me explain my reasoning:  The current_logfile file's location
is not typically going to change.  The cluster gets created and
current_logfile then has a known location.  

Having a known location in the filesystem for current_logfile is sort 
of the point of current_logfile.  A script knows where to find it,
to look at it's content and find out where to get the log file
it needs.  Execing a psql process or otherwise creating a database
connection, just in order to execute "SELECT
pg_current_logfile('stderr');" to discover where the current logs
are, is _way_ more complicated and error prone that reading
a file's content.

But what if current_logfile contains only a single line?  What
sort of file format does the logfile have?  If you don't know
you can't process the logfile content.

When there's multiple lines in current_logfile your script might
be looking for a logfile in a particular format.  How is the
script supposed to know the file format of each logfile listed?
Relying on a '.csv' file name extension is adequate, but
only under a number of assumptions.  What if PG adds another
logfile format that's different from stderr; how would this
new format be distinguished from stderr?  What if MS Excel
changes the CSV format (yet again) or a non MS Excel CSV
format is supported by PG, so that PG supports
multiple CSV formats; how would the different CSV
file formats be distinguished from each other.  Without
going on and on, there's good reasons why Unix does not
rely on filename extensions to determine the format of file
content.  Sure, scripts could always do "something", like
exec the "file" program, to discover, or at least guess
at, the format of the log file.  Or we could have PG
be forever dependent upon using filename extensions
when introducing new logfile formats.  There's always
kludges which will handle new circumstances.  But why
go there?  PG knows the format of the files it's writing into
current_logfile. 

My argument is strongly related to the "Explicit is better
than implicit" philosophy of design.
It's not hard to cleanly expose the logfile format to the 
reader of current_logfile; guessing would never be required
no matter the future of PG.
"current_logfile" must then be parsed, but if you _are_ going
to expose the file format then putting it into the same file
as the logfile pathnames is the way to go.

"format <space> path" is as easy to parse as looking for 
a ".csv" suffix, and a lot more clear and future proof.  (Even 
works with spaces in the "path", in shell, using the "read"
builtin.)  It does mean that every user of current_logfile _must_ 
parse.  If you don't put an explicit format into the file content
readers who already know what file format they're going
to get don't have to parse.  Making these readers parse
does not seem like too high a price.

What it comes down to is I don't buy the adequacy of the
".csv" suffix test and think that "keeping things simple" now
is a recipe for future breakage, or at least significant future
complication and confusion when it come to processing logfile 
content.

Regards the data structure to use to expose the file format
I can't vouch that "format path" is most future-proof.
It's what I came up with on the spur of the moment.
Something like: "format <format>: path <path>",
where ":" is the field separator and each data element is
tagged, would still be parseable by the shell "read" built-in
so long as the path comes last. I don't really care about 
the exact data structure but I do think the file format
meta-information should be included.

> I have copy/paste here your other comments to limit the number of
> messages:

Thanks.  I was not very disciplined and wrote a lot of emails.

> > You're writing Unix eol characters into pg_log_file.  (I think.)
> > Does this matter on MS Windows?  (I'm not up on MS Windows,
> > and haven't put any thought into this at all.  But didn't
> > want to forget about the potential issue.)  
> 
> This doesn't matter as the file is opened in text mode (see
> logfile_open() in syslogger.c) and use of LF in fprintf() will be
> automatically converted to CRLF on MS Windows.

This I should have recalled.  Thanks for the explanation.

> > While you're at it, it wouldn't hurt to provide another
> > function that tells you the format of the file returned
> > by pg_current_logfile(), since pg_current_logfile()
> > called without arguments could return either a stderr
> > formatted file or a csvlog formatted file.  
> 
> The log format can be check using something like:
> 
> postgres=# SELECT pg_current_logfile(), pg_current_logfile('csvlog');
>            pg_current_logfile            |          
> pg_current_logfile           
> -----------------------------------------+-----------------------------------------
>  pg_log/postgresql-2016-10-26_231700.log |
> pg_log/postgresql-2016-10-26_231700.csv
> (1 row)
> 
> postgres=# SELECT CASE WHEN (pg_current_logfile() =
> pg_current_logfile('stderr')) THEN 'stderr' ELSE 'csvlog' END AS
> log_format; log_format
> ------------
>  stderr
> (1 row)
> 
> but if we want we can have an other internal function with a call
> like:
> 
>     SELECT pg_log_format(pg_current_logfile());
> 
> that will return stderr or csvlog but I'm not sure this is necessary.

Why not just: SELECT pg_log_format(); since you only ever need to
know what log format is returned by pg_current_logfile() when called
without arguments?  Otherwise you already know the log format because
you asked for something specific.

My thoughts are as follows:  Either you know the log format because
you configured the cluster or you don't.  If you don't know the log
format having the log file is halfway useless.  You can do something
like back it up, but you can't ever look at it's contents (in some
sense) without knowing what data structure you're looking at.

Therefore pg_current_logfile() without any arguments is, in the sense
of any sort of automated processing of the logfile content, useless.
You must know the format of the returned logfile.  Anybody
who's doing automated log file processing, on a cluster which they
did not configure, must write a "pg_log_format()" equivalent
function.  (Probably, as you say, by using the pg_current_logfile(text)
function.)  Not everybody's going to be doing this sort of logfile
processing but I'd think it'd be common enough that a pg_log_format()
function would be in some demand.

In a lot of ways this argument is related to the one above.  I see
a need for automated processing of logfiles in arbitrarily configured
PG clusters.

> Thanks for your review, let me know if there's any thing to adapt.

I'll look over your v7 patch very soon, I hope.  If I have documentation
edits I'll probably send these first.  Knowing what the patch is
supposed to do helps me when reading the code.  :)

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Christoph Berg
Дата:
Re: Karl O. Pinc 2016-10-27 <20161026222513.74cd3def@slate.meme.com>
> Since it now can contain multiple pathnames perhaps the name of the
> file should be "current_logfiles"?  Seems more descriptive.

Not sure about that, most often it would contain one logfile only, and
catering for that seems fair to me. (But see the next comment below)

> But what if current_logfile contains only a single line?  What
> sort of file format does the logfile have?  If you don't know
> you can't process the logfile content.
> 
> When there's multiple lines in current_logfile your script might
> be looking for a logfile in a particular format.  How is the
> script supposed to know the file format of each logfile listed?

My idea here would be to always write out two lines, the first for
stderr, the second for csvlog, and leave the unused one empty. That's
easy to parse from shell scripts.

> Regards the data structure to use to expose the file format
> I can't vouch that "format path" is most future-proof.
> It's what I came up with on the spur of the moment.
> Something like: "format <format>: path <path>",
> where ":" is the field separator and each data element is
> tagged, would still be parseable by the shell "read" built-in
> so long as the path comes last. I don't really care about 
> the exact data structure but I do think the file format
> meta-information should be included.

I guess that depends on how likely we think new log formats would be
added in the future. My guess would be that it's rather unlikely, so
going with simple file format makes sense.

> Why not just: SELECT pg_log_format(); since you only ever need to
> know what log format is returned by pg_current_logfile() when called
> without arguments?  Otherwise you already know the log format because
> you asked for something specific.

That function already exists: "show log_destination;"

> Therefore pg_current_logfile() without any arguments is, in the sense
> of any sort of automated processing of the logfile content, useless.

The function without arguments is very useful for interactive use,
which is the primary point of this patch in my opinion.

Christoph



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 11:07:43 +0200
Christoph Berg <myon@debian.org> wrote:

> Re: Karl O. Pinc 2016-10-27 <20161026222513.74cd3def@slate.meme.com>

> > But what if current_logfile contains only a single line?  What
> > sort of file format does the logfile have?  If you don't know
> > you can't process the logfile content.
> > 
> > When there's multiple lines in current_logfile your script might
> > be looking for a logfile in a particular format.  How is the
> > script supposed to know the file format of each logfile listed?  
> 
> My idea here would be to always write out two lines, the first for
> stderr, the second for csvlog, and leave the unused one empty. That's
> easy to parse from shell scripts.

That'd work.

> 
> > Regards the data structure to use to expose the file format
> > I can't vouch that "format path" is most future-proof.
> > It's what I came up with on the spur of the moment.
> > Something like: "format <format>: path <path>",
> > where ":" is the field separator and each data element is
> > tagged, would still be parseable by the shell "read" built-in
> > so long as the path comes last. I don't really care about 
> > the exact data structure but I do think the file format
> > meta-information should be included.  
> 
> I guess that depends on how likely we think new log formats would be
> added in the future. My guess would be that it's rather unlikely, so
> going with simple file format makes sense.

Agreed.  I can't see adding any more meta-information other than
file format.

I'm partial to "format <space> path" over just line number, because
it's more explicit.  Either way works.

> > Why not just: SELECT pg_log_format();

> That function already exists: "show log_destination;"

Great, done!  :)

> > Therefore pg_current_logfile() without any arguments is, in the
> > sense of any sort of automated processing of the logfile content,
> > useless.  
> 
> The function without arguments is very useful for interactive use,
> which is the primary point of this patch in my opinion.

Your comment makes me wonder if pg_current_logfile(), without
arguments, should instead be "SHOW current_logfile;".

I think not, but have no rationale.  Could this be a good idea?

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 27/10/2016 à 14:14, Karl O. Pinc a écrit :
> On Thu, 27 Oct 2016 11:07:43 +0200
> Christoph Berg <myon@debian.org> wrote:
>
>> Re: Karl O. Pinc 2016-10-27 <20161026222513.74cd3def@slate.meme.com>
>>> But what if current_logfile contains only a single line?  What
>>> sort of file format does the logfile have?  If you don't know
>>> you can't process the logfile content.
>>>
>>> When there's multiple lines in current_logfile your script might
>>> be looking for a logfile in a particular format.  How is the
>>> script supposed to know the file format of each logfile listed?
>> My idea here would be to always write out two lines, the first for
>> stderr, the second for csvlog, and leave the unused one empty. That's
>> easy to parse from shell scripts.
> That'd work.

I don't think we have to store two lines when just one of the log format
is enabled. If you grep for csvlog in the file and it is not present you
will just know that csvlog is not enabled. If the log format is present
but without the path I think it adds useless information.

The current v8 of the patch only register "format<space>path" for the
enabled log destination, the description of the current_logfiles file
have been change to:

"
A file recording the current log file(s) used by the syslogger and its
log format, stderr or csvlog. Each line of the file is a space separated
list of two elements: the log format and the full path to the log file
including the value of log_directory. The log format must be present in
log_destination to be listed in the file. This file is present only when
logging_collector is activated.
"

The file have been renamed current_logfile*s*


> Agreed.  I can't see adding any more meta-information other than
> file format.
>
> I'm partial to "format <space> path" over just line number, because
> it's more explicit.  Either way works.

This is the format used. Ex:

~$ cat /usr/local/postgresql/data/current_logfile
stderr pg_log/postgresql-2016-10-27_185100.log
csvlog pg_log/postgresql-2016-10-27_185100.csv


> Your comment makes me wonder if pg_current_logfile(), without
> arguments, should instead be "SHOW current_logfile;".
>
> I think not, but have no rationale.  Could this be a good idea?
>

-1, SHOW is used to display run-time parameters values in our case this
is log_destination + log_directory + log_filename. current_logfile is a
filename not a parameter name.

Thanks again for your reviews.

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 19:03:11 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> >> Re: Karl O. Pinc 2016-10-27
> >> <20161026222513.74cd3def@slate.meme.com>  

> > Your comment makes me wonder if pg_current_logfile(), without
> > arguments, should instead be "SHOW current_logfile;".

> -1, SHOW is used to display run-time parameters values

Another interface to consider might be a system catalog:

SELECT * from postgres.pg_current_logfile;

format | path
-------+-------------------
syslog | /some/where/log
cvslog | /some/where/log.csv

(2 rows)

Maybe good if the goal is "interactive use".  Seems like
overkill to me, but thought I'd present the idea
anyway.



Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Christoph Berg
Дата:
Re: Gilles Darold 2016-10-27 <ee7efb7d-fd41-4dfd-29ed-6d05044d1cbc@dalibo.com>
> > I'm partial to "format <space> path" over just line number, because
> > it's more explicit.  Either way works.
> 
> This is the format used. Ex:
> 
> ~$ cat /usr/local/postgresql/data/current_logfile
> stderr pg_log/postgresql-2016-10-27_185100.log
> csvlog pg_log/postgresql-2016-10-27_185100.csv

On a closer look, I like this better than my "always two lines"
suggestion. +1.

> -1, SHOW is used to display run-time parameters values in our case this
> is log_destination + log_directory + log_filename. current_logfile is a
> filename not a parameter name.

I'm not sure if it's even possible to put non-GUC information in
there. It might also be a performance problem, if pg_current_logfiles
had to read-in for every "select * from pg_settings".

Re: Karl O. Pinc 2016-10-27 <20161027121141.6bd95efd@slate.meme.com>
> Another interface to consider might be a system catalog:
> 
> SELECT * from postgres.pg_current_logfile;
> 
> format | path
> -------+-------------------
> syslog | /some/where/log
> cvslog | /some/where/log.csv
> 
> (2 rows)
> 
> Maybe good if the goal is "interactive use".  Seems like
> overkill to me, but thought I'd present the idea
> anyway.

We were discussing exactly that idea upthread before concluding that a
function with a single return value is much easier to use.

Christoph



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 19:57:18 +0200
Christoph Berg <myon@debian.org> wrote:

> Re: Karl O. Pinc 2016-10-27 <20161027121141.6bd95efd@slate.meme.com>
> > SELECT * from postgres.pg_current_logfile;

> We were discussing exactly that idea upthread before concluding that a
> function with a single return value is much easier to use.

Apologies for raising the dead this Halloween season.  :)

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 19:03:11 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> The current v8 of the patch

Attached is a doc patch for your v8 patch.

Added <row>, so the docs would build.

Added markup of "system values".


Hope to look at code soon!


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 16:18:02 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> On Thu, 27 Oct 2016 19:03:11 +0200
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
> > The current v8 of the patch
>
> Attached is a doc patch for your v8 patch.
>
> Added <row>, so the docs would build.
>
> Added markup of "system values".

Drat.  The attached patch of my last email was against master.
New attached patch is to apply on top of the v8 patch.  Sorry.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 19:03:11 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> The current v8 of the patch

Attached is a patch to the v8 version of your patch.

It rewords some of the comments in the code.  Take the hunks
or leave them as you wish.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Thu, 27 Oct 2016 19:03:11 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> The current v8 of the patch

For your consideration.

Attached is a patch to apply to v8 of your patch.

I moved the call to logfile_writename() in write_syslogger_file()
into the open_csvlogfile().  That's where the new filename
is generated and it makes sense to record the new name
upon generation.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 22:09:41 -0500
"Karl O. Pinc" <kop@meme.com> wrote:
> On Thu, 27 Oct 2016 19:03:11 +0200
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
> > The current v8 of the patch
>
> For your consideration.
>
> Attached is a patch to apply to v8 of your patch.
>
> I moved the call to logfile_writename() in write_syslogger_file()
> into the open_csvlogfile().  That's where the new filename
> is generated and it makes sense to record the new name
> upon generation.

Er.  Actually attached the patch this time.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 27 Oct 2016 19:03:11 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> The current v8 of the patch

Perhaps instead of the define CURRENT_LOG_FILENAME
a better name for the symbol would be LOG_METAINFO_FILE?

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 28/10/2016 à 05:09, Karl O. Pinc a écrit :
> Hi Gilles,
>
> On Thu, 27 Oct 2016 19:03:11 +0200
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> The current v8 of the patch
> For your consideration.
>
> Attached is a patch to apply to v8 of your patch.
>
> I moved the call to logfile_writename() in write_syslogger_file()
> into the open_csvlogfile().  That's where the new filename
> is generated and it makes sense to record the new name
> upon generation.

Yes, it make sense. The last change to documentation, comment and this
move have been included in the v9 of the patch, attached here.

Thanks a lot.

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Fri, 28 Oct 2016 10:03:37 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> ...
> the v9 of the patch, attached here.

I notice that there are a number of user-supplied GUC
values for log_destination that are repeatedly used,
both in the GUC code and in your patch.  These are
presently written as hardcoded strings.

Attached are 2 patches which abstract the values a
user is supposed to supply for log_destination.


patch_pg_current_logfile-v9.diff.guc_values-part1
This applies to both master HEAD and on top of your v9
patch.  It abstracts the user-supplied values within
the GUC code.

patch_pg_current_logfile-v9.diff.guc_values-part2
This applies on top of your v9 patch.

I couldn't find a good place to put the newly defined symbols
in the existing code so the part1 patch creates
src/include/utils/guc_values.h.  Someone who knows
the code better than me would be better able to judge
if making a new .h file is a good idea.  Likewise, I presume
that a "GUCV_" prefix for the new symbols is good, but this
too could use review.

The odd part about the part1 patch is that GUCV_EVENTLOG
is never used anywhere but in src/backend/utils/misc/guc.c.
But it is used twice there and it seemed like as long as
I was doing the rest of the log_destination values I should
abstract eventlog too.

If we use these patches I propose that we keep the
part1 patch and submit it separately to the committers.
Seems like it'd be easier to review/commit when the changes to
existing code are kept separate from new code.

> Thanks a lot.

Thank you also for considering my ideas.  :)

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Fri, 28 Oct 2016 10:03:37 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> ...
> v9 of the patch, attached here.

Attached are 2 more documentation patchs to apply on
top of your v9 patch.


patch_pg_current_logfile-v9.diff.doc_current_logfiles

Explains the current_logfiles file in the
narrative documentation.  It's not like I want
to toot our horn here.  I'm afraid that otherwise
no one will notice the feature.


patch_pg_current_logfile-v9.diff.doc_indexes

Fixes an index entry and add more.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 29/10/2016 à 14:38, Karl O. Pinc a écrit :
> On Fri, 28 Oct 2016 10:03:37 +0200
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> ...
>> v9 of the patch, attached here.
> Attached are 2 more documentation patchs to apply on
> top of your v9 patch.
>
>
> patch_pg_current_logfile-v9.diff.doc_current_logfiles
>
> Explains the current_logfiles file in the
> narrative documentation.  It's not like I want
> to toot our horn here.  I'm afraid that otherwise
> no one will notice the feature.
>
>
> patch_pg_current_logfile-v9.diff.doc_indexes
>
> Fixes an index entry and add more.
>
> Regards,
>
> Karl <kop@meme.com>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein

The attached v10 of the current_logfiles patch include your last changes
on documentation but not the patch on v9 about the user-supplied GUC
value. I think the v10 path is ready for committers and that the
additional patch to add src/include/utils/guc_values.h to define user
GUC values is something that need to be taken outside this one. Imo,
thoses GUC values (stderr, csvlog) are not expected to change so often
to require a global definition, but why not, if committers think this
must be done I can add it to a v11 patch.

Best regards,

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sat, 29 Oct 2016 22:00:08 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> The attached v10 of the current_logfiles patch include your last
> changes on documentation but not the patch on v9 about the
> user-supplied GUC value. I think the v10 path is ready for committers
> and that the additional patch to add src/include/utils/guc_values.h
> to define user GUC values is something that need to be taken outside
> this one. Imo, thoses GUC values (stderr, csvlog) are not expected to
> change so often to require a global definition, but why not, if
> committers think this must be done I can add it to a v11 patch.

I agree that the guc_values.h patches should be submitted separately
to the committers, when your patch is submitted.  Creating symbols
for these values is a matter of coding style they should resolve.
(IMO it's not whether the values will change, it's whether
someone reading the code can read the letters "stdout" and know
to what they refer and where to find related usage elsewhere in
the code.)

I'll keep up the guc_values.h patches and have them ready for
submission along with your patch.

I don't think your patch is quite ready for submission to
the committers.

Attached is a patch to be applied on top of your v10 patch
which does basic fixup to logfile_writename().

I have some questions about logfile_writename():

 Why does the logfile_open() call fail silently?
 Why not use ereport() here?  (With a WARNING level.)

 Why do the ereport() invocations all have a LOG level?
 You're not recovering and the errors are unexpected
 so I'd think WARNING would be the right level.
 (I previously, IIRC, suggested LOG level -- only if
 you are retrying and recovering from an error.)


Have you given any thought to my proposal to change
CURRENT_LOG_FILENAME to LOG_METAINFO_FILE?


I'm not sure I've looked at every bit of your patch
yet.  I won't have much time to do more real work
until after Tuesday morning.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 28 Jun 2016 11:06:24 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Le 07/04/2016 08:30, Karl O. Pinc a écrit :

> > "src/backend/postmaster/syslogger.c expects to see fopen() fail
> > with
> ENFILE and EMFILE.  What will you do if you get these?"
>
>     - Nothing, if the problem occurs during the log rotate call, then
> log rotation file is disabled so logfile_writename() will not be
> called. Case where the problem occurs between the rotation call and
> the logfile_writename() call is possible but I don't think that it
> will be useful to try again. In this last case log filename will be
> updated during next rotation.

The case I'm interested in is when the rotation call succeeds but
you get ENFILE or EMFILE in logfile_writename() and current_logfiles
is not updated.

This looks like an ugly problem that only happens
sporadically under load.  If I've set log
rotation to, say, 1 week, and I'm post-processing my logs based
on the current_logfiles content, and the logs rotate but
current_logfiles is not updated, then I lose a week of log
post-processing.

I'm experimenting with some code that retries writing current_logfiles,
if it failed due to ENFILE or EMFILE, the next time a log message
is written.  If that's too often it'd be easy enough to add
a backoff counter that retries progressively less frequently
based on a "clock tick" per log message write.

However, per my last email, it'll be Tuesday before I really get
back to this.   Let me know if, instead, you want to jump in
and write the code, have a better idea, think this is a really
stupid approach, or have other reasons why I should abandon
this plan.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Sat, 29 Oct 2016 22:00:08 +0200
Gilles Darold <gilles.darold@dalibo.com> wrote:

> The attached v10 of the current_logfiles patch

Attached is a patch to apply on top of the v10 patch.

It updates current_logfiles only once per log rotation.
I see no reason to open and write the file twice
if both csvlog and stderr logging is happening.

I have 2 more questions.

I don't understand why you're calling
logfile_writename(last_file_name, last_csv_file_name);
in the SIGHUP code.  Presumably you've already
written the old logfile names to current_logfiles.
On SIGHUP you want to write the new names, not
the old ones.  But the point of the SIGHUP code
is to look for changes in logfile writing and then
call logfile_rotate() to make those changes.  And
logfile_rotate() calls logfile_writename() as appropriate.
Shouldn't the logfile_writename call in the SIGHUP
code be eliminated?

Second, and I've not paid really close attention here,
you're calling logfile_writename() with last_file_name
and last_csv_file_name in a number of places.  Are you
sure that last_file_name and last_csv_file_name is reset
to the empty string if stderr/csvlog logging is turned
off and the config file re-read?  You might be recording
that logging is going somewhere that's been turned off
by a config change.  I've not noticed last_file_name and
(especially) last_csv_file_name getting reset to the empty
string.

FYI, ultimately, in order to re-try writes to current_logfiles
after ENFILE and EMFILE failure, I'm thinking that I'll probably
wind up with logfile_writename() taking no arguments.  It will
always write last_file_name and last_csv_file_name.  Then it's
a matter of making sure that these variables contain accurate
values.  It would be helpful to let me know if you have any
insight regards config file re-read and resetting of these
variables before I dive into writing code which retries writes to
current_logfiles.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :
> On Sat, 29 Oct 2016 22:00:08 +0200
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> The attached v10 of the current_logfiles patch include your last
>> changes on documentation but not the patch on v9 about the
>> user-supplied GUC value. I think the v10 path is ready for committers
>> and that the additional patch to add src/include/utils/guc_values.h
>> to define user GUC values is something that need to be taken outside
>> this one. Imo, thoses GUC values (stderr, csvlog) are not expected to
>> change so often to require a global definition, but why not, if
>> committers think this must be done I can add it to a v11 patch.
> I agree that the guc_values.h patches should be submitted separately
> to the committers, when your patch is submitted.  Creating symbols
> for these values is a matter of coding style they should resolve.
> (IMO it's not whether the values will change, it's whether
> someone reading the code can read the letters "stdout" and know
> to what they refer and where to find related usage elsewhere in
> the code.)
>
> I'll keep up the guc_values.h patches and have them ready for
> submission along with your patch.
>
> I don't think your patch is quite ready for submission to
> the committers.
>
> Attached is a patch to be applied on top of your v10 patch
> which does basic fixup to logfile_writename().

Attached patch v11 include your patch.

>
> I have some questions about logfile_writename():
>
>  Why does the logfile_open() call fail silently?
>  Why not use ereport() here?  (With a WARNING level.)

logfile_open() "fail silently" in logfile_writename(), like in other
parts of syslogger.c where it is called, because this is a function()
that already report a message when an error occurs ("could not open log
file..."). I think I have already replied to this question.

>  Why do the ereport() invocations all have a LOG level?
>  You're not recovering and the errors are unexpected
>  so I'd think WARNING would be the right level.
>  (I previously, IIRC, suggested LOG level -- only if
>  you are retrying and recovering from an error.)

If you look deeper in syslogger.c, all messages are reported at LOG
level. I guess the reason is to not call syslogger repeatedly. I think I
have already replied to this question in the thread too.

> Have you given any thought to my proposal to change
> CURRENT_LOG_FILENAME to LOG_METAINFO_FILE?
Yes, I don't think the information logged in this file are kind of meta
information and CURRENT_LOG_FILENAME seems obvious.



--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 31/10/2016 à 04:35, Karl O. Pinc a écrit :
> Hi Gilles,
>
> On Sat, 29 Oct 2016 22:00:08 +0200
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> The attached v10 of the current_logfiles patch 
> Attached is a patch to apply on top of the v10 patch.
>
> It updates current_logfiles only once per log rotation.
> I see no reason to open and write the file twice
> if both csvlog and stderr logging is happening.

I do not take the effort to do that because log rotation is not
something that occurs too often and I don't want to change the
conditional "time_based_rotation" code lines in logfile_rotate() for
readability. My though was also that on high load, log rotation is
automatically disabled so logfile_writename() is not called and there
will be no additional I/O. But why not, if commiters want to save this
additional I/O, this patch can be applied.
> I have 2 more questions.
>
> I don't understand why you're calling 
> logfile_writename(last_file_name, last_csv_file_name);
> in the SIGHUP code.  Presumably you've already
> written the old logfile names to current_logfiles.
> On SIGHUP you want to write the new names, not
> the old ones.  But the point of the SIGHUP code
> is to look for changes in logfile writing and then
> call logfile_rotate() to make those changes.  And
> logfile_rotate() calls logfile_writename() as appropriate.
> Shouldn't the logfile_writename call in the SIGHUP
> code be eliminated?

No, when you change the log_destination and reload configuration you
have to refresh the content of current_logfiles, in this case no new
rotation have been done and you have to write last_file_name and/or
last_csv_file_name.

> Second, and I've not paid really close attention here,
> you're calling logfile_writename() with last_file_name
> and last_csv_file_name in a number of places.  Are you
> sure that last_file_name and last_csv_file_name is reset
> to the empty string if stderr/csvlog logging is turned
> off and the config file re-read?  You might be recording
> that logging is going somewhere that's been turned off
> by a config change.  I've not noticed last_file_name and
> (especially) last_csv_file_name getting reset to the empty
> string.
In the patch I do not take care if the value of last_file_name and
last_csv_file_name have been reseted or not because following the
log_destination they are written or not to current_logfiles. When they
are written, they always contain the current log filename because the
call to logfile_writename() always appears after their assignment to the
new rotated filename.

> FYI, ultimately, in order to re-try writes to current_logfiles
> after ENFILE and EMFILE failure, I'm thinking that I'll probably
> wind up with logfile_writename() taking no arguments.  It will
> always write last_file_name and last_csv_file_name.  Then it's
> a matter of making sure that these variables contain accurate
> values.  It would be helpful to let me know if you have any
> insight regards config file re-read and resetting of these
> variables before I dive into writing code which retries writes to
> current_logfiles.

As explain above, last_file_name and last_csv_file_name always contains
the last log file name, then even in case of logfile_writename()
repeated failure. Those variables might have been updated in case of log
rotation occurs before a new call to logfile_writename(). In case of
ENFILE and EMFILE failure, log rotation is disabled and
logfile_writename() not called, last_file_name and last_csv_file_name
will still contain the last log file name.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
Vik Fearing
Дата:
On 03/09/2016 07:32 PM, Gilles Darold wrote:
> This patch implements the pg_current_logfile() function that can be
> used as follow. The function returns NULL when logging_collector
> is not active and outputs a warning.

I'm really late to this discussion, and I apologize for that; but I'm
wondering why we're doing all this through some random file on disk.

Why not just use the stats collector and have everything we'd want in a
pg_stat_logger view just like we have for pg_stat_archiver and others?
That makes the most sense to me.

We could then also count the number of rotations per time/size and whatnot.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: Patch to implement pg_current_logfile() function

От
Tom Lane
Дата:
Vik Fearing <vik@2ndquadrant.fr> writes:
> I'm really late to this discussion, and I apologize for that; but I'm
> wondering why we're doing all this through some random file on disk.

Well, the log collector is intentionally not connected to very much.

> Why not just use the stats collector and have everything we'd want in a
> pg_stat_logger view just like we have for pg_stat_archiver and others?

This would, at minimum, require the log collector process to not start
until after the stats collector (so that it could be connected to the
stats collector's input socket).  But perhaps more to the point, it
establishes the stats collector as infrastructure for the log collector,
which seems pretty backwards.  It's not totally out of the question that
that could result in a deadlock --- stats collector trying to write to
the log while log collector is trying to write to the stats socket,
and both getting blocked.  Also, this doesn't seem like the sort of
info that people would be okay with having the stats collector drop
under load.

I have to agree that the intermediate disk file is kind of ugly.
But passing this info through the stats collector is not better.
        regards, tom lane



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Mon, 31 Oct 2016 09:26:27 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :

> Attached patch v11 include your patch.
>
> >
> > I have some questions about logfile_writename():
> >
> >  Why does the logfile_open() call fail silently?

> logfile_open() "fail silently" in logfile_writename(), like in other
> parts of syslogger.c where it is called, because this is a function()
> that already report a message when an error occurs ("could not open
> log file..."). I think I have already replied to this question.

Please apply the attached patch on top of your v11 patch.
It simulates a logfile_open() failure.  Upon simulated failure you do
not get a "currrent_logfile" file, and neither do you get any
indication of any problems anywhere in the logs.  It's failing
silently.

To test I create a cluster, start the server, and look for
current_logfile and at the logs.

(I finally got around to writing down the process I use to install
and run a patched server, instead of just poking it with a
stick until it works every time I get back to hacking pg.
I'd be happy to share my process with you
if you're interested.  If you cannot reproduce my results please
share with me your procedure for cluster creation and runtime testing
so I can see why I find a problem and you don't.  Thank you.)

I don't expect to have a lot of time to work on pg in the next
36 hours.  After that I hope to push this through to completion.
I did want to get something back to you now, hence this email.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 02/11/2016 à 03:51, Karl O. Pinc a écrit :
> On Mon, 31 Oct 2016 09:26:27 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :
>> Attached patch v11 include your patch.
>>
>>> I have some questions about logfile_writename():
>>>
>>>  Why does the logfile_open() call fail silently?
>> logfile_open() "fail silently" in logfile_writename(), like in other
>> parts of syslogger.c where it is called, because this is a function()
>> that already report a message when an error occurs ("could not open
>> log file..."). I think I have already replied to this question.
> Please apply the attached patch on top of your v11 patch.
> It simulates a logfile_open() failure.  Upon simulated failure you do
> not get a "currrent_logfile" file, and neither do you get any
> indication of any problems anywhere in the logs.  It's failing
> silently.

Please have a look at line  1137 on HEAD of syslogger.c you will see
that in case of failure function logfile_open() report a FATAL or LOG
error with message:
   errmsg("could not open log file \"%s\": %m", filename);

I don't think adding more error messages after this has any interest
this is why logfile_writename() just return without doing anything.
Other call to logfile_open() have the exact same behavior.

For a real example, create the temporary file and change its system
privileges:
  touch /usr/local/postgresql/data/current_logfiles.tmp  chmod 400 /usr/local/postgresql/data/current_logfiles.tmp

So in this case of failure it prints:
   2016-11-02 09:56:00.791 CET [6321] LOG:  could not open log file
"current_logfiles.tmp": Permission denied

I don't understand what you are expecting more? I agree that "open log"
can confuse a little but this is also a file where we log the current
log file name so I can live with that.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 2 Nov 2016 10:07:34 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Please have a look at line  1137 on HEAD of syslogger.c you will see
> that in case of failure function logfile_open() report a FATAL or LOG
> error with message:
> 
>     errmsg("could not open log file \"%s\": %m", filename);

Ok.  Thanks.  Sorry for the confusion.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 2 Nov 2016 07:55:45 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> On Wed, 2 Nov 2016 10:07:34 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
> 
> > Please have a look at line  1137 on HEAD of syslogger.c 

> Ok.  Thanks.  Sorry for the confusion.

And yes, we did talk about this before.  I should have remembered.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Wed, Oct 26, 2016 at 11:25 PM, Karl O. Pinc <kop@meme.com> wrote:
> What it comes down to is I don't buy the adequacy of the
> ".csv" suffix test and think that "keeping things simple" now
> is a recipe for future breakage, or at least significant future
> complication and confusion when it come to processing logfile
> content.

Sounds like a plausible argument (although I haven't looked at the code).

> My thoughts are as follows:  Either you know the log format because
> you configured the cluster or you don't.  If you don't know the log
> format having the log file is halfway useless.  You can do something
> like back it up, but you can't ever look at it's contents (in some
> sense) without knowing what data structure you're looking at.
>
> Therefore pg_current_logfile() without any arguments is, in the sense
> of any sort of automated processing of the logfile content, useless.

Yeah, but it's not useless in general.  I've certainly had cases where
somebody gave me access to their PostgreSQL cluster to try to fix a
problem, and I'm struggling to find the log files.  Being able to ask
the PostgreSQL cluster "where are all of the files to which you are
logging?" sounds helpful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 03/11/2016 à 16:15, Robert Haas a écrit :
> On Wed, Oct 26, 2016 at 11:25 PM, Karl O. Pinc <kop@meme.com> wrote:
>> What it comes down to is I don't buy the adequacy of the
>> ".csv" suffix test and think that "keeping things simple" now
>> is a recipe for future breakage, or at least significant future
>> complication and confusion when it come to processing logfile
>> content.
> Sounds like a plausible argument (although I haven't looked at the code).

Yes, the current v11 patch doesn't rely on any extension anymore,
instead the log format is now stored with the log file name.

>> My thoughts are as follows:  Either you know the log format because
>> you configured the cluster or you don't.  If you don't know the log
>> format having the log file is halfway useless.  You can do something
>> like back it up, but you can't ever look at it's contents (in some
>> sense) without knowing what data structure you're looking at.
>>
>> Therefore pg_current_logfile() without any arguments is, in the sense
>> of any sort of automated processing of the logfile content, useless.
> Yeah, but it's not useless in general.  I've certainly had cases where
> somebody gave me access to their PostgreSQL cluster to try to fix a
> problem, and I'm struggling to find the log files.  Being able to ask
> the PostgreSQL cluster "where are all of the files to which you are
> logging?" sounds helpful.
>

+1

Current implementation always returns the log file in use by the logging
collector when pg_current_logfile() is called without arguments. If
stderr and csvlog are both enabled in log_destination, then it will
return the stderr log. If you need to specifically ask for the csvlog
file you can give it with as pg_current_logfile parameter.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Mon, 31 Oct 2016 09:26:27 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :

> > Have you given any thought to my proposal to change
> > CURRENT_LOG_FILENAME to LOG_METAINFO_FILE?
> Yes, I don't think the information logged in this file are kind of
> meta information and CURRENT_LOG_FILENAME seems obvious.

To me, the CURRENT_LOG_FILENAME symbol should contain the name
of the current log file.  It does not.  The CURRENT_LOG_FILENAME symbol
holds the name of the file which itself contains the name of
the log file(s) being written, plus the log file
structure of each log file.

IMO, the name of the log files being written, as well as
the type of data structure written into each log file,
are meta-information about the logging data.  So maybe
the right name is LOG_METAINFO_DATAFILE.

If you're not happy with making this change that's fine.
If not, I'd like to make mention of the symbol name to
the committers.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 04/11/2016 à 00:34, Karl O. Pinc a écrit :
> On Mon, 31 Oct 2016 09:26:27 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :
>>> Have you given any thought to my proposal to change
>>> CURRENT_LOG_FILENAME to LOG_METAINFO_FILE?  
>> Yes, I don't think the information logged in this file are kind of
>> meta information and CURRENT_LOG_FILENAME seems obvious.
> To me, the CURRENT_LOG_FILENAME symbol should contain the name 
> of the current log file.  It does not.  The CURRENT_LOG_FILENAME symbol
> holds the name of the file which itself contains the name of 
> the log file(s) being written, plus the log file
> structure of each log file.
>
> IMO, the name of the log files being written, as well as
> the type of data structure written into each log file,
> are meta-information about the logging data.  So maybe
> the right name is LOG_METAINFO_DATAFILE.
>
> If you're not happy with making this change that's fine.
> If not, I'd like to make mention of the symbol name to
> the committers.

If it need to be changed I would prefer something like CURRENT_LOG_INFO,
but this is not really important. Please mention it, and the committer
will choose to change it or not.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 3 Nov 2016 18:34:50 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> On Mon, 31 Oct 2016 09:26:27 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
> > Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :
>
> > > Have you given any thought to my proposal to change
> > > CURRENT_LOG_FILENAME to LOG_METAINFO_FILE?
> > Yes, I don't think the information logged in this file are kind of
> > meta information and CURRENT_LOG_FILENAME seems obvious.

> ... maybe the right name is LOG_METAINFO_DATAFILE.

CURRENT_LOGFILES_FILENAME is good, but a bit long.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Mon, 31 Oct 2016 10:19:18 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Le 31/10/2016 à 04:35, Karl O. Pinc a écrit :

> > Attached is a patch to apply on top of the v10 patch.
> >
> > It updates current_logfiles only once per log rotation.
> > I see no reason to open and write the file twice
> > if both csvlog and stderr logging is happening.
>
> I do not take the effort to do that because log rotation is not
> something that occurs too often and I don't want to change the
> conditional "time_based_rotation" code lines in logfile_rotate() for
> readability. My though was also that on high load, log rotation is
> automatically disabled so logfile_writename() is not called and there
> will be no additional I/O. But why not, if commiters want to save this
> additional I/O, this patch can be applied.

Ok.  You didn't put this into your v11 patch so it can be submitted to
the committers as a separate patch.

> > I don't understand why you're calling
> > logfile_writename(last_file_name, last_csv_file_name);
> > in the SIGHUP code.  Presumably you've already
> > written the old logfile names to current_logfiles.
> > On SIGHUP you want to write the new names, not
> > the old ones.  But the point of the SIGHUP code
> > is to look for changes in logfile writing and then
> > call logfile_rotate() to make those changes.  And
> > logfile_rotate() calls logfile_writename() as appropriate.
> > Shouldn't the logfile_writename call in the SIGHUP
> > code be eliminated?
>
> No, when you change the log_destination and reload configuration you
> have to refresh the content of current_logfiles, in this case no new
> rotation have been done and you have to write last_file_name and/or
> last_csv_file_name.

I don't understand.  Please explain what's wrong with the
picture I have of how logging operates on receipt of SIGHUP.
Here is my understanding:

The system is running normally, current_logfiles exists and contains
the log file paths of the logs presently being written into.  These
paths end with the file names in last_file_name and/or
last_csv_file_name.  (I'm assuming throughout my description here that
log_destination is writing to the file system at all.)

A SIGHUP arrives.  The signal handler, sigHupHandler(), sets the
got_SIGHUP flag.  Nothing else happens.

The logging process wakes up due to the signal.
Either there's also log data or there's not.  If there's
not:

The logging process goes through it's processing loop and finds,
at line 305 of syslogger.c, got_SIGHUP to be true.
Then it proceeds to do a bunch of assignment statements.
If it finds that the log directory or logfile name changed
it requests immediate log file rotation.  It does this by
setting the request_rotation flag.  If log_destination
or logging_collector changed request_rotation is not set.

Then, your patch adds a call to logfile_writename().
But nothing has touched the last_file_name or last_csv_file_name
variables.  You rewrite into current_logfiles what's
already in current_logfiles.


If there is log data in the pipe on SIGHUP
and it's csv log data then if there's a csv
log file open that's the file that's written to.
last_csv_file_name does not change so current_logfiles
does not need to be re-written.  If there is no csv
log file open then open_csvlogfile() is called
and it calls logfile_writename().  No need to
call logfile_writename() again when processing the
SIGHUP.

If there is log data in the pipe on SIGHUP
and it's stderr log data then the currently open
stderr log file is written to.  This is already
in current_logfiles so no need to call logfile_writename().


Looking at what happens after your call to logfile_writename()
in SysLoggerMain() there's no changes to the log files
without calling logfile_writename.

If there's stderr
log messages in the pipe these get written to the currently
open stderr log file until log rotation changes the file
name.  This either happens immediately because
request_rotation was set, or it waits.

If there's csv log messages in the pipe then these are either
written to the currently open log file or, when no
csv log file is open, open_csvlogfile() calls logfile_writename().
Either way, no need to re-write current_logfiles until
log rotation.

I'll respond to the rest of this email later, hopefully later
today.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 04/11/2016 à 14:17, Karl O. Pinc a écrit :
> On Mon, 31 Oct 2016 10:19:18 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> Le 31/10/2016 à 04:35, Karl O. Pinc a écrit :
>>> Attached is a patch to apply on top of the v10 patch.
>>>
>>> It updates current_logfiles only once per log rotation.
>>> I see no reason to open and write the file twice
>>> if both csvlog and stderr logging is happening.
>> I do not take the effort to do that because log rotation is not
>> something that occurs too often and I don't want to change the
>> conditional "time_based_rotation" code lines in logfile_rotate() for
>> readability. My though was also that on high load, log rotation is
>> automatically disabled so logfile_writename() is not called and there
>> will be no additional I/O. But why not, if commiters want to save this
>> additional I/O, this patch can be applied.
> Ok.  You didn't put this into your v11 patch so it can be submitted to
> the committers as a separate patch.
>
>>> I don't understand why you're calling
>>> logfile_writename(last_file_name, last_csv_file_name);
>>> in the SIGHUP code.  Presumably you've already
>>> written the old logfile names to current_logfiles.
>>> On SIGHUP you want to write the new names, not
>>> the old ones.  But the point of the SIGHUP code
>>> is to look for changes in logfile writing and then
>>> call logfile_rotate() to make those changes.  And
>>> logfile_rotate() calls logfile_writename() as appropriate.
>>> Shouldn't the logfile_writename call in the SIGHUP
>>> code be eliminated?
>> No, when you change the log_destination and reload configuration you
>> have to refresh the content of current_logfiles, in this case no new
>> rotation have been done and you have to write last_file_name and/or
>> last_csv_file_name.
> I don't understand.  Please explain what's wrong with the
> picture I have of how logging operates on receipt of SIGHUP.
> Here is my understanding:
>
> The system is running normally, current_logfiles exists and contains
> the log file paths of the logs presently being written into.  These
> paths end with the file names in last_file_name and/or
> last_csv_file_name.  (I'm assuming throughout my description here that
> log_destination is writing to the file system at all.)
Yes, also if log_collector is on and log_destination is not stderr or
csvlog, current_logfiles exists too but it is emtpy.
When log_collector is off this file doesn't exists.


> A SIGHUP arrives.  The signal handler, sigHupHandler(), sets the
> got_SIGHUP flag.  Nothing else happens.
>
> The logging process wakes up due to the signal.
> Either there's also log data or there's not.  If there's
> not:
>
> The logging process goes through it's processing loop and finds,
> at line 305 of syslogger.c, got_SIGHUP to be true.
> Then it proceeds to do a bunch of assignment statements.
> If it finds that the log directory or logfile name changed
> it requests immediate log file rotation.  It does this by
> setting the request_rotation flag.  If log_destination
> or logging_collector changed request_rotation is not set.
>
> Then, your patch adds a call to logfile_writename().
> But nothing has touched the last_file_name or last_csv_file_name
> variables.  You rewrite into current_logfiles what's
> already in current_logfiles.

Your picture is good, you just miss the case where we just change
log_destination. In this case, syslogger add/change log file destination
but rotation_requested is false. If write to current_logfiles is
conditional to this flag, it will never reflect the file change until
next rotation, see why next answer bellow.


If log_destination remain unchanged I agree that I am rewriting the same
information, but I don't think that this kind of sporadic I/O is enough
to append code to store the old log_destination value and check if there
is a change like what is done with Log_directory. This is my opinion,
but may be I'm wrong to consider that those isolated and not critical
I/O doesn't justify this work.


> If there is log data in the pipe on SIGHUP
> and it's csv log data then if there's a csv
> log file open that's the file that's written to.
> last_csv_file_name does not change so current_logfiles
> does not need to be re-written.  If there is no csv
> log file open then open_csvlogfile() is called
> and it calls logfile_writename().  No need to
> call logfile_writename() again when processing the
> SIGHUP.

Yes, but the problem here is that logfile_writename() doesn't write the
change because the test (Log_destination & LOG_DESTINATION_CSVLOG)
returns false, Log_destination is set after the file is successfully
created. This make me though that the call of logfile_writename() into
function open_csvlogfile() can be removed, thanks for pointing me that.
I attached a v12 patch with some comment fix in the call to
logfile_writename().


Regards,

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Fri, 4 Nov 2016 16:58:45 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> I attached a v12 patch

Attached is a comment patch which improves the comment
describing CURRENT_LOG_FILENAME.  It's been bugging me.
I should have made this change long ago when I looked
at all the other code comments but neglected to.

The comment now matches other documentation.

This applies on top of your v12 patch.

I'm thinking through the v12 patch and email.
I'm in general agreement but want to make sure
I really understand all the code paths.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 04/11/2016 à 21:07, Karl O. Pinc a écrit :
> On Fri, 4 Nov 2016 16:58:45 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> I attached a v12 patch
> Attached is a comment patch which improves the comment
> describing CURRENT_LOG_FILENAME.  It's been bugging me.
> I should have made this change long ago when I looked
> at all the other code comments but neglected to.
>
> The comment now matches other documentation.
>
> This applies on top of your v12 patch.

Here is the v13 of the patch, it applies your last change in comment and
some other changes:

  - I've reverted the patch removing the call to logfile_writename in
open_csvlogfile function, it is required to write log filename at
startup when log_destination is set to csvlog. I could not find a better
place right now, but will try to see if we can avoid the call to
logfile_writename().
  - Do not write current_logfiles when log_collector is activated but
log_destination doesn't contained stderr or csvlog. This was creating an
empty file that can confuse the user.
  - I've changed the comment in doc/src/sgml/storage.sgml by addind the
following sentence: "This file is present only when logging_collector is
activated and when at least one of stderr or csvlog value is present in
log_destination."


--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Thu, Nov 3, 2016 at 7:34 PM, Karl O. Pinc <kop@meme.com> wrote:
> To me, the CURRENT_LOG_FILENAME symbol should contain the name
> of the current log file.  It does not.  The CURRENT_LOG_FILENAME symbol
> holds the name of the file which itself contains the name of
> the log file(s) being written, plus the log file
> structure of each log file.

I agree that this is confusing.

> IMO, the name of the log files being written, as well as
> the type of data structure written into each log file,
> are meta-information about the logging data.  So maybe
> the right name is LOG_METAINFO_DATAFILE.

+1 for something like this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 09/11/2016 à 02:51, Robert Haas a écrit :
> On Thu, Nov 3, 2016 at 7:34 PM, Karl O. Pinc <kop@meme.com> wrote:
>> To me, the CURRENT_LOG_FILENAME symbol should contain the name
>> of the current log file.  It does not.  The CURRENT_LOG_FILENAME symbol
>> holds the name of the file which itself contains the name of
>> the log file(s) being written, plus the log file
>> structure of each log file.
> I agree that this is confusing.
>
>> IMO, the name of the log files being written, as well as
>> the type of data structure written into each log file,
>> are meta-information about the logging data.  So maybe
>> the right name is LOG_METAINFO_DATAFILE.
> +1 for something like this.
>

Ok, it will be changed it in next patch.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Mon, 7 Nov 2016 23:29:28 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:


> Here is the v13 of the patch,

>   - I've reverted the patch removing the call to logfile_writename in
> open_csvlogfile function, it is required to write log filename at
> startup when log_destination is set to csvlog. I could not find a
> better place right now, but will try to see if we can avoid the call
> to logfile_writename().

Why is calling logfile_writename() in open_csvlogfile() a problem?

I've not thought too deeply but this is what's in my mind.

The only issue that leaps to mind is overall code simplicity.  But
because csv files are opened "on-demand", only when there's something
in the queue to get written to a csv file, it's not clear how to avoid
calling logfile_writename() "on-demand" as well.  To do otherwise might
risk putting the name of a csv logfile into current_logfiles when that
csv log file does not yet exist.  Or you could wait, and have a csv
log file in existance but not have it reflected in the content of
current_logfiles for some time period.  But why?

If there is a problem in code clarity it might be that the csv log files
are not created until there's a csv log entry in the queue.  Change this
and other confusion, if there really is any, goes away.

>   - Do not write current_logfiles when log_collector is activated but
> log_destination doesn't contained stderr or csvlog. This was creating
> an empty file that can confuse the user.

Good catch.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sat, 12 Nov 2016 12:59:46 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> On Mon, 7 Nov 2016 23:29:28 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
> > Here is the v13 of the patch,

> >   - Do not write current_logfiles when log_collector is activated
> > but log_destination doesn't contained stderr or csvlog. This was
> > creating an empty file that can confuse the user.
>
> Good catch.

I don't see a good reason to go past 80 characters on a line here.
Patch attached to be applied on top of v13.

Whether to write current_logfiles only when there's a stderr or csvlog
seems dependent up on whether the current_logfiles file is for human
or program use.  If for humans, don't write it unless it has content.
If for programs, why make the program always have to check to see
if the file exists before reading it?  Failing to check is just one
more cause of bugs.

A casual read of the v13 code does not show me that the current_logfiles
file is deleted when the config file is changed so that stderr and
csvlog is no longer output and the user sends a SIGHUP.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Mon, 7 Nov 2016 23:29:28 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Here is the v13 of the patch,

Just to keep things up to date...  Attached are 3 re-worked patches
that I see submitting along with the pg_current_logfile patch.
They apply on top of v13.

(I still have one more I'm working on to ensure that current_logfiles
always has valid content should an attempt to open it returns ENFILE
or EMFILE.)


patch_pg_current_logfile-v13.diff.writeonce

Writes the current_logfiles file only once when rotating both syslog
and csvlog logfiles.

This patch pushes bool types onto the stack instead of using int.
This is a practice which is not optimal given traditional instruction
sets and compilers.  I used bool types out of clarity.  If the
compiler does not optimize the problem away it won't impact
performance anyway because the code is not run that often.
If you find this ugly change the bools to ints.

I believe this patch also fixes a bug where, when both syslog and
csvlog are on and the syslog is rotated due to size but the csvlog is
not rotated (for whatever reason), then the syslog filename fails to
be updated in the current_logfiles file content.


patch_pg_current_logfile-v13.diff.abstract_guc_part1

Creates symbols for some GUC values which appear in multiple
places in the code.


patch_pg_current_logfile-v13.diff.abstract_guc_part2

Uses symbols for some GUC values in the pg_current_logfile
patch.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Mon, 7 Nov 2016 23:29:28 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Here is the v13 of the patch, ...

Attached is a doc patch to apply on top of v13.

It adds a lot more detail regards just what is
in the current_logfiles file and when it's
in current_logfiles.  I'd like review both for
language and accuracy, but I'd also like to
confirm that we really want the documented
behavior regards what's there at backend startup
v.s. what's there during normal runtime.  Seems
ok the way it is to me but...

This patch is also starting to put a lot of information
inside a graphical table.  I'm fine with this but
it's a bit of a departure from the other cells of
the table so maybe somebody else has a better
suggestion.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Sun, 30 Oct 2016 02:04:59 -0500
"Karl O. Pinc" <kop@meme.com> wrote:

> Attached is a patch to be applied on top of your v10 patch
> which does basic fixup to logfile_writename().

I'm looking at the v13 patch and don't see a change I submitted
with a patch to v10.  You wrote:
   snprintf(tempfn, sizeof(tempfn), "%s",                       CURRENT_LOG_FILENAME);   strcat(tempfn, ".tmp");

I patched to:
   snprintf(tempfn, sizeof(tempfn), "%s.tmp",                       CURRENT_LOG_FILENAME);

As long as you're doing a snprintf() there's no point
in "risking" a buffer overflow by a subsequent strcat().
(Not that you're likely to ever get a buffer overflow.)
And why make two calls instead of 1?  That's what's
in my head.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 16/11/2016 à 17:38, Karl O. Pinc a écrit :
> On Mon, 7 Nov 2016 23:29:28 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> Here is the v13 of the patch, ...
> Attached is a doc patch to apply on top of v13.
>
> It adds a lot more detail regards just what is
> in the current_logfiles file and when it's
> in current_logfiles.  I'd like review both for
> language and accuracy, but I'd also like to
> confirm that we really want the documented
> behavior regards what's there at backend startup
> v.s. what's there during normal runtime.  Seems
> ok the way it is to me but...
>
> This patch is also starting to put a lot of information
> inside a graphical table.  I'm fine with this but
> it's a bit of a departure from the other cells of
> the table so maybe somebody else has a better
> suggestion.
>
> Regards,
>
> Karl <kop@meme.com>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein
>

All patches you've submitted on tha v13 patch have been applied and are
present in attached v14 of the patch. I have not included the patches
about GUC changes because I'm not sure that adding a new file
(include/utils/guc_values.h) just for that will be accepted or that it
will not require a more global work to add other GUC values. However
perhaps this patch can be submitted separately if the decision is not
taken here.

Regards,

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sat, 19 Nov 2016 12:58:47 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> All patches you've submitted on tha v13 patch have been applied and
> are present in attached v14 of the patch. I have not included the
> patches about GUC changes because I'm not sure that adding a new file
> (include/utils/guc_values.h) just for that will be accepted or that it
> will not require a more global work to add other GUC values. However
> perhaps this patch can be submitted separately if the decision is not
> taken here.

Understood.  I've a couple of other patches that do
a little cleanup on master that I'd also like to submit
along with your patch.  This on the theory that
the maintainers will be looking at this code anyway
because your patch touches it.  All this can be submitted
for their review at once.  My approach is to be minimally invasive on
a per-patch basis (i.e. your patch) but add small patches
that make existing code "better" without touching
functionality.  (Deleting unnecessary statements, etc.)
The overall goal being a better code base.

I've found what I think is another bug, where a 
long-stale filename could be written to current_logfiles.
I've coded a patch.

I also have a patch to ensure that current_logfiles never
"skips" a logfile due to ENFILE or ENFILE in
logfile_writename() (only).

I'm done with all the coding and need to refactor on top
of your v14 patch and on top of each other.  I'll send all
patches back to you for your review.  (Unless you've
some objection.)  I may not get to this until Monday morning.

I haven't throughly tested the ENFILE/ENFILE patch.
I was planning on doing that while you looked over the
code.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Tue, 15 Nov 2016 15:15:52 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> > On Mon, 7 Nov 2016 23:29:28 +0100
> > Gilles Darold <gilles.darold@dalibo.com> wrote:  

> > >   - Do not write current_logfiles when log_collector is activated
> > > but log_destination doesn't contained stderr or csvlog. This was
> > > creating an empty file that can confuse the user. 

> Whether to write current_logfiles only when there's a stderr or csvlog
> seems dependent up on whether the current_logfiles file is for human
> or program use.  If for humans, don't write it unless it has content.
> If for programs, why make the program always have to check to see
> if the file exists before reading it?  Failing to check is just one
> more cause of bugs.

What are your thoughts on this?  I'm leaning toward current_logfiles
being for programs, not people.  So it should be present whenever
logging_collector is on.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 19/11/2016 à 16:22, Karl O. Pinc a écrit :
> Hi Gilles,
>
> On Tue, 15 Nov 2016 15:15:52 -0600
> "Karl O. Pinc" <kop@meme.com> wrote:
>
>>> On Mon, 7 Nov 2016 23:29:28 +0100
>>> Gilles Darold <gilles.darold@dalibo.com> wrote:  
>>>>   - Do not write current_logfiles when log_collector is activated
>>>> but log_destination doesn't contained stderr or csvlog. This was
>>>> creating an empty file that can confuse the user. 
>> Whether to write current_logfiles only when there's a stderr or csvlog
>> seems dependent up on whether the current_logfiles file is for human
>> or program use.  If for humans, don't write it unless it has content.
>> If for programs, why make the program always have to check to see
>> if the file exists before reading it?  Failing to check is just one
>> more cause of bugs.
> What are your thoughts on this?  I'm leaning toward current_logfiles
> being for programs, not people.  So it should be present whenever
> logging_collector is on.

My though is that it is better to not have an empty file even if
log_collector is on. Programs can not be confused but human yes, if the
file is present but empty, someone may want to check why this file is
empty. Also having a file containing two lines with just the log format
without path is worst for confusion than having an empty file.

In other words, from a program point of view, to gather last log
filename, existence of the file or not doesn't make any difference. This
is the responsibility of the programer to cover all cases. In a non
expert user point of view, there will always the question: why this file
is empty? If the file is not present, the question will stay: how I to
get current log filename?


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sat, 19 Nov 2016 18:59:49 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Le 19/11/2016 à 16:22, Karl O. Pinc a écrit :
> > Hi Gilles,
> >
> > On Tue, 15 Nov 2016 15:15:52 -0600
> > "Karl O. Pinc" <kop@meme.com> wrote:
> >
> >>> On Mon, 7 Nov 2016 23:29:28 +0100
> >>> Gilles Darold <gilles.darold@dalibo.com> wrote:
> >>>>   - Do not write current_logfiles when log_collector is activated
> >>>> but log_destination doesn't contained stderr or csvlog. This was
> >>>> creating an empty file that can confuse the user.
> >> Whether to write current_logfiles only when there's a stderr or
> >> csvlog seems dependent up on whether the current_logfiles file is
> >> for human or program use.  If for humans, don't write it unless it
> >> has content. If for programs, why make the program always have to
> >> check to see if the file exists before reading it?  Failing to
> >> check is just one more cause of bugs.
> > What are your thoughts on this?  I'm leaning toward current_logfiles
> > being for programs, not people.  So it should be present whenever
> > logging_collector is on.
>
> My though is that it is better to not have an empty file even if
> log_collector is on.

Thanks.   I wrote to be sure that you'd considered my thoughts.

I don't see a point in further discussion.  I may submit a separate
patch to the maintainers when we're ready to send them code so they can
see how the code looks with current_logfiles always in existence.

Further thoughts below.  No need to read them or respond.

> Programs can not be confused but human yes, if
> the file is present but empty, someone may want to check why this
> file is empty.

IMO if they want to know why it's empty they could read the
docs.

> Also having a file containing two lines with just the
> log format without path is worst for confusion than having an empty
> file.

I agree that it makes no sense to list log formats without paths.
> In other words, from a program point of view, to gather last log
> filename, existence of the file or not doesn't make any difference.
> This is the responsibility of the programer to cover all cases. In a
> non expert user point of view, there will always the question: why
> this file is empty? If the file is not present, the question will
> stay: how I to get current log filename?

As a non-expert looking at the default data_directory (etc.) almost
nothing makes sense.  I see the non-expert using the
pg_current_logfile() function from within Postgres.

I also see a lot of non-experts writing program to get log data and
then getting errors when the config changes and current_logfile goes
away.  Not having data in a opened file generally means a program
does nothing. an appropriate action when there are no log files
in the filesystem.  But not accounting for a non-existent file leads
to crashes.

Regards,


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Sat, Nov 19, 2016 at 8:51 AM, Karl O. Pinc <kop@meme.com> wrote:
> On Sat, 19 Nov 2016 12:58:47 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> All patches you've submitted on tha v13 patch have been applied and
>> are present in attached v14 of the patch. I have not included the
>> patches about GUC changes because I'm not sure that adding a new file
>> (include/utils/guc_values.h) just for that will be accepted or that it
>> will not require a more global work to add other GUC values. However
>> perhaps this patch can be submitted separately if the decision is not
>> taken here.
>
> Understood.  I've a couple of other patches that do
> a little cleanup on master that I'd also like to submit
> along with your patch.  This on the theory that
> the maintainers will be looking at this code anyway
> because your patch touches it.  All this can be submitted
> for their review at once.  My approach is to be minimally invasive on
> a per-patch basis (i.e. your patch) but add small patches
> that make existing code "better" without touching
> functionality.  (Deleting unnecessary statements, etc.)
> The overall goal being a better code base.

It would really be much better to submit anything that's not
absolutely necessary for the new feature as a separate patch, rather
than bundling things together.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Mon, 21 Nov 2016 13:17:17 -0500
Robert Haas <robertmhaas@gmail.com> wrote:

> > I've a couple of other patches that do
> > a little cleanup on master that I'd also like to submit
> > along with your patch. 

> It would really be much better to submit anything that's not
> absolutely necessary for the new feature as a separate patch, rather
> than bundling things together.

My plan is separate patches, one email.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Sat, 19 Nov 2016 12:58:47 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> ... attached v14 of the patch.

Attached are patches for your consideration and review.
(Including your latest v14 patch for completeness.)

Some of the attached patches (like the GUC symbol
patch you've seen before) are marked to be submitted
as separate patches to the maintainers when we send
them code for review.  These need looking over by
somebody, I hope you, before they get sent on so
please comment on these if you're not going to look
at them or if you see a problem with them.   (Or if
you like them.  :)  Thanks.

I also have comments at the bottom regards problems
I see but haven't patched.

---

patch_pg_current_logfile-v14.diff

Applies on top of master.

The current patch.

---

patch_pg_current_logfile-v14.diff.startup_docs

For consideration of inclusion in "main" patch.

Applies on top of patch_pg_current_logfile-v14.diff

A documentation fix.

(Part of) my previous doc patch was wrong; current_logfiles is not
unconditionally written on startup.

---

patch_pg_current_logfile-v14.diff.bool_to_int

Do not include in "main" patch, submit to maintainers separately.

Applies on top of patch_pg_current_logfile-v14.diff

The bool types on the stack in logfile_rotate() are
my work.  Bools on the stack don't make sense as far
as hardware goes, so the compiler's optimizer should change
them to int.  This patch changes the bools to ints
should that be to someone's taste.

---

patch_pg_current_logfile-v14.diff.logdest_change

For consideration of inclusion in "main" patch.

Applies on top of patch_pg_current_logfile-v14.diff.

Fixes a bug where, when log_destination changes and the config
file is reloaded, a no-longer-used logfile path may be written
to the current_logfiles file.  The chain of events would be
as follows:

1) PG configured with csvlog in log_destination. Logs are written.

This makes last_csv_file_name non-NULL.


2) PG config changed to remove csvlog from log_destination
and SIGHUP sent.

This removes the csvlog path from current_logfiles but does not
make last_csv_file_name NULL.  last_csv_file_name has the old
path in it.


3) PG configured to add csvlog back to log_destination and
SIGHUP sent.

When csvlogging is turned back on, the stale csv path is written
to current_logfiles.  This is overwritten as soon as some csv logs
are written because the new csv logfile must be opened, but this is
still a problem.  Even if it happens to be impossible at the moment
to get past step 2 to step 3 without having some logs written it seems
a lot more robust to manually "expire" the last*_file_name variable
content when log_destination changes.


So what the patch does is "scrub" the "last_*file_name" variables
when the config file changes.

FWIW, I moved the logfile_writename() call toward the top of the
if block just to keep all the code which sorta-kinda involves
the old_log_destination variable together.

---

patch_pg_current_logfile-v14.diff.logdest_change-part2

Do not include in "main" patch, submit to maintainers separately.

Applies on top of patch_pg_current_logfile-v14.diff.logdest_change

Adds a PGLogDestination typedef.  A separate patch since this may be
overkill and more about coding style than anything else.

---

And now, a series of patches to fix the problem where, at least
potentially, logfile_writename() gets a ENFILE or EMFILE and
therefore a log file path does not ever get written to the
current_logfiles file.  The point of this patch series is to retry until
the content of current_logfiles is correct.

All of these are for consideration of inclusion in "main" patch.


patch_pg_current_logfile-v14.diff.retry_current_logfiles-part1

Applies on top of patch_pg_current_logfile-v14.diff.logdest_change

A documentation patch.  Notes that (even with retrying) the
current_logfiles content might not be right.


patch_pg_current_logfile-v14.diff.retry_current_logfiles-part2

Applies on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part1

Remove arguments from logfile_writename().  Use static storage
instead.  We always update last_file_name and last_csv_file_name
whenever logfile_writename() is called so there's not much of
a change here.



patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3

Applies on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part2

Re-try the write of current_logfiles should it fail because the
system is too busy.

---

patch_pg_current_logfile-v14.diff.backoff

Do not include in "main" patch, submit to maintainers separately.

Applies on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3

Introduces a backoff when retrying to write the current_logfiles file,
because retrying when failure is due to a busy system only makes the
system more busy.  This may well be over-engineering but I thought
I'd present the code and let more experienced people decide.

I have yet to really test this patch.

---

The remaining patches have to do with coding style and code clarity.

---

patch_pg_current_logfile-v14.diff.deletion

For consideration of inclusion in "main" patch, otherwise submit to
maintainers separately.

Applies on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3

I find the code to be more understandable if the deletion
of the current_logfiles file is separated from the updating
of current_logfiles.  It's not that without this patch
that unnecessary code execution is inefficient, it's that
the unnecessary code execution makes it hard (for me) to think about
what's happening where and why.

---

patch_pg_current_logfile-v14.diff.conditional

For consideration of inclusion in "main" patch, otherwise submit to
maintainers separately.

Applies on top of patch_pg_current_logfile-v14.diff.deletion
(Could be changed to apply on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3.)

Only change the current_logfiles content on SIGHUP when the content needs
to be changed.

As above, I understand the code more readily when statements are only
executed when they need to be executed.

My analysis here is as follows:

There are 9 cases, the combination of 2 sets of 3 possibilities.

 syslog is removed from log_destination
 the presence or absence of syslog in log_destination is unchanged
 syslog is added to log_destination

 the same 3 cases only regards csvlog

In the no-change case we don't need to do anything

If either syslog or csvlog are added to log_destination then
the current_logfiles file will be updated on rotation or, in
the case of adding a csvlog file, on file open.

It is only the removal cases that require the current_logfiles
file be changed, and that's what the patch does.

---

patch_pg_current_logfile-v14.diff.codecomment

Do not include in "main" patch, submit to maintainers separately.

Applies on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3

Add a comment to the code explaining how an unusual case can come to
pass.

---

patch_pg_current_logfile-v14.diff.cleanup_rotate

Do not include in "main" patch, submit to maintainers separately.

Applies on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3

Removes a net 10 lines of code, eliminating unnecessary if statements.
(Refactoring aside, pfree() does nothing when given a NULL pointer.)

It might be a little cleaner to move the declarations of "filename" and
"csvfilename" into the if blocks that use them, but I didn't do this.

---

A series of patches which creates symbols for some GUC values which
appear in multiple places in the code.

Do not include in "main" patch, submit to maintainers separately.


patch_pg_current_logfile-v14.diff.abstract_guc_part1

Applies on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3
(And also applies on top of master branch, or really, any of the
other patches presented in this email.)

Changes code in master branch to create symbols for some GUC values.


patch_pg_current_logfile-v14.diff.abstract_guc_part2

Applies on top of patch_pg_current_logfile-v14.diff.abstract_guc_part1
(and also applies on top of patch_pg_current_logfile-v14.diff, or
probably any of the other patches in this email)

Changes code in this patch series to use symbols for some GUC values.

---

I've just started to look at pg_current_logfile() in
src/backend/utils/adt/misc.c.  I believe I've noticed a couple
of problems.  I probably won't have time to write patches until
next week so I thought I'd simply report what I see and
let you review what I've got so far.

It looks like under some conditions (typically errors) you
can return an uninitialized value in log_filename.  The
answer here is probably to write :

  log_filename[0] = 0;

somewhere toward the top of the function.  And (as now written)
you also need to put the same statement after the error that
reports "unexpected line format in file %s" to be sure that
in that case you're not going to return a pointer to the C
NULL value.

But also, you can't use strtok() to parse lbuffer because
the path you're returning can contain a space.  (I suspect using
strstr() to parse on the the first space you find is the way
to go.  You can use the same trick that strtok() uses of sticking
a 0 byte in in place of the space to get the log format string.)

---

I'm going to put together a patch, to be sent separately to the
maintainers when we ask them for final review, which shows what the
docs/code looks like when current_logfiles always exists and can be an
empty file.  And argue for that.  But I've not gotten to this yet.

The good news is that I don't see anything else in syslogger.c to
comment on or patch.  :-)

Regards,


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 23 Nov 2016 03:21:05 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> But also, you can't use strtok() to parse lbuffer because
> the path you're returning can contain a space. 

Maybe on the 2nd call to strtok() you could pass ""
as the 2nd argument?  That'd be a little wonky but
the man page does not say you can't have an empty
set of delimiters.  On the other hand strtok() is
not a perfect choice, you don't want to "collapse"
adjacent delimiters in the parsed string or ignore
leading spaces.  I'd prefer a strstr() solution.

Or strchr() even better.

Regards.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Tom Lane
Дата:
"Karl O. Pinc" <kop@meme.com> writes:
> Maybe on the 2nd call to strtok() you could pass ""
> as the 2nd argument?  That'd be a little wonky but
> the man page does not say you can't have an empty
> set of delimiters.  On the other hand strtok() is
> not a perfect choice, you don't want to "collapse"
> adjacent delimiters in the parsed string or ignore
> leading spaces.  I'd prefer a strstr() solution.

I'd stay away from strtok() no matter what.  The process-wide static
state it implies is dangerous: if you use it, you're betting that
you aren't interrupting some caller's use, nor will any callee decide
to use it.  In a system as large as Postgres, that's a bad bet, or
would be if we didn't discourage use of strtok() pretty hard.

As far as I can find, there are exactly two users of strtok() in
the backend, and they're already playing with fire because one
calls the other (look in utils/misc/tzparser.c).  I don't want the
hazard to get any larger.
        regards, tom lane



Re: Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Wed, Nov 23, 2016 at 4:21 AM, Karl O. Pinc <kop@meme.com> wrote:
> patch_pg_current_logfile-v14.diff.bool_to_int
>
> Do not include in "main" patch, submit to maintainers separately.
>
> Applies on top of patch_pg_current_logfile-v14.diff
>
> The bool types on the stack in logfile_rotate() are
> my work.  Bools on the stack don't make sense as far
> as hardware goes, so the compiler's optimizer should change
> them to int.  This patch changes the bools to ints
> should that be to someone's taste.

That does not seem like a good idea from here.  Even if it buys some
microscopic speedup, in a place like this it won't matter.  It's more
important that the code be clear, and using an int where you really
intend a bool isn't an improvement as far as clarity goes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Patch to implement pg_current_logfile() function

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Nov 23, 2016 at 4:21 AM, Karl O. Pinc <kop@meme.com> wrote:
>> The bool types on the stack in logfile_rotate() are
>> my work.  Bools on the stack don't make sense as far
>> as hardware goes, so the compiler's optimizer should change
>> them to int.  This patch changes the bools to ints
>> should that be to someone's taste.

> That does not seem like a good idea from here.  Even if it buys some
> microscopic speedup, in a place like this it won't matter.  It's more
> important that the code be clear, and using an int where you really
> intend a bool isn't an improvement as far as clarity goes.

Not to mention that the "bools on the stack don't make sense" premise
is bogus anyway.
        regards, tom lane



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 23/11/2016 à 16:26, Tom Lane a écrit :
> "Karl O. Pinc" <kop@meme.com> writes:
>> Maybe on the 2nd call to strtok() you could pass ""
>> as the 2nd argument?  That'd be a little wonky but
>> the man page does not say you can't have an empty
>> set of delimiters.  On the other hand strtok() is
>> not a perfect choice, you don't want to "collapse"
>> adjacent delimiters in the parsed string or ignore
>> leading spaces.  I'd prefer a strstr() solution.
> I'd stay away from strtok() no matter what.  The process-wide static
> state it implies is dangerous: if you use it, you're betting that
> you aren't interrupting some caller's use, nor will any callee decide
> to use it.  In a system as large as Postgres, that's a bad bet, or
> would be if we didn't discourage use of strtok() pretty hard.
>
> As far as I can find, there are exactly two users of strtok() in
> the backend, and they're already playing with fire because one
> calls the other (look in utils/misc/tzparser.c).  I don't want the
> hazard to get any larger.
>
>             regards, tom lane

Understood, I will remove and replace this call from the patch and more
generally from my programming.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 23 Nov 2016 10:39:28 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
> > On Wed, Nov 23, 2016 at 4:21 AM, Karl O. Pinc <kop@meme.com>
> > wrote:
> >> The bool types on the stack in logfile_rotate() are
> >> my work.  Bools on the stack don't make sense as far
> >> as hardware goes, so the compiler's optimizer should change
> >> them to int.  This patch changes the bools to ints
> >> should that be to someone's taste.
>
> > That does not seem like a good idea from here.  Even if it buys some
> > microscopic speedup, in a place like this it won't matter.  It's
> > more important that the code be clear, and using an int where you
> > really intend a bool isn't an improvement as far as clarity goes.
>
> Not to mention that the "bools on the stack don't make sense" premise
> is bogus anyway.

Thanks.  I don't think I've paid attention since 8088 days, just always
used bools.  We'll drop that patch then.

But this reminds me.  I should have used a bool return value.

Attached is a version 2 of
patch_pg_current_logfile-v14.diff.deletion-v2


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 23 Nov 2016 03:21:05 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> On Sat, 19 Nov 2016 12:58:47 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
> > ... attached v14 of the patch.

> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3

> Re-try the write of current_logfiles should it fail because the
> system is too busy.

Attached are 2 more doc patchs.

---

patch_pg_current_logfile-v14.diff.pg_ctl_and_more_docs

Applies on top of
patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3

Mentions pg_ctl and how using it might affect whether logfile
paths are captured in the current_logfiles file.  I think
the default RedHat packaging captures logs this way.  The
Debian default is to ship with logging_collector=off
and to rely on pg_ctrl via pg_ctlcluster to manage
log writing.  Both RH and Debian then pass stderr to systemd
and that does log management.  I don't
recall other distros' practice.  If the default distro
packaging means that current_logfiles/pg_current_logfile()
"don't work" this could be an issue to address in the
documentation.  Certainly the systemd reliance on stderr
capture and it's subsuming of logging functionality could
also be an issue.

Cross reference the current_logfiles file in the pg_current_logfile()
docs.

Marks up stderr appropriately.

Adds index entries.

---

patch_pg_current_logfile-v14.diff.doc_linux_default

Applies on top of
patch_pg_current_logfile-v14.diff.pg_ctl_and_more_docs

Mentions, in the body of the docs, defaults and their impact on
current_logfiles and pg_current_logfiles.  It seems appropriate
to mention this in the main documentation and in the overall context
of logging.

Adds index entries.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 23 Nov 2016 23:08:18 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> On Wed, 23 Nov 2016 03:21:05 -0600
> "Karl O. Pinc" <kop@meme.com> wrote:
>
> > On Sat, 19 Nov 2016 12:58:47 +0100
> > Gilles Darold <gilles.darold@dalibo.com> wrote:
> >
> > > ... attached v14 of the patch.
>
> > patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3
>
> > Re-try the write of current_logfiles should it fail because the
> > system is too busy.

> ---
>
> patch_pg_current_logfile-v14.diff.doc_linux_default
>
> Applies on top of
> patch_pg_current_logfile-v14.diff.pg_ctl_and_more_docs
>
> Mentions, in the body of the docs, defaults and their impact on
> current_logfiles and pg_current_logfiles.  It seems appropriate
> to mention this in the main documentation and in the overall context
> of logging.

Attached is version 2 of this patch.  Adds an index entry.

patch_pg_current_logfile-v14.diff.doc_linux_default-v2

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein
 top of
> p
Вложения

Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Hi Karl,

I've attached the v15 of the patch that applies your changes/fixes and
remove the call to strtok().

I've not applied patch patch_pg_current_logfile-v14.diff.backoff to
prevent constant call of logfile_writename() on a busy system (errno =
ENFILE | EMFILE). I think this can be done quite simply by testing if
log rotate is still enabled. This is possible because function
logfile_rotate() is already testing if  errno = ENFILE | EMFILE and in
this case rotation_disabled is set to true. So the following test should
do the work:

               if (log_metainfo_stale && !rotation_disabled)
                       logfile_writename();

This is included in v15 patch.

I've also not added patches
patch_pg_current_logfile-v14.diff.conditional,
patch_pg_current_logfile-v14.diff.logdest_change and
patch_pg_current_logfile-v14.diff.logdest_change-part2 because the case
your are trying to fix with lot of code can not appears. You said that
when csv logging is turned back on, the stale csv path is written to
current_logfiles. This is not the case, last_csv_file_name is
immediately changed when the log file is open, see open_csvlogfile().
After running your use case I was not able to reproduce the bug you are
describing.

Again, patches patch_pg_current_logfile-v14.diff.doc_linux_default-v2
have not been included because I don't see any reason to talk especially
about systemd. If you talk about systemd you must talk about other
stderr handler by all systems. IMO saying that current_logfile is
present only if logging_collector is enabled and log_destination include
stderr or/and csvlog is enough, no need to talk about systemd and
behavior of Linux distributions.

Regards

Le 23/11/2016 à 10:21, Karl O. Pinc a écrit :
> Hi Gilles,
>
> On Sat, 19 Nov 2016 12:58:47 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> ... attached v14 of the patch.
> Attached are patches for your consideration and review.
> (Including your latest v14 patch for completeness.)
>
> Some of the attached patches (like the GUC symbol
> patch you've seen before) are marked to be submitted
> as separate patches to the maintainers when we send
> them code for review.  These need looking over by
> somebody, I hope you, before they get sent on so
> please comment on these if you're not going to look
> at them or if you see a problem with them.   (Or if
> you like them.  :)  Thanks.
>
> I also have comments at the bottom regards problems
> I see but haven't patched.
>
> ---
>
> patch_pg_current_logfile-v14.diff
>
> Applies on top of master.
>
> The current patch.
>
> ---
>
> patch_pg_current_logfile-v14.diff.startup_docs
>
> For consideration of inclusion in "main" patch.
>
> Applies on top of patch_pg_current_logfile-v14.diff
>
> A documentation fix.
>
> (Part of) my previous doc patch was wrong; current_logfiles is not
> unconditionally written on startup.
>
> ---
>
> patch_pg_current_logfile-v14.diff.bool_to_int
>
> Do not include in "main" patch, submit to maintainers separately.
>
> Applies on top of patch_pg_current_logfile-v14.diff
>
> The bool types on the stack in logfile_rotate() are
> my work.  Bools on the stack don't make sense as far
> as hardware goes, so the compiler's optimizer should change
> them to int.  This patch changes the bools to ints
> should that be to someone's taste.
>
> ---
>
> patch_pg_current_logfile-v14.diff.logdest_change
>
> For consideration of inclusion in "main" patch.
>
> Applies on top of patch_pg_current_logfile-v14.diff.
>
> Fixes a bug where, when log_destination changes and the config
> file is reloaded, a no-longer-used logfile path may be written
> to the current_logfiles file.  The chain of events would be
> as follows:
>
> 1) PG configured with csvlog in log_destination. Logs are written.
>
> This makes last_csv_file_name non-NULL.
>
>
> 2) PG config changed to remove csvlog from log_destination
> and SIGHUP sent.
>
> This removes the csvlog path from current_logfiles but does not
> make last_csv_file_name NULL.  last_csv_file_name has the old
> path in it.
>
>
> 3) PG configured to add csvlog back to log_destination and
> SIGHUP sent.
>
> When csvlogging is turned back on, the stale csv path is written
> to current_logfiles.  This is overwritten as soon as some csv logs
> are written because the new csv logfile must be opened, but this is
> still a problem.  Even if it happens to be impossible at the moment
> to get past step 2 to step 3 without having some logs written it seems
> a lot more robust to manually "expire" the last*_file_name variable
> content when log_destination changes.
>
>
> So what the patch does is "scrub" the "last_*file_name" variables
> when the config file changes.
>
> FWIW, I moved the logfile_writename() call toward the top of the
> if block just to keep all the code which sorta-kinda involves
> the old_log_destination variable together.
>
> ---
>
> patch_pg_current_logfile-v14.diff.logdest_change-part2
>
> Do not include in "main" patch, submit to maintainers separately.
>
> Applies on top of patch_pg_current_logfile-v14.diff.logdest_change
>
> Adds a PGLogDestination typedef.  A separate patch since this may be
> overkill and more about coding style than anything else.
>
> ---
>
> And now, a series of patches to fix the problem where, at least
> potentially, logfile_writename() gets a ENFILE or EMFILE and
> therefore a log file path does not ever get written to the
> current_logfiles file.  The point of this patch series is to retry until
> the content of current_logfiles is correct.
>
> All of these are for consideration of inclusion in "main" patch.
>
>
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part1
>
> Applies on top of patch_pg_current_logfile-v14.diff.logdest_change
>
> A documentation patch.  Notes that (even with retrying) the
> current_logfiles content might not be right.
>
>
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part2
>
> Applies on top of
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part1
>
> Remove arguments from logfile_writename().  Use static storage
> instead.  We always update last_file_name and last_csv_file_name
> whenever logfile_writename() is called so there's not much of
> a change here.
>
>
>
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3
>
> Applies on top of
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part2
>
> Re-try the write of current_logfiles should it fail because the
> system is too busy.
>
> ---
>
> patch_pg_current_logfile-v14.diff.backoff
>
> Do not include in "main" patch, submit to maintainers separately.
>
> Applies on top of
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3
>
> Introduces a backoff when retrying to write the current_logfiles file,
> because retrying when failure is due to a busy system only makes the
> system more busy.  This may well be over-engineering but I thought
> I'd present the code and let more experienced people decide.
>
> I have yet to really test this patch.
>
> ---
>
> The remaining patches have to do with coding style and code clarity.
>
> ---
>
> patch_pg_current_logfile-v14.diff.deletion
>
> For consideration of inclusion in "main" patch, otherwise submit to
> maintainers separately.
>
> Applies on top of
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3
>
> I find the code to be more understandable if the deletion
> of the current_logfiles file is separated from the updating
> of current_logfiles.  It's not that without this patch
> that unnecessary code execution is inefficient, it's that
> the unnecessary code execution makes it hard (for me) to think about
> what's happening where and why.
>
> ---
>
> patch_pg_current_logfile-v14.diff.conditional
>
> For consideration of inclusion in "main" patch, otherwise submit to
> maintainers separately.
>
> Applies on top of patch_pg_current_logfile-v14.diff.deletion
> (Could be changed to apply on top of
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3.)
>
> Only change the current_logfiles content on SIGHUP when the content needs
> to be changed.
>
> As above, I understand the code more readily when statements are only
> executed when they need to be executed.
>
> My analysis here is as follows:
>
> There are 9 cases, the combination of 2 sets of 3 possibilities.
>
>  syslog is removed from log_destination
>  the presence or absence of syslog in log_destination is unchanged
>  syslog is added to log_destination
>
>  the same 3 cases only regards csvlog
>
> In the no-change case we don't need to do anything
>
> If either syslog or csvlog are added to log_destination then
> the current_logfiles file will be updated on rotation or, in
> the case of adding a csvlog file, on file open.
>
> It is only the removal cases that require the current_logfiles
> file be changed, and that's what the patch does.
>
> ---
>
> patch_pg_current_logfile-v14.diff.codecomment
>
> Do not include in "main" patch, submit to maintainers separately.
>
> Applies on top of
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3
>
> Add a comment to the code explaining how an unusual case can come to
> pass.
>
> ---
>
> patch_pg_current_logfile-v14.diff.cleanup_rotate
>
> Do not include in "main" patch, submit to maintainers separately.
>
> Applies on top of
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3
>
> Removes a net 10 lines of code, eliminating unnecessary if statements.
> (Refactoring aside, pfree() does nothing when given a NULL pointer.)
>
> It might be a little cleaner to move the declarations of "filename" and
> "csvfilename" into the if blocks that use them, but I didn't do this.
>
> ---
>
> A series of patches which creates symbols for some GUC values which
> appear in multiple places in the code.
>
> Do not include in "main" patch, submit to maintainers separately.
>
>
> patch_pg_current_logfile-v14.diff.abstract_guc_part1
>
> Applies on top of
> patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3
> (And also applies on top of master branch, or really, any of the
> other patches presented in this email.)
>
> Changes code in master branch to create symbols for some GUC values.
>
>
> patch_pg_current_logfile-v14.diff.abstract_guc_part2
>
> Applies on top of patch_pg_current_logfile-v14.diff.abstract_guc_part1
> (and also applies on top of patch_pg_current_logfile-v14.diff, or
> probably any of the other patches in this email)
>
> Changes code in this patch series to use symbols for some GUC values.
>
> ---
>
> I've just started to look at pg_current_logfile() in
> src/backend/utils/adt/misc.c.  I believe I've noticed a couple
> of problems.  I probably won't have time to write patches until
> next week so I thought I'd simply report what I see and
> let you review what I've got so far.
>
> It looks like under some conditions (typically errors) you
> can return an uninitialized value in log_filename.  The
> answer here is probably to write :
>
>   log_filename[0] = 0;
>
> somewhere toward the top of the function.  And (as now written)
> you also need to put the same statement after the error that
> reports "unexpected line format in file %s" to be sure that
> in that case you're not going to return a pointer to the C
> NULL value.
>
> But also, you can't use strtok() to parse lbuffer because
> the path you're returning can contain a space.  (I suspect using
> strstr() to parse on the the first space you find is the way
> to go.  You can use the same trick that strtok() uses of sticking
> a 0 byte in in place of the space to get the log format string.)
>
> ---
>
> I'm going to put together a patch, to be sent separately to the
> maintainers when we ask them for final review, which shows what the
> docs/code looks like when current_logfiles always exists and can be an
> empty file.  And argue for that.  But I've not gotten to this yet.
>
> The good news is that I don't see anything else in syslogger.c to
> comment on or patch.  :-)
>
> Regards,
>
>
> Karl <kop@meme.com>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein


--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sun, 27 Nov 2016 21:54:46 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Again, patches patch_pg_current_logfile-v14.diff.doc_linux_default-v2
> have not been included because I don't see any reason to talk
> especially about systemd. If you talk about systemd you must talk
> about other stderr handler by all systems. IMO saying that
> current_logfile is present only if logging_collector is enabled and
> log_destination include stderr or/and csvlog is enough, no need to
> talk about systemd and behavior of Linux distributions.

Fair enough.  And I'd sooner not talk about systemd or other such
specifics too.

The concern I'm attempting to address is that the patch touts
the current_logfiles file in the section on logging without
reservation.  But anyone compiling from source or using most
pre-built Linux binaries won't have the file. (And the
pg_current_logfiles() function won't work either.)

Maybe the answer is to  change
   <para>When logs are written to the file-system ...

to
   <para>When the <productname>PostgreSQL</productname> backend   database server write logs to the file-system (which
isoften   not the default configuration) ...
 

?

Or something.  This also seems verbose, yet incomplete because
although it mentions that the default is to not have the backend
write logs it does not say anything about where to look to change
the default.

The thing is, on most default setups you do get logs written to the
filesystem, but they are not written by the PG backend.

I'll let you (or anyone else who might be concerned) move this forward
because I don't seem to have a good answer.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Sun, 27 Nov 2016 21:54:46 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> I've attached the v15 of the patch that applies your changes/fixes and
> remove the call to strtok().

I like the idea of replacing the strtok() call with sscanf()
but I see problems.

It won't work if log_filename begins with a space because
(the docs say) that a space in the sscanf() format represents
any amount of whitespace.

As written, there's a potential buffer overrun in log_format.
You could fix this by making log_format as large as lbuffer,
but this seems clunky.

Regards,


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 28/11/2016 à 18:54, Karl O. Pinc a écrit :
> Hi Gilles,
>
> On Sun, 27 Nov 2016 21:54:46 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> I've attached the v15 of the patch that applies your changes/fixes and
>> remove the call to strtok().
> I like the idea of replacing the strtok() call with sscanf()
> but I see problems.
>
> It won't work if log_filename begins with a space because
> (the docs say) that a space in the sscanf() format represents
> any amount of whitespace.
Right

> As written, there's a potential buffer overrun in log_format.
> You could fix this by making log_format as large as lbuffer,
> but this seems clunky.

Sorry, Isee that I forgot to apply the control in number of character
read by sscanf.

The problem about white space make me though that the use of sscanf is
not the right solution, I will rewrite this part but I can not do that
before the end of the week.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Sun, 27 Nov 2016 21:54:46 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> I've attached the v15 of the patch that applies your changes/fixes ...

Per the following:

On Mon, 21 Nov 2016 13:17:17 -0500
Robert Haas <robertmhaas@gmail.com> wrote:

> It would really be much better to submit anything that's not
> absolutely necessary for the new feature as a separate patch, rather
> than bundling things together.

The following patch should really be submitted (along with
your patch) as a separate and independent patch:

patch_pg_current_logfile-v14.diff.cleanup_rotate

It introduces no new functionality and only cleans up existing code.
The idea is to give the maintainers only one thing to consider
at a time.  It can be looked at along with your patch since it
touches the same code but, like the GUC symbol patch, it's
not a necessary part of your patch's functionality.

At present patch_pg_current_logfile-v14.diff.cleanup_rotate is bundled
in with the v15 patch.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sun, 27 Nov 2016 21:54:46 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> I've attached the v15 of the patch 

> I've not applied patch patch_pg_current_logfile-v14.diff.backoff to
> prevent constant call of logfile_writename() on a busy system (errno =
> ENFILE | EMFILE).

I don't think it should be applied and included in the basic
functionality patch in any case. I think it needs to be submitted as a
separate patch along with the basic functionality patch.  Backing off
the retry of the current_logfiles write could be overly fancy and
simply not needed.

> I think this can be done quite simply by testing if
> log rotate is still enabled. This is possible because function
> logfile_rotate() is already testing if  errno = ENFILE | EMFILE and in
> this case rotation_disabled is set to true. So the following test
> should do the work:
> 
>                if (log_metainfo_stale && !rotation_disabled)
>                        logfile_writename();
> 
> This is included in v15 patch.

I don't see this helping much, if at all.

First, it's not clear just when rotation_disabled can be false
when log_metainfo_stale is true.  The typical execution
path is for logfile_writename() to be called after rotate_logfile()
has already set rotataion_disabled to true.   logfile_writename()
is the only place setting log_metainfo_stale to true and
rotate_logfile() the only place settig rotation_disabled to false.

While it's possible
that log_metainfo_stale might be set to true when logfile_writename()
is called from within open_csvlogfile(), this does not help with the
stderr case.  IMO better to just test for log_metaifo_stale at the
code snippet above.

Second, the whole point of retrying the logfile_writename() call is
to be sure that the current_logfiles file is updated before the logs
rotate.  Waiting until logfile rotation is enabled defeats the purpose.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: Patch to implement pg_current_logfile() function

От
Haribabu Kommi
Дата:


On Fri, Dec 2, 2016 at 12:08 PM, Karl O. Pinc <kop@meme.com> wrote:
On Sun, 27 Nov 2016 21:54:46 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> I've attached the v15 of the patch

> I've not applied patch patch_pg_current_logfile-v14.diff.backoff to
> prevent constant call of logfile_writename() on a busy system (errno =
> ENFILE | EMFILE).

I don't think it should be applied and included in the basic
functionality patch in any case. I think it needs to be submitted as a
separate patch along with the basic functionality patch.  Backing off
the retry of the current_logfiles write could be overly fancy and
simply not needed.

> I think this can be done quite simply by testing if
> log rotate is still enabled. This is possible because function
> logfile_rotate() is already testing if  errno = ENFILE | EMFILE and in
> this case rotation_disabled is set to true. So the following test
> should do the work:
>
>                if (log_metainfo_stale && !rotation_disabled)
>                        logfile_writename();
>
> This is included in v15 patch.

I don't see this helping much, if at all.

First, it's not clear just when rotation_disabled can be false
when log_metainfo_stale is true.  The typical execution
path is for logfile_writename() to be called after rotate_logfile()
has already set rotataion_disabled to true.   logfile_writename()
is the only place setting log_metainfo_stale to true and
rotate_logfile() the only place settig rotation_disabled to false.

While it's possible
that log_metainfo_stale might be set to true when logfile_writename()
is called from within open_csvlogfile(), this does not help with the
stderr case.  IMO better to just test for log_metaifo_stale at the
code snippet above.

Second, the whole point of retrying the logfile_writename() call is
to be sure that the current_logfiles file is updated before the logs
rotate.  Waiting until logfile rotation is enabled defeats the purpose.

Moved to next CF with "needs review" status.

Regards,
Hari Babu
Fujitsu Australia

Re: Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 02/12/2016 à 02:08, Karl O. Pinc a écrit :
> On Sun, 27 Nov 2016 21:54:46 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> I've attached the v15 of the patch
>> I've not applied patch patch_pg_current_logfile-v14.diff.backoff to
>> prevent constant call of logfile_writename() on a busy system (errno =
>> ENFILE | EMFILE).
> I don't think it should be applied and included in the basic
> functionality patch in any case. I think it needs to be submitted as a
> separate patch along with the basic functionality patch.  Backing off
> the retry of the current_logfiles write could be overly fancy and
> simply not needed.
>
>> I think this can be done quite simply by testing if
>> log rotate is still enabled. This is possible because function
>> logfile_rotate() is already testing if  errno = ENFILE | EMFILE and in
>> this case rotation_disabled is set to true. So the following test
>> should do the work:
>>
>>                if (log_metainfo_stale && !rotation_disabled)
>>                        logfile_writename();
>>
>> This is included in v15 patch.
> I don't see this helping much, if at all.
>
> First, it's not clear just when rotation_disabled can be false
> when log_metainfo_stale is true.  The typical execution
> path is for logfile_writename() to be called after rotate_logfile()
> has already set rotataion_disabled to true.   logfile_writename()
> is the only place setting log_metainfo_stale to true and
> rotate_logfile() the only place settig rotation_disabled to false.
>
> While it's possible
> that log_metainfo_stale might be set to true when logfile_writename()
> is called from within open_csvlogfile(), this does not help with the
> stderr case.  IMO better to just test for log_metaifo_stale at the
> code snippet above.
>
> Second, the whole point of retrying the logfile_writename() call is
> to be sure that the current_logfiles file is updated before the logs
> rotate.  Waiting until logfile rotation is enabled defeats the purpose.

Ok, sorry I've misunderstood your previous post. Current v16
attached patch removed your change about log_meta_info
stale and fix the use of sscanf to read the file.

It seems that all fixes have been included in this patch.

Regards

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


Вложения

Re: Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Tue, Dec 6, 2016 at 6:11 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
> It seems that all fixes have been included in this patch.

Yikes.  This patch has certainly had a lot of review, but it has lots
of basic inconsistencies with PostgreSQL coding style, which one would
think somebody would have noticed by now, and there are multiple
places where the logic seems to do in a complicated way something that
could be done significantly more simply.  I don't have any objection
to the basic idea of this patch, but it's got to look and feel like
the surrounding PostgreSQL code.  And not be unnecessarily
complicated.

Detailed comments below:

The SGML doesn't match the surrounding style - for example, <para>
typically appears on a line by itself.

+    if (!Logging_collector) {

Formatting.

rm_log_metainfo() could be merged into logfile_writename(), since
they're always called in conjunction and in the same pattern.  The
function is poorly named; it should be something like
update_metainfo_datafile().

+        if (errno == ENFILE || errno == EMFILE)
+            ereport(LOG,
+                (errmsg("system is too busy to write logfile meta
info, %s will be updated on next rotation (or use SIGHUP to retry)",
LOG_METAINFO_DATAFILE)));

This seems like a totally unprincipled reaction to a purely arbitrary
subset of errors.  EMFILE or ENFILE doesn't represent a general "too
busy" condition, and logfile_open() has already logged the real error.

+    snprintf(tempfn, sizeof(tempfn), "%s.tmp", LOG_METAINFO_DATAFILE);

You don't really need to use snprintf() here.  You could #define
LOG_METAINFO_DATAFILE_TMP LOG_METAINFO_DATAFILE ".tmp" and compute
this at compile time instead of runtime.

+    if (PG_NARGS() == 1) {
+        fmt = PG_ARGISNULL(0) ? NULL : PG_GETARG_TEXT_PP(0);
+        if (fmt != NULL) {
+            logfmt = text_to_cstring(fmt);
+            if ( (strcmp(logfmt, "stderr") != 0) &&
+                (strcmp(logfmt, "csvlog") != 0) ) {
+                ereport(ERROR,
+        (errmsg("log format %s not supported, possible values are
stderr or csvlog", logfmt)));
+                PG_RETURN_NULL();
+            }
+        }
+    } else {
+        logfmt = NULL;
+    }

Formatting.  This is unlike PostgreSQL style in multiple ways,
including cuddled braces, extra parentheses and spaces, and use of
braces around a single-line block.  Also, this could be written in a
much less contorted way, like:

if (PG_NARGS() == 0 || PG_ARGISNULL(0))   logfmt = NULL;
else
{   logfmt = text_to_cstring(PG_GETARG_TEXT_PP(0));   if (strcmp(logfmt, "stderr") != 0 && strcmp(logfmt, "csvlog") !=
0)       ereport(...);
 
}

Also, the ereport() needs an errcode(), not just an errmsg().
Otherwise it defaults to ERRCODE_INTERNAL_ERROR, which is not correct.

+    /* Check if current log file is present */
+    if (stat(LOG_METAINFO_DATAFILE, &stat_buf) != 0)
+        PG_RETURN_NULL();

Useless test.  The code that follows catches this case anyway and
handles it the same way.  Which is good, because this has an inherent
race condition. The previous if (!Logging_collector) test sems fairly
useless too; unless there's a bug in your syslogger.c code, the file
won't exist anyway, so we'll return NULL for that reason with no extra
code needed here.

+    /*
+    * Read the file to gather current log filename(s) registered
+    * by the syslogger.
+    */

Whitespace isn't right.

+    while (fgets(lbuffer, sizeof(lbuffer), fd) != NULL) {
+        char    log_format[10];
+        int     i = 0, space_pos = 0;
+
+        /* Check for a read error. */
+        if (ferror(fd)) {

More coding style issues.

+            ereport(ERROR,
+                (errcode_for_file_access(),
+                errmsg("could not read file \"%s\": %m",
LOG_METAINFO_DATAFILE)));
+            FreeFile(fd);
+            break;

ereport(ERROR) doesn't return, so the code that follows is pointless.

+        if (strchr(lbuffer, '\n') != NULL)
+            *strchr(lbuffer, '\n') = '\0';

Probably worth adding a local variable instead of doing this twice.
Local variables are cheap, and the code would be more readable.

+            if ((space_pos == 0) && (isspace(lbuffer[i]) != 0))

Too many parentheses.  Also, I think the whole loop in which this is
contained could be eliminated entirely.  Just search for the first ' '
character with strchr(); you don't need to are about isspace() because
you know the code that writes this file uses ' ' specifically.
Overwrite it with '\0'.  And then use a pointer to lbuffer for the log
format and a pointer to lbuffer + strchr_result + 1 for the pathname.

+        if ((space_pos != (int)strlen("stderr")) &&
+            (space_pos != (int)strlen("csvlog")))
+        {
+            ereport(ERROR,
+                (errmsg("unexpected line format in file %s",
LOG_METAINFO_DATAFILE)));
+            break;
+        }

I think this is pointless.  Validating the length of the log format
but not the content seems kind of silly, and why do either?  The only
way it's going to do the wrong thing is if the file contents are
corrupt, and that should only happen if we have a bug.  Which we
probably won't, and if we do we'll fix it and then we won't, and this
will just be deadweight -- and one more thing to update if we ever add
support for another log format.

+    /* Return null when csvlog is requested but we have a stderr log */
+    if ( (logfmt != NULL) && (strcmp(logfmt, "csvlog") == 0)
+            && !(Log_destination & LOG_DESTINATION_CSVLOG) )
+        PG_RETURN_NULL();
+
+    /* Return null when stderr is requested but we have a csv log */
+    if ( (logfmt != NULL) && (strcmp(logfmt, "stderr") == 0)
+            && !(Log_destination & LOG_DESTINATION_STDERR) )
+        PG_RETURN_NULL();

Seems complicated.  Instead, you could declare char *result = NULL.
When you find the row you're looking for, set result and break out of
the loop.  Then here you can just do if (result == NULL)
PG_RETURN_NULL() and this complexity goes away.

+/* in backend/utils/adt/misc.c and backend/postmaster/syslogger.c */
+/*
+ * Name of file holding the paths, names, and types of the files where current
+ * log messages are written, when the log collector is enabled.  Useful
+ * outside of Postgres when finding the name of the current log file in the
+ * case of time-based log rotation.
+ */
+#define LOG_METAINFO_DATAFILE  "current_logfiles"

syslogger.h seems more appropriate.  miscadmin.h is quite widely
included and it would be better not to bloat it with things that
aren't widely needed.  It makes sense to use it for widely-used stuff
that has no other obvious home, but since this is clearly a creature
of the syslogger, it could logically go in syslogger.h.

This might not be exhaustive but I think fixing these things would get
us much closer to a committable patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Patch to implement pg_current_logfile() function

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Yikes.  This patch has certainly had a lot of review, but it has lots
> of basic inconsistencies with PostgreSQL coding style, which one would
> think somebody would have noticed by now, and there are multiple
> places where the logic seems to do in a complicated way something that
> could be done significantly more simply.  I don't have any objection
> to the basic idea of this patch, but it's got to look and feel like
> the surrounding PostgreSQL code.  And not be unnecessarily
> complicated.

A lot of the code-formatting issues could probably be fixed by running it
through pgindent.  Also, when you are starting from code that was written
with little regard for Postgres layout conventions, it's a really good
idea to see what pgindent will do to it, because it might not look very
nice afterwards.
        regards, tom lane



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hello Robert,

On Wed, 7 Dec 2016 18:52:24 -0500
Robert Haas <robertmhaas@gmail.com> wrote:

> On Tue, Dec 6, 2016 at 6:11 PM, Gilles Darold
> <gilles.darold@dalibo.com> wrote:
> > It seems that all fixes have been included in this patch.

I read this and knew that I hadn't finished review, but didn't
immediately respond because I thought the patch had to be
marked "ready for committer" on commitfest.postgresql.org
before the committers would spend a lot of time on it.

I don't have the time to fully respond, or I'd be able to
attach new code, but want to comment before anybody else
spends a lot of time on this patch.

> Yikes.  This patch has certainly had a lot of review, but it has lots
> of basic inconsistencies with PostgreSQL coding style, which one would
> think somebody would have noticed by now,

Yes and no.  I don't really know what the coding style is and
rather than have to make multiple corrections to code that might
get weeded out and discarded I've been focusing on getting the logic
right.  Really, I've not put a thought to it except for trying to write
what I write like what's there, and sticking to < 80 chars per line.

There has been lots of review.
The only truly effective way I've found to communicate regards
the pg_current_logfiles patch has been to write code and provide
detailed test cases.  I could be wrong, but unless I submit
code I don't feel like I've been understood.
I just haven't been interested in re-formatting
somebody else's code before I think the code really works.

>  I don't have any objection
> to the basic idea of this patch, but it's got to look and feel like
> the surrounding PostgreSQL code.  And not be unnecessarily
> complicated.

I've been introducing some complication, but I hope it's necessary.
To keep the review process simple my plan has been to submit
multiple patches.  One with the basics and more on top of that
that introduce complication that can be considered and accepted or
rejected.  So I send emails with multiple patches, some that I think
need to go into the core patch and others to be kept separate.
But, although I try, this does not seem to have been communicated
because I keep getting emails back that contain only a single patch.

Maybe something's wrong with my review process but I don't know
how to fix it.

If you think a single patch is the way to go I can open up
separate tickets at commitfest.postgresql.org.  But this seems
like overkill because all the patches touch the same code.

The extreme case is the attached "cleanup_rotate" patch.
(Which applies to v14 of this patch.)  It's nothing but
a little bit of tiding up of the master branch, but does
touch code that's already being modified so it seems
like the committers would want to look at it at the same
time as when reviewing the pg_current_logfile patch.
Once you've looked at the pg_current_logfile patch
you've already looked at and modified code in the function
the "cleanup_rotate" patch touches.

But the "cleanup_rotate" patch got included
in the v15 version of the patch and is also in v16.
I'm expecting to submit it as a separate patch
along with the pg_current_logfile patch and tried
to be very very clear about this.  It
really has nothing to do with the pg_current_logfile
functionality.  (And is the only "separate" patch
which really has nothing to do with the pg_current_logfile
functionality.)

More examples of separate patches are below.  Any guidance
would be appreciated.

>
> Detailed comments below:

> rm_log_metainfo() could be merged into logfile_writename(), since
> they're always called in conjunction and in the same pattern.

This would be true, if it weren't for the attached
"retry_current_logfiles" patches that were applied in
v15 of the patch and removed from v16 due to some
mis-communication I don't understand.

(But the attached patches apply on top of the the v14 patch.
I don't have time to refactor them now.)

FYI.  The point of the "retry_current_logfiles"
patch series is to handle the case
where logfile_writename gets a ENFILE or EMFILE.
When this happens the current_logfiles file can
"skip" and not contain the name(s) of the current
log file for an entire log rotation.  To miss, say,
a month of logs because the logs only rotate monthly
and your log processing is based on reading the
current_logfiles file sounds like a problem.

What I was attempting to communicate in my email response
to the v15 patch is that the (attached, but applies to the
v14 patch again) "backoff" patch should be submitted
as a separate patch.  It seems over-complicated, but
exists in case a committer is worried about re-trying
writes on a system that's busy enough to generate ENFILE
or EMFILE errors.


> +        if (errno == ENFILE || errno == EMFILE)
> +            ereport(LOG,
> +                (errmsg("system is too busy to write logfile meta
> info, %s will be updated on next rotation (or use SIGHUP to retry)",
> LOG_METAINFO_DATAFILE)));
>
> This seems like a totally unprincipled reaction to a purely arbitrary
> subset of errors.  EMFILE or ENFILE doesn't represent a general "too
> busy" condition, and logfile_open() has already logged the real error.

This is my doing.  I wrote "too busy" because that's what the existing
code comments say that ENFILE and EMFILE are indicative of.

The point of the code is to report not just the real error, but what
effect the real error has -- that the current_logfiles file did not
get updated in a timely fashion.  Maybe this isn't necessary, especially
if the write of current_logfiles gets retried on failure.  Or maybe
the right thing to do is to give logfile_open() an argument that
let's it elaborate it's error message.  Any guidance here would
be appreciated.

Thanks for the review and I'll try to read over and digest the details
before submitting another email with patches.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Wed, Dec 7, 2016 at 11:02 PM, Karl O. Pinc <kop@meme.com> wrote:
> I read this and knew that I hadn't finished review, but didn't
> immediately respond because I thought the patch had to be
> marked "ready for committer" on commitfest.postgresql.org
> before the committers would spend a lot of time on it.

Generally that's true, but I was trying to be helpful and trying also
to move this toward some kind of conclusion.

> I've been introducing some complication, but I hope it's necessary.
> To keep the review process simple my plan has been to submit
> multiple patches.  One with the basics and more on top of that
> that introduce complication that can be considered and accepted or
> rejected.  So I send emails with multiple patches, some that I think
> need to go into the core patch and others to be kept separate.
> But, although I try, this does not seem to have been communicated
> because I keep getting emails back that contain only a single patch.
>
> Maybe something's wrong with my review process but I don't know
> how to fix it.

It is, of course, not my job to decide who is at fault in whatever may
or may not have gone wrong during the reviewing process.  It is not
only not my job, but would be unproductive, since the goal here is for
people to contribute more, not less.  I have done enough review in
this community to have experienced the problem of people who say that
they took your suggestions when in fact they didn't, or who randomly
de-improve things in future patch versions that were more correct in
earlier versions.  I agree that on occasions when that happens, it is
indeed very frustrating.  Whether that's happened in this case, I
don't know.  I do think that the blizzard of small patches that you've
submitted in some of your emails may possibly be a bit overwhelming.
We shouldn't really need a stack of a dozen or more patches to
implement one small feature.  Declarative partitioning only had 7.
Why does this need more than twice that number?

> The extreme case is the attached "cleanup_rotate" patch.
> (Which applies to v14 of this patch.)  It's nothing but
> a little bit of tiding up of the master branch, but does
> touch code that's already being modified so it seems
> like the committers would want to look at it at the same
> time as when reviewing the pg_current_logfile patch.
> Once you've looked at the pg_current_logfile patch
> you've already looked at and modified code in the function
> the "cleanup_rotate" patch touches.

I took a look at that patch just now and I guess I don't really see
the point.  I mean, it will save a few cycles, and that's not nothing,
but it's not much, either.  I don't see a reason to complicate the
basic feature patch by entangling it with such a change - it just
makes it harder to get the actual feature committed.  And I kind of
wonder if the time it will take to review and commit that patch is
even worth it.  There must be hundreds of places in our code base
where you could do stuff like that, but only a few of those save
enough cycles to be worth bothering with.  To be fair, if that patch
were submitted independently of the rest of this and nobody objected,
I'd probably commit it; I mean, why not?  But this thread is now over
100 emails long without reaching a happy conclusion, and one good way
to expedite the path to a happy conclusion is to drop all of the
nonessential bits.  And that change certainly qualifies as
nonessential.

> FYI.  The point of the "retry_current_logfiles"
> patch series is to handle the case
> where logfile_writename gets a ENFILE or EMFILE.
> When this happens the current_logfiles file can
> "skip" and not contain the name(s) of the current
> log file for an entire log rotation.  To miss, say,
> a month of logs because the logs only rotate monthly
> and your log processing is based on reading the
> current_logfiles file sounds like a problem.

I think if you are trying to enumerate the names of your logfiles by
any sort of polling mechanism, rather than by seeing what files show
up in your logging directory, you are doing it wrong.  So, I don't see
that this matters at all.

> The point of the code is to report not just the real error, but what
> effect the real error has -- that the current_logfiles file did not
> get updated in a timely fashion.  Maybe this isn't necessary, especially
> if the write of current_logfiles gets retried on failure.  Or maybe
> the right thing to do is to give logfile_open() an argument that
> let's it elaborate it's error message.  Any guidance here would
> be appreciated.

Generally speaking, I would say that it's the user's job to decide
what the impact of an error is; our job is only to tell them what
happened.  There are occasional exceptions; for example:

rhaas=# select ablance from pgbench_accounts;
ERROR:  column "ablance" does not exist
LINE 1: select ablance from pgbench_accounts;              ^
HINT:  Perhaps you meant to reference the column "pgbench_accounts.abalance".

The HINT -- which you'll note is part of the same ereport() rather
than being separately logged -- has been judged a helpful response to
a particularly common kind of mistake.  But in general it's not
necessary to elaborate on possible reasons for the error; it suffices
to report it.  One of the important reasons for this is that our
guesses about what probably caused the problem can easily turn out to
be WRONG, and a misleading HINT is worse than no hint because it
confuses some users and annoys others.  For example, I'm still on the
warpath about this:

rhaas=# select 1;
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly   This probably means the server terminated abnormally   before or while
processingthe request.
 

The hint is essentially alleging that the server crashed even though,
as you can see from the FATAL, it really was just shut down cleanly.
So users panic even though an experienced user can see that the hint
is obviously garbage in this particular case; that sucks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 8 Dec 2016 11:27:34 -0500
Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Dec 7, 2016 at 11:02 PM, Karl O. Pinc <kop@meme.com> wrote:
> > I read this and knew that I hadn't finished review, but didn't
> > immediately respond because I thought the patch had to be
> > marked "ready for committer" on commitfest.postgresql.org
> > before the committers would spend a lot of time on it.
>
> Generally that's true, but I was trying to be helpful and trying also
> to move this toward some kind of conclusion.

It has been very helpful, particularly your last reply.  And I think
there's now a clear path forward.

(I'm not looking for any replies to the below, although of course would
welcome whatever you've got to say.)

> It is, of course, not my job to decide who is at fault in whatever may
> or may not have gone wrong during the reviewing process.

Understood.  And I'm not looking for any sort of judgment or
attempting to pass judgment.  It has been frustrating though
and your email provided an opportunity to vent, and to
provide some feedback, of whatever quality, to the review
process.

It's been that much more frustrating because I don't really
care one way or another about the feature.  I was just trying
to build up credit reviewing somebody else's patch, and instead
probably did the opposite with all the thrashing.  :)

>  I do think that the blizzard of small patches that you've
> submitted in some of your emails may possibly be a bit overwhelming.
> We shouldn't really need a stack of a dozen or more patches to
> implement one small feature.  Declarative partitioning only had 7.
> Why does this need more than twice that number?

I'm trying to break up changes into patches which are as small
as possible to make them more easily understood by providing
a guide for the reader/reviewer.  So rather than
a single patch I'd make, say, 3.  One for documentation to describe
the change.  Another which does whatever refactoring is necessary
to the existing code, but which otherwise does not introduce any
functional changes.  And another which adds the new code which makes
use of the refactoring.  At each stage the code should continue
to work without bugs.  The other party can then decide to incorporate
the patchs into the main patch, which is itself another attached
patch.

Do this for several unrelated changes to the main patch and the patches
add up.

Mostly I wouldn't expect various patches to be reviewed by the
committers, they'd get incorporated into the main patch.

This is how I break down the work when I look at code changes.
What's it do?  What does it change/break in the existing code?
How does the new stuff work?  But this process has not worked
here so I guess I'll stop.

But.... I expect you will see at least 3 patches submitted for
committer review.  I see a number of hardcoded constants,
now that the main patch adds additional code, that can
be made into symbols to, IMO, improve code clarity.
Guiles points out that this is an issue of coding style
and might be considered unnecessary complication.
So they are not in the main patch.  They are attached
(applied to v14 of the main patch; really, the first applies
to the master PG branch and the 2nd to v14 of the
pg_current_logfiles patch) if you want to look
at them now and provide some guidance as to whether they
should be dropped or included in the patch.

> > The extreme case is the attached "cleanup_rotate" patch.
> > (Which applies to v14 of this patch.)  It's nothing but
> > a little bit of tiding up of the master branch,

> I took a look at that patch just now and I guess I don't really see
> the point.

The point would be to make the code easier to read.  Saving cycles
is hardly ever worthwhile in my opinion.  The whole point
of code is to be read by people and be understandable.
If it's not understood it's useless going forward.

Once I've gone to the effort to understand something
written, that I'm going to be responsible for maintaining, I like to
see it written as clearly as possible.  In my experience
if you don't do this then little confusions multiply and
eventually the whole thing turns to mud.

The trade-off, as you say, is the overhead involved
in running minor changes through the production
process.  I figure this can be minimized by bundling
such changes with related substantial changes.

Again, it's not working so I'll drop it.

> > FYI.  The point of the "retry_current_logfiles"
> > patch series is to handle the case
> > where logfile_writename gets a ENFILE or EMFILE.
> > When this happens the current_logfiles file can
> > "skip" and not contain the name(s) of the current
> > log file for an entire log rotation.  To miss, say,
> > a month of logs because the logs only rotate monthly
> > and your log processing is based on reading the
> > current_logfiles file sounds like a problem.
>
> I think if you are trying to enumerate the names of your logfiles by
> any sort of polling mechanism, rather than by seeing what files show
> up in your logging directory, you are doing it wrong.  So, I don't see
> that this matters at all.

Ok.  This simplifies things substantially.

It is worth noting that the PG pg_current_logfile() function
is dependent upon the content of the current_logfiles file.  So
if the file is wrong the function will also give wrong answers.
But I don't care if you don't.

The logging code is quite anal about dropping information.
It's very helpful to know that (so long as limitations are
documented I presume) the related meta-information functionality
of the current_logfiles file and the pg_current_logfile() function
need only be best-effort.

> > The point of the code is to report not just the real error, but what
> > effect the real error has

> Generally speaking, I would say that it's the user's job to decide
> what the impact of an error is; our job is only to tell them what
> happened.  There are occasional exceptions; for example:
> >
> > rhaas=# select ablance from pgbench_accounts;
> > ERROR:  column "ablance" does not exist
> > LINE 1: select ablance from pgbench_accounts;
> >                ^
> > HINT:  Perhaps you meant to reference the column
> > "pgbench_accounts.abalance".
> >
> > The HINT -- which you'll note is part of the same ereport() rather
> > than being separately logged

Yeah.  :)

> -- has been judged a helpful response
> > to a particularly common kind of mistake.  But in general it's not
> > necessary to elaborate on possible reasons for the error; it
> > suffices to report it.  One of the important reasons for this is
> > that our guesses about what probably caused the problem can easily
> > turn out to be WRONG, and a misleading HINT is worse than no hint
> > because it confuses some users and annoys others.

Ok.  And agreed.  And I'll let the error reporting in the patch
which started this thread drop.

I'm not attempting to argue here; but to provide some relief
for what comes into my brain.

Which is: I do think there is
a distinction between trying to guess what problem the user
needs to fix and reporting what part of PG is failing.  In
other words just passing through the error gotten from the
OS regards some file or another does not give the user
who knows nothing about PG internals useful information.
There does need to be some interpretation of what the impact
is to the user.

A related thought is this.  There are 3 abstraction levels of
concern.  The level beneath PG, the PG level, and the level
of the user's application.  When PG detects an error from either
"above" or "below" its job is to describe the problem from
its own perspective.  HINTs are attempts to cross the boundary
into the application's perspective.

Thanks for the reply.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 8 Dec 2016 11:27:34 -0500
Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Dec 7, 2016 at 11:02 PM, Karl O. Pinc <kop@meme.com> wrote:
> > I read this and knew that I hadn't finished review, but didn't
> > immediately respond because I thought the patch had to be
> > marked "ready for committer" on commitfest.postgresql.org
> > before the committers would spend a lot of time on it.
>
> Generally that's true, but I was trying to be helpful and trying also
> to move this toward some kind of conclusion.

It has been very helpful, particularly your last reply.  And I think
there's now a clear path forward.

(I'm not looking for any replies to the below, although of course would
welcome whatever you've got to say.)

> It is, of course, not my job to decide who is at fault in whatever may
> or may not have gone wrong during the reviewing process.

Understood.  And I'm not looking for any sort of judgment or
attempting to pass judgment.  It has been frustrating though
and your email provided an opportunity to vent, and to
provide some feedback, of whatever quality, to the review
process.

It's been that much more frustrating because I don't really
care one way or another about the feature.  I was just trying
to build up credit reviewing somebody else's patch, and instead
probably did the opposite with all the thrashing.  :)

>  I do think that the blizzard of small patches that you've
> submitted in some of your emails may possibly be a bit overwhelming.
> We shouldn't really need a stack of a dozen or more patches to
> implement one small feature.  Declarative partitioning only had 7.
> Why does this need more than twice that number?

I'm trying to break up changes into patches which are as small
as possible to make them more easily understood by providing
a guide for the reader/reviewer.  So rather than
a single patch I'd make, say, 3.  One for documentation to describe
the change.  Another which does whatever refactoring is necessary
to the existing code, but which otherwise does not introduce any
functional changes.  And another which adds the new code which makes
use of the refactoring.  At each stage the code should continue
to work without bugs.  The other party can then decide to incorporate
the patchs into the main patch, which is itself another attached
patch.

Do this for several unrelated changes to the main patch and the patches
add up.

Mostly I wouldn't expect various patches to be reviewed by the
committers, they'd get incorporated into the main patch.

This is how I break down the work when I look at code changes.
What's it do?  What does it change/break in the existing code?
How does the new stuff work?  But this process has not worked
here so I guess I'll stop.

But.... I expect you will see at least 3 patches submitted for
committer review.  I see a number of hardcoded constants,
now that the main patch adds additional code, that can
be made into symbols to, IMO, improve code clarity.
Guiles points out that this is an issue of coding style
and might be considered unnecessary complication.
So they are not in the main patch.  They are attached
(applied to v14 of the main patch; really, the first applies
to the master PG branch and the 2nd to v14 of the
pg_current_logfiles patch) if you want to look
at them now and provide some guidance as to whether they
should be dropped or included in the patch.

> > The extreme case is the attached "cleanup_rotate" patch.
> > (Which applies to v14 of this patch.)  It's nothing but
> > a little bit of tiding up of the master branch,

> I took a look at that patch just now and I guess I don't really see
> the point.

The point would be to make the code easier to read.  Saving cycles
is hardly ever worthwhile in my opinion.  The whole point
of code is to be read by people and be understandable.
If it's not understood it's useless going forward.

Once I've gone to the effort to understand something
written, that I'm going to be responsible for maintaining, I like to
see it written as clearly as possible.  In my experience
if you don't do this then little confusions multiply and
eventually the whole thing turns to mud.

The trade-off, as you say, is the overhead involved
in running minor changes through the production
process.  I figure this can be minimized by bundling
such changes with related substantial changes.

Again, it's not working so I'll drop it.

> > FYI.  The point of the "retry_current_logfiles"
> > patch series is to handle the case
> > where logfile_writename gets a ENFILE or EMFILE.
> > When this happens the current_logfiles file can
> > "skip" and not contain the name(s) of the current
> > log file for an entire log rotation.  To miss, say,
> > a month of logs because the logs only rotate monthly
> > and your log processing is based on reading the
> > current_logfiles file sounds like a problem.
>
> I think if you are trying to enumerate the names of your logfiles by
> any sort of polling mechanism, rather than by seeing what files show
> up in your logging directory, you are doing it wrong.  So, I don't see
> that this matters at all.

Ok.  This simplifies things substantially.

It is worth noting that the PG pg_current_logfile() function
is dependent upon the content of the current_logfiles file.  So
if the file is wrong the function will also give wrong answers.
But I don't care if you don't.

The logging code is quite anal about dropping information.
It's very helpful to know that (so long as limitations are
documented I presume) the related meta-information functionality
of the current_logfiles file and the pg_current_logfile() function
need only be best-effort.

> > The point of the code is to report not just the real error, but what
> > effect the real error has

> Generally speaking, I would say that it's the user's job to decide
> what the impact of an error is; our job is only to tell them what
> happened.  There are occasional exceptions; for example:
> >
> > rhaas=# select ablance from pgbench_accounts;
> > ERROR:  column "ablance" does not exist
> > LINE 1: select ablance from pgbench_accounts;
> >                ^
> > HINT:  Perhaps you meant to reference the column
> > "pgbench_accounts.abalance".
> >
> > The HINT -- which you'll note is part of the same ereport() rather
> > than being separately logged

Yeah.  :)

> -- has been judged a helpful response
> > to a particularly common kind of mistake.  But in general it's not
> > necessary to elaborate on possible reasons for the error; it
> > suffices to report it.  One of the important reasons for this is
> > that our guesses about what probably caused the problem can easily
> > turn out to be WRONG, and a misleading HINT is worse than no hint
> > because it confuses some users and annoys others.

Ok.  And agreed.  And I'll let the error reporting in the patch
which started this thread drop.

I'm not attempting to argue here; but to provide some relief
for what comes into my brain.

Which is: I do think there is
a distinction between trying to guess what problem the user
needs to fix and reporting what part of PG is failing.  In
other words just passing through the error gotten from the
OS regards some file or another does not give the user
who knows nothing about PG internals useful information.
There does need to be some interpretation of what the impact
is to the user.

A related thought is this.  There are 3 abstraction levels of
concern.  The level beneath PG, the PG level, and the level
of the user's application.  When PG detects an error from either
"above" or "below" its job is to describe the problem from
its own perspective.  HINTs are attempts to cross the boundary
into the application's perspective.

Thanks for the reply.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 08/12/2016 à 00:52, Robert Haas a écrit :
On Tue, Dec 6, 2016 at 6:11 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
It seems that all fixes have been included in this patch.
Yikes.  This patch has certainly had a lot of review, but it has lots
of basic inconsistencies with PostgreSQL coding style, which one would
think somebody would have noticed by now, and there are multiple
places where the logic seems to do in a complicated way something that
could be done significantly more simply.  I don't have any objection
to the basic idea of this patch, but it's got to look and feel like
the surrounding PostgreSQL code.  And not be unnecessarily
complicated.

Detailed comments below:

The SGML doesn't match the surrounding style - for example, <para>
typically appears on a line by itself.
Fixed.

+    if (!Logging_collector) {

Formatting.
Fixed.

rm_log_metainfo() could be merged into logfile_writename(), since
they're always called in conjunction and in the same pattern.  The
function is poorly named; it should be something like
update_metainfo_datafile().
Merge and logfile_writename() function renamed as suggested.

+        if (errno == ENFILE || errno == EMFILE)
+            ereport(LOG,
+                (errmsg("system is too busy to write logfile meta
info, %s will be updated on next rotation (or use SIGHUP to retry)",
LOG_METAINFO_DATAFILE)));

This seems like a totally unprincipled reaction to a purely arbitrary
subset of errors.  EMFILE or ENFILE doesn't represent a general "too
busy" condition, and logfile_open() has already logged the real error.
Removed.

+    snprintf(tempfn, sizeof(tempfn), "%s.tmp", LOG_METAINFO_DATAFILE);

You don't really need to use snprintf() here.  You could #define
LOG_METAINFO_DATAFILE_TMP LOG_METAINFO_DATAFILE ".tmp" and compute
this at compile time instead of runtime.
Done and added to syslogger.h

+    if (PG_NARGS() == 1) {
+        fmt = PG_ARGISNULL(0) ? NULL : PG_GETARG_TEXT_PP(0);
+        if (fmt != NULL) {
+            logfmt = text_to_cstring(fmt);
+            if ( (strcmp(logfmt, "stderr") != 0) &&
+                (strcmp(logfmt, "csvlog") != 0) ) {
+                ereport(ERROR,
+        (errmsg("log format %s not supported, possible values are
stderr or csvlog", logfmt)));
+                PG_RETURN_NULL();
+            }
+        }
+    } else {
+        logfmt = NULL;
+    }

Formatting.  This is unlike PostgreSQL style in multiple ways,
including cuddled braces, extra parentheses and spaces, and use of
braces around a single-line block.  Also, this could be written in a
much less contorted way, like:

if (PG_NARGS() == 0 || PG_ARGISNULL(0))   logfmt = NULL;
else
{   logfmt = text_to_cstring(PG_GETARG_TEXT_PP(0));   if (strcmp(logfmt, "stderr") != 0 && strcmp(logfmt, "csvlog") != 0)        ereport(...);
}
Applied.

Also, the ereport() needs an errcode(), not just an errmsg().
Otherwise it defaults to ERRCODE_INTERNAL_ERROR, which is not correct.
Add errcode(ERRCODE_INVALID_PARAMETER_VALUE) to the ereport call.

+    /* Check if current log file is present */
+    if (stat(LOG_METAINFO_DATAFILE, &stat_buf) != 0)
+        PG_RETURN_NULL();

Useless test.  The code that follows catches this case anyway and
handles it the same way.  Which is good, because this has an inherent
race condition. The previous if (!Logging_collector) test sems fairly
useless too; unless there's a bug in your syslogger.c code, the file
won't exist anyway, so we'll return NULL for that reason with no extra
code needed here.
That's right, both test have been removed.

+    /*
+    * Read the file to gather current log filename(s) registered
+    * by the syslogger.
+    */

Whitespace isn't right.

+    while (fgets(lbuffer, sizeof(lbuffer), fd) != NULL) {
+        char    log_format[10];
+        int     i = 0, space_pos = 0;
+
+        /* Check for a read error. */
+        if (ferror(fd)) {

More coding style issues.

+            ereport(ERROR,
+                (errcode_for_file_access(),
+                errmsg("could not read file \"%s\": %m",
LOG_METAINFO_DATAFILE)));
+            FreeFile(fd);
+            break;

ereport(ERROR) doesn't return, so the code that follows is pointless.
All issues above are fixed.

+        if (strchr(lbuffer, '\n') != NULL)
+            *strchr(lbuffer, '\n') = '\0';

Probably worth adding a local variable instead of doing this twice.
Local variables are cheap, and the code would be more readable.
Removed

+            if ((space_pos == 0) && (isspace(lbuffer[i]) != 0))

Too many parentheses.  Also, I think the whole loop in which this is
contained could be eliminated entirely.  Just search for the first ' '
character with strchr(); you don't need to are about isspace() because
you know the code that writes this file uses ' ' specifically.
Overwrite it with '\0'.  And then use a pointer to lbuffer for the log
format and a pointer to lbuffer + strchr_result + 1 for the pathname.
Rewritten, not exactly the way you describe but like this:

              /* extract log format and log file path from the line */
              log_filepath = strchr(lbuffer, ' ');
              log_filepath++;
              lbuffer[log_filepath-lbuffer-1] = '\0';
              log_format = lbuffer;
              *strchr(log_filepath, '\n') = '\0';

it was possible to not use the additional log_format variable and just keep the log format information in lbuffer, then use it the next code, but I have kept the log_format variable for readability.


+        if ((space_pos != (int)strlen("stderr")) &&
+            (space_pos != (int)strlen("csvlog")))
+        {
+            ereport(ERROR,
+                (errmsg("unexpected line format in file %s",
LOG_METAINFO_DATAFILE)));
+            break;
+        }

I think this is pointless.  Validating the length of the log format
but not the content seems kind of silly, and why do either?  The only
way it's going to do the wrong thing is if the file contents are
corrupt, and that should only happen if we have a bug.  Which we
probably won't, and if we do we'll fix it and then we won't, and this
will just be deadweight -- and one more thing to update if we ever add
support for another log format.
Removed.

+    /* Return null when csvlog is requested but we have a stderr log */
+    if ( (logfmt != NULL) && (strcmp(logfmt, "csvlog") == 0)
+            && !(Log_destination & LOG_DESTINATION_CSVLOG) )
+        PG_RETURN_NULL();
+
+    /* Return null when stderr is requested but we have a csv log */
+    if ( (logfmt != NULL) && (strcmp(logfmt, "stderr") == 0)
+            && !(Log_destination & LOG_DESTINATION_STDERR) )
+        PG_RETURN_NULL();

Seems complicated.  Instead, you could declare char *result = NULL.
When you find the row you're looking for, set result and break out of
the loop.  Then here you can just do if (result == NULL)
PG_RETURN_NULL() and this complexity goes away.
Fixed.

+/* in backend/utils/adt/misc.c and backend/postmaster/syslogger.c */
+/*
+ * Name of file holding the paths, names, and types of the files where current
+ * log messages are written, when the log collector is enabled.  Useful
+ * outside of Postgres when finding the name of the current log file in the
+ * case of time-based log rotation.
+ */
+#define LOG_METAINFO_DATAFILE  "current_logfiles"

syslogger.h seems more appropriate.  miscadmin.h is quite widely
included and it would be better not to bloat it with things that
aren't widely needed.  It makes sense to use it for widely-used stuff
that has no other obvious home, but since this is clearly a creature
of the syslogger, it could logically go in syslogger.h.
Moved to syslogger.h

This might not be exhaustive but I think fixing these things would get
us much closer to a committable patch.

I have also run the code through pgindent as Tom suggest to fix all pending coding style issues.  I've just change on line from pgindent work to prevent git apply to complain about a space in tabulation issue.  I don't know which is the best, let things like pgindent want to write it or prevent git to complain.  Attached is the v17 of the patch.

Thanks a lot for this synthetic review.

Regards

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org
Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

On Fri, 9 Dec 2016 23:41:25 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

>               /* extract log format and log file path from the line */
>               log_filepath = strchr(lbuffer, ' ');
>               log_filepath++;
>               lbuffer[log_filepath-lbuffer-1] = '\0';
>               log_format = lbuffer;
>               *strchr(log_filepath, '\n') = '\0';

Instead I propose (code I have not actually executed):
...
char    lbuffer[MAXPGPATH];
char    *log_format = lbuffer;
...
   /* extract log format and log file path from the line */   log_filepath = strchr(lbuffer, ' ');  /* lbuffer ==
log_format*/   *log_filepath = '\0';                 /* terminate log_format */   log_filepath++;
/*start of file path */   log_filepath[strcspn(log_filepath, "\n")] = '\0';
 


My first thought was to follow your example and begin with   log_format = lbuffer;
But upon reflection I think changing the declaration of log_format
to use an initializer better expresses how storage is always shared.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Fri, 9 Dec 2016 23:36:12 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> Instead I propose (code I have not actually executed):
> ...
> char    lbuffer[MAXPGPATH];
> char    *log_format = lbuffer;
> ...
> 
>     /* extract log format and log file path from the line */
>     log_filepath = strchr(lbuffer, ' ');  /* lbuffer == log_format */
>     *log_filepath = '\0';                 /* terminate log_format */
>     log_filepath++;                       /* start of file path */
>     log_filepath[strcspn(log_filepath, "\n")] = '\0';

Er, I guess I prefer the more paranoid, just because who knows
what might have manged to somehow write the file that's read
into lbuffer:

...
char    lbuffer[MAXPGPATH];
char    *log_format = lbuffer;
...
   /* extract log format and log file path from the line */   if (log_filepath = strchr(lbuffer, ' ')) /* lbuffer ==
log_format*/       *log_filepath = '\0';                /* terminate log_format */   log_filepath++;
     /* start of file path */   log_filepath[strcspn(log_filepath, "\n")] = '\0';
 

The file read is, of course, normally written by postgres.  But possibly
writing to unintended memory locations, even virtual address NULL, does
not seem good.

Any feedback from more experienced PG developers as how to best handle
this case would be welcome.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sat, 10 Dec 2016 19:41:21 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> On Fri, 9 Dec 2016 23:36:12 -0600
> "Karl O. Pinc" <kop@meme.com> wrote:
> 
> > Instead I propose (code I have not actually executed):
> > ...
> > char    lbuffer[MAXPGPATH];
> > char    *log_format = lbuffer;
> > ...
> > 
> >     /* extract log format and log file path from the line */
> >     log_filepath = strchr(lbuffer, ' ');  /* lbuffer == log_format
> > */ *log_filepath = '\0';                 /* terminate log_format */
> >     log_filepath++;                       /* start of file path */
> >     log_filepath[strcspn(log_filepath, "\n")] = '\0';  
> 
> Er, I guess I prefer the more paranoid, just because who knows
> what might have manged to somehow write the file that's read
> into lbuffer:
> 
> ...
> char    lbuffer[MAXPGPATH];
> char    *log_format = lbuffer;
> ...
> 
>     /* extract log format and log file path from the line */
>     if (log_filepath = strchr(lbuffer, ' ')) /* lbuffer == log_format
> */ *log_filepath = '\0';                /* terminate log_format */
>     log_filepath++;                          /* start of file path */
>     log_filepath[strcspn(log_filepath, "\n")] = '\0';

*sigh*


...
char    lbuffer[MAXPGPATH];
char    *log_format = lbuffer;
...
   /* extract log format and log file path from the line */   /* lbuffer == log_format, they share storage */   if
(log_filepath= strchr(lbuffer, ' '))       *log_filepath = '\0';   /* terminate log_format */   else   {       /*
Unknownformat, no space. Return NULL to caller. */       lbuffer[0] = '\0';       break;   }   log_filepath++;
   /* start of file path   */   log_filepath[strcspn(log_filepath, "\n")] = '\0';
 


> The file read is, of course, normally written by postgres.  But
> possibly writing to unintended memory locations, even virtual address
> NULL, does not seem good.
> 
> Any feedback from more experienced PG developers as how to best handle
> this case would be welcome.
> 
> Regards,
> 
> Karl <kop@meme.com>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein
> 




Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 11/12/2016 à 04:38, Karl O. Pinc a écrit :
> On Sat, 10 Dec 2016 19:41:21 -0600
> "Karl O. Pinc" <kop@meme.com> wrote:
>
>> On Fri, 9 Dec 2016 23:36:12 -0600
>> "Karl O. Pinc" <kop@meme.com> wrote:
>>
>>> Instead I propose (code I have not actually executed):
>>> ...
>>> char    lbuffer[MAXPGPATH];
>>> char    *log_format = lbuffer;
>>> ...
>>>
>>>     /* extract log format and log file path from the line */
>>>     log_filepath = strchr(lbuffer, ' ');  /* lbuffer == log_format
>>> */ *log_filepath = '\0';                 /* terminate log_format */
>>>     log_filepath++;                       /* start of file path */
>>>     log_filepath[strcspn(log_filepath, "\n")] = '\0';  
>> Er, I guess I prefer the more paranoid, just because who knows
>> what might have manged to somehow write the file that's read
>> into lbuffer:
>>
>> ...
>> char    lbuffer[MAXPGPATH];
>> char    *log_format = lbuffer;
>> ...
>>
>>     /* extract log format and log file path from the line */
>>     if (log_filepath = strchr(lbuffer, ' ')) /* lbuffer == log_format
>> */ *log_filepath = '\0';                /* terminate log_format */
>>     log_filepath++;                          /* start of file path */
>>     log_filepath[strcspn(log_filepath, "\n")] = '\0';
> *sigh*
>
>
> ...
> char    lbuffer[MAXPGPATH];
> char    *log_format = lbuffer;
> ...
>
>     /* extract log format and log file path from the line */
>     /* lbuffer == log_format, they share storage */
>     if (log_filepath = strchr(lbuffer, ' '))
>         *log_filepath = '\0';   /* terminate log_format */
>     else
>     {
>         /* Unknown format, no space. Return NULL to caller. */
>         lbuffer[0] = '\0';
>         break;
>     }
>     log_filepath++;              /* start of file path   */
>     log_filepath[strcspn(log_filepath, "\n")] = '\0';
>

Applied in last version of the patch v18 as well as removing of an
unused variable.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Mon, Dec 12, 2016 at 4:31 AM, Gilles Darold <gilles.darold@dalibo.com> wrote:
> Applied in last version of the patch v18 as well as removing of an
> unused variable.

OK, this looks a lot closer to being committable.  But:

@@ -570,11 +583,13 @@ SysLogger_Start(void)     * a time-based rotation.     */    first_syslogger_file_time =
time(NULL);
-    filename = logfile_getname(first_syslogger_file_time, NULL);
+    last_file_name = logfile_getname(first_syslogger_file_time, NULL);
+
+    syslogFile = logfile_open(last_file_name, "a", false);

-    syslogFile = logfile_open(filename, "a", false);
+    update_metainfo_datafile();

-    pfree(filename);
+    pfree(last_file_name);
#ifdef EXEC_BACKEND    switch ((sysloggerPid = syslogger_forkexec()))

This leaves last_file_name pointing to free'd storage, which seems
like a recipe for bugs, because the next call to
update_metainfo_datafile() might try to follow that pointer.  I think
you need to make a clear decision about what the contract is for
last_file_name and last_csv_file_name.  Perhaps the idea is to always
keep them valid, but then you need to free the old value when
reassigning them and not free the current value.

The changes to logfile_rotate() appear to be mostly unrelated
refactoring which Karl was proposing for separate commit, not to be
incorporated into this patch.  Please remove whatever deltas are not
essential to the new feature being implemented.

misc.c no longer needs to add an include of <sys/stat.h>.  I hope, anyway.

+                     errmsg("log format not supported, possible
values are stderr or csvlog")));

This doesn't follow our message style guidelines.
https://www.postgresql.org/docs/devel/static/error-style-guide.html

Basically, a comma splice is not an acceptable way of joining together
two independent thoughts.  Obviously, people speak and write that way
informally all the time, but we try to be a bit rigorous about grammar
in our documentation and error messages.  I think the way to do this
would be something like errmsg("log format \"%s\" is not supported"),
errhint("The supported log formats are \"stderr\" and \"csvlog\".").

+            FreeFile(fd);
+            ereport(ERROR,
+                    (errcode_for_file_access(),
+                     errmsg("could not read file \"%s\": %m",
+                            LOG_METAINFO_DATAFILE)));

You don't need to FreeFile(fd) before ereport(ERROR).  See header
comments for AllocateFile().

+        /* report the entry corresponding to the requested format */
+        if (strcmp(logfmt, log_format) == 0)
+            break;
+    }
+    /* Close the current log filename file. */
+    if (FreeFile(fd))
+        ereport(ERROR,
+                (errcode_for_file_access(),
+                 errmsg("could not close file \"%s\": %m",
+                    LOG_METAINFO_DATAFILE)));
+
+    if (lbuffer[0] == '\0')
+        PG_RETURN_NULL();
+
+    /* Recheck requested log format against the one extracted from the file */
+    if (logfmt != NULL && (log_format == NULL ||
+                            strcmp(logfmt, log_format) != 0))
+        PG_RETURN_NULL();

Your email upthread claims that you fixed this per my suggestions, but
it doesn't look fixed to me.  That check to see whether lbuffer[0] ==
'\0' is either protecting against nothing at all (in which case it
could be removed) or it's protecting against some weirdness that can
only happen here because of the strange way this logic is written.
It's hard to understand all the cases here because we can exit the
loop either having found the entry we want or not, and there's no
clear indication of which one it is.  Why not change the if-statement
at the end of the loop like this:

if (logfmt == NULL || strcmp(logfmt, log_format) == 0)
{   FreeFile(fd);   PG_RETURN_TEXT_P(cstring_to_text(log_filepath));
}

Then after the loop, just:

FreeFile(fd);
PG_RETURN_NULL();

+  <para>
+  A file recording the log file(s) currently written to by the syslogger
+  and the file's log formats, <systemitem>stderr</>
+  or <systemitem>csvlog</>.  Each line of the file is a space separated list of
+  two elements: the log format and the full path to the log file including the
+  value of <xref linkend="guc-log-directory">.  The log format must be present
+  in <xref linkend="guc-log-destination"> to be listed in
+  <filename>current_logfiles</filename>.
+  </para>
+
+  <note>
+    <indexterm>
+      <primary><application>pg_ctl</application></primary>
+      <secondary>and <filename>current_logfiles</filename></secondary>
+    </indexterm>
+    <indexterm>
+      <primary><filename>stderr</filename></primary>
+      <secondary>and <filename>current_logfiles</filename></secondary>
+    </indexterm>
+    <indexterm>
+      <primary>log_destination configuration parameter</primary>
+      <secondary>and <filename>current_logfiles</filename></secondary>
+    </indexterm>
+
+    <para>
+    Although logs directed to <filename>stderr</filename> may be written
+    to the filesystem, when the writing of <filename>stderr</filename> is
+    managed outside of the <productname>PostgreSQL</productname> database
+    server the location of such files in the filesystem is not reflected in
+    the content of <filename>current_logfiles</filename>.  One such case is
+    when the <application>pg_ctl</application> command is used to start
+    the <command>postgres</command> database server, capture
+    the <filename>stderr</filename> output of the server, and direct it to a
+    file.
+    </para>
+
+    <para>
+    There are other notable situations related
+    to <filename>stderr</filename> logging.
+    Non-<productname>PostgreSQL</productname> log sources, such as 3rd party
+    libraries, which deliver error messages directly
+    to <filename>stderr</filename> are always logged
+    by <productname>PostgreSQL</productname>
+    to <filename>stderr</filename>.  <Filename>Stderr</Filename> is also the
+    destination for any incomplete log messages produced by
+    <productname>PostgreSQL</productname>.  When
+    <systemitem>stderr</systemitem> is not in
+    <xref linkend="guc-log-destination">,
+    <filename>current_logfiles</filename> does not not contain the name of the
+    file where these sorts of log messages are written.
+    </para>
+  </note>
+
+  <para>
+  The <filename>current_logfiles</filename> file
+  is present only when <xref linkend="guc-logging-collector"> is
+  activated and when at least one of <systemitem>stderr</> or
+  <systemitem>csvlog</> value is present in
+  <xref linkend="guc-log-destination">.
+  </para>
+ </entry>

This seems unnecessarily long-winded to me.  I would just remove the
entire <note>, which seems like a lengthy digression mostly on topics
not really related to the contents of current_logfiles.  The text
above and below make it clear enough that this file is only capturing
what the syslogger is doing and that it is based on
guc-log-destination; we don't need to recapitulate that a third and a
fourth time.  If the logging collector (syslogger) is turned on, it's
going to capture everything that pops out on stderr, because every
backend's stderr will have been redirected to the collector.  If it's
not, then the current_logfiles metafile won't be populated.  Other
things that might happen in other situations might need to be added to
the documentation someplace, but not in this patch and not here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Thu, Dec 8, 2016 at 4:34 PM, Karl O. Pinc <kop@meme.com> wrote:
>>  I do think that the blizzard of small patches that you've
>> submitted in some of your emails may possibly be a bit overwhelming.
>> We shouldn't really need a stack of a dozen or more patches to
>> implement one small feature.  Declarative partitioning only had 7.
>> Why does this need more than twice that number?
>
> I'm trying to break up changes into patches which are as small
> as possible to make them more easily understood by providing
> a guide for the reader/reviewer.  So rather than
> a single patch I'd make, say, 3.  One for documentation to describe
> the change.  Another which does whatever refactoring is necessary
> to the existing code, but which otherwise does not introduce any
> functional changes.  And another which adds the new code which makes
> use of the refactoring.  At each stage the code should continue
> to work without bugs.  The other party can then decide to incorporate
> the patchs into the main patch, which is itself another attached
> patch.

Sure, I don't think the intention is bad.  It didn't really work out
here, perhaps because the individual changes were too small.  I think
for small changes it's often more helpful to say "change X to be like
Y" than to provide a patch, or else it's worth combining several small
patches just to keep the number from getting too big.  I don't know; I
can't point to anything you really did wrong here; the process just
didn't seem to be converging.

> It is worth noting that the PG pg_current_logfile() function
> is dependent upon the content of the current_logfiles file.  So
> if the file is wrong the function will also give wrong answers.
> But I don't care if you don't.

Well, I want the current_logfiles file to be correct, but that doesn't
mean that it's reasonable to expect people to enumerate all the
logfiles that have ever existed by running it.  That would be fragile
for tons of reasons, like whether or not the server hits the
connection limit at the wrong time, whether network connectivity is
unimpaired, whether the cron job that's supposed to run it
periodically hiccups for some stupid reason, and many others.

> A related thought is this.  There are 3 abstraction levels of
> concern.  The level beneath PG, the PG level, and the level
> of the user's application.  When PG detects an error from either
> "above" or "below" its job is to describe the problem from
> its own perspective.  HINTs are attempts to cross the boundary
> into the application's perspective.

Yes, I agree.  EnterpriseDB occasionally get complaints from our
customers saying that PG is buggy because it reported an operating
system error.  No, really, if the operating system can't read the
file, that's not our fault!  Call your sysadmin!  I also agree with
your perspective on HINTs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Gilles,

I don't plan to be able to get back to this patch until late
this week or early next week.  My plan is to then go though
the whole thing and fix everything I can find.  If we're both
working at the same time this could lead to wasted effort
so I will write as soon as I start work and if you are working
at the same time I'll send smaller hunks.

By the by, my last email contained some stupidity in it's
code suggestion because it is structured like:

while if (...)   do something; else   break; do more...;

Stupid to have an if/else construct.

while if (!...)   break; do something; do more...;

Is cleaner.  Apologies if my coding out loud is confusing things.
I'll fix this in the next go-round if you don't get to it first.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Wed, Dec 14, 2016 at 12:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Dec 12, 2016 at 4:31 AM, Gilles Darold <gilles.darold@dalibo.com> wrote:
>> Applied in last version of the patch v18 as well as removing of an
>> unused variable.
>
> OK, this looks a lot closer to being committable.  But:
>
> [long review]

Gilles, could you update the patch based on this review from Robert? I
am marking the patch as "waiting on author" for the time being. I
looked a bit at the patch but did not notice anything wrong with it,
but let's see with a fresh version...
-- 
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 11/01/2017 à 08:39, Michael Paquier a écrit :
> On Wed, Dec 14, 2016 at 12:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Dec 12, 2016 at 4:31 AM, Gilles Darold <gilles.darold@dalibo.com> wrote:
>>> Applied in last version of the patch v18 as well as removing of an
>>> unused variable.
>> OK, this looks a lot closer to being committable.  But:
>>
>> [long review]
>  
> Gilles, could you update the patch based on this review from Robert? I
> am marking the patch as "waiting on author" for the time being. I
> looked a bit at the patch but did not notice anything wrong with it,
> but let's see with a fresh version...
Hi,

My bad, I was thinking that Karl has planned an update of the patch in
his last post, sorry for my misunderstanding.

Here is attached v19 of the patch that might fix all comment from
Robert's last review.

Best regards,

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Thu, 12 Jan 2017 13:14:28 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> My bad, I was thinking that Karl has planned an update of the patch in
> his last post, sorry for my misunderstanding.

I was, but have been swept along by events and not gotten to it.



Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Thu, Jan 12, 2017 at 9:55 PM, Karl O. Pinc <kop@meme.com> wrote:
> On Thu, 12 Jan 2017 13:14:28 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> My bad, I was thinking that Karl has planned an update of the patch in
>> his last post, sorry for my misunderstanding.
>
> I was, but have been swept along by events and not gotten to it.

No problem. Thanks for the update.
-- 
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Thu, Jan 12, 2017 at 9:14 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
> Le 11/01/2017 à 08:39, Michael Paquier a écrit :
>> On Wed, Dec 14, 2016 at 12:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Mon, Dec 12, 2016 at 4:31 AM, Gilles Darold <gilles.darold@dalibo.com> wrote:
>>>> Applied in last version of the patch v18 as well as removing of an
>>>> unused variable.
>>> OK, this looks a lot closer to being committable.  But:
>>>
>>> [long review]
>>
>> Gilles, could you update the patch based on this review from Robert? I
>> am marking the patch as "waiting on author" for the time being. I
>> looked a bit at the patch but did not notice anything wrong with it,
>> but let's see with a fresh version...
> Hi,
>
> My bad, I was thinking that Karl has planned an update of the patch in
> his last post, sorry for my misunderstanding.
>
> Here is attached v19 of the patch that might fix all comment from
> Robert's last review.

OK. I have been looking at this patch.

git diff --check is very unhappy, particularly because of
update_metainfo_datafile().

+    <para>
+    When logs are written to the file-system their paths, names, and
+    types are recorded in
+    the <xref linkend="storage-pgdata-current-logfiles"> file.  This provides
+    a convenient way to find and access log content without establishing a
+    database connection.
+    </para>
File system is used as a name here. In short, "file-system" -> "file
system". I think that it would be a good idea to show up the output of
this file. This could be reworded a bit:
"When logs are written to the file system, the <xref
linkend="storage-pgdata-current-logfiles"> file includes the location,
the name and the type of the logs currently in use. This provides a
convenient way to find the logs currently in used by the instance."

@@ -15441,6 +15441,19 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY
AS t(ls,n);      </row>
      <row>
+       <entry><literal><function>pg_current_logfile()</function></literal></entry>
+       <entry><type>text</type></entry>
+       <entry>primary log file name in use by the logging collector</entry>
+      </row>
+
+      <row>
+       <entry><literal><function>pg_current_logfile(<type>text</>)</function></literal></entry>
+       <entry><type>text</type></entry>
+       <entry>log file name, of log in the requested format, in use by the
+       logging collector</entry>
+      </row>
You could just use one line, using squared brackets for the additional
argument. The description looks imprecise, perhaps something like that
would be better: "Log file name currently in use by the logging
collector".

+/* in backend/utils/adt/misc.c and backend/postmaster/syslogger.c */
+/*
+ * Name of file holding the paths, names, and types of the files where current
+ * log messages are written, when the log collector is enabled.  Useful
+ * outside of Postgres when finding the name of the current log file in the
+ * case of time-based log rotation.
+ */
Not mentioning the file names here would be better. If this spreads in
the future updates would likely be forgotten. This paragraph could be
reworked: "configuration file saving meta-data information about the
log files currently in use by the system logging process".

--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -466,5 +466,4 @@ extern bool has_rolreplication(Oid roleid);/* in access/transam/xlog.c */extern bool
BackupInProgress(void);externvoid CancelBackup(void); 
-#endif   /* MISCADMIN_H */
Unrelated diff.

+           /*
+            * Force rewriting last log filename when reloading configuration,
+            * even if rotation_requested is false, log_destination may have
+            * been changed and we don't want to wait the next file rotation.
+            */
+           update_metainfo_datafile();
+       }
I know I am famous for being noisy on such things, but please be
careful with newline use..

+       else
+       {
+           /* Unknown format, no space. Return NULL to caller. */
+           lbuffer[0] = '\0';
+           break;
+       }
Hm. I would think that an ERROR is more useful to the user here.

Perhaps pg_current_logfile() should be marked as STRICT?

Would it make sense to have the 0-argument version be a SRF returning
rows of '(log_destination,location)'? We could just live with this
function I think, and let the caller filter by logging method.

+    is <literal>NULL</literal>.  When multiple logfiles exist, each in a
+    different format, <function>pg_current_logfile</function> called
s/logfiles/log files/

Surely the temporary file of current_logfiles should not be included
in base backups (look at basebackup.c). Documentation needs to reflect
that as well. Also, should we have current_logfiles in a base backup?
I would think no.

pg_current_logfile() can be run by *any* user. We had better revoke
its access in system_views.sql!

I have been playing with this patch a bit, and could not make the
system crash :(
--
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 13/01/2017 à 05:26, Michael Paquier a écrit :
> OK. I have been looking at this patch.
>
> git diff --check is very unhappy, particularly because of
> update_metainfo_datafile().
Sorry, fixed.

> +    <para>
> +    When logs are written to the file-system their paths, names, and
> +    types are recorded in
> +    the <xref linkend="storage-pgdata-current-logfiles"> file.  This provides
> +    a convenient way to find and access log content without establishing a
> +    database connection.
> +    </para>
> File system is used as a name here. In short, "file-system" -> "file
> system". I think that it would be a good idea to show up the output of
> this file. This could be reworded a bit:
> "When logs are written to the file system, the <xref
> linkend="storage-pgdata-current-logfiles"> file includes the location,
> the name and the type of the logs currently in use. This provides a
> convenient way to find the logs currently in used by the instance."
Changes applied. Output example added:

 <programlisting>
$ cat $PGDATA/current_logfiles
stderr pg_log/postgresql-2017-01-13_085812.log
</programlisting>


> @@ -15441,6 +15441,19 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY
> AS t(ls,n);
>        </row>
>
>        <row>
> +       <entry><literal><function>pg_current_logfile()</function></literal></entry>
> +       <entry><type>text</type></entry>
> +       <entry>primary log file name in use by the logging collector</entry>
> +      </row>
> +
> +      <row>
> +       <entry><literal><function>pg_current_logfile(<type>text</>)</function></literal></entry>
> +       <entry><type>text</type></entry>
> +       <entry>log file name, of log in the requested format, in use by the
> +       logging collector</entry>
> +      </row>
> You could just use one line, using squared brackets for the additional
> argument. The description looks imprecise, perhaps something like that
> would be better: "Log file name currently in use by the logging
> collector".
OK, back to a single row entry with optional parameter.

> +/* in backend/utils/adt/misc.c and backend/postmaster/syslogger.c */
> +/*
> + * Name of file holding the paths, names, and types of the files where current
> + * log messages are written, when the log collector is enabled.  Useful
> + * outside of Postgres when finding the name of the current log file in the
> + * case of time-based log rotation.
> + */
> Not mentioning the file names here would be better. If this spreads in
> the future updates would likely be forgotten. This paragraph could be
> reworked: "configuration file saving meta-data information about the
> log files currently in use by the system logging process".
Removed.

> --- a/src/include/miscadmin.h
> +++ b/src/include/miscadmin.h
> @@ -466,5 +466,4 @@ extern bool has_rolreplication(Oid roleid);
>  /* in access/transam/xlog.c */
>  extern bool BackupInProgress(void);
>  extern void CancelBackup(void);
> -
>  #endif   /* MISCADMIN_H */
> Unrelated diff.
Removed

> +           /*
> +            * Force rewriting last log filename when reloading configuration,
> +            * even if rotation_requested is false, log_destination may have
> +            * been changed and we don't want to wait the next file rotation.
> +            */
> +           update_metainfo_datafile();
> +
>         }
> I know I am famous for being noisy on such things, but please be
> careful with newline use..
That's ok for me, unnecessary newline removed.

> +       else
> +       {
> +           /* Unknown format, no space. Return NULL to caller. */
> +           lbuffer[0] = '\0';
> +           break;
> +       }
> Hm. I would think that an ERROR is more useful to the user here.
The problem addressed here might never happen unless file corruption but
if you know an error code that can be use in this case I will add the
error message. I can not find any error code usable in this case.

> Perhaps pg_current_logfile() should be marked as STRICT?
I don't think so, the log format parameter is optional, and passing NULL
might be like passing no parameter.

> Would it make sense to have the 0-argument version be a SRF returning
> rows of '(log_destination,location)'? We could just live with this
> function I think, and let the caller filter by logging method.
No, this case have already been eliminated during the previous review.

> +    is <literal>NULL</literal>.  When multiple logfiles exist, each in a
> +    different format, <function>pg_current_logfile</function> called
> s/logfiles/log files/.
Applied.

> Surely the temporary file of current_logfiles should not be included
> in base backups (look at basebackup.c). Documentation needs to reflect
> that as well. Also, should we have current_logfiles in a base backup?
> I would think no.
Done but can't find any documentation about the file exclusion, do you
have a pointer?

> pg_current_logfile() can be run by *any* user. We had better revoke
> its access in system_views.sql!
Why? There is no special information returned by this function unless
the path but it can be retrieve using show log_directory.

Attached is patch v20.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
> Le 13/01/2017 à 05:26, Michael Paquier a écrit :
>> Surely the temporary file of current_logfiles should not be included
>> in base backups (look at basebackup.c). Documentation needs to reflect
>> that as well. Also, should we have current_logfiles in a base backup?
>> I would think no.
> Done but can't find any documentation about the file exclusion, do you
> have a pointer?

protocol.sgml, in the protocol-replication part. You could search for
the paragraph that contains postmaster.opts. There is no real harm in
including current_logfiles in base backups, but that's really in the
same bag as postmaster.opts or postmaster.pid, particularly if the log
file name has a timestamp.

>> pg_current_logfile() can be run by *any* user. We had better revoke
>> its access in system_views.sql!
> Why? There is no special information returned by this function unless
> the path but it can be retrieve using show log_directory.

log_directory could be an absolute path, and we surely don't want to
let normal users have a look at it. For example, "show log_directory"
can only be seen by superusers. As Stephen Frost is for a couple of
months (years?) on a holly war path against super-user checks in
system functions, I think that revoking the access to this function is
the best thing to do. It is as well easier to restrict first and
relax, the reverse is harder to justify from a compatibility point of
view.
--
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Kevin Grittner
Дата:
On Fri, Jan 13, 2017 at 7:09 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

> There is no real harm in including current_logfiles in base
> backups, but that's really in the same bag as postmaster.opts or
> postmaster.pid, particularly if the log file name has a
> timestamp.

I'm going to dispute that -- if postmaster.opts and postmaster.pid
are present when you restore, it takes away a level of insurance
against restoring a corrupted image of the database without knowing
it.  In particular, if the backup_label file is deleted (which
happens with alarmingly frequency), the startup code may think it
is dealing with a cluster that crashed rather than with a restore
of a backup.  This often leads to corruption (anything from
"database can't start" to subtle index corruption that isn't
noticed for months).  The presence of log files from the time of
the backup do not present a similar hazard.

So while I agree that there is no harm in including
current_logfiles in base backups, I object to the comparisons to
the more dangerous files.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Sat, Jan 14, 2017 at 1:43 AM, Kevin Grittner <kgrittn@gmail.com> wrote:
> On Fri, Jan 13, 2017 at 7:09 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> There is no real harm in including current_logfiles in base
>> backups, but that's really in the same bag as postmaster.opts or
>> postmaster.pid, particularly if the log file name has a
>> timestamp.
>
> I'm going to dispute that -- if postmaster.opts and postmaster.pid
> are present when you restore, it takes away a level of insurance
> against restoring a corrupted image of the database without knowing
> it.  In particular, if the backup_label file is deleted (which
> happens with alarmingly frequency), the startup code may think it
> is dealing with a cluster that crashed rather than with a restore
> of a backup.  This often leads to corruption (anything from
> "database can't start" to subtle index corruption that isn't
> noticed for months).  The presence of log files from the time of
> the backup do not present a similar hazard.
>
> So while I agree that there is no harm in including
> current_logfiles in base backups, I object to the comparisons to
> the more dangerous files.

Good point.
-- 
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 13/01/2017 à 14:09, Michael Paquier a écrit :
> On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
>> Le 13/01/2017 à 05:26, Michael Paquier a écrit :
>>> Surely the temporary file of current_logfiles should not be included
>>> in base backups (look at basebackup.c). Documentation needs to reflect
>>> that as well. Also, should we have current_logfiles in a base backup?
>>> I would think no.
>> Done but can't find any documentation about the file exclusion, do you
>> have a pointer?
> protocol.sgml, in the protocol-replication part. You could search for
> the paragraph that contains postmaster.opts. There is no real harm in
> including current_logfiles in base backups, but that's really in the
> same bag as postmaster.opts or postmaster.pid, particularly if the log
> file name has a timestamp.
Thanks for the pointer. After reading this part I think it might only
concern files that are critical at restore time. I still think that the
file might not be removed automatically from the backup. If it is
restored it has strictly no impact at all, it will be removed or
overwritten at startup. We can let the user choose to remove it from the
backup or not, the file can be an help to find the latest log file
related to a backup.

>
>>> pg_current_logfile() can be run by *any* user. We had better revoke
>>> its access in system_views.sql!
>> Why? There is no special information returned by this function unless
>> the path but it can be retrieve using show log_directory.
> log_directory could be an absolute path, and we surely don't want to
> let normal users have a look at it. For example, "show log_directory"
> can only be seen by superusers. As Stephen Frost is for a couple of
> months (years?) on a holly war path against super-user checks in
> system functions, I think that revoking the access to this function is
> the best thing to do. It is as well easier to restrict first and
> relax, the reverse is harder to justify from a compatibility point of
> view.

Right, I've append a REVOKE to the function in file
backend/catalog/system_views.sql, patch v21 reflect this change.

Thanks.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Sat, Jan 14, 2017 at 10:02 PM, Gilles Darold
<gilles.darold@dalibo.com> wrote:
> Le 13/01/2017 à 14:09, Michael Paquier a écrit :
>> On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
>>> Le 13/01/2017 à 05:26, Michael Paquier a écrit :
>>>> Surely the temporary file of current_logfiles should not be included
>>>> in base backups (look at basebackup.c). Documentation needs to reflect
>>>> that as well. Also, should we have current_logfiles in a base backup?
>>>> I would think no.
>>> Done but can't find any documentation about the file exclusion, do you
>>> have a pointer?
>> protocol.sgml, in the protocol-replication part. You could search for
>> the paragraph that contains postmaster.opts. There is no real harm in
>> including current_logfiles in base backups, but that's really in the
>> same bag as postmaster.opts or postmaster.pid, particularly if the log
>> file name has a timestamp.
>
> Thanks for the pointer. After reading this part I think it might only
> concern files that are critical at restore time. I still think that the
> file might not be removed automatically from the backup. If it is
> restored it has strictly no impact at all, it will be removed or
> overwritten at startup. We can let the user choose to remove it from the
> backup or not, the file can be an help to find the latest log file
> related to a backup.

OK, no problem for me. I can see that your patch does the right thing
to ignore the current_logfiles.tmp. Could you please update
t/010_pg_basebackup.pl and add this file to the list of files filled
with DONOTCOPY?

>>>> pg_current_logfile() can be run by *any* user. We had better revoke
>>>> its access in system_views.sql!
>>> Why? There is no special information returned by this function unless
>>> the path but it can be retrieve using show log_directory.
>> log_directory could be an absolute path, and we surely don't want to
>> let normal users have a look at it. For example, "show log_directory"
>> can only be seen by superusers. As Stephen Frost is for a couple of
>> months (years?) on a holly war path against super-user checks in
>> system functions, I think that revoking the access to this function is
>> the best thing to do. It is as well easier to restrict first and
>> relax, the reverse is harder to justify from a compatibility point of
>> view.
>
> Right, I've append a REVOKE to the function in file
> backend/catalog/system_views.sql, patch v21 reflect this change.

Thanks. That looks good.

+   {
+       /* Logging collector is not enabled. We don't know where messages are
+        * logged.  Remove outdated file holding the current log filenames.
+        */
+       unlink(LOG_METAINFO_DATAFILE);       return 0
This comment format is not postgres-like.

I think that's all I have for this patch.
--
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 15/01/2017 à 06:54, Michael Paquier a écrit :
> On Sat, Jan 14, 2017 at 10:02 PM, Gilles Darold
> <gilles.darold@dalibo.com> wrote:
>> Le 13/01/2017 à 14:09, Michael Paquier a écrit :
>>> On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
>>>> Le 13/01/2017 à 05:26, Michael Paquier a écrit :
>>>>> Surely the temporary file of current_logfiles should not be included
>>>>> in base backups (look at basebackup.c). Documentation needs to reflect
>>>>> that as well. Also, should we have current_logfiles in a base backup?
>>>>> I would think no.
>>>> Done but can't find any documentation about the file exclusion, do you
>>>> have a pointer?
>>> protocol.sgml, in the protocol-replication part. You could search for
>>> the paragraph that contains postmaster.opts. There is no real harm in
>>> including current_logfiles in base backups, but that's really in the
>>> same bag as postmaster.opts or postmaster.pid, particularly if the log
>>> file name has a timestamp.
>> Thanks for the pointer. After reading this part I think it might only
>> concern files that are critical at restore time. I still think that the
>> file might not be removed automatically from the backup. If it is
>> restored it has strictly no impact at all, it will be removed or
>> overwritten at startup. We can let the user choose to remove it from the
>> backup or not, the file can be an help to find the latest log file
>> related to a backup.
> OK, no problem for me. I can see that your patch does the right thing
> to ignore the current_logfiles.tmp. Could you please update
> t/010_pg_basebackup.pl and add this file to the list of files filled
> with DONOTCOPY?
Applied in attached patch v22.

>>>>> pg_current_logfile() can be run by *any* user. We had better revoke
>>>>> its access in system_views.sql!
>>>> Why? There is no special information returned by this function unless
>>>> the path but it can be retrieve using show log_directory.
>>> log_directory could be an absolute path, and we surely don't want to
>>> let normal users have a look at it. For example, "show log_directory"
>>> can only be seen by superusers. As Stephen Frost is for a couple of
>>> months (years?) on a holly war path against super-user checks in
>>> system functions, I think that revoking the access to this function is
>>> the best thing to do. It is as well easier to restrict first and
>>> relax, the reverse is harder to justify from a compatibility point of
>>> view.
>> Right, I've append a REVOKE to the function in file
>> backend/catalog/system_views.sql, patch v21 reflect this change.
> Thanks. That looks good.
>
> +   {
> +       /* Logging collector is not enabled. We don't know where messages are
> +        * logged.  Remove outdated file holding the current log filenames.
> +        */
> +       unlink(LOG_METAINFO_DATAFILE);
>         return 0
> This comment format is not postgres-like.

Fixed too.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sun, 15 Jan 2017 14:54:44 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:

> I think that's all I have for this patch.

I'd like to submit with it an addendum patch that
makes symbols out of some constants.  I'll see if I can
get that done soon.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Sun, 15 Jan 2017 07:20:40 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> On Sun, 15 Jan 2017 14:54:44 +0900
> Michael Paquier <michael.paquier@gmail.com> wrote:
> 
> > I think that's all I have for this patch.  

> I'd like to submit with it an addendum patch that
> makes symbols out of some constants.  I'll see if I can
> get that done soon.

Attached is a version 23 of the patch.  The only change
is follow-through and cleanup of code posted in-line in emails.

  src/backend/utils/adt/misc.c | 13 ++++++-------
  1 file changed, 6 insertions(+), 7 deletions(-)

This includes making comments into full sentences.


I do have a question here regards code formatting.
The patch now contains:

    if (log_filepath == NULL)
    {
        /* Bad data. Avoid segfaults etc. and return NULL to caller. */
        break;
    }

I'm not sure how this fits in with PG coding style,
whether the {} should be removed or what.  I've looked
around and can't find an example of an if with a single
line then block and a comment.  Maybe this means that
I shouldn't be doing this, but if I put the comment above
the if then it will "run into" the comment block which
immediately precedes the above code which describes
a larger body of code.  So perhaps someone should look
at this and tell me how to improve it.


Attached also are 2 patches which abstract some hardcoded
constants into symbols.  Whether to apply them is a matter
of style and left to the maintainers, which is why these
are separate patches.  

The first patch changes only code on the master
branch, the 2nd patch changes the new code.


I have not looked further at the patch or checked
to see that all changes previously recommended have been
made.  Michael, if you're confident that everything is good
go ahead and move the patchs forward to the maintainers.  Otherwise
please let me know and I'll see about finding time
for further review.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Mon, Jan 16, 2017 at 1:14 PM, Karl O. Pinc <kop@meme.com> wrote:
> I do have a question here regards code formatting.
> The patch now contains:
>
>     if (log_filepath == NULL)
>     {
>         /* Bad data. Avoid segfaults etc. and return NULL to caller. */
>         break;
>     }
>
> I'm not sure how this fits in with PG coding style,
> whether the {} should be removed or what.  I've looked
> around and can't find an example of an if with a single
> line then block and a comment.  Maybe this means that
> I shouldn't be doing this, but if I put the comment above
> the if then it will "run into" the comment block which
> immediately precedes the above code which describes
> a larger body of code.  So perhaps someone should look
> at this and tell me how to improve it.

Including brackets in this case makes a more readable code. That's the
pattern respected the most in PG code. See for example
XLogInsertRecord():
    else
    {
        /*
         * This was an xlog-switch record, but the current insert location was
         * already exactly at the beginning of a segment, so there was no need
         * to do anything.
         */
    }

> Attached also are 2 patches which abstract some hardcoded
> constants into symbols.  Whether to apply them is a matter
> of style and left to the maintainers, which is why these
> are separate patches.

Making the strings csvlog, stderr and eventlog variables? Why not
because the patch presented here uses them in new places. Now it is
not like it is a huge maintenance burden to keep them as strings, so I
would personally not bother much.

> The first patch changes only code on the master
> branch, the 2nd patch changes the new code.

Thanks for keeping things separated.

> I have not looked further at the patch or checked
> to see that all changes previously recommended have been
> made.  Michael, if you're confident that everything is good
> go ahead and move the patchs forward to the maintainers.  Otherwise
> please let me know and I'll see about finding time
> for further review.

OK. I have done a round of hands-on review on this patch, finishing
with the attached. In the things I have done:
- Elimination of useless noise diff
- Fixes some typos (logile, log file log, etc.)
- Adjusted docs.
- Docs were overdoing in storage.sgml. Let's keep the description simple.
- Having a paragraph at the beginning of "Error Reporting and Logging"
in config.sgml does not look like a good idea to me. As the updates of
current_logfiles is associated with log_destination I'd rather see
this part added in the description of the GUC.
- Missed a (void) in the definition of update_metainfo_datafile().
- Moved update_metainfo_datafile() before the signal handler routines
in syslogger.c for clarity.
- The last "return" is useless in update_metainfo_datafile().
- In pg_current_logfile(), it is useless to call PG_RETURN_NULL after
emitting an ERROR message.
- Two calls to FreeFile(fd) have been forgotten in
pg_current_logfile(). In one case, errno needs to be saved beforehand
to be sure that the correct error string is generated for the user.
- A portion of 010_pg_basebackup.pl was not updated.
- Incorrect header ordering in basebackup.c.
- Decoding of CR and LF characters seem to work fine when
log_file_name include some.

With all those issues fixed, I finish with the attached, that I am
fine to pass down to a committer. I still think that this should be
only one function using a SRF with rows shaped as (type, path) for
simplicity, but as I am visibly outnumbered I won't insist further.
Also, I would rather see an ERROR returned to the user in case of bad
data in current_logfiles, I did not change that either as that's the
original intention of Gilles.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:


On January 15, 2017 11:47:51 PM CST, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Jan 16, 2017 at 1:14 PM, Karl O. Pinc <kop@meme.com> wrote:
I do have a question here regards code formatting.
The patch now contains:

if (log_filepath == NULL)
{
/* Bad data. Avoid segfaults etc. and return NULL to caller. */
break;
}

I'm not sure how this fits in with PG coding style,
whether the {} should be removed or what. I've looked
around and can't find an example of an if with a single
line then block and a comment. Maybe this means that
I shouldn't be doing this, but if I put the comment above
the if then it will "run into" the comment block which
immediately precedes the above code which describes
a larger body of code. So perhaps someone should look
at this and tell me how to improve it.

Including brackets in this case makes a more readable code. That's the
pattern respected the most in PG code. See for example
XLogInsertRecord():
else
{
/*
* This was an xlog-switch record, but the current insert location was
* already exactly at the beginning of a segment, so there was no need
* to do anything.
*/
}

Attached also are 2 patches which abstract some hardcoded
constants into symbols. Whether to apply them is a matter
of style and left to the maintainers, which is why these
are separate patches.

Making the strings csvlog, stderr and eventlog variables? Why not
because the patch presented here uses them in new places. Now it is
not like it is a huge maintenance burden to keep them as strings, so I
would personally not bother much.

The first patch changes only code on the master
branch, the 2nd patch changes the new code.

Thanks for keeping things separated.

I have not looked further at the patch or checked
to see that all changes previously recommended have been
made. Michael, if you're confident that everything is good
go ahead and move the patchs forward to the maintainers. Otherwise
please let me know and I'll see about finding time
for further review.

OK. I have done a round of hands-on review on this patch, finishing
with the attached. In the things I have done:
- Elimination of useless noise diff
- Fixes some typos (logile, log file log, etc.)
- Adjusted docs.
- Docs were overdoing in storage.sgml. Let's keep the description simple.
- Having a paragraph at the beginning of "Error Reporting and Logging"
in config.sgml does not look like a good idea to me. As the updates of
current_logfiles is associated with log_destination I'd rather see
this part added in the description of the GUC.
- Missed a (void) in the definition of update_metainfo_datafile().
- Moved update_metainfo_datafile() before the signal handler routines
in syslogger.c for clarity.
- The last "return" is useless in update_metainfo_datafile().
- In pg_current_logfile(), it is useless to call PG_RETURN_NULL after
emitting an ERROR message.
- Two calls to FreeFile(fd) have been forgotten in
pg_current_logfile(). In one case, errno needs to be saved beforehand
to be sure that the correct error string is generated for the user.
- A portion of 010_pg_basebackup.pl was not updated.
- Incorrect header ordering in basebackup.c.
- Decoding of CR and LF characters seem to work fine when
log_file_name include some.

With all those issues fixed, I finish with the attached, that I am
fine to pass down to a committer. I still think that this should be
only one function using a SRF with rows shaped as (type, path) for
simplicity, but as I am visibly outnumbered I won't insist further.
Also, I would rather see an ERROR returned to the user in case of bad
data in current_logfiles, I did not change that either as that's the
original intention of Gilles.


Karl <kop@meme.com>
Free Software: "You don't pay back you pay forward."
-- Robert A. Heinlein

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:

On January 15, 2017 11:47:51 PM CST, Michael Paquier <michael.paquier@gmail.com> wrote:
>On Mon, Jan 16, 2017 at 1:14 PM, Karl O. Pinc <kop@meme.com> wrote:
>
>
>> Attached also are 2 patches which abstract some hardcoded
>> constants into symbols.  Whether to apply them is a matter
>> of style and left to the maintainers, which is why these
>> are separate patches.
>
>Making the strings csvlog, stderr and eventlog variables? Why not
>because the patch presented here uses them in new places. Now it is
>not like it is a huge maintenance burden to keep them as strings, so I
>would personally not bother much.

To my mind it's a matter readability. It is useful to be able to search for or identify quickly when reading, e. g.,
allthe places where the keyword stderr relates to output log destination and not some other more common use. The
downsideis it is more code... 


>OK. I have done a round of hands-on review on this patch, finishing
>with the attached. In the things I have done:

Thank you.

>With all those issues fixed, I finish with the attached, that I am
>fine to pass down to a committer. I still think that this should be
>only one function using a SRF with rows shaped as (type, path) for
>simplicity, but as I am visibly outnumbered I won't insist further.

That also makes a certain amount of sense to me but I can't say I have thought deeply about it. Haven't paid any
attentionto this issue and am fine letting things move forward as is. 

>Also, I would rather see an ERROR returned to the user in case of bad
>data in current_logfiles, I did not change that either as that's the
>original intention of Gilles.

I could be wrong but I seem to recall that Robert recommended against an error message. If there is bad data then
somethingis really wrong, up to some sort of an attack on the back end. Because this sort of thing simply shouldn't
happenit's enough in my opinion to guard against buffer overruns etc and just get on with life. If something goes
unexpectedlyand badly wrong with internal data structures in general there would have to be all kinds of additional
codeto cover every possible problem and that doesn't seem to make sense. 

Sorry about the previous email with empty content. My email client got away from me.


Karl <kop@meme.com>
Free  Software:  "You don't pay back you pay forward."               -- Robert A. Heinlein




Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Alvaro Herrera
Дата:
Karl O. Pinc wrote:

> I do have a question here regards code formatting.
> The patch now contains:
> 
>     if (log_filepath == NULL)
>     {
>         /* Bad data. Avoid segfaults etc. and return NULL to caller. */
>         break;
>     }
> 
> I'm not sure how this fits in with PG coding style,
> whether the {} should be removed or what.  I've looked
> around and can't find an example of an if with a single
> line then block and a comment.  Maybe this means that
> I shouldn't be doing this, but if I put the comment above
> the if then it will "run into" the comment block which
> immediately precedes the above code which describes
> a larger body of code.  So perhaps someone should look
> at this and tell me how to improve it.

I think this style is good.  Following the style guide to the letter
would lead to remove the braces and keep the comment where it is;
pgindent will correctly keep at its current indentation.  We do use that
style in a couple of places.  It looks a bit clunky.  In most places we
do keep those braces, for readability and future-proofing in case
somebody inadvertently introduces another statement to the "block".  We
don't automatically remove braces anyway.  (We used to do that, but
stopped a few years ago shortly after introducing PG_TRY).

Putting the comment outside (above) the "if" would be wrong, too; you'd
have to rephrase the comment in a conditional tense.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Tue, Jan 17, 2017 at 2:21 AM, Karl O. Pinc <kop@meme.com> wrote:
> On January 15, 2017 11:47:51 PM CST, Michael Paquier <michael.paquier@gmail.com> wrote:
>>On Mon, Jan 16, 2017 at 1:14 PM, Karl O. Pinc <kop@meme.com> wrote:
>>With all those issues fixed, I finish with the attached, that I am
>>fine to pass down to a committer. I still think that this should be
>>only one function using a SRF with rows shaped as (type, path) for
>>simplicity, but as I am visibly outnumbered I won't insist further.
>
> That also makes a certain amount of sense to me but I can't say I have thought deeply about it. Haven't paid any
attentionto this issue and am fine letting things move forward as is. 

Gilles, what's your opinion here? As the author that's your call at
the end. What the point here is would be to change
pg_current_logfiles() to something like that:
=# SELECT * FROM pg_current_logfiles();method | file
--------|--------stderr | pg_log/postgresql.logcsvlog | pg_log/postgresql.csv
And using this SRF users can filter the method with a WHERE clause.
And as a result the 1-arg version is removed. No rows are returned if
current_logfiles does not exist. I don't think there is much need for
a system view either.

>>Also, I would rather see an ERROR returned to the user in case of bad
>>data in current_logfiles, I did not change that either as that's the
>>original intention of Gilles.
>
> I could be wrong but I seem to recall that Robert recommended against an error message. If there is bad data then
somethingis really wrong, up to some sort of an attack on the back end. Because this sort of thing simply shouldn't
happenit's enough in my opinion to guard against buffer overruns etc and just get on with life. If something goes
unexpectedlyand badly wrong with internal data structures in general there would have to be all kinds of additional
codeto cover every possible problem and that doesn't seem to make sense. 

Hm... OK. At the same time not throwing at least a WARNING is
confusing, because a NULL result is returned as well even if the input
method is incorrect and even if the method is correct but it is not
present in current_logfiles. As the user is thought as a trusted user
as it has access to this function, I would think that being verbose on
the error handling, or at least warnings, would make things easier to
analyze.

> Sorry about the previous email with empty content. My email client got away from me.

No problem. That happens.
--
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 17 Jan 2017 11:22:45 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:

> On Tue, Jan 17, 2017 at 2:21 AM, Karl O. Pinc <kop@meme.com> wrote:

> >>Also, I would rather see an ERROR returned to the user in case of
> >>bad data in current_logfiles, I did not change that either as
> >>that's the original intention of Gilles.  
> >
> > I could be wrong but I seem to recall that Robert recommended
> > against an error message. If there is bad data then something is
> > really wrong, up to some sort of an attack on the back end. Because
> > this sort of thing simply shouldn't happen it's enough in my
> > opinion to guard against buffer overruns etc and just get on with
> > life. If something goes unexpectedly and badly wrong with internal
> > data structures in general there would have to be all kinds of
> > additional code to cover every possible problem and that doesn't
> > seem to make sense.  
> 
> Hm... OK. At the same time not throwing at least a WARNING is
> confusing, because a NULL result is returned as well even if the input
> method is incorrect and even if the method is correct but it is not
> present in current_logfiles. As the user is thought as a trusted user
> as it has access to this function, I would think that being verbose on
> the error handling, or at least warnings, would make things easier to
> analyze.

These are all valid points.

In the context of reliability it's worth noting that
pg_current_logfile() results are inherently unreliable.
If the OS returns ENFILE or EMFILE when opening the current_logfiles
file (but not previously) the content, and so pg_current_logfile()
results, will be outdated until the next logfile rotation.
You wouldn't expect this to happen, but it might.  Which
is similar to the situation where the content of the current_logfiles
is corrupted; very unexpected but possible.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 17 Jan 2017 07:58:43 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> On Tue, 17 Jan 2017 11:22:45 +0900
> Michael Paquier <michael.paquier@gmail.com> wrote:
> 
> > On Tue, Jan 17, 2017 at 2:21 AM, Karl O. Pinc <kop@meme.com>
> > wrote:  
> 
> > >>Also, I would rather see an ERROR returned to the user in case of
> > >>bad data in current_logfiles,

> > 
> > Hm... OK. At the same time not throwing at least a WARNING is
> > confusing

What I find a little disconcerting is that there'd be nothing in the
logs.



Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 17/01/2017 à 03:22, Michael Paquier a écrit :
> On Tue, Jan 17, 2017 at 2:21 AM, Karl O. Pinc <kop@meme.com> wrote:
>> On January 15, 2017 11:47:51 PM CST, Michael Paquier <michael.paquier@gmail.com> wrote:
>>> On Mon, Jan 16, 2017 at 1:14 PM, Karl O. Pinc <kop@meme.com> wrote:
>>> With all those issues fixed, I finish with the attached, that I am
>>> fine to pass down to a committer. I still think that this should be
>>> only one function using a SRF with rows shaped as (type, path) for
>>> simplicity, but as I am visibly outnumbered I won't insist further.
>> That also makes a certain amount of sense to me but I can't say I have thought deeply about it. Haven't paid any
attentionto this issue and am fine letting things move forward as is.
 
> Gilles, what's your opinion here? As the author that's your call at
> the end. What the point here is would be to change
> pg_current_logfiles() to something like that:
> =# SELECT * FROM pg_current_logfiles();
>  method | file
> --------|--------
>  stderr | pg_log/postgresql.log
>  csvlog | pg_log/postgresql.csv
> And using this SRF users can filter the method with a WHERE clause.
> And as a result the 1-arg version is removed. No rows are returned if
> current_logfiles does not exist. I don't think there is much need for
> a system view either.

This have already been discuted previoulsy in this thread, one of my
previous patch version has implemented this behavior but we decide that
what we really want is to be able to use the function using the
following simple query:
   SELECT pg_read_file(pg_current_logfiles());

and not something like
   SELECT pg_read_file(SELECT file FROM pg_current_logfiles() LIMIT 1);
or   SELECT pg_read_file(SELECT file FROM pg_current_logfiles() WHERE
method='stderr');

You can also obtain the "kind" of output from the SRF function using:
   SELECT pg_read_file('current_logfiles');


>>> Also, I would rather see an ERROR returned to the user in case of bad
>>> data in current_logfiles, I did not change that either as that's the
>>> original intention of Gilles.
>> I could be wrong but I seem to recall that Robert recommended against an error message. If there is bad data then
somethingis really wrong, up to some sort of an attack on the back end. Because this sort of thing simply shouldn't
happenit's enough in my opinion to guard against buffer overruns etc and just get on with life. If something goes
unexpectedlyand badly wrong with internal data structures in general there would have to be all kinds of additional
codeto cover every possible problem and that doesn't seem to make sense.
 
> Hm... OK. At the same time not throwing at least a WARNING is
> confusing, because a NULL result is returned as well even if the input
> method is incorrect and even if the method is correct but it is not
> present in current_logfiles. As the user is thought as a trusted user
> as it has access to this function, I would think that being verbose on
> the error handling, or at least warnings, would make things easier to
> analyze.

I'm not against adding a warning or error message here even if it may
never occurs, but we need a new error code as it seems to me that no
actual error code can be used.  


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 17 Jan 2017 19:06:22 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Le 17/01/2017 à 03:22, Michael Paquier a écrit :
> > On Tue, Jan 17, 2017 at 2:21 AM, Karl O. Pinc <kop@meme.com>
> > wrote:
> >> On January 15, 2017 11:47:51 PM CST, Michael Paquier
> >> <michael.paquier@gmail.com> wrote:


> >>> Also, I would rather see an ERROR returned to the user in case of
> >>> bad data in current_logfiles, I did not change that either as
> >>> that's the original intention of Gilles.

> I'm not against adding a warning or error message here even if it may
> never occurs, but we need a new error code as it seems to me that no
> actual error code can be used.

Seems to me the XX001 "data_corrupted" sub-category of
XX000 "internal_error" is appropriate.

https://www.postgresql.org/docs/devel/static/errcodes-appendix.html

As a dbadmin/sysadm I'd defiantly like to see something in the log if
you're going to raise anything user-side.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Gilles Darold
Дата:
Le 17/01/2017 à 19:58, Karl O. Pinc a écrit :
> On Tue, 17 Jan 2017 19:06:22 +0100
> Gilles Darold <gilles.darold@dalibo.com> wrote:
>
>> Le 17/01/2017 à 03:22, Michael Paquier a écrit :
>>> On Tue, Jan 17, 2017 at 2:21 AM, Karl O. Pinc <kop@meme.com>
>>> wrote:  
>>>> On January 15, 2017 11:47:51 PM CST, Michael Paquier
>>>> <michael.paquier@gmail.com> wrote:  
>
>>>>> Also, I would rather see an ERROR returned to the user in case of
>>>>> bad data in current_logfiles, I did not change that either as
>>>>> that's the original intention of Gilles.  
>> I'm not against adding a warning or error message here even if it may
>> never occurs, but we need a new error code as it seems to me that no
>> actual error code can be used.  
> Seems to me the XX001 "data_corrupted" sub-category of
> XX000 "internal_error" is appropriate.

I don't think so, this file doesn't contain any data and we must not
report such error in this case. Somethink like "bad/unknow file format"
would be better.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org




Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Tue, 17 Jan 2017 23:00:43 +0100
Gilles Darold <gilles.darold@dalibo.com> wrote:

> Le 17/01/2017 à 19:58, Karl O. Pinc a écrit :
> > On Tue, 17 Jan 2017 19:06:22 +0100
> > Gilles Darold <gilles.darold@dalibo.com> wrote:
> >
> >> Le 17/01/2017 à 03:22, Michael Paquier a écrit :
> >>> On Tue, Jan 17, 2017 at 2:21 AM, Karl O. Pinc <kop@meme.com>
> >>> wrote:
> >>>> On January 15, 2017 11:47:51 PM CST, Michael Paquier
> >>>> <michael.paquier@gmail.com> wrote:
> >
> >>>>> Also, I would rather see an ERROR returned to the user in case
> >>>>> of bad data in current_logfiles, I did not change that either as
> >>>>> that's the original intention of Gilles.
> >> I'm not against adding a warning or error message here even if it
> >> may never occurs, but we need a new error code as it seems to me
> >> that no actual error code can be used.
> > Seems to me the XX001 "data_corrupted" sub-category of
> > XX000 "internal_error" is appropriate.
>
> I don't think so, this file doesn't contain any data and we must not
> report such error in this case. Somethink like "bad/unknow file
> format" would be better.

Maybe.  It's not user-supplied data that's corrupted but it is
PG generated data which is generated for and supplied to the user.
I just looked at all uses of XX001 and it is true that they all
involve corruption of user-supplied data.

If you don't want to use XX001 use XX000.  It does not seem worth
making a new error code for just this one case.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Wed, Jan 18, 2017 at 3:06 AM, Gilles Darold <gilles.darold@dalibo.com> wrote:
> This have already been discuted previously in this thread, one of my
> previous patch version has implemented this behavior but we decide that
> what we really want is to be able to use the function using the
> following simple query:
>
>     SELECT pg_read_file(pg_current_logfiles());
>
> and not something like
>
>     SELECT pg_read_file(SELECT file FROM pg_current_logfiles() LIMIT 1);
> or
>     SELECT pg_read_file(SELECT file FROM pg_current_logfiles() WHERE
> method='stderr');
>
> You can also obtain the "kind" of output from the SRF function using:
>
>     SELECT pg_read_file('current_logfiles');

I don't really understand this argument as you can do that as well:
SELECT pg_read_file(file) FROM pg_current_logfiles WHERE method = 'stderr';

> I'm not against adding a warning or error message here even if it may
> never occurs, but we need a new error code as it seems to me that no
> actual error code can be used.

ERRCODE_INTERNAL_ERROR, no?
-- 
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Wed, Jan 18, 2017 at 7:36 AM, Karl O. Pinc <kop@meme.com> wrote:
> Maybe.  It's not user-supplied data that's corrupted but it is
> PG generated data which is generated for and supplied to the user.
> I just looked at all uses of XX001 and it is true that they all
> involve corruption of user-supplied data.
>
> If you don't want to use XX001 use XX000.  It does not seem worth
> making a new error code for just this one case.

Our ideas rather map here, ERRCODE_INTERNAL_ERROR would be adapted for
this situation. Do any of you want to give it a shot or should I?
-- 
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 18 Jan 2017 11:08:25 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:

> Our ideas rather map here, ERRCODE_INTERNAL_ERROR would be adapted for
> this situation. Do any of you want to give it a shot or should I?

You're welcome to it.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Wed, Jan 18, 2017 at 11:36 AM, Karl O. Pinc <kop@meme.com> wrote:
> On Wed, 18 Jan 2017 11:08:25 +0900
> Michael Paquier <michael.paquier@gmail.com> wrote:
>
>> Our ideas rather map here, ERRCODE_INTERNAL_ERROR would be adapted for
>> this situation. Do any of you want to give it a shot or should I?
>
> You're welcome to it.

What do you think about that?
+       log_filepath = strchr(lbuffer, ' ');
+       if (log_filepath == NULL)
+       {
+           /*
+            * Corrupted data found, return NULL to the caller and still
+            * inform him on the situation.
+            */
+           ereport(WARNING,
+                   (ERRCODE_INTERNAL_ERROR,
+                    errmsg("corrupted data found in \"%s\"",
+                           LOG_METAINFO_DATAFILE)));
+           break;
+       }

Recent commits 352a24a1 (not my fault) and e7b020f7 (my fault) are
creating conflicts, so a rebase was as well welcome.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 18 Jan 2017 13:26:46 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:

> On Wed, Jan 18, 2017 at 11:36 AM, Karl O. Pinc <kop@meme.com> wrote:
> > On Wed, 18 Jan 2017 11:08:25 +0900
> > Michael Paquier <michael.paquier@gmail.com> wrote:
> >  
> >> Our ideas rather map here, ERRCODE_INTERNAL_ERROR would be adapted
> >> for this situation. Do any of you want to give it a shot or should
> >> I?  

> What do you think about that?
> +       log_filepath = strchr(lbuffer, ' ');
> +       if (log_filepath == NULL)
> +       {
> +           /*
> +            * Corrupted data found, return NULL to the caller and
> still
> +            * inform him on the situation.
> +            */
> +           ereport(WARNING,
> +                   (ERRCODE_INTERNAL_ERROR,
> +                    errmsg("corrupted data found in \"%s\"",
> +                           LOG_METAINFO_DATAFILE)));
> +           break;
> +       }

You must write

  errcode(ERRCODE_INTERNAL_ERROR)

instead of just
  
  ERRCODE_INTERNAL_ERROR

If you don't you get back 01000 for the error code.
Test by using psql with '\set VERBOSITY verbose'.

v26 patch attached which fixes this.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 18 Jan 2017 10:11:20 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> You must write
>   errcode(ERRCODE_INTERNAL_ERROR)
> instead of just
>   ERRCODE_INTERNAL_ERROR
> 
> If you don't you get back 01000 for the error code.

> v26 patch attached which fixes this.

Attached are revised versions of the 2 (optional) patches which
make hardcoded constants into symbols to apply on
top of the v26 patch.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
Hi Micheal,

On Wed, 18 Jan 2017 10:26:43 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> > v26 patch attached which fixes this.  

I was glancing over the changes to the documentation
you made between the v22 and v25 and from looking at the diffs 
it seems the format of the current_logfiles file content is no longer
documented.  Seems to me that the file format should
be documented if there's any intention that the end user
look at or otherwise use the file's content.

It's fine with me if the content of current_logfiles
is supposed to be internal to PG and not exposed
to the end user.  I'm writing to make sure that
this is a considered decision.

I also see that all the index entries in the docs
to the current_logfiles file were removed.  (And
I think maybe some index entries to various places
where pg_current_logfiles() was mentioned in the docs.)
I see no reason why we shouldn't have more rather
than fewer index entries in the docs.

I haven't made any sort of though review of your
changes to the docs but this jumped out at me
and I wanted to comment before the patches got
passed on to the committers.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 18 Jan 2017 11:08:23 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> Hi Micheal,
> 
> On Wed, 18 Jan 2017 10:26:43 -0600
> "Karl O. Pinc" <kop@meme.com> wrote:
> 
> > > v26 patch attached which fixes this.    
> 
> I was glancing over the changes to the documentation
> you made between the v22 and v25

If it were me I'd have the documentation mention
that the pg_current_logfiles() result is
supplied on a "best effort" basis and under
rare conditions may be outdated.   The
sentence in the pg_current_logfles() docs
which reads "pg_current_logfiles reflects the contents 
of the file current_logfiles." now carries little
meaning because the current_logfiles docs no
longer mention that the file content may be outdated.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Wed, Jan 18, 2017 at 12:56 PM, Karl O. Pinc <kop@meme.com> wrote:
> If it were me I'd have the documentation mention
> that the pg_current_logfiles() result is
> supplied on a "best effort" basis and under
> rare conditions may be outdated.   The
> sentence in the pg_current_logfles() docs
> which reads "pg_current_logfiles reflects the contents
> of the file current_logfiles." now carries little
> meaning because the current_logfiles docs no
> longer mention that the file content may be outdated.

Generally that's going to be true of any function or SQL statement
that returns data which can change.  The only way it isn't true in
some particular case is if a function has some kind of snapshot
semantics or returns with a lock on the underlying object held.  So it
doesn't seem particularly noteworthy here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Robert Haas
Дата:
On Wed, Jan 18, 2017 at 12:08 PM, Karl O. Pinc <kop@meme.com> wrote:
> Seems to me that the file format should
> be documented if there's any intention that the end user
> look at or otherwise use the file's content.
>
> It's fine with me if the content of current_logfiles
> is supposed to be internal to PG and not exposed
> to the end user.  I'm writing to make sure that
> this is a considered decision.

On the whole, documenting it seems better than documenting it,
provided there's a logical place to include it in the existing
documentation.

But, anyway, Michael shouldn't remove it without some explanation or discussion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 18 Jan 2017 15:52:36 -0500
Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Jan 18, 2017 at 12:08 PM, Karl O. Pinc <kop@meme.com> wrote:
> > Seems to me that the file format should
> > be documented if there's any intention that the end user
> > look at or otherwise use the file's content.
> >
> > It's fine with me if the content of current_logfiles
> > is supposed to be internal to PG and not exposed
> > to the end user.  I'm writing to make sure that
> > this is a considered decision.  
> 
> On the whole, documenting it seems better than documenting it,
> provided there's a logical place to include it in the existing
> documentation.
> 
> But, anyway, Michael shouldn't remove it without some explanation or
> discussion.

He explained that where it was looked clunky, it being
inside a table that otherwise has rows that are not tall.

And, it looks like I'm wrong.  The format is documented
by way of an example in section 19.8.1. Where To Log
under log_destination (string).

Sorry for the bother.

I would like to see index entries for "current_logfiles"
so this stuff is findable.

Regards.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."                -- Robert A. Heinlein



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
"Karl O. Pinc"
Дата:
On Wed, 18 Jan 2017 15:08:09 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> I would like to see index entries for "current_logfiles"
> so this stuff is findable.

Attached is a v27 of the patch.

I polished some of the sentences in the documentation
and added index entries.

Also attached are a series of 2 patches to apply on top
of v27 which make symbols out of hardcoded constants.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Alvaro Herrera
Дата:
Karl O. Pinc wrote:

> @@ -511,10 +519,16 @@ int
>  SysLogger_Start(void)
>  {
>      pid_t        sysloggerPid;
> -    char       *filename;
>  
> +    /*
> +     * Logging collector is not enabled. We don't know where messages are
> +     * logged.  Remove outdated file holding the current log filenames.
> +     */
>      if (!Logging_collector)
> +    {
> +        unlink(LOG_METAINFO_DATAFILE);
>          return 0;
> +    }

I thought this part was odd -- I mean, why is SysLogger_Start() being
called if the collector is not enabled?  Turns out we do it and return
early if not enabled.  But not in all cases -- there is one callsite in
postmaster.c that avoids the call if the collector is disabled.  That
needs to be changed if we want this to work reliably.

I don't think the "abstract names" stuff is an improvement (just look at
the quoting mess in ConfigureNamesString).  I think we should do without
that; at least as part of this patch.  If you think there's code that
can get better because of the idea, let's see it.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Thu, Jan 19, 2017 at 6:08 AM, Karl O. Pinc <kop@meme.com> wrote:
> On Wed, 18 Jan 2017 15:52:36 -0500
> Robert Haas <robertmhaas@gmail.com> wrote:
>
>> On Wed, Jan 18, 2017 at 12:08 PM, Karl O. Pinc <kop@meme.com> wrote:
>> > Seems to me that the file format should
>> > be documented if there's any intention that the end user
>> > look at or otherwise use the file's content.
>> >
>> > It's fine with me if the content of current_logfiles
>> > is supposed to be internal to PG and not exposed
>> > to the end user.  I'm writing to make sure that
>> > this is a considered decision.
>>
>> On the whole, documenting it seems better than documenting it,
>> provided there's a logical place to include it in the existing
>> documentation.
>>
>> But, anyway, Michael shouldn't remove it without some explanation or
>> discussion.
>
> He explained that where it was looked clunky, it being
> inside a table that otherwise has rows that are not tall.
>
> And, it looks like I'm wrong.  The format is documented
> by way of an example in section 19.8.1. Where To Log
> under log_destination (string).
>
> Sorry for the bother.

Er, well. I kept the same detail verbosity in the docs...

> I would like to see index entries for "current_logfiles"
> so this stuff is findable.

Why not.
-- 
Michael



Re: [HACKERS] Patch to implement pg_current_logfile() function

От
Michael Paquier
Дата:
On Thu, Jan 19, 2017 at 6:56 AM, Karl O. Pinc <kop@meme.com> wrote:
> On Wed, 18 Jan 2017 15:08:09 -0600
> "Karl O. Pinc" <kop@meme.com> wrote:
>
>> I would like to see index entries for "current_logfiles"
>> so this stuff is findable.
>
> Attached is a v27 of the patch.
>
> I polished some of the sentences in the documentation
> and added index entries.
>
> Also attached are a series of 2 patches to apply on top
> of v27 which make symbols out of hardcoded constants.

+     <indexterm>
+       <primary>current_logfiles</primary>
+       <secondary>and the log_destination configuration parameter</secondary>
+     </indexterm>
I have been wondering about this portion a bit, and actually that's pretty nice.

So patch is marked as ready for committer, the version proposed here
not using SRFs per Gilles' input on the matter.
-- 
Michael