[HACKERS] CSV Logging questions

Поиск
Список
Период
Сортировка
От Greg Stark
Тема [HACKERS] CSV Logging questions
Дата
Msg-id CAM-w4HNEAL1JjOM25gMvT38dEmmsEd3tRwhDFACAiH+xd6m5Ww@mail.gmail.com
обсуждение исходный текст
Ответы Re: [HACKERS] CSV Logging questions  (David Fetter <david@fetter.org>)
Список pgsql-hackers
I was just looking over the CSV logging code and have a few questions
about why things were done the way they were done.

1) Why do we gather a per-session log line number? Is it just to aid
people importing to avoid duplicate entries from partial files? Is
there some other purpose given that entries will already be sequential
in the csv file?

2) Why is the file error conditional on log_error_verbosity? Surely
the whole point of a structured log is that you can log everything and
choose what to display later -- i.e. why csv logging doesn't look at
log_line_prefix to determine which other bits to display. There's no
added cost to include this information unconditionally and they're far
from the largest piece of data being logged either.

3) Similarly I wonder if the statement should always be included even
with hide_stmt is set so that users can write sensible queries against
the data even if it means duplicating data.

4) Why the session start time? Is this just so that <process_id,
session_start_time> uniquely identiifes a session? Should we perhaps
generate a unique session identifier instead?

The real reason I'm looking at this is because I'm looking at the
json_log plugin from Michael Paquier. It doesn't have the log line
numbers and I can't figure whether this is something it should have
because I can't quite figure out why they exist in CSV files. I think
there are a few other fields that have been added in Postgres but are
missing from the JSON log because of version skew.

I'm wondering if we should abstract out the CSV format so instead of
using emit_log_hook you would add a new format and it would specify a
"add_log_attribute(key,val)" hook which would get called once per log
format so you could have as many log formats as you want and be sure
they would all have the same data. That would also mean that the
timestamps would be in sync and we could probably eliminate the
occurrences of the wrong format appearing in the wrong logs.


-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Kuzmenkov
Дата:
Сообщение: Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Variable substitution in psql backtick expansion