Re: Redacting information from logs

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Redacting information from logs
Дата
Msg-id 20190803224757.6egkzussvkswnymk@alap3.anarazel.de
обсуждение исходный текст
Ответ на Redacting information from logs  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Redacting information from logs  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 2019-07-30 11:54:55 -0700, Jeff Davis wrote:
> My proposal is:
>
>  * redact every '%s' in an ereport by having a special mode for
> snprintf.c (this is possible because we now own snprintf)

I'm extremely doubtful this is a sane approach. We use snprintf for a
heck of a lot of things. The likelihood of this having unintended
consequences seems high (consider an error being thrown while trying to
report another error message and such). Nor do I think that snprintf.c
is a good layer to perform redaction - it's too low level. It's used for
both frontend/backend. It's used for both non-error and error purposes.

I also don't think you're actually going to get that far with it -
there's plenty places where we concatenate error messages without using
*printf, but e.g. appendStringInfoString().


> But I don't see a better solution. Right now, it's a pain to treat log
> files as sensitive things when there are so many ways they can help
> with smooth operations and so many tools available to analyze them.
> This proposal seems like a practical solution to enable better use of
> log files while protecting potentially-sensitive information.

I don't really see a low-effort way either. But I'm fairly certain that
this will cause at least many problems as it'll help solve.

I think incrementally moving to messages where portions of information
are separated out (e.g. the things we'd inline with %s) is, although a
lengthy process, the better approach. It'll make richer output formats
possible, it'll allow for proper redaction, etc.

I.e. something very roughly like

ereport(ERROR,
        errmsg_rich("string with %{named}s references to %{parameter}s"),
        errparam("named", somevar),
        errparam("parameter", othervar, .redact = CONTEXT));

Which would allow us to add annotate whether a specific parameter needs
to be redacted for certain contexts.

I'd probably add a errredact(bool) to annotate whether a message needs
to be redacted, mostly so we can easily flag a lot of current messages
as OK. When not present, I'd redact the entire message when errmsg() is
being used, and redact nothing if errmsg_rich() is used, and none of the
parameters flag an error.

That'd then also allow us to reference parameters that clients /
exception handlers may not see, e.g. the arguments to leakproof
functions. Which currently makes a lot of issues harder to debug,
because we don't get the values for e.g. overflows, input syntax errors
etc.

Allowing errparam()s to be specified that are not used in the error
messages, we can provide more detail to errors for people using richer
log outputs. I'd assume we'd fairly quickly have logfmt/json logging
target/format.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: A couple of random BF failures in kerberosCheck
Следующее
От: Chapman Flack
Дата:
Сообщение: Re: Redacting information from logs