Re: pg_stat_statements: Query normalisation may fail during stats reset

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: pg_stat_statements: Query normalisation may fail during stats reset
Дата
Msg-id 2104.1399401087@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: pg_stat_statements: Query normalisation may fail during stats reset  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: pg_stat_statements: Query normalisation may fail during stats reset
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, May 6, 2014 at 12:26 PM, Michael Renner
>> when regularly collecting & resetting query information from
>> pg_stat_statements it’s possible to trigger a situation where unnormalised
>> queries are stored.
>> 
>> Is this something that should be fixed or is this intentional behavior? The
>> documentation doesn’t make any strong claims on the state of the query
>> string, so this might be debatable. [1]

> It sounds pretty wonky to me, but then, so does the behavior in the
> email to which you linked.

The source code says that "query strings are normalized on a best effort
basis", so perhaps we ought to say the same in the documentation.

It would be rather expensive to provide a guarantee of normalization:
basically, we'd have to compute the normalized query string during parsing
*even when the hashtable entry already exists*, and then store it
somewhere where it'd survive till ExecutorEnd (but, preferably, not be
leaked if we never get to ExecutorEnd; which makes this hard).  I think
most people would find that a bad tradeoff.

One cheap-and-dirty solution is to throw away the execution stats if we
get to the end and find the hash table entry no longer exists, rather than
make a new entry with a not-normalized string.  Not sure if that cure is
better than the disease or not.

Another thought, though it's not too relevant to this particular scenario
of intentional resets, is that we could raise the priority of entries
for statements-in-progress even further.  I notice for example that if
entry_alloc finds an existing hashtable entry, it does nothing to raise
the usage count of that entry.

>> This is a bit counterintuitive if you rely on the query to be normalised,
>> e.g. for privacy reasons where you don’t want to leak query constants like
>> password hashes or usernames.

The bigger picture here is that relying on query normalization for privacy
doesn't seem like a bright idea.  Consider making sure that
security-relevant values are passed as parameters rather than being
embedded in the query text in the first place.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: pg_shmem_allocations view
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg_shmem_allocations view