Re: [PATCH] optional cleaning queries stored in pg_stat_statements
От | Tom Lane |
---|---|
Тема | Re: [PATCH] optional cleaning queries stored in pg_stat_statements |
Дата | |
Msg-id | 2223.1320592101@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [PATCH] optional cleaning queries stored in pg_stat_statements (Peter Geoghegan <peter@2ndquadrant.com>) |
Ответы |
Re: [PATCH] optional cleaning queries stored in
pg_stat_statements
("Tomas Vondra" <tv@fuzzy.cz>)
Re: [PATCH] optional cleaning queries stored in pg_stat_statements (Greg Smith <greg@2ndQuadrant.com>) Re: [PATCH] optional cleaning queries stored in pg_stat_statements (Peter Geoghegan <peter@2ndquadrant.com>) |
Список | pgsql-hackers |
Peter Geoghegan <peter@2ndquadrant.com> writes: > I'm a couple of days away from posting a much better principled > implementation of pg_stat_statements normalisation. To normalise, we > perform a selective serialisation of the query tree, which is hashed. That seems like an interesting approach, and definitely a lot less of a kluge than what Tomas suggested. Are you intending to hash the raw grammar output tree, the parse analysis result, the rewriter result, or the actual plan tree? I don't necessarily have a preconceived notion about which is best (I can think of pluses and minuses for each), but we should hear your explanation of which one you chose and what your reasoning was. > ... It also does things > like intelligently distinguishing between queries with different > limit/offset constant values, as these constants are deemed to be > differentiators of queries for our purposes. A guiding principal that > I've followed is that anything that could result in a different plan > is a differentiator of queries. This claim seems like bunk, unless you're hashing the plan tree, in which case it's tautological. As an example, switching from "where x < 1" to "where x < 2" could make the planner change plans, if there's a sufficiently large difference in the selectivity of the two cases (or even if there's a very small difference, but we're right at the tipping point where the estimated costs are equal). There's no way to know this in advance unless you care to duplicate all the planner cost estimation logic. And I don't think you actually want to classify "where x < 1" and "where x < 2" differently just because they *might* give different plans in corner cases. I'm not real sure whether it's better to classify on the basis of similar plans or similar original queries, anyway. This seems like something that could be open for debate about use-cases. > There will be additional infrastructure added to the parser to support > normalisation of query strings for the patch I'll be submitting (that > obviously won't be supported in the version that builds against > existing Postgres versions that I'll make available). Essentially, > I'll be adding a length field to certain nodes, This seems like a good way to get your patch rejected: adding overhead to the core system for the benefit of a feature that not everyone cares about is problematic. Why do you need it anyway? Surely it's possible to determine the length of a literal token after the fact. More generally, if you're hashing anything later than the raw grammar tree, I think that generating a suitable representation of the queries represented by a single hash entry is going to be problematic anyway. There could be significant differences --- much more than just the values of constants --- between query strings that end up being semantically the same. Or for that matter we could have identical query strings that wind up being considered different because of the action of search_path or other context. It might be that the path of least resistance is to document that we select one of the actual query strings "at random" to represent all the queries that went into the same hash entry, and not even bother with trying to strip out constants. The effort required to do that seems well out of proportion to the reward, if we can't do a perfect job of representing the common aspects of the queries. regards, tom lane
В списке pgsql-hackers по дате отправления:
Следующее
От: hubert depesz lubaczewskiДата:
Сообщение: Re: [GENERAL] Strange problem with create table as select * from table;