Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)
Дата
Msg-id CAKJS1f-wc9jX819vA09yPjk+tW+TiVa76pJ_JRn-dJFHfxqmTg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On 31 December 2015 at 01:24, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 12/30/2015 08:16 AM, David Rowley wrote:

I do strongly believe that we need to come up with something to
solve this problem. I already summarised my thoughts on the other
thread.

One approach that I don't see mentioned on any of the threads is plan caching, which allows reusing the plan across many query executions, hopefully amortizing the planning costs.

I'm sure implementing such caching is non-trivial and there are cases where it may not help, but perhaps it's not entirely futile (AFAIK it's used by some databases exactly to address the higher planning costs).

I imagine a single GUC enabling or disabling this (possibly not just globally but per session, user or database).

We already have some form of plan caching, although only for prepared statements within a single session - maybe that could be a good starting point? For example what if we only enabled those "expensive" optimizations for prepared statements, which are assumed to be executed multiple times? Of course, this may not be entirely true (say, PL/pgSQL uses prepared statements all the time).

Of course, the annoying consequence of this would be that the planning may get somewhat unpredictable - the plan will depend on whether the query was planned directly or as a prepared statement, or whether plan caching is enabled. However, the same mostly applies to solutions proposed in the other threads so far.

Personally I'd like to see automatic plan caching occur in PostgreSQL. There is a bit of a problem with it in regards to a query such as: select * from t where mySkewedColumn = 1;  where the value 1 appears 99% of the time. Initially we may plan to seqscan, where with other values we'd likely prefer to index scan. I imagine with my unique joins patch, that it could be expanded to test baserestrictinfos to see if they contain quals with a unique index.   This knowledge could later permit plan caching to occur on queries which are safe from having any skew in the results.  It might sound rather restrictive, but likely this would cover 99% of UPDATE and DELETE operations in an OLTP workload, and likely a good deal of the SELECTs too.  The safety of caching could be analyzed during planning, and a flag could be set somewhere, perhaps in PlannedStmt to mark if the plan is safe to cache. The planner() function could initially hash the query string and check if any cached plan exists with that hash, if not, plan the statement, and then check if the "safeToCache" flag is set, and if so, stick that plan into a hash table. Also plans with no baserestrictinfos could be "safeToCache" too.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)
Следующее
От: David Rowley
Дата:
Сообщение: Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)