Re: [v9.5] Custom Plan API

Поиск
Список
Период
Сортировка
От Kouhei Kaigai
Тема Re: [v9.5] Custom Plan API
Дата
Msg-id 9A28C8860F777E439AA12E8AEA7694F8F9E93E@BPXM15GP.gisp.nec.co.jp
обсуждение исходный текст
Ответ на Re: [v9.5] Custom Plan API  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: [v9.5] Custom Plan API
Список pgsql-hackers
> On 7 May 2014 02:05, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > Prior to the development cycle towards v9.5, I'd like to reopen the
> > discussion of custom-plan interface. Even though we had lots of
> > discussion during the last three commit-fests, several issues are
> > still under discussion. So, I'd like to clarify direction of the
> > implementation, prior to the first commit-fest.
> >
> > (1) DDL support and system catalog
> >
> > Simon suggested that DDL command should be supported to track custom-
> > plan providers being installed, and to avoid nonsense hook calls if it
> > is an obvious case that custom-plan provider can help. It also makes
> > sense to give a chance to load extensions once installed.
> > (In the previous design, I assumed modules are loaded by LOAD command
> > or *_preload_libraries parameters).
> >
> > I tried to implement the following syntax:
> >
> >   CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;
> 
> Thank you for exploring that thought and leading the way on this research.
> I've been thinking about this also.
> 
> What I think we need is a declarative form that expresses the linkage between
> base table(s) and a related data structures that can be used to optimize
> a query, while still providing accurate results.
> 
> In other DBMS, we have concepts such as a JoinIndex or a MatView which allow
> some kind of lookaside behaviour. Just for clarity, a concrete example is
> Oracle's Materialized Views which can be set using ENABLE QUERY REWRITE
> so that the MatView can be used as an alternative path for a query. We do
> already have this concept in PostgreSQL, where an index can be used to
> perform an IndexOnlyScan rather than accessing the heap itself.
> 
> We have considerable evidence that the idea of alternate data structures
> results in performance gains.
> * KaiGai's work - https://wiki.postgresql.org/wiki/PGStrom
> * http://www.postgresql.org/message-id/52C59858.9090500@garret.ru
> * http://citusdata.github.io/cstore_fdw/
> * University of Manchester - exploring GPUs as part of the AXLE project
> * Barcelona SuperComputer Centre - exploring FPGAs, as part of the AXLE
> project
> * Some other authors have also cited gains using GPU technology in databases
> 
> So I would like to have a mechanism that provides a *generic* Lookaside
> for a table or foreign table.
> 
> Tom and Kevin have previously expressed that MatViews would represent a
> planning problem, in the general case. One way to solve that planning issue
> is to link structures directly together, in the same way that an index and
> a table are linked. We can then process the lookaside in the same way we
> handle a partial index - check prerequisites and if usable, calculate a
> cost for the alternate path.
> We need not add planning time other than to the tables that might benefit
> from that.
> 
> Roughly, I'm thinking of this...
> 
> CREATE LOOKASIDE ON foo
>    TO foo_mat_view;
> 
> and also this...
> 
> CREATE LOOKASIDE ON foo
>    TO foo_as_a_foreign_table   /* e.g. PGStrom */
> 
> This would allow the planner to consider alternate plans for foo_mv during
> set_plain_rel_pathlist() similarly to the way it considers index paths,
> in one of the common cases that the mat view covers just one table.
> 
> This concept is similar to ENABLE QUERY REWRITE in Oracle, but this thought
> goes much further, to include any generic user-defined data structure or
> foreign table.
> 
Let me clarify. This mechanism allows to add alternative scan/join paths
including built-in ones, not only custom enhanced plan/exec node, isn't it?
Probably, it is a variation of above proposition if we install a handler
function that proposes built-in path nodes towards the request for scan/join.

> Do we need this? For MVs, we *might* be able to deduce that the MV is
> rewritable for "foo", but that is not deducible for Foreign Tables, by
> current definition, so I prefer the explicit definition of objects that
> are linked - since doing this for indexes is already familiar to people.
> 
> Having an explicit linkage between data structures allows us to enhance
> an existing application by transaparently adding new structures, just as
> we already do with indexes. Specifically, that we allow more than one
> lookaside structure on any one table.
> 
Not only alternative data structure, alternative method to scan/join towards
same data structure is also important, isn't it?

> Forget the exact name, thats not important. But I think the requirements
> here are...
> 
> * Explicit definition that we are attaching an alternate path onto a table
> (conceptually similar to adding an index)
> 
I think the syntax allows "tables", not only a particular table.
It will inform the core planner this lookaside/customplan (name is not
important, anyway this feature...) can provide alternative path towards
the set of relations; being considered. So, it allows to reduce number of
function calls on planner stage.

> * Ability to check that the alternate path is viable (similar to the way
> we validate use of partial indexes prior to usage)
>     Checks on columns(SELECT), rows(WHERE), aggregations(GROUP)
> 
I never deny it... but do you think this feature from the initial version??

> * Ability to consider access cost for both normal table and alternate path
> (like an index) - this allows the alternate path to *not* be chosen when
> we are performing some operation that is sub-optimal (for whatever reason).
> 
It is an usual job of existing planner, isn't it?

> * There may be some need to define operator classes that are implemented
> via the alternate path
> 
> which works for single tables, but a later requirement would then be
> 
> * allows the join of one or more tables to be replaced with a single lookaside
> 
It's higher priority for me, and I guess it is same in MatView usage.

> Hopefully, we won't need a "Custom Plan" at all, just the ability to
> lookaside when useful.
> 
Probably, lookaside is a special case in the scenario that custom-plan can
provide. I also think it is an attractive use case if we can redirect
a particular complicated join into a MatView reference. So, it makes sense
to bundle a handler function to replace join by matview reference.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: New pg_lsn type doesn't have hash/btree opclasses
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers