Re: TABLESAMPLE patch is really in pretty sad shape

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: TABLESAMPLE patch is really in pretty sad shape
Дата
Msg-id CAB7nPqQk-SRhOUn612XMwL0DyXjBvRZ9gWnKbHb924TpN4=G1A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: TABLESAMPLE patch is really in pretty sad shape  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: TABLESAMPLE patch is really in pretty sad shape  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Mon, Jul 13, 2015 at 7:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> The two contrib modules this patch added are nowhere near fit for public
>> consumption.  They cannot clean up after themselves when dropped:
>> ...
>> Raw inserts into system catalogs just
>> aren't a sane thing to do in extensions.
>
> I had some thoughts about how we might fix that, without going to the
> rather tedious lengths of creating a complete set of DDL infrastructure
> for CREATE/DROP TABLESAMPLE METHOD.
>
> First off, the extension API designed for the tablesample patch is
> evidently modeled on the index AM API, which was not a particularly good
> precedent --- it's not a coincidence that index AMs can't be added or
> dropped on-the-fly.  Modeling a server internal API as a set of
> SQL-visible functions is problematic when the call signatures of those
> functions can't be accurately described by SQL datatypes, and it's rather
> pointless and inefficient when none of the functions in question is meant
> to be SQL-callable.  It's even less attractive if you don't think you've
> got a completely stable API spec, because adding, dropping, or changing
> signature of any one of the API functions then involves a pile of
> easy-to-mess-up catalog changes on top of the actually useful work.
> Not to mention then having to think about backwards compatibility of
> your CREATE command's arguments.
>
> We have a far better model to follow already, namely the foreign data
> wrapper API.  In that, there's a single SQL-visible function that returns
> a dummy datatype indicating that it's an FDW handler, and when called,
> it hands back a struct containing pointers to all the other functions
> that the particular wrapper needs to supply (and, if necessary, the struct
> could have non-function-pointer fields containing other info the core
> system might need to know about the handler).  We could similarly invent a
> pseudotype "tablesample_handler" and represent each tablesample method by
> a single SQL function returning tablesample_handler.  Any future changes
> in the API for tablesample handlers would then appear as changes in the C
> definition of the struct returned by the handler, which requires no
> SQL-visible thrashing, hence creates no headaches for pg_upgrade, and it
> is pretty easy to design it so that failure to update an external module
> that implements the API results in C compiler errors and/or sane fallback
> behavior.

That's elegant.

> Once we've done that, I think we don't even need a special catalog
> representing tablesample methods.  Given "FROM foo TABLESAMPLE
> bernoulli(...)", the parser could just look for a function bernoulli()
> returning tablesample_handler, and it's done.  The querytree would have an
> ordinary function dependency on that function, which would be sufficient
> to handle DROP dependency behaviors properly.  (On reflection, maybe
> better if it's "bernoulli(internal) returns tablesample_handler",
> so as to guarantee that such a function doesn't create a conflict with
> any user-defined function of the same name.)
>
> Thoughts?

Regarding the fact that those two contrib modules can be part of a
-contrib package and could be installed, nuking those two extensions
from the tree and preventing the creating of custom tablesample
methods looks like a correct course of things to do for 9.5.

When looking at this patch I was as well surprised that the BERNOUILLI
method can only be applied on a physical relation and was not able to
fire on a materialized set of tuples, say the result of a WITH clause
for example.

> PS: now that I've written this rant, I wonder why we don't redesign the
> index AM API along the same lines.  It probably doesn't matter much at
> the moment, but if we ever get serious about supporting index AM
> extensions, I think we ought to consider doing that.

Any API redesign looks to be clearly 9.6 work IMO at this point.
-- 
Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [PATCH] SQL function to report log message
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: multivariate statistics / patch v7