Обсуждение: WIP: Rework access method interface

Поиск
Список
Период
Сортировка

WIP: Rework access method interface

От
Alexander Korotkov
Дата:
Hacker,

some time before I proposed patches implementing CREATE ACCESS METHOD.
As I get from comments to my patches and also from Tom's comment about AM interface in tablesampling thread – AM interface needs reworking. And AFAICS AM interface rework is essential to have CREATE ACCESS METHOD command.

This is why I'd like to show a WIP patch implementing AM interface rework. Patch is quite dirty yet, but I think it illustrated the idea quite clear. AM now have single parameter – handler function. This handler returns pointer to AmRoutine which have the same data as pg_am had before. Thus, API is organized very like FDW.

However, this patch appears to take more work than I expected. It have to do many changes spread among many files. Also, it appears not so easy to hide amsupport into AmRoutine, because it's needed for relcache. As a temporary solution it's duplicated in RelationData.

What do you think about this approach of AM interface rework?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-08-09 23:56, Alexander Korotkov wrote:
> Hacker,
>
> some time before I proposed patches implementing CREATE ACCESS METHOD.
> http://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com
> As I get from comments to my patches and also from Tom's comment about
> AM interface in tablesampling thread – AM interface needs reworking. And
> AFAICS AM interface rework is essential to have CREATE ACCESS METHOD
> command.
> http://www.postgresql.org/message-id/5438.1436740611@sss.pgh.pa.us

Cool, I was planning to take a stab at this myself when I have time, so 
I am glad you posted this.

> This is why I'd like to show a WIP patch implementing AM interface
> rework. Patch is quite dirty yet, but I think it illustrated the idea
> quite clear. AM now have single parameter – handler function. This
> handler returns pointer to AmRoutine which have the same data as pg_am
> had before. Thus, API is organized very like FDW.
>

I wonder if it would make sense to call it IndexAmRoutine instead in 
case we add other AMs (like the sequence am, or maybe column store if 
that's done as AM) in the future.

The patch should probably add the am_handler type which should be return 
type of the am handler function (see fdw_handler and tsm_handler types).

I also think that the CHECK_PROCEDUREs should be in the place of the 
original GET_REL_PROCEDUREs  (just after relation check). If the 
interface must exist we may as well check for it at the beginning and 
not after we did some other work which is useless without the interface.

> However, this patch appears to take more work than I expected. It have
> to do many changes spread among many files.

Yeah you need to change the definition and I/O handling of every AM 
function, but that's to be expected.

> Also, it appears not so easy
> to hide amsupport into AmRoutine, because it's needed for relcache. As a
> temporary solution it's duplicated in RelationData.
>

I don't understand this, there is already AmRoutine in RelationData, why 
the need for additional field for just amsupport?

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Aug 10, 2015 at 1:12 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-08-09 23:56, Alexander Korotkov wrote:
Hacker,

some time before I proposed patches implementing CREATE ACCESS METHOD.
http://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com
As I get from comments to my patches and also from Tom's comment about
AM interface in tablesampling thread – AM interface needs reworking. And
AFAICS AM interface rework is essential to have CREATE ACCESS METHOD
command.
http://www.postgresql.org/message-id/5438.1436740611@sss.pgh.pa.us

Cool, I was planning to take a stab at this myself when I have time, so I am glad you posted this.

This is why I'd like to show a WIP patch implementing AM interface
rework. Patch is quite dirty yet, but I think it illustrated the idea
quite clear. AM now have single parameter – handler function. This
handler returns pointer to AmRoutine which have the same data as pg_am
had before. Thus, API is organized very like FDW.


I wonder if it would make sense to call it IndexAmRoutine instead in case we add other AMs (like the sequence am, or maybe column store if that's done as AM) in the future.

Good point!
 
The patch should probably add the am_handler type which should be return type of the am handler function (see fdw_handler and tsm_handler types).

Sounds reasonable!
 
I also think that the CHECK_PROCEDUREs should be in the place of the original GET_REL_PROCEDUREs  (just after relation check). If the interface must exist we may as well check for it at the beginning and not after we did some other work which is useless without the interface.

Ok, good point too.

However, this patch appears to take more work than I expected. It have
to do many changes spread among many files.

Yeah you need to change the definition and I/O handling of every AM function, but that's to be expected.

Yes, this is why I decided to get some feedback on this stage of work.
 
Also, it appears not so easy
to hide amsupport into AmRoutine, because it's needed for relcache. As a
temporary solution it's duplicated in RelationData.

I don't understand this, there is already AmRoutine in RelationData, why the need for additional field for just amsupport?

We need amsupport in load_relcache_init_file() which reads "pg_internal.init". I'm not sure this is correct place to call am_handler. It should work in the case of built-in AM. But if AM is defined in the extension then we wouldn't be able to do catalog lookup for am_handler on this stage of initialization.

Another point is that amsupport and amstrategies are used for regression tests of opclasses and opfamilies. Thus, we probably can keep them in pg_am.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Mon, Aug 10, 2015 at 1:12 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
>> I don't understand this, there is already AmRoutine in RelationData, why
>> the need for additional field for just amsupport?

> We need amsupport in load_relcache_init_file() which reads
> "pg_internal.init". I'm not sure this is correct place to call am_handler.
> It should work in the case of built-in AM. But if AM is defined in the
> extension then we wouldn't be able to do catalog lookup for am_handler on
> this stage of initialization.

This is an issue we'll have to face before there's much hope of having
index AMs as extensions: how would you locate any extension function
without catalog access?  Storing raw function pointers in pg_internal.init
is not an answer in an ASLR world.

I think we can dodge the issue so far as pg_internal.init is concerned by
decreeing that system catalogs can only have indexes with built-in AMs.
Calling a built-in function doesn't require catalog access, so there
should be no problem with re-calling the handler function by OID during
load_relcache_init_file().

We could also have problems with WAL replay, though I think the consensus
there is that extension AMs have to use generic WAL records that don't
require any index-specific replay code.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Aug 10, 2015 at 5:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Mon, Aug 10, 2015 at 1:12 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
>> I don't understand this, there is already AmRoutine in RelationData, why
>> the need for additional field for just amsupport?

> We need amsupport in load_relcache_init_file() which reads
> "pg_internal.init". I'm not sure this is correct place to call am_handler.
> It should work in the case of built-in AM. But if AM is defined in the
> extension then we wouldn't be able to do catalog lookup for am_handler on
> this stage of initialization.

This is an issue we'll have to face before there's much hope of having
index AMs as extensions: how would you locate any extension function
without catalog access?  Storing raw function pointers in pg_internal.init
is not an answer in an ASLR world.

I think we can dodge the issue so far as pg_internal.init is concerned by
decreeing that system catalogs can only have indexes with built-in AMs.
Calling a built-in function doesn't require catalog access, so there
should be no problem with re-calling the handler function by OID during
load_relcache_init_file().

That should work, thanks! Also we can have SQL-visible functions to get amsupport and amstrategies and use them in the regression tests.
 
We could also have problems with WAL replay, though I think the consensus
there is that extension AMs have to use generic WAL records that don't
require any index-specific replay code.

Yes, I've already showed one version of generic WAL records. The main objecting against them was it's hard insure that composed WAL-record is correct.
Now I'm working on new version which would be much easier and safe to use.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-08-10 16:58, Alexander Korotkov wrote:
> On Mon, Aug 10, 2015 at 5:48 PM, Tom Lane <tgl@sss.pgh.pa.us
> <mailto:tgl@sss.pgh.pa.us>> wrote:
>
>     Alexander Korotkov <a.korotkov@postgrespro.ru
>     <mailto:a.korotkov@postgrespro.ru>> writes:
>     > On Mon, Aug 10, 2015 at 1:12 PM, Petr Jelinek <petr@2ndquadrant.com <mailto:petr@2ndquadrant.com>> wrote:
>     >> I don't understand this, there is already AmRoutine in RelationData, why
>     >> the need for additional field for just amsupport?
>
>     > We need amsupport in load_relcache_init_file() which reads
>     > "pg_internal.init". I'm not sure this is correct place to call am_handler.
>     > It should work in the case of built-in AM. But if AM is defined in the
>     > extension then we wouldn't be able to do catalog lookup for am_handler on
>     > this stage of initialization.
>
>     This is an issue we'll have to face before there's much hope of having
>     index AMs as extensions: how would you locate any extension function
>     without catalog access?  Storing raw function pointers in
>     pg_internal.init
>     is not an answer in an ASLR world.
>
>     I think we can dodge the issue so far as pg_internal.init is
>     concerned by
>     decreeing that system catalogs can only have indexes with built-in AMs.
>     Calling a built-in function doesn't require catalog access, so there
>     should be no problem with re-calling the handler function by OID during
>     load_relcache_init_file().
>
>
> That should work, thanks! Also we can have SQL-visible functions to get
> amsupport and amstrategies and use them in the regression tests.
>

SQL-visible functions would be preferable to storing it in pg_am as 
keeping the params in pg_am would limit the extensibility of pg_am itself.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Aug 10, 2015 at 6:36 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-08-10 16:58, Alexander Korotkov wrote:
That should work, thanks! Also we can have SQL-visible functions to get
amsupport and amstrategies and use them in the regression tests.


SQL-visible functions would be preferable to storing it in pg_am as keeping the params in pg_am would limit the extensibility of pg_am itself.

I actually meant just two particular functions (not per AM) which both get AM oid as argument. They should call amhandler and return amsupport and amstrategies correspondingly.
These functions can be used in regression tests to check opclass and opfamilies correctness.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-08-10 16:58, Alexander Korotkov wrote:
>> That should work, thanks! Also we can have SQL-visible functions to get
>> amsupport and amstrategies and use them in the regression tests.

> SQL-visible functions would be preferable to storing it in pg_am as 
> keeping the params in pg_am would limit the extensibility of pg_am itself.

I don't see any particularly good reason to remove amsupport and
amstrategies from pg_am.  Those are closely tied to the other catalog
infrastructure for indexes (pg_amproc, pg_amop) which I don't think are
candidates for getting changed by this patch.

There are a couple of other pg_am columns, such as amstorage and
amcanorderbyop, which similarly bear on what's legal to appear in
related catalogs such as pg_opclass.  I'd be sort of inclined to
leave those in the catalog as well.  I do not see that exposing
a SQL function is better than exposing a catalog column; either
way, that property is SQL-visible.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Aug 10, 2015 at 6:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-08-10 16:58, Alexander Korotkov wrote:
>> That should work, thanks! Also we can have SQL-visible functions to get
>> amsupport and amstrategies and use them in the regression tests.

> SQL-visible functions would be preferable to storing it in pg_am as
> keeping the params in pg_am would limit the extensibility of pg_am itself.

I don't see any particularly good reason to remove amsupport and
amstrategies from pg_am.  Those are closely tied to the other catalog
infrastructure for indexes (pg_amproc, pg_amop) which I don't think are
candidates for getting changed by this patch.

There are a couple of other pg_am columns, such as amstorage and
amcanorderbyop, which similarly bear on what's legal to appear in
related catalogs such as pg_opclass.  I'd be sort of inclined to
leave those in the catalog as well.  I do not see that exposing
a SQL function is better than exposing a catalog column; either
way, that property is SQL-visible.

That answers my question, thanks!

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-08-10 17:47, Tom Lane wrote:
> Petr Jelinek <petr@2ndquadrant.com> writes:
>> On 2015-08-10 16:58, Alexander Korotkov wrote:
>>> That should work, thanks! Also we can have SQL-visible functions to get
>>> amsupport and amstrategies and use them in the regression tests.
>
>> SQL-visible functions would be preferable to storing it in pg_am as
>> keeping the params in pg_am would limit the extensibility of pg_am itself.
>
> I don't see any particularly good reason to remove amsupport and
> amstrategies from pg_am.  Those are closely tied to the other catalog
> infrastructure for indexes (pg_amproc, pg_amop) which I don't think are
> candidates for getting changed by this patch.
>

Ok, in that case it seems unlikely that we'll be able to use pg_am for 
any other access methods besides indexes in the future.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Mon, Aug 10, 2015 at 6:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> There are a couple of other pg_am columns, such as amstorage and
>> amcanorderbyop, which similarly bear on what's legal to appear in
>> related catalogs such as pg_opclass.  I'd be sort of inclined to
>> leave those in the catalog as well.  I do not see that exposing
>> a SQL function is better than exposing a catalog column; either
>> way, that property is SQL-visible.

> That answers my question, thanks!

BTW, just to clarify: we can still have the desirable property that
CREATE INDEX ACCESS METHOD needs no parameters other than the AM handler
function name.  The AM code would still expose all of its properties
through the struct returned by the handler.  What is at issue here is
how information in that struct that needs to be available to SQL code
gets exposed.  We can do that either by exposing SQL functions to get
it, or by having CREATE INDEX ACCESS METHOD call the handler and then
copy some fields into the new pg_am row.  I'm suggesting that we should
do the latter for any fields that determine validity of pg_opclass,
pg_amop, etc entries associated with the AM type.  The AM could not
reasonably change its mind about such properties (within a major
release at least) so there is no harm in making a long-lived copy
in pg_am.  And we might as well not break SQL code unnecessarily
by removing fields that used to be there.

There may be some other AM properties that would be better to expose
through SQL functions because they could potentially change without
invalidating information stored in the other catalogs.  I'm not sure
whether there is any such data that needs to be accessible at SQL
level, though.
        regards, tom lane



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-08-10 17:47, Tom Lane wrote:
>> I don't see any particularly good reason to remove amsupport and
>> amstrategies from pg_am.  Those are closely tied to the other catalog
>> infrastructure for indexes (pg_amproc, pg_amop) which I don't think are
>> candidates for getting changed by this patch.

> Ok, in that case it seems unlikely that we'll be able to use pg_am for 
> any other access methods besides indexes in the future.

I think that's likely for the best anyway; there are too many catalogs
that think a pg_am OID identifies an index AM.  Better to create other
catalogs for other types of AMs.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Tom Lane wrote:

> There are a couple of other pg_am columns, such as amstorage and
> amcanorderbyop, which similarly bear on what's legal to appear in
> related catalogs such as pg_opclass.  I'd be sort of inclined to
> leave those in the catalog as well.  I do not see that exposing
> a SQL function is better than exposing a catalog column; either
> way, that property is SQL-visible.

If we do that, it doesn't seem reasonable to use the same catalog for
other things such as sequence AM, right?  IMO it'd be better to keep the
catalog agnostic for exactly what each row is going to be an AM for.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-08-10 18:08, Tom Lane wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>> On Mon, Aug 10, 2015 at 6:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> There are a couple of other pg_am columns, such as amstorage and
>>> amcanorderbyop, which similarly bear on what's legal to appear in
>>> related catalogs such as pg_opclass.  I'd be sort of inclined to
>>> leave those in the catalog as well.  I do not see that exposing
>>> a SQL function is better than exposing a catalog column; either
>>> way, that property is SQL-visible.
>
>> That answers my question, thanks!
>
> BTW, just to clarify: we can still have the desirable property that
> CREATE INDEX ACCESS METHOD needs no parameters other than the AM handler
> function name.  The AM code would still expose all of its properties
> through the struct returned by the handler.  What is at issue here is
> how information in that struct that needs to be available to SQL code
> gets exposed.  We can do that either by exposing SQL functions to get
> it, or by having CREATE INDEX ACCESS METHOD call the handler and then
> copy some fields into the new pg_am row.  I'm suggesting that we should
> do the latter for any fields that determine validity of pg_opclass,
> pg_amop, etc entries associated with the AM type.  The AM could not
> reasonably change its mind about such properties (within a major
> release at least) so there is no harm in making a long-lived copy
> in pg_am.  And we might as well not break SQL code unnecessarily
> by removing fields that used to be there.
>

That's definitely the case for built-in AMs but 3rd party ones won't 
necessarily follow PG release schedule/versioning and I can imagine 
change of for example amcanorderbyop between AM releases as the AM 
matures. This could probably be solved by some kind of invalidation 
mechanism (even if it's some additional SQL).

However I am not sure if using catalog as some sort of cache for 
function output is a good idea in general. IMHO it would be better to 
just have those options as part of CREATE and ALTER DDL for INDEX ACCESS 
METHODS if we store them in pg_am.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-08-10 18:16, Alvaro Herrera wrote:
> Tom Lane wrote:
>
>> There are a couple of other pg_am columns, such as amstorage and
>> amcanorderbyop, which similarly bear on what's legal to appear in
>> related catalogs such as pg_opclass.  I'd be sort of inclined to
>> leave those in the catalog as well.  I do not see that exposing
>> a SQL function is better than exposing a catalog column; either
>> way, that property is SQL-visible.
>
> If we do that, it doesn't seem reasonable to use the same catalog for
> other things such as sequence AM, right?  IMO it'd be better to keep the
> catalog agnostic for exactly what each row is going to be an AM for.
>

Yeah I said the same, the question is if we should have pg_am agnostic 
or just assume that it's index AM and let other AM types create separate 
catalogs. Tom seems to prefer the latter, I don't see problem with that, 
except that I would really hate to add more am related columns to 
pg_class. I would not mind having relam pointing to different AM catalog 
for different relkinds but dunno if that's ok for others (it's not 
really normalized design).

We could also keep pg_am agnostic and add pg_index_am for additional 
info about index AMs, but that would make this patch more invasive.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Petr Jelinek <petr@2ndquadrant.com> writes:
> > On 2015-08-10 17:47, Tom Lane wrote:
> >> I don't see any particularly good reason to remove amsupport and
> >> amstrategies from pg_am.  Those are closely tied to the other catalog
> >> infrastructure for indexes (pg_amproc, pg_amop) which I don't think are
> >> candidates for getting changed by this patch.
> 
> > Ok, in that case it seems unlikely that we'll be able to use pg_am for 
> > any other access methods besides indexes in the future.
> 
> I think that's likely for the best anyway; there are too many catalogs
> that think a pg_am OID identifies an index AM.  Better to create other
> catalogs for other types of AMs.

That means we won't be able to reuse pg_class.relam as a pointer to the
AM-of-the-other-kind either.  I don't think this is the best course of
action.  We have the sequence AM patch that already reuses
pg_class.relam to point to pg_seqam.oid, but you objected to that on
relational theory grounds, which seemed reasonable to me.  The other
option is to create yet another pg_class column with an OID of some
other AM catalog, but this seems a waste.

FWIW the column store patch we're working on also wants to have its own
AM-like catalog.  In our current code we have a separate catalog
pg_colstore_am, but are eagerly waiting for the discussion on this to
settle so that we can just use pg_am and pg_class.relam instead.  (The
patch itself is not public yet since it's nowhere near usable, and this
detail is a pretty minor issue, but I thought reasonable to bring it up
here.  We're also waiting on upper-planner "path-ification" since it
seems likely that some code will collide there, anyway.)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> I think that's likely for the best anyway; there are too many catalogs
>> that think a pg_am OID identifies an index AM.  Better to create other
>> catalogs for other types of AMs.

> That means we won't be able to reuse pg_class.relam as a pointer to the
> AM-of-the-other-kind either.

Hm.  So one way or the other we're going to end up violating relational
theory somewhere.  OK, I yield: let's say that pg_am has amname, amkind,
amhandler, and nothing else.  Then we will need SQL functions to expose
whatever information we think needs to be available to SQL code.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Aug 10, 2015 at 7:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> I think that's likely for the best anyway; there are too many catalogs
>> that think a pg_am OID identifies an index AM.  Better to create other
>> catalogs for other types of AMs.

> That means we won't be able to reuse pg_class.relam as a pointer to the
> AM-of-the-other-kind either.

Hm.  So one way or the other we're going to end up violating relational
theory somewhere.  OK, I yield: let's say that pg_am has amname, amkind,
amhandler, and nothing else.  Then we will need SQL functions to expose
whatever information we think needs to be available to SQL code.

There is second revision of this patch. Changes are so:

 * AmRoutine was renamed to IndexAmRoutine assuming there could be other access methods in the future.
 * amhandlers now return index_am_handler pseudotype.
 * CHECK_PROCEDUREs are now is the place of original GET_REL_PROCEDUREs.
 * amstrategies, amsupport, amcanorderbyop, amstorage, amkeytype are in both pg_am and IndexAmRoutine. Consistence of amhandler answer and pg_am tuple is checking.
 * Some comments and refactoring.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Mon, Aug 10, 2015 at 7:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hm.  So one way or the other we're going to end up violating relational
>> theory somewhere.  OK, I yield: let's say that pg_am has amname, amkind,
>> amhandler, and nothing else.  Then we will need SQL functions to expose
>> whatever information we think needs to be available to SQL code.

> There is second revision of this patch. Changes are so:

>  * AmRoutine was renamed to IndexAmRoutine assuming there could be other
> access methods in the future.
>  * amhandlers now return index_am_handler pseudotype.
>  * CHECK_PROCEDUREs are now is the place of original GET_REL_PROCEDUREs.
>  * amstrategies, amsupport, amcanorderbyop, amstorage, amkeytype are in
> both pg_am and IndexAmRoutine. Consistence of amhandler answer and pg_am
> tuple is checking.

[ scratches head... ]  I thought we'd just agreed we weren't going to keep
any of those pg_am columns?  If we keep them, we'll have to define what
they mean for sequence AMs etc.  ("Let them be NULL" would likely break
enough stuff that we might as well not have them.)
        regards, tom lane



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Aug 24, 2015 at 5:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Mon, Aug 10, 2015 at 7:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hm.  So one way or the other we're going to end up violating relational
>> theory somewhere.  OK, I yield: let's say that pg_am has amname, amkind,
>> amhandler, and nothing else.  Then we will need SQL functions to expose
>> whatever information we think needs to be available to SQL code.

> There is second revision of this patch. Changes are so:

>  * AmRoutine was renamed to IndexAmRoutine assuming there could be other
> access methods in the future.
>  * amhandlers now return index_am_handler pseudotype.
>  * CHECK_PROCEDUREs are now is the place of original GET_REL_PROCEDUREs.
>  * amstrategies, amsupport, amcanorderbyop, amstorage, amkeytype are in
> both pg_am and IndexAmRoutine. Consistence of amhandler answer and pg_am
> tuple is checking.

[ scratches head... ]  I thought we'd just agreed we weren't going to keep
any of those pg_am columns?  If we keep them, we'll have to define what
they mean for sequence AMs etc.  ("Let them be NULL" would likely break
enough stuff that we might as well not have them.)

From the previous discussion I see following options:
1) Non-index access methods don't reuse pg_class.relam nor pg_am. Fully relational compliant but complex catalog structure.
2) Non-index access methods reuse pg_class.relam but don't reuse pg_am. This violates relational theory because single column reference multiple tables.
3) Non-index access methods reuse both pg_class.relam and pg_am. This violates relational theory because we store different objects in the same table.

I'd say we already have precedent of #2. It's pg_depend which reference objects of arbitrary types.
In the #3 we really shouldn't keep any specific to index am in pg_am.

But what we assume to be an access method in general? For instance, we have foreign data wrappers which aren't access methods (but looks quite similar from some degree). 

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Jim Nasby
Дата:
On 8/24/15 9:49 AM, Alexander Korotkov wrote:
> 2) Non-index access methods reuse pg_class.relam but don't reuse pg_am.
> This violates relational theory because single column reference multiple
> tables.
> 3) Non-index access methods reuse both pg_class.relam and pg_am. This
> violates relational theory because we store different objects in the
> same table.
>
> I'd say we already have precedent of #2. It's pg_depend which reference
> objects of arbitrary types.
> In the #3 we really shouldn't keep any specific to index am in pg_am.

In userspace, table inheritance handles this nicely. Stick a "type" 
field in the parent so you know what kind of entity each record is, 
along with all your common fields. Everything else is in the children, 
and code generally already knows which child table to hit or doesn't 
care about specifics and hits only the parent. Perhaps something similar 
could be made to work with a catalog table.

#2 seems like a twist on the same idea, except that there's fields in 
pg_class that tell you what the child is instead of a real parent table. 
Presumably we could still create a parent table even if the internals 
were going through pg_class.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Jim Nasby wrote:
> On 8/24/15 9:49 AM, Alexander Korotkov wrote:
> >2) Non-index access methods reuse pg_class.relam but don't reuse pg_am.
> >This violates relational theory because single column reference multiple
> >tables.
> >3) Non-index access methods reuse both pg_class.relam and pg_am. This
> >violates relational theory because we store different objects in the
> >same table.
> >
> >I'd say we already have precedent of #2. It's pg_depend which reference
> >objects of arbitrary types.
> >In the #3 we really shouldn't keep any specific to index am in pg_am.

In my reading of the thread, we have a consensus for doing #3, and that
one gets my vote in any case.

> In userspace, table inheritance handles this nicely. Stick a "type" field in
> the parent so you know what kind of entity each record is, along with all
> your common fields.

Yeah, this pattern is not hugely common but it's definitely used in some
places.  In fact, I would think it is less of a violation of relational
theory than #2 -- because then relam is always a reference to pg_am,
instead of sometimes being a reference to some other catalog.  What's
stored in pg_am is not pg_class' concern; and I think calling pg_am a
catalog for "access methods" (in a generic way, not only indexes) is
sound.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Jim Nasby wrote:
>> On 8/24/15 9:49 AM, Alexander Korotkov wrote:
>>> 3) Non-index access methods reuse both pg_class.relam and pg_am. This
>>> violates relational theory because we store different objects in the
>>> same table.

> In my reading of the thread, we have a consensus for doing #3, and that
> one gets my vote in any case.

That's what I thought as well.

>> In userspace, table inheritance handles this nicely. Stick a "type" field in
>> the parent so you know what kind of entity each record is, along with all
>> your common fields.

> Yeah, this pattern is not hugely common but it's definitely used in some
> places.  In fact, I would think it is less of a violation of relational
> theory than #2 -- because then relam is always a reference to pg_am,
> instead of sometimes being a reference to some other catalog.  What's
> stored in pg_am is not pg_class' concern; and I think calling pg_am a
> catalog for "access methods" (in a generic way, not only indexes) is
> sound.

I'm good with this as long as all the things that get stored in pg_am
are things that pg_class.relam can legitimately reference.  If somebody
proposed adding an "access method" kind that was not a relation access
method, I'd probably push back on whether that should be in pg_am or
someplace else.

The fact that the subsidiary tables like pg_opclass no longer have
perfectly clean foreign keys to pg_am is a bit annoying, but that's better
than having pg_class.relam linking to multiple tables.  (Also, if we
really wanted to we could define the foreign key constraints as
multicolumn ones that also require a match on amkind.  So it's not *that*
broken.  We could easily add opr_sanity tests to verify proper matching.)
        regards, tom lane



Re: WIP: Rework access method interface

От
Jim Nasby
Дата:
On 8/25/15 10:56 AM, Tom Lane wrote:
> I'm good with this as long as all the things that get stored in pg_am
> are things that pg_class.relam can legitimately reference.  If somebody
> proposed adding an "access method" kind that was not a relation access
> method, I'd probably push back on whether that should be in pg_am or
> someplace else.

Would fields in pg_am be overloaded then? From a SQL standpoint it'd be 
much nicer to have child tables, though that could potentially be faked 
with views.

> The fact that the subsidiary tables like pg_opclass no longer have
> perfectly clean foreign keys to pg_am is a bit annoying, but that's better
> than having pg_class.relam linking to multiple tables.  (Also, if we
> really wanted to we could define the foreign key constraints as
> multicolumn ones that also require a match on amkind.  So it's not*that*
> broken.  We could easily add opr_sanity tests to verify proper matching.)

I have wished for something similar to CHECK constraints that operated 
on data in a *related* table (something we already have a foreign key 
to) for this purpose. In the past I've enforced it with triggers on both 
sides but writing those gets old after a while.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Jim Nasby wrote:
> On 8/25/15 10:56 AM, Tom Lane wrote:
> >I'm good with this as long as all the things that get stored in pg_am
> >are things that pg_class.relam can legitimately reference.  If somebody
> >proposed adding an "access method" kind that was not a relation access
> >method, I'd probably push back on whether that should be in pg_am or
> >someplace else.
> 
> Would fields in pg_am be overloaded then? From a SQL standpoint it'd be much
> nicer to have child tables, though that could potentially be faked with
> views.

The whole point of this conversation is that we're getting rid of almost
all the columns in pg_am, leaving only an "amkind" column and a pointer
to a handler function.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> On 8/25/15 10:56 AM, Tom Lane wrote:
>> I'm good with this as long as all the things that get stored in pg_am
>> are things that pg_class.relam can legitimately reference.  If somebody
>> proposed adding an "access method" kind that was not a relation access
>> method, I'd probably push back on whether that should be in pg_am or
>> someplace else.

> Would fields in pg_am be overloaded then?

No, because the proposal was to reduce pg_am to just amname, amkind
(which would be something like 'i' or 's'), and amhandler.  Everything
specific to a particular type of access method would be shoved down to
the level of the C APIs.

> From a SQL standpoint it'd be 
> much nicer to have child tables, though that could potentially be faked 
> with views.

I've looked into having actual child tables in the system catalogs, and
I'm afraid that the pain-to-reward ratio doesn't look very good.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Tue, Aug 25, 2015 at 7:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> On 8/25/15 10:56 AM, Tom Lane wrote:
>> I'm good with this as long as all the things that get stored in pg_am
>> are things that pg_class.relam can legitimately reference.  If somebody
>> proposed adding an "access method" kind that was not a relation access
>> method, I'd probably push back on whether that should be in pg_am or
>> someplace else.

> Would fields in pg_am be overloaded then?

No, because the proposal was to reduce pg_am to just amname, amkind
(which would be something like 'i' or 's'), and amhandler.  Everything
specific to a particular type of access method would be shoved down to
the level of the C APIs.

OK. So, as we mentioned before, if we need to expose something of am parameters at SQL-level then we need to write special functions which would call amhandler and expose it.
Did we come to the agreement on this solution?

> From a SQL standpoint it'd be
> much nicer to have child tables, though that could potentially be faked
> with views.

I've looked into having actual child tables in the system catalogs, and
I'm afraid that the pain-to-reward ratio doesn't look very good.

Agree. Teach syscache about inheritance would be overengeneering for this problem.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> OK. So, as we mentioned before, if we need to expose something of am
> parameters at SQL-level then we need to write special functions which would
> call amhandler and expose it.
> Did we come to the agreement on this solution?

I think we were agreed that we should write functions to expose whatever
needs to be visible at SQL level.  I'm not sure that we had a consensus
on exactly which things need to be visible.

One thought here is that we might not want to just blindly duplicate
the existing pg_am behavior anyway.  For example, the main use of the
amstrategies column was to allow validation of pg_amop.amopstrategy
entries --- but in 4 out of the 6 existing AMs, knowledge of the AM alone
isn't sufficient information to determine the valid set of strategy
numbers anyway.  So inventing a "pg_amstrategies(am oid)" function seems
like it would be repeating a previous failure.  Not quite sure what to
do instead, though.  We could imagine something like "pg_amstrategies(am
oid, opclass oid)", but I don't see how to implement it without asking
opclasses to provide a validation function, which maybe is a change we
don't want to take on here.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Wed, Aug 26, 2015 at 6:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> OK. So, as we mentioned before, if we need to expose something of am
> parameters at SQL-level then we need to write special functions which would
> call amhandler and expose it.
> Did we come to the agreement on this solution?

I think we were agreed that we should write functions to expose whatever
needs to be visible at SQL level.  I'm not sure that we had a consensus
on exactly which things need to be visible.

One thought here is that we might not want to just blindly duplicate
the existing pg_am behavior anyway.  For example, the main use of the
amstrategies column was to allow validation of pg_amop.amopstrategy
entries --- but in 4 out of the 6 existing AMs, knowledge of the AM alone
isn't sufficient information to determine the valid set of strategy
numbers anyway.  So inventing a "pg_amstrategies(am oid)" function seems
like it would be repeating a previous failure.  Not quite sure what to
do instead, though.  We could imagine something like "pg_amstrategies(am
oid, opclass oid)", but I don't see how to implement it without asking
opclasses to provide a validation function, which maybe is a change we
don't want to take on here.

Could we add another function to access method interface which would validate opclass?
Am could validate this way not only strategies, but also supporting functions. For instance, in GIN, we now require opclass to specify at least one of consistent and triconsistent. ISTM I would be nice to let the access method check such conditions. Also, we would be able to check opclass correction on its creation. Now one can create opclass with missing support functions which doesn't work.
In the SQL-level we can create function which validates opclass using this new method. This function can be used in regression tests.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Wed, Aug 26, 2015 at 7:20 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Wed, Aug 26, 2015 at 6:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> OK. So, as we mentioned before, if we need to expose something of am
> parameters at SQL-level then we need to write special functions which would
> call amhandler and expose it.
> Did we come to the agreement on this solution?

I think we were agreed that we should write functions to expose whatever
needs to be visible at SQL level.  I'm not sure that we had a consensus
on exactly which things need to be visible.

One thought here is that we might not want to just blindly duplicate
the existing pg_am behavior anyway.  For example, the main use of the
amstrategies column was to allow validation of pg_amop.amopstrategy
entries --- but in 4 out of the 6 existing AMs, knowledge of the AM alone
isn't sufficient information to determine the valid set of strategy
numbers anyway.  So inventing a "pg_amstrategies(am oid)" function seems
like it would be repeating a previous failure.  Not quite sure what to
do instead, though.  We could imagine something like "pg_amstrategies(am
oid, opclass oid)", but I don't see how to implement it without asking
opclasses to provide a validation function, which maybe is a change we
don't want to take on here.

Could we add another function to access method interface which would validate opclass?
Am could validate this way not only strategies, but also supporting functions. For instance, in GIN, we now require opclass to specify at least one of consistent and triconsistent. ISTM I would be nice to let the access method check such conditions. Also, we would be able to check opclass correction on its creation. Now one can create opclass with missing support functions which doesn't work.
In the SQL-level we can create function which validates opclass using this new method. This function can be used in regression tests.

Should I try to implement such new access method function, say 'amvalidate'?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-08-27 15:15, Alexander Korotkov wrote:
> On Wed, Aug 26, 2015 at 7:20 PM, Alexander Korotkov
> <a.korotkov@postgrespro.ru <mailto:a.korotkov@postgrespro.ru>> wrote:
>
>     On Wed, Aug 26, 2015 at 6:50 PM, Tom Lane <tgl@sss.pgh.pa.us
>     <mailto:tgl@sss.pgh.pa.us>> wrote:
>
>         One thought here is that we might not want to just blindly duplicate
>         the existing pg_am behavior anyway.  For example, the main use
>         of the
>         amstrategies column was to allow validation of pg_amop.amopstrategy
>         entries --- but in 4 out of the 6 existing AMs, knowledge of the
>         AM alone
>         isn't sufficient information to determine the valid set of strategy
>         numbers anyway.  So inventing a "pg_amstrategies(am oid)"
>         function seems
>         like it would be repeating a previous failure.  Not quite sure
>         what to
>         do instead, though.  We could imagine something like
>         "pg_amstrategies(am
>         oid, opclass oid)", but I don't see how to implement it without
>         asking
>         opclasses to provide a validation function, which maybe is a
>         change we
>         don't want to take on here.
>
>
>     Could we add another function to access method interface which would
>     validate opclass?
>     Am could validate this way not only strategies, but also supporting
>     functions. For instance, in GIN, we now require opclass to specify
>     at least one of consistent and triconsistent. ISTM I would be nice
>     to let the access method check such conditions. Also, we would be
>     able to check opclass correction on its creation. Now one can create
>     opclass with missing support functions which doesn't work.
>     In the SQL-level we can create function which validates opclass
>     using this new method. This function can be used in regression tests.
>
>
> Should I try to implement such new access method function, say 'amvalidate'?
>

Makes sense to me to do that, should be probably optional though.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Aug 31, 2015 at 1:28 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-08-27 15:15, Alexander Korotkov wrote:
On Wed, Aug 26, 2015 at 7:20 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru <mailto:a.korotkov@postgrespro.ru>> wrote:
    Could we add another function to access method interface which would
    validate opclass?
    Am could validate this way not only strategies, but also supporting
    functions. For instance, in GIN, we now require opclass to specify
    at least one of consistent and triconsistent. ISTM I would be nice
    to let the access method check such conditions. Also, we would be
    able to check opclass correction on its creation. Now one can create
    opclass with missing support functions which doesn't work.
    In the SQL-level we can create function which validates opclass
    using this new method. This function can be used in regression tests.


Should I try to implement such new access method function, say 'amvalidate'?


Makes sense to me to do that, should be probably optional though.


Attached patch is implementing this. It doesn't pretend to be fully correct implementation, but it should be enough for proof the concept.
In this patch access method exposes another function: amvalidate. It takes data structure representing opclass and throws error if it finds it invalid.
This method is used on new opclass definition (alter operator family etc. are not yet implemented but planned). Also, there is SQL function validate_opclass(oid) which is used in regression tests.
Any thoughts?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-09-04 16:26, Alexander Korotkov wrote:
>
> Attached patch is implementing this. It doesn't pretend to be fully
> correct implementation, but it should be enough for proof the concept.
> In this patch access method exposes another function: amvalidate. It
> takes data structure representing opclass and throws error if it finds
> it invalid.
> This method is used on new opclass definition (alter operator family
> etc. are not yet implemented but planned). Also, there is SQL function
> validate_opclass(oid) which is used in regression tests.
> Any thoughts?
>

This is starting to look good.

However I don't like the naming differences between validate_opclass and 
amvalidate. If you expect that the current amvalidate will only be used 
for opclass validation then it should be renamed accordingly.

Also GetIndexAmRoutine should check the return type of the amhandler.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Sep 7, 2015 at 9:17 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-09-04 16:26, Alexander Korotkov wrote:

Attached patch is implementing this. It doesn't pretend to be fully
correct implementation, but it should be enough for proof the concept.
In this patch access method exposes another function: amvalidate. It
takes data structure representing opclass and throws error if it finds
it invalid.
This method is used on new opclass definition (alter operator family
etc. are not yet implemented but planned). Also, there is SQL function
validate_opclass(oid) which is used in regression tests.
Any thoughts?


This is starting to look good

Thanks!
 
However I don't like the naming differences between validate_opclass and amvalidate. If you expect that the current amvalidate will only be used for opclass validation then it should be renamed accordingly.

I'm not yet sure if we need separate validation of opfamilies.

Also GetIndexAmRoutine should check the return type of the amhandler.

Will be fixed.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-09-07 20:56, Alexander Korotkov wrote:
> On Mon, Sep 7, 2015 at 9:17 PM, Petr Jelinek <petr@2ndquadrant.com
>
>     However I don't like the naming differences between validate_opclass
>     and amvalidate. If you expect that the current amvalidate will only
>     be used for opclass validation then it should be renamed accordingly.
>
>
> I'm not yet sure if we need separate validation of opfamilies.
>

Well either the amvalidate or the validate_opclass should be renamed 
IMHO, depending on which way the checking goes (one interface for 
everything with generic name or multiple interfaces for multiple 
validations).


--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Sep 7, 2015 at 10:02 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-09-07 20:56, Alexander Korotkov wrote:
On Mon, Sep 7, 2015 at 9:17 PM, Petr Jelinek <petr@2ndquadrant.com

    However I don't like the naming differences between validate_opclass
    and amvalidate. If you expect that the current amvalidate will only
    be used for opclass validation then it should be renamed accordingly.


I'm not yet sure if we need separate validation of opfamilies.


Well either the amvalidate or the validate_opclass should be renamed IMHO, depending on which way the checking goes (one interface for everything with generic name or multiple interfaces for multiple validations).

Yes, I agree with you about naming.
I'm not sure about separate validation of opfamilies independent of its naming. I'd like to get any arguments/advises about it.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Mon, Sep 7, 2015 at 9:17 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-09-04 16:26, Alexander Korotkov wrote:

Attached patch is implementing this. It doesn't pretend to be fully
correct implementation, but it should be enough for proof the concept.
In this patch access method exposes another function: amvalidate. It
takes data structure representing opclass and throws error if it finds
it invalid.
This method is used on new opclass definition (alter operator family
etc. are not yet implemented but planned). Also, there is SQL function
validate_opclass(oid) which is used in regression tests.
Any thoughts?


This is starting to look good.

However I don't like the naming differences between validate_opclass and amvalidate. If you expect that the current amvalidate will only be used for opclass validation then it should be renamed accordingly.

validate_opclass was renamed to amvalidate.
 
Also GetIndexAmRoutine should check the return type of the amhandler.

Fixed.

See the attached patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 
Вложения

Re: WIP: Rework access method interface

От
Oleg Bartunov
Дата:


On Fri, Sep 11, 2015 at 4:22 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Mon, Sep 7, 2015 at 9:17 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-09-04 16:26, Alexander Korotkov wrote:

Attached patch is implementing this. It doesn't pretend to be fully
correct implementation, but it should be enough for proof the concept.
In this patch access method exposes another function: amvalidate. It
takes data structure representing opclass and throws error if it finds
it invalid.
This method is used on new opclass definition (alter operator family
etc. are not yet implemented but planned). Also, there is SQL function
validate_opclass(oid) which is used in regression tests.
Any thoughts?


This is starting to look good.

However I don't like the naming differences between validate_opclass and amvalidate. If you expect that the current amvalidate will only be used for opclass validation then it should be renamed accordingly.

validate_opclass was renamed to amvalidate.
 
Also GetIndexAmRoutine should check the return type of the amhandler.

Fixed.

See the attached patch.



Whhat I don't understand from this thread if  we should wait 2ndQuadrant for their sequence and column AMs or just start to work on committing it ? Alvaro, where are you ?

 
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-09-14 14:34, Oleg Bartunov wrote:
>
> Whhat I don't understand from this thread if  we should wait 2ndQuadrant
> for their sequence and column AMs or just start to work on committing it
> ? Alvaro, where are you ?
>

I don't see problems with this patch from the sequence am perspective. 
The next AM type will need to add code for different AM types but that's 
mostly it.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Teodor Sigaev
Дата:
> validate_opclass was renamed to amvalidate.

It seems to me, that amvalidate method of AM should get as argument only Oid of 
operator family. Layout and meaning of amproc/amop fields are differ for 
different AM and there isn't an AM which implements all possible features.

Actually, I'm a bit confused with follow piece of code (ginvalidate, for instance):
foreach(l, opclass->procedures)
{     ...     if (proc->lefttype != opclass->intype               || proc->righttype != opclass->intype)
continue;    ...
 

That is amproc could contain a row, which connected to some operator class but 
this fact will be missed this check and may be, never used or used wrongly.

Despite these observations, I think that this work is needed.
-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Wed, Sep 16, 2015 at 8:44 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
validate_opclass was renamed to amvalidate.

It seems to me, that amvalidate method of AM should get as argument only Oid of operator family. Layout and meaning of amproc/amop fields are differ for different AM and there isn't an AM which implements all possible features.

Actually, I'm a bit confused with follow piece of code (ginvalidate, for instance):
foreach(l, opclass->procedures)
{
     ...
     if (proc->lefttype != opclass->intype
               || proc->righttype != opclass->intype)
           continue;
     ...

That is amproc could contain a row, which connected to some operator class but this fact will be missed this check and may be, never used or used wrongly.

Despite these observations, I think that this work is needed.

After, further personal discussion with Teodor, we decided that amvalidate is out of scope for this patch.
It's not evident what should we validate in amvalidate and which way. I think if we need amvalidate it should be subject of separate patch.
The attached patch exposes index access method parameters to SQL using following fucntions:
 * get_am_param_oid(oid, text)
 * get_am_param_int(oid, text)
 * get_am_param_bool(oid, text)

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-09-18 14:58, Alexander Korotkov wrote:
> On Wed, Sep 16, 2015 at 8:44 PM, Teodor Sigaev <teodor@sigaev.ru
> <mailto:teodor@sigaev.ru>> wrote:
>
>         validate_opclass was renamed to amvalidate.
>
>
>     It seems to me, that amvalidate method of AM should get as argument
>     only Oid of operator family. Layout and meaning of amproc/amop
>     fields are differ for different AM and there isn't an AM which
>     implements all possible features.
>
> After, further personal discussion with Teodor, we decided that
> amvalidate is out of scope for this patch.
> It's not evident what should we validate in amvalidate and which way. I
> think if we need amvalidate it should be subject of separate patch.
> The attached patch exposes index access method parameters to SQL using
> following fucntions:
>   * get_am_param_oid(oid, text)
>   * get_am_param_int(oid, text)
>   * get_am_param_bool(oid, text)
>

Hmm, we might want these functons in any case (although I think just one 
function which would return all am params would be better).

But why is it not evident? We do the validations in regression tests, 
even if we just copy those then it's enough for a start.

If you mean that it's not obvious what are all the possible things that 
amvalidate should validate in the future, then yes, that will always be 
the case though. I think better solution would be to provide more 
granular validation interface in the C api (i.e. have amvalidateopclass 
etc). We can still have just one SQL exposed function called amvalidate 
which will call all those C interfaces that are supported by current 
version (I agree that those interfaces like amvalidateopclass should 
accept just Oid).

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Sun, Sep 20, 2015 at 5:02 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-09-18 14:58, Alexander Korotkov wrote:
On Wed, Sep 16, 2015 at 8:44 PM, Teodor Sigaev <teodor@sigaev.ru
<mailto:teodor@sigaev.ru>> wrote:

        validate_opclass was renamed to amvalidate.


    It seems to me, that amvalidate method of AM should get as argument
    only Oid of operator family. Layout and meaning of amproc/amop
    fields are differ for different AM and there isn't an AM which
    implements all possible features.

After, further personal discussion with Teodor, we decided that
amvalidate is out of scope for this patch.
It's not evident what should we validate in amvalidate and which way. I
think if we need amvalidate it should be subject of separate patch.
The attached patch exposes index access method parameters to SQL using
following fucntions:
  * get_am_param_oid(oid, text)
  * get_am_param_int(oid, text)
  * get_am_param_bool(oid, text)


Hmm, we might want these functons in any case (although I think just one function which would return all am params would be better).

But why is it not evident? We do the validations in regression tests, even if we just copy those then it's enough for a start

The reason is that those validations were used only in regression tests yet. It wasn't used for user-defined operator classes. User might define invalid opclass and then alter it to valid. Or user might upgrade opclass in two steps where intermediate step is invalid. This is why I think validating opclasses in CREATE/ALTER command is not evident since it changes user visible behavior and compatibility.
Simultaneously, implementing new API function just for regression tests doesn't seem to worth it for me.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 

Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-09-18 14:58, Alexander Korotkov wrote:
>> After, further personal discussion with Teodor, we decided that
>> amvalidate is out of scope for this patch.
>> It's not evident what should we validate in amvalidate and which way. I
>> think if we need amvalidate it should be subject of separate patch.

> But why is it not evident? We do the validations in regression tests, 
> even if we just copy those then it's enough for a start.

I think the main reason this question is in-scope for this patch is
precisely the problem of what do we do about the regression tests.

I'm not in favor of exposing some SQL-level functions whose sole purpose
is to support those regression test queries, because while those queries
are very useful for detecting errors in handmade opclasses, they're hacks,
and always have been.  They don't work well (or at all, really) for
anything more than btree/hash cases.  It'd be better to expose amvalidate
functions, even if we don't yet have full infrastructure for them.
        regards, tom lane



Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-09-20 16:17, Alexander Korotkov wrote:
> On Sun, Sep 20, 2015 at 5:02 PM, Petr Jelinek <petr@2ndquadrant.com
>
>     Hmm, we might want these functons in any case (although I think just
>     one function which would return all am params would be better).
>
>     But why is it not evident? We do the validations in regression
>     tests, even if we just copy those then it's enough for a start
>
>
> The reason is that those validations were used only in regression tests
> yet. It wasn't used for user-defined operator classes. User might define
> invalid opclass and then alter it to valid. Or user might upgrade
> opclass in two steps where intermediate step is invalid. This is why I
> think validating opclasses in CREATE/ALTER command is not evident since
> it changes user visible behavior and compatibility.
> Simultaneously, implementing new API function just for regression tests
> doesn't seem to worth it for me.
>

I think it's ok to not do automatic validation during CREATE/ALTER just 
yet. And I also think it's much worse to implement a SQL API which 
exposes internals just for regression tests than having a C API just for 
regression tests. The reason for moving AM to C API was to have less of 
the internals exposed at SQL level afaik.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Sun, Sep 20, 2015 at 5:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-09-18 14:58, Alexander Korotkov wrote:
>> After, further personal discussion with Teodor, we decided that
>> amvalidate is out of scope for this patch.
>> It's not evident what should we validate in amvalidate and which way. I
>> think if we need amvalidate it should be subject of separate patch.

> But why is it not evident? We do the validations in regression tests,
> even if we just copy those then it's enough for a start.

I think the main reason this question is in-scope for this patch is
precisely the problem of what do we do about the regression tests.

I'm not in favor of exposing some SQL-level functions whose sole purpose
is to support those regression test queries, because while those queries
are very useful for detecting errors in handmade opclasses, they're hacks,
and always have been.  They don't work well (or at all, really) for
anything more than btree/hash cases.  It'd be better to expose amvalidate
functions, even if we don't yet have full infrastructure for them.

I'm OK about continuing work on amvalidate if we can build consuensus on its design.
Could you give some feedback on amvalidate version of patch please?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Teodor Sigaev
Дата:
> I'm OK about continuing work on amvalidate if we can build consuensus on its design.
> Could you give some feedback on amvalidate version of patch please?
> http://www.postgresql.org/message-id/CAPpHfds8ZyWenz9vW6tE5RZXboL1vU_wSW181vEq+mU+v1dsiw@mail.gmail.com

In attach a bit modified patch based on 4-th version, and if there are no strong
objections, I will commit it. Waiting this patch stops Alexander's development
on CREATE ACCESS METHOD and he needs to move forward.


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-09-25 16:11, Teodor Sigaev wrote:
>> I'm OK about continuing work on amvalidate if we can build consuensus
>> on its design.
>> Could you give some feedback on amvalidate version of patch please?
>> http://www.postgresql.org/message-id/CAPpHfds8ZyWenz9vW6tE5RZXboL1vU_wSW181vEq+mU+v1dsiw@mail.gmail.com
>>
>
> In attach a bit modified patch based on 4-th version, and if there are
> no strong objections, I will commit it. Waiting this patch stops
> Alexander's development on CREATE ACCESS METHOD and he needs to move
> forward.
>

The code itself looks ok to me, but before we can think about committing 
this the documentation has to be updated.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-09-25 16:11, Teodor Sigaev wrote:
>> In attach a bit modified patch based on 4-th version, and if there are
>> no strong objections, I will commit it. Waiting this patch stops
>> Alexander's development on CREATE ACCESS METHOD and he needs to move
>> forward.

> The code itself looks ok to me, but before we can think about committing 
> this the documentation has to be updated.

Yes.  Also, for a major change like this, it would be a good thing if
the documentation got a review from a native English speaker.  I will
volunteer to handle it if someone else does the first draft.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:

On Fri, Sep 25, 2015 at 6:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-09-25 16:11, Teodor Sigaev wrote:
>> In attach a bit modified patch based on 4-th version, and if there are
>> no strong objections, I will commit it. Waiting this patch stops
>> Alexander's development on CREATE ACCESS METHOD and he needs to move
>> forward.

> The code itself looks ok to me, but before we can think about committing
> this the documentation has to be updated.

Yes.  Also, for a major change like this, it would be a good thing if
the documentation got a review from a native English speaker.  I will
volunteer to handle it if someone else does the first draft.

Great! I'll write this documentation.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Teodor Sigaev wrote:
> >I'm OK about continuing work on amvalidate if we can build consuensus on its design.
> >Could you give some feedback on amvalidate version of patch please?
> >http://www.postgresql.org/message-id/CAPpHfds8ZyWenz9vW6tE5RZXboL1vU_wSW181vEq+mU+v1dsiw@mail.gmail.com
> 
> In attach a bit modified patch based on 4-th version, and if there are no
> strong objections, I will commit it. Waiting this patch stops Alexander's
> development on CREATE ACCESS METHOD and he needs to move forward.

I think the messages in ereport() need some work -- at the bare minimum,
do not uppercase the initial.  Also things such as whether operation
names such as "order by" ought to be uppercase or in quotes, for example
here:

> +            ereport(ERROR,
> +                    (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> +                     errmsg("BRIN doesn't support order by operators")));

I think the API of getOpFamilyInfo is a bit odd; is the caller expected
to fill intype and keytype always, and then it only sets the procs/opers
lists?  I wonder if it would be more sensible to have that routine
receive the pg_opclass tuple (or even the opclass OID) instead of the
opfamily OID.

I think "amapi.h" is not a great file name.  What about am_api.h?

I'm unsure about BRIN_NPROC.  Why did you set it to 15?  It's not
unthinkable that future opclass frameworks will have different numbers
of support procs.  For BRIN I'm thinking that we could add another
support proc which validates the opclass definition using the specific
framework; that way we will be able to check that the set of operators
defined are correct, etc (something that the current approach cannot
do).

I think another pgindent run would be good -- there's some broken
whitespace here and there.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-09-25 17:45, Alvaro Herrera wrote:
>
> I think the API of getOpFamilyInfo is a bit odd; is the caller expected
> to fill intype and keytype always, and then it only sets the procs/opers
> lists?  I wonder if it would be more sensible to have that routine
> receive the pg_opclass tuple (or even the opclass OID) instead of the
> opfamily OID.
>
> I think "amapi.h" is not a great file name.  What about am_api.h?
>

Well we have related fdwapi.h and tsmapi.h (and several unrelated *api.h 
which also don't use "_" in the name) so amapi.h seems fine to me.

> I'm unsure about BRIN_NPROC.  Why did you set it to 15?  It's not
> unthinkable that future opclass frameworks will have different numbers

The BRIN_NPROC should be probably defined in brin.c since it's only used 
for sizing local array variable in amvalidate and should be used to set 
amsupport in the init function as well then.

> of support procs.  For BRIN I'm thinking that we could add another
> support proc which validates the opclass definition using the specific
> framework; that way we will be able to check that the set of operators
> defined are correct, etc (something that the current approach cannot
> do).

As I said before in the thread I would prefer more granular approach to 
validation - have amvalidateopclass in the struct for the current 
functionality so that we can easily add more validators in the future. 
There can still be one amvalidate function exposed on SQL level that 
just calls all the amvalidate* functions that the am defines.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Amit Kapila
Дата:
On Fri, Sep 25, 2015 at 7:41 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
I'm OK about continuing work on amvalidate if we can build consuensus on its design.
Could you give some feedback on amvalidate version of patch please?
http://www.postgresql.org/message-id/CAPpHfds8ZyWenz9vW6tE5RZXboL1vU_wSW181vEq+mU+v1dsiw@mail.gmail.com

In attach a bit modified patch based on 4-th version, and if there are no strong objections, I will commit it. Waiting this patch stops Alexander's development on CREATE ACCESS METHOD and he needs to move forward.


1.
+InitIndexAmRoutine(Relation relation)
+{
+ IndexAmRoutine *result, *tmp;
+
+ result = (IndexAmRoutine *)MemoryContextAlloc(CacheMemoryContext,
+ sizeof(IndexAmRoutine));
+ tmp = (IndexAmRoutine *)DatumGetPointer(
+ OidFunctionCall0(relation->rd_am->amhandler));
+ memcpy(result, tmp, sizeof(IndexAmRoutine));
+ relation->amroutine = result;
+}

Is it appropriate to use CacheMemoryContext here?
Currently in load_relcache_init_file(), all the other information
like rd_indoption, rd_aminfo in Relation is allocated in indexcxt
which ofcourse in allocated in CacheMemoryContext, but still is
there a reason why this also shouldn't be allocated in same
context.

2.
- aform = (Form_pg_am) MemoryContextAlloc(CacheMemoryContext, sizeof *aform);
+ aform = (Form_pg_am)MemoryContextAlloc(CacheMemoryContext, sizeof *aform);

Spurious change.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Sat, Sep 26, 2015 at 12:55 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-09-25 17:45, Alvaro Herrera wrote:

I think the API of getOpFamilyInfo is a bit odd; is the caller expected
to fill intype and keytype always, and then it only sets the procs/opers
lists?  I wonder if it would be more sensible to have that routine
receive the pg_opclass tuple (or even the opclass OID) instead of the
opfamily OID.

I think "amapi.h" is not a great file name.  What about am_api.h?


Well we have related fdwapi.h and tsmapi.h (and several unrelated *api.h which also don't use "_" in the name) so amapi.h seems fine to me.

Makes sense. I leave it amapi.h.
 
I'm unsure about BRIN_NPROC.  Why did you set it to 15?  It's not
unthinkable that future opclass frameworks will have different numbers

The BRIN_NPROC should be probably defined in brin.c since it's only used for sizing local array variable in amvalidate and should be used to set amsupport in the init function as well then.

Problem is that in BRIN, access method use only first 4 support procedures. However, operator class can define more and call them from first 4 support procedures. That allows operator classes to re-use same support procedures and build additional levels of abstractions. I've declared BRIN_MANDATORY_NPROCS, and brinvalidate checks only their presence. We could check BRIN opclass more precisely, by introducing new support procedure for validation. I think it's a subject of a separate patch since it would change interface of BRIN.
 
of support procs.  For BRIN I'm thinking that we could add another
support proc which validates the opclass definition using the specific
framework; that way we will be able to check that the set of operators
defined are correct, etc (something that the current approach cannot
do).

As I said before in the thread I would prefer more granular approach to validation - have amvalidateopclass in the struct for the current functionality so that we can easily add more validators in the future. There can still be one amvalidate function exposed on SQL level that just calls all the amvalidate* functions that the am defines.

I agree about staying with one SQL-visible function.

Other changes:
 * Documentation reflects interface changes.
 * IndexAmRoutine moved from CacheMemoryContext to indexcxt.
 * Various minor formatting improvements.
 * Error messages are corrected.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 
Вложения

Re: WIP: Rework access method interface

От
Amit Kapila
Дата:
On Fri, Oct 2, 2015 at 8:14 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
>
>
> I agree about staying with one SQL-visible function.
>
> Other changes:
>  * Documentation reflects interface changes.
>  * IndexAmRoutine moved from CacheMemoryContext to indexcxt.
>  * Various minor formatting improvements.
>  * Error messages are corrected.
>

Few assorted comments:

1.
+  * Get IndexAmRoutine structure from access method oid.
+  */
+ IndexAmRoutine *
+ GetIndexAmRoutine(Oid 
amoid)
+ {
+ IndexAmRoutine *result;
+ HeapTuple tuple;
+ regproc
amhandler;
+ tuple = SearchSysCache1(AMOID, ObjectIdGetDatum(amoid));
+ if (!HeapTupleIsValid
(tuple))
+ elog(ERROR, "cache lookup failed for access method %u",
+  
amoid);
+ amhandler = ((Form_pg_am)GETSTRUCT(tuple))->amhandler;
+ if (!RegProcedureIsValid
(amhandler))
+ elog(ERROR, "invalid %u regproc", amhandler);


I have noticed that currently, the above kind of error is reported slightly
differently as in below code:
if (!RegProcedureIsValid(procOid)) \
elog(ERROR, "invalid %s regproc", CppAsString
(pname)); \

If you feel it is better to do the way as it is in current code, then you
can change accordingly.

2.
<para>
         Access methods that always return entries in the natural ordering
         of their data (such 
as btree) should set
!        <structname>pg_am</>.<structfield>amcanorder</> to true.
         Currently, such 
access methods must use btree-compatible strategy
         numbers for their equality and ordering operators.
  
      </para>
--- 545,551 ----
        <para>
         Access methods that always return entries in the natural 
ordering
         of their data (such as btree) should set
!        <structfield>amcanorder</> to true.
         
Currently, such access methods must use btree-compatible strategy
         numbers for their equality and 
ordering operators.


Isn't it better to use structure while referencing the field of it?

3.
!    Handler function must be written in a compiled language such as C, using
!    the version-1 interface.

Here, it is not completely clear, what do you refer to as version-1 interface.

4.
xindex.sgml
<title>Index Methods and Operator Classes</title>
..
It is possible to add a
   new index method by defining the required interface routines and
   then creating a row in <classname>pg_am</classname> &mdash; but that is
   beyond the scope of this chapter (see <xref linkend="indexam">).
  </para>

I think changing above to indicate something about handler function
could be useful.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-10-03 08:27, Amit Kapila wrote:
> On Fri, Oct 2, 2015 at 8:14 PM, Alexander Korotkov
> <a.korotkov@postgrespro.ru <mailto:a.korotkov@postgrespro.ru>> wrote:
>  >
>  >
>  > I agree about staying with one SQL-visible function.

Okay, this does not necessarily mean there should be only one validation 
function in the C struct though. I wonder if it would be more future 
proof to name the C interface as something else than the current generic 
amvalidate. Especially considering that it basically only does opclass 
validation at the moment (It's IMHO saner in terms of API evolution to 
expand the struct with more validator functions in the future compared 
to adding arguments to the existing function).

>
> Few assorted comments:
>
> 1.
> +  * Get IndexAmRoutine structure from access method oid.
> +  */
> + IndexAmRoutine *
> + GetIndexAmRoutine(Oid
> amoid)
> ...
> + if (!RegProcedureIsValid
> (amhandler))
> + elog(ERROR, "invalid %u regproc", amhandler);
>
> I have noticed that currently, the above kind of error is reported slightly
> differently as in below code:
> if (!RegProcedureIsValid(procOid)) \
> elog(ERROR, "invalid %s regproc", CppAsString
> (pname)); \
>
> If you feel it is better to do the way as it is in current code, then you
> can change accordingly.

It's completely different use-case from existing code. And tbh I think 
it should have completely different and more informative error message 
something in the style of "index access method %s does not have a 
handler" (see for example GetFdwRoutineByServerId or 
transformRangeTableSample how this is handled for similar cases currently).

This however brings another comment - I think it would be better if the 
GetIndexAmRoutine would be split into two interfaces. The 
GetIndexAmRoutine itself would accept the amhandler Oid and should just 
do the OidFunctionCall and then check the result is not NULL and 
possibly that it is an IndexAmRoutine node. And then all the
(IndexAmRoutine*)DatumGetPointer(!OidFunctionCall0(accessMethodForm->amhandler));
calls in the code should be replaced with calls to the GetIndexAmRoutine 
instead.

The other routine (let's call it GetIndexAmRoutineByAmId for example) 
would get IndexAmRoutine from amoid by looking up catalog, doing that 
validation of amhandler Oid/regproc and calling the GetIndexAmRoutine.

>
> 3.
> !    Handler function must be written in a compiled language such as C,
> using
> !    the version-1 interface.
>
> Here, it is not completely clear, what do you refer to as version-1
> interface.
>

This seems to be copy paste from fdw docs, if we decide this should be 
explained differently then it should be explained differently there as well.

> 4.
> xindex.sgml
> <title>Index Methods and Operator Classes</title>
> ..
> It is possible to add a
>     new index method by defining the required interface routines and
>     then creating a row in <classname>pg_am</classname> — but that is
>     beyond the scope of this chapter (see <xref linkend="indexam">).
>    </para>
>
> I think changing above to indicate something about handler function
> could be useful.
>

+1

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Andres Freund
Дата:
Hi,

this topic has seen a lot of activity/review. As development is ongoing
I'm moving the patch to the next CF.

Greetings,

Andres Freund



Re: WIP: Rework access method interface

От
Amit Kapila
Дата:
On Sat, Oct 3, 2015 at 5:07 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-10-03 08:27, Amit Kapila wrote:
On Fri, Oct 2, 2015 at 8:14 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru <mailto:a.korotkov@postgrespro.ru>> wrote:
 >
 >
 > I agree about staying with one SQL-visible function.

Okay, this does not necessarily mean there should be only one validation function in the C struct though. I wonder if it would be more future proof to name the C interface as something else than the current generic amvalidate. Especially considering that it basically only does opclass validation at the moment (It's IMHO saner in terms of API evolution to expand the struct with more validator functions in the future compared to adding arguments to the existing function).


I also agree with you that adding more arguments in future might
not be a good idea for exposed API.  I don't know how much improvement
we can get if we use structure and then keep on adding more members
to it based on future need, but atleast that way it will be less prone to
breakage.

I think adding multiple validator functions is another option, but that
also doesn't sound like a good way as it can pose difficulty in
understanding the right version of API to be used.  
 

Few assorted comments:

1.
+  * Get IndexAmRoutine structure from access method oid.
+  */
+ IndexAmRoutine *
+ GetIndexAmRoutine(Oid
amoid)
...
+ if (!RegProcedureIsValid
(amhandler))
+ elog(ERROR, "invalid %u regproc", amhandler);

I have noticed that currently, the above kind of error is reported slightly
differently as in below code:
if (!RegProcedureIsValid(procOid)) \
elog(ERROR, "invalid %s regproc", CppAsString
(pname)); \

If you feel it is better to do the way as it is in current code, then you
can change accordingly.

It's completely different use-case from existing code. And tbh I think it should have completely different and more informative error message something in the style of "index access method %s does not have a handler" (see for example GetFdwRoutineByServerId or transformRangeTableSample how this is handled for similar cases currently).


makes sense to me, but in that case isn't it better to use ereport
(as used in GetFdwRoutineByServerId()) rather than elog?
 
This however brings another comment - I think it would be better if the GetIndexAmRoutine would be split into two interfaces. The GetIndexAmRoutine itself would accept the amhandler Oid and should just do the OidFunctionCall and then check the result is not NULL and possibly that it is an IndexAmRoutine node. And then all the
(IndexAmRoutine*)DatumGetPointer(!OidFunctionCall0(accessMethodForm->amhandler));
calls in the code should be replaced with calls to the GetIndexAmRoutine instead.

The other routine (let's call it GetIndexAmRoutineByAmId for example) would get IndexAmRoutine from amoid by looking up catalog, doing that validation of amhandler Oid/regproc and calling the GetIndexAmRoutine.


+1, I think that will make this API's design closer to what we have
for corresponding FDW API.



With Regards,
Amit Kapila.

Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Sun, Oct 4, 2015 at 4:27 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Oct 3, 2015 at 5:07 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-10-03 08:27, Amit Kapila wrote:
On Fri, Oct 2, 2015 at 8:14 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru <mailto:a.korotkov@postgrespro.ru>> wrote:
 >
 >
 > I agree about staying with one SQL-visible function.

Okay, this does not necessarily mean there should be only one validation function in the C struct though. I wonder if it would be more future proof to name the C interface as something else than the current generic amvalidate. Especially considering that it basically only does opclass validation at the moment (It's IMHO saner in terms of API evolution to expand the struct with more validator functions in the future compared to adding arguments to the existing function).


I also agree with you that adding more arguments in future might
not be a good idea for exposed API.  I don't know how much improvement
we can get if we use structure and then keep on adding more members
to it based on future need, but atleast that way it will be less prone to
breakage.

I think adding multiple validator functions is another option, but that
also doesn't sound like a good way as it can pose difficulty in
understanding the right version of API to be used.  
 
I think the major priority is to keep compatibility. For now, user can easily define invalid opclass and he will just get the error runtime. Thus, the opclass validation looks like improvement which is not strictly needed. We can add new validator functions in the future but make them not required. Thus, old access method wouldn't loose compatibility from this.

 

Few assorted comments:

1.
+  * Get IndexAmRoutine structure from access method oid.
+  */
+ IndexAmRoutine *
+ GetIndexAmRoutine(Oid
amoid)
...
+ if (!RegProcedureIsValid
(amhandler))
+ elog(ERROR, "invalid %u regproc", amhandler);

I have noticed that currently, the above kind of error is reported slightly
differently as in below code:
if (!RegProcedureIsValid(procOid)) \
elog(ERROR, "invalid %s regproc", CppAsString
(pname)); \

If you feel it is better to do the way as it is in current code, then you
can change accordingly.

It's completely different use-case from existing code. And tbh I think it should have completely different and more informative error message something in the style of "index access method %s does not have a handler" (see for example GetFdwRoutineByServerId or transformRangeTableSample how this is handled for similar cases currently).


makes sense to me, but in that case isn't it better to use ereport
(as used in GetFdwRoutineByServerId()) rather than elog?

Changed to ereport.
  
This however brings another comment - I think it would be better if the GetIndexAmRoutine would be split into two interfaces. The GetIndexAmRoutine itself would accept the amhandler Oid and should just do the OidFunctionCall and then check the result is not NULL and possibly that it is an IndexAmRoutine node. And then all the
(IndexAmRoutine*)DatumGetPointer(!OidFunctionCall0(accessMethodForm->amhandler));
calls in the code should be replaced with calls to the GetIndexAmRoutine instead.

The other routine (let's call it GetIndexAmRoutineByAmId for example) would get IndexAmRoutine from amoid by looking up catalog, doing that validation of amhandler Oid/regproc and calling the GetIndexAmRoutine.


+1, I think that will make this API's design closer to what we have
for corresponding FDW API.

Good, I've changed interface.

2.
<para>
         Access methods that always return entries in the natural ordering
         of their data (such 
as btree) should set
!        <structname>pg_am</>.<structfield>amcanorder</> to true.
         Currently, such 
access methods must use btree-compatible strategy
         numbers for their equality and ordering operators.
  
      </para>
--- 545,551 ----
        <para>
         Access methods that always return entries in the natural 
ordering
         of their data (such as btree) should set
!        <structfield>amcanorder</> to true.
         
Currently, such access methods must use btree-compatible strategy
         numbers for their equality and 
ordering operators.

Isn't it better to use structure while referencing the field of it?

Done.
 
3.
!    Handler function must be written in a compiled language such as C, using
!    the version-1 interface.
Here, it is not completely clear, what do you refer to as version-1 interface.

As, Peter commented upthread it is the same in FDW and we should change both places if needed.
It refers to version 1 calling convention for C-function.
However, I'm not sure that it can't be version 0 calling convention. It probably could work, but nobody use it.
 
4.
xindex.sgml
<title>Index Methods and Operator Classes</title>
..
It is possible to add a
   new index method by defining the required interface routines and
   then creating a row in <classname>pg_am</classname> &mdash; but that is
   beyond the scope of this chapter (see <xref linkend="indexam">).
  </para>
I think changing above to indicate something about handler function
could be useful.

Done.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-10-05 19:25, Alexander Korotkov wrote:
> On Sun, Oct 4, 2015 at 4:27 PM, Amit Kapila <amit.kapila16@gmail.com
> <mailto:amit.kapila16@gmail.com>> wrote:
>
>     On Sat, Oct 3, 2015 at 5:07 PM, Petr Jelinek <petr@2ndquadrant.com
>     <mailto:petr@2ndquadrant.com>> wrote:
>
>         On 2015-10-03 08:27, Amit Kapila wrote:
>
>             On Fri, Oct 2, 2015 at 8:14 PM, Alexander Korotkov
>             <a.korotkov@postgrespro.ru
>             <mailto:a.korotkov@postgrespro.ru>
>             <mailto:a.korotkov@postgrespro.ru
>             <mailto:a.korotkov@postgrespro.ru>>> wrote:
>               >
>               >
>               > I agree about staying with one SQL-visible function.
>
>
>         Okay, this does not necessarily mean there should be only one
>         validation function in the C struct though. I wonder if it would
>         be more future proof to name the C interface as something else
>         than the current generic amvalidate. Especially considering that
>         it basically only does opclass validation at the moment (It's
>         IMHO saner in terms of API evolution to expand the struct with
>         more validator functions in the future compared to adding
>         arguments to the existing function).
>
>
>     I also agree with you that adding more arguments in future might
>     not be a good idea for exposed API.  I don't know how much improvement
>     we can get if we use structure and then keep on adding more members
>     to it based on future need, but atleast that way it will be less
>     prone to
>     breakage.
>
>     I think adding multiple validator functions is another option, but that
>     also doesn't sound like a good way as it can pose difficulty in
>     understanding the right version of API to be used.
>
> I think the major priority is to keep compatibility. For now, user can
> easily define invalid opclass and he will just get the error runtime.
> Thus, the opclass validation looks like improvement which is not
> strictly needed. We can add new validator functions in the future but
> make them not required. Thus, old access method wouldn't loose
> compatibility from this.

Yes that was what I was thinking as well. We don't want to break 
anything in this patch as it's mainly API change, but we want to have 
API that can potentially evolve. I think evolving the API by adding more 
interfaces in the *Routine struct so far worked well for the FDW for 
example and given that those structs are nodes, the individual pointers 
get initialized to NULL automatically so it's easy to add optional 
interfaces (like validators) without breaking anything. Besides, it's 
not unreasonable to expect that custom AM authors will have to check if 
their implementation is compatible with new major version.

But this is also why I don't think it's good idea to call the opclass 
validator just "amvalidate" in the IndexAmRoutine struct because it 
implies to be the only validate function we'll ever have.


Other than the above gripe and the following spurious change, the patch 
seems good to me now.

>  RelationInitIndexAccessInfo(Relation relation)
>  {
>         HeapTuple       tuple;
> -       Form_pg_am      aform;
>         Datum           indcollDatum;
>         Datum           indclassDatum;
>         Datum           indoptionDatum;
> @@ -1178,6 +1243,7 @@ RelationInitIndexAccessInfo(Relation relation)
>         MemoryContext oldcontext;
>         int                     natts;
>         uint16          amsupport;
> +       Form_pg_am      aform;


--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Sat, Oct 10, 2015 at 6:03 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-10-05 19:25, Alexander Korotkov wrote:
On Sun, Oct 4, 2015 at 4:27 PM, Amit Kapila <amit.kapila16@gmail.com
<mailto:amit.kapila16@gmail.com>> wrote:

    On Sat, Oct 3, 2015 at 5:07 PM, Petr Jelinek <petr@2ndquadrant.com
    <mailto:petr@2ndquadrant.com>> wrote:

        On 2015-10-03 08:27, Amit Kapila wrote:

            On Fri, Oct 2, 2015 at 8:14 PM, Alexander Korotkov
            <a.korotkov@postgrespro.ru
            <mailto:a.korotkov@postgrespro.ru>
            <mailto:a.korotkov@postgrespro.ru

            <mailto:a.korotkov@postgrespro.ru>>> wrote:
              >
              >
              > I agree about staying with one SQL-visible function.


        Okay, this does not necessarily mean there should be only one
        validation function in the C struct though. I wonder if it would
        be more future proof to name the C interface as something else
        than the current generic amvalidate. Especially considering that
        it basically only does opclass validation at the moment (It's
        IMHO saner in terms of API evolution to expand the struct with
        more validator functions in the future compared to adding
        arguments to the existing function).


    I also agree with you that adding more arguments in future might
    not be a good idea for exposed API.  I don't know how much improvement
    we can get if we use structure and then keep on adding more members
    to it based on future need, but atleast that way it will be less
    prone to
    breakage.

    I think adding multiple validator functions is another option, but that
    also doesn't sound like a good way as it can pose difficulty in
    understanding the right version of API to be used.

I think the major priority is to keep compatibility. For now, user can
easily define invalid opclass and he will just get the error runtime.
Thus, the opclass validation looks like improvement which is not
strictly needed. We can add new validator functions in the future but
make them not required. Thus, old access method wouldn't loose
compatibility from this.

Yes that was what I was thinking as well. We don't want to break anything in this patch as it's mainly API change, but we want to have API that can potentially evolve. I think evolving the API by adding more interfaces in the *Routine struct so far worked well for the FDW for example and given that those structs are nodes, the individual pointers get initialized to NULL automatically so it's easy to add optional interfaces (like validators) without breaking anything. Besides, it's not unreasonable to expect that custom AM authors will have to check if their implementation is compatible with new major version.

But this is also why I don't think it's good idea to call the opclass validator just "amvalidate" in the IndexAmRoutine struct because it implies to be the only validate function we'll ever have.


Other than the above gripe and the following spurious change, the patch seems good to me now.

 RelationInitIndexAccessInfo(Relation relation)
 {
        HeapTuple       tuple;
-       Form_pg_am      aform;
        Datum           indcollDatum;
        Datum           indclassDatum;
        Datum           indoptionDatum;
@@ -1178,6 +1243,7 @@ RelationInitIndexAccessInfo(Relation relation)
        MemoryContext oldcontext;
        int                     natts;
        uint16          amsupport;
+       Form_pg_am      aform;

Good catch, this change was reverted. 
Also, it was planned that documentation changes would be reviewed by native english speaker.
Tom, could you take a look at them?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-10-12 14:32, Alexander Korotkov wrote:
> Also, it was planned that documentation changes would be reviewed by
> native english speaker.

BTW I think this is ready for committer, except for the need to check 
docs by native speaker.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-10-12 14:32, Alexander Korotkov wrote:
>> Also, it was planned that documentation changes would be reviewed by
>> native english speaker.

> BTW I think this is ready for committer, except for the need to check 
> docs by native speaker.

I'm working at Salesforce this week, but will take a look after I get
home.
        regards, tom lane



Re: WIP: Rework access method interface

От
Michael Paquier
Дата:
On Thu, Oct 22, 2015 at 8:37 AM, Petr Jelinek wrote:
> BTW I think this is ready for committer, except for the need to check docs
> by native speaker.

If so, could you update the entry of this patch accordingly?
https://commitfest.postgresql.org/6/353/
-- 
Michael



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> Tom, could you take a look at them?

I started to look at this today.  (Apologies for the delay, but I came
back from San Francisco with a nasty head cold, and wasn't really up to
doing anything complicated last week.)

The thing that jumps out at me right away is that the proposed new amapi.h
header has this:

+ #include "nodes/relation.h"

to support declaring amcostestimate_function.  This will result in
importing the planner's declarations into pretty much every part of the
executor, because not only is amapi.h #included by a lot of places, but
the proposed patch actually has execnodes.h including it, as well as some
other popular headers.  We might as well shove everything into postgres.h
and forget about header modularity altogether.

Probably the least messy way to fix this is to drop that #include and
instead use dummy declarations like "struct PlannerInfo;" and "struct
IndexPath;" here.  We could additionally dumb the amcostestimate
declaration down from using "Cost" and "Selectivity" to just saying
"double".

A different line of attack would be to try to divide amapi.h into
"executor" and "planner" parts, but I'm not sure I see how to make that
work.  Somewhere you gotta declare the struct of function pointers.

(Note: I realize that there's a precedent in fdwapi.h of including planner
headers into what's otherwise mostly an executor thing.  That one's not
quite as destructive to header modularity because it's not used in as many
places as amapi.h will be.  But maybe we should rethink that as well.)

Thoughts, better ideas?
        regards, tom lane



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Tom Lane wrote:

> Probably the least messy way to fix this is to drop that #include and
> instead use dummy declarations like "struct PlannerInfo;" and "struct
> IndexPath;" here.  We could additionally dumb the amcostestimate
> declaration down from using "Cost" and "Selectivity" to just saying
> "double".

I'm not a fan of this approach.  I'd rather split the executor headers
in two, a leanone with the typedefs only and another with the actual
struct definitions.  That way we have one very lean executor header that
can be included everywhere (in headers and .c files that only pass the
structs around), and a fat one that is only included by the executor .c
files (and the few extra .c files that need access to the struct
definitions).

This would be similar in spirit to the htup.h / htup_details.h split.

I think (almost?) all the headers that define nodes suffer from this
disease and could be cured in the same way.

> (Note: I realize that there's a precedent in fdwapi.h of including planner
> headers into what's otherwise mostly an executor thing.  That one's not
> quite as destructive to header modularity because it's not used in as many
> places as amapi.h will be.  But maybe we should rethink that as well.)

Yes, rethink++.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Probably the least messy way to fix this is to drop that #include and
>> instead use dummy declarations like "struct PlannerInfo;" and "struct
>> IndexPath;" here.  We could additionally dumb the amcostestimate
>> declaration down from using "Cost" and "Selectivity" to just saying
>> "double".

> I'm not a fan of this approach.  I'd rather split the executor headers
> in two, a leanone with the typedefs only and another with the actual
> struct definitions.  That way we have one very lean executor header that
> can be included everywhere (in headers and .c files that only pass the
> structs around), and a fat one that is only included by the executor .c
> files (and the few extra .c files that need access to the struct
> definitions).

> This would be similar in spirit to the htup.h / htup_details.h split.
> I think (almost?) all the headers that define nodes suffer from this
> disease and could be cured in the same way.

I follow your reasoning, but I don't particularly want to make this
patch wait on a large and invasive refactoring of existing headers.

As a down payment on this problem, maybe we could invent a new planner
header that provides just enough info to support amapi.h and fdwapi.h;
it looks like this would be "typedef struct PlannerInfo PlannerInfo;",
likewise for RelOptInfo, ForeignPath, and IndexPath, and real declarations
of Cost and Selectivity.  Not sure what to name the new header.

Comments?
        regards, tom lane



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Tom Lane wrote:

> I follow your reasoning, but I don't particularly want to make this
> patch wait on a large and invasive refactoring of existing headers.

Sure.

> As a down payment on this problem, maybe we could invent a new planner
> header that provides just enough info to support amapi.h and fdwapi.h;
> it looks like this would be "typedef struct PlannerInfo PlannerInfo;",
> likewise for RelOptInfo, ForeignPath, and IndexPath, and real declarations
> of Cost and Selectivity.

Works for me, under the assumption that, down the road and without any
rush, we can shuffle some more stuff around to make this whole area a
bit cleaner.

> Not sure what to name the new header.

Yeah, this is always a problem for such patches :-(  I have no great
ideas ATM.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> As a down payment on this problem, maybe we could invent a new planner
>> header that provides just enough info to support amapi.h and fdwapi.h;
>> it looks like this would be "typedef struct PlannerInfo PlannerInfo;",
>> likewise for RelOptInfo, ForeignPath, and IndexPath, and real declarations
>> of Cost and Selectivity.

> Works for me, under the assumption that, down the road and without any
> rush, we can shuffle some more stuff around to make this whole area a
> bit cleaner.

Well, since we're at the start of a devel cycle, we'd have the rest of 9.6
for somebody to whack it around to their heart's content.  Once we ship
9.6 it would get a little harder to redefine the new header(s).

>> Not sure what to name the new header.

> Yeah, this is always a problem for such patches :-(  I have no great
> ideas ATM.

I'll draft something, but in the meantime, file name ideas solicited.
        regards, tom lane



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
... btw, what is the point of catalog/opfam_internal.h?  I see you added
it in b488c580aef4e05f, but it seems quite useless to have split it out
as a separate header, since only commands/opclasscmds.c uses it.

My attention got drawn to it because the current patch proposes to
#include it in amapi.h, which is as thorough a subversion of the concept
of "internal header" as I can readily think of.  If we're going to do
that with it we'd definitely need to rename it.  But I'm not following
why struct OpFamilyMember needs to be exposed at all.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> ... btw, what is the point of catalog/opfam_internal.h?  I see you added
> it in b488c580aef4e05f, but it seems quite useless to have split it out
> as a separate header, since only commands/opclasscmds.c uses it.
> 
> My attention got drawn to it because the current patch proposes to
> #include it in amapi.h, which is as thorough a subversion of the concept
> of "internal header" as I can readily think of.  If we're going to do
> that with it we'd definitely need to rename it.  But I'm not following
> why struct OpFamilyMember needs to be exposed at all.

Oh, that slipped in there, didn't it.  The JSON DDL-deparse bits need
it -- last posted by Alex Shulgin here:
https://www.postgresql.org/message-id/CACACo5Q_UXYwF117LBhjZ3xaMPyrgqnqE%3DmXvRhEfjJ51aCfwQ%40mail.gmail.com

I suppose it shouldn't have been committed, and be part of the extension
instead.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> ... btw, what is the point of catalog/opfam_internal.h?  I see you added
>> it in b488c580aef4e05f, but it seems quite useless to have split it out
>> as a separate header, since only commands/opclasscmds.c uses it.

> Oh, that slipped in there, didn't it.  The JSON DDL-deparse bits need
> it -- last posted by Alex Shulgin here:
> https://www.postgresql.org/message-id/CACACo5Q_UXYwF117LBhjZ3xaMPyrgqnqE%3DmXvRhEfjJ51aCfwQ%40mail.gmail.com

I still don't see any reference to OpFamilyMember in there?

> I suppose it shouldn't have been committed, and be part of the extension
> instead.

Previously, OpFamilyMember was just a transient internal data structure
inside commands/opclasscmds.c.  Unless we intended that some chunks of
the code in there be exposed for use by extensions, it's not terribly
clear to me why extensions would need to access this struct.  Perhaps
we ought to just revert struct OpFamilyMember back into opclasscmds.c.

Regardless of that, I'm a bit skeptical that any of these structs ought
to be defined as part of the amapi.h interface.  They seem to be making
premature judgments as to what an opclass verifier will care about.  In
any case, tying the opclass verifier API to DDL-command implementation
details doesn't seem like a good plan.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Tom Lane wrote:
> >> ... btw, what is the point of catalog/opfam_internal.h?  I see you added
> >> it in b488c580aef4e05f, but it seems quite useless to have split it out
> >> as a separate header, since only commands/opclasscmds.c uses it.
> 
> > Oh, that slipped in there, didn't it.  The JSON DDL-deparse bits need
> > it -- last posted by Alex Shulgin here:
> > https://www.postgresql.org/message-id/CACACo5Q_UXYwF117LBhjZ3xaMPyrgqnqE%3DmXvRhEfjJ51aCfwQ%40mail.gmail.com
> 
> I still don't see any reference to OpFamilyMember in there?

Sorry, that posting doesn't have the part that generates the JSON.  It's
ddl_deparse.c in the .tar.gz earlier in that thread:
http://www.postgresql.org/message-id/CACACo5QQuAV+n4Gi+YA1JF_a+QenR6SJuP8CYdPSrXKa+FHS3A@mail.gmail.com

> > I suppose it shouldn't have been committed, and be part of the extension
> > instead.
> 
> Previously, OpFamilyMember was just a transient internal data structure
> inside commands/opclasscmds.c.  Unless we intended that some chunks of
> the code in there be exposed for use by extensions, it's not terribly
> clear to me why extensions would need to access this struct.  Perhaps
> we ought to just revert struct OpFamilyMember back into opclasscmds.c.

The whole point of splitting the struct declaration to a new header was
to get a DDL deparser to examine the list of objects being created, so
that it could construct the deparsed command.  If the struct is opaque
to the outside world, there's no way to do that (as I recall, you can't
construct the full command starting from the parsed version only -- you
need access to the OIDs of the ops/procs actually added to the opclass.)

> Regardless of that, I'm a bit skeptical that any of these structs ought
> to be defined as part of the amapi.h interface.  They seem to be making
> premature judgments as to what an opclass verifier will care about.  In
> any case, tying the opclass verifier API to DDL-command implementation
> details doesn't seem like a good plan.

That's a different argument, with which I don't necessarily disagree.
I think it's not unlikely that a verifier will want to examine the
contents of the struct, but I don't think that means we need to expose
the struct in amapi.h.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>>> ... btw, what is the point of catalog/opfam_internal.h?

> The whole point of splitting the struct declaration to a new header was
> to get a DDL deparser to examine the list of objects being created, so
> that it could construct the deparsed command.  If the struct is opaque
> to the outside world, there's no way to do that (as I recall, you can't
> construct the full command starting from the parsed version only -- you
> need access to the OIDs of the ops/procs actually added to the opclass.)

Hm.  OK, looking closer, I see that we've effectively exposed these
structs as being what is passed to EventTriggerCollectCreateOpClass, so
even if there's not currently any committed code that can read that info,
we clearly need the structs to be accessible to interested event triggers.
I'm not sold that that was a great design, but it's what's there.

>> Regardless of that, I'm a bit skeptical that any of these structs ought
>> to be defined as part of the amapi.h interface.  They seem to be making
>> premature judgments as to what an opclass verifier will care about.  In
>> any case, tying the opclass verifier API to DDL-command implementation
>> details doesn't seem like a good plan.

> That's a different argument, with which I don't necessarily disagree.
> I think it's not unlikely that a verifier will want to examine the
> contents of the struct, but I don't think that means we need to expose
> the struct in amapi.h.

I think we'd be considerably better off to *not* re-use OpFamilyMember
in the verifier API.  It's a struct that was only ever designed for
internal use in CREATE/ALTER OPERATOR CLASS.

I'm kind of inclined to just let the verifiers read the catalogs for
themselves.  AFAICS, a loop around the results of SearchSysCacheList
is not going to be significantly more code than what this patch does,
and it avoids presuming that we know what the verifiers will wish to
look at.
        regards, tom lane



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Tom Lane wrote:
> >>> ... btw, what is the point of catalog/opfam_internal.h?
> 
> > The whole point of splitting the struct declaration to a new header was
> > to get a DDL deparser to examine the list of objects being created, so
> > that it could construct the deparsed command.  If the struct is opaque
> > to the outside world, there's no way to do that (as I recall, you can't
> > construct the full command starting from the parsed version only -- you
> > need access to the OIDs of the ops/procs actually added to the opclass.)
> 
> Hm.  OK, looking closer, I see that we've effectively exposed these
> structs as being what is passed to EventTriggerCollectCreateOpClass, so
> even if there's not currently any committed code that can read that info,
> we clearly need the structs to be accessible to interested event triggers.
> I'm not sold that that was a great design, but it's what's there.

All the deparsers are expected to be able to understand internal
representations of commands, though, such as OIDs and so on.  If you
have a better idea (I don't), we can still rework the API we present to
deparser extensions.

> > That's a different argument, with which I don't necessarily disagree.
> > I think it's not unlikely that a verifier will want to examine the
> > contents of the struct, but I don't think that means we need to expose
> > the struct in amapi.h.
> 
> I think we'd be considerably better off to *not* re-use OpFamilyMember
> in the verifier API.  It's a struct that was only ever designed for
> internal use in CREATE/ALTER OPERATOR CLASS.
>
> I'm kind of inclined to just let the verifiers read the catalogs for
> themselves.  AFAICS, a loop around the results of SearchSysCacheList
> is not going to be significantly more code than what this patch does,
> and it avoids presuming that we know what the verifiers will wish to
> look at.

Hmm, so this amounts to saying the verifier can only run after the
catalog rows are written.  Is that okay?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> I'm kind of inclined to just let the verifiers read the catalogs for
>> themselves.  AFAICS, a loop around the results of SearchSysCacheList
>> is not going to be significantly more code than what this patch does,
>> and it avoids presuming that we know what the verifiers will wish to
>> look at.

> Hmm, so this amounts to saying the verifier can only run after the
> catalog rows are written.  Is that okay?

Why not?  Surely we are not interested in performance-optimizing for
the case of a detectably incorrect CREATE OP CLASS command.

I don't actually care that much about running the verifiers during
CREATE/etc OP CLASS at all.  It's at least as important to be able
to run them across the built-in opclasses, considering all the chances
for human error in constructing those things.  Even in the ALTER OP CLASS
case, the verifier would likely need to look at existing catalog rows as
well as the new ones.  So basically, I see zero value in exposing CREATE/
ALTER OP CLASS's internal working representation to the verifiers.
        regards, tom lane



Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-11-02 23:28, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Tom Lane wrote:
>>> Regardless of that, I'm a bit skeptical that any of these structs ought
>>> to be defined as part of the amapi.h interface.  They seem to be making
>>> premature judgments as to what an opclass verifier will care about.  In
>>> any case, tying the opclass verifier API to DDL-command implementation
>>> details doesn't seem like a good plan.
>
>> That's a different argument, with which I don't necessarily disagree.
>> I think it's not unlikely that a verifier will want to examine the
>> contents of the struct, but I don't think that means we need to expose
>> the struct in amapi.h.
>
> I think we'd be considerably better off to *not* re-use OpFamilyMember
> in the verifier API.  It's a struct that was only ever designed for
> internal use in CREATE/ALTER OPERATOR CLASS.
>
> I'm kind of inclined to just let the verifiers read the catalogs for
> themselves.  AFAICS, a loop around the results of SearchSysCacheList
> is not going to be significantly more code than what this patch does,
> and it avoids presuming that we know what the verifiers will wish to
> look at.

I like this idea. I didn't like from beginning that verifier is tied to 
opclass but didn't have better solution, but this seems reasonable.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Tue, Nov 3, 2015 at 2:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> I'm kind of inclined to just let the verifiers read the catalogs for
>> themselves.  AFAICS, a loop around the results of SearchSysCacheList
>> is not going to be significantly more code than what this patch does,
>> and it avoids presuming that we know what the verifiers will wish to
>> look at.

> Hmm, so this amounts to saying the verifier can only run after the
> catalog rows are written.  Is that okay?

Why not?  Surely we are not interested in performance-optimizing for
the case of a detectably incorrect CREATE OP CLASS command.

I don't actually care that much about running the verifiers during
CREATE/etc OP CLASS at all.  It's at least as important to be able
to run them across the built-in opclasses, considering all the chances
for human error in constructing those things.  Even in the ALTER OP CLASS
case, the verifier would likely need to look at existing catalog rows as
well as the new ones.  So basically, I see zero value in exposing CREATE/
ALTER OP CLASS's internal working representation to the verifiers.

I'm OK with validating opclass directly by system catalog, i.e. looping over SearchSysCacheList results. Teodor was telling me something similar personally.
I'll also try to rearrange header according to the comments upthread.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Tue, Nov 3, 2015 at 3:16 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Tue, Nov 3, 2015 at 2:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> I'm kind of inclined to just let the verifiers read the catalogs for
>> themselves.  AFAICS, a loop around the results of SearchSysCacheList
>> is not going to be significantly more code than what this patch does,
>> and it avoids presuming that we know what the verifiers will wish to
>> look at.

> Hmm, so this amounts to saying the verifier can only run after the
> catalog rows are written.  Is that okay?

Why not?  Surely we are not interested in performance-optimizing for
the case of a detectably incorrect CREATE OP CLASS command.

I don't actually care that much about running the verifiers during
CREATE/etc OP CLASS at all.  It's at least as important to be able
to run them across the built-in opclasses, considering all the chances
for human error in constructing those things.  Even in the ALTER OP CLASS
case, the verifier would likely need to look at existing catalog rows as
well as the new ones.  So basically, I see zero value in exposing CREATE/
ALTER OP CLASS's internal working representation to the verifiers.

I'm OK with validating opclass directly by system catalog, i.e. looping over SearchSysCacheList results. Teodor was telling me something similar personally.
I'll also try to rearrange header according to the comments upthread.

The next revision of patch is attached.
The changes are following:
  1. Definitions of Selectivity, Cost and declarations of IndexAmRoutine, PlannerInfo, IndexPath, IndexInfo are moved into separate header reldecls.h. That allows to get rid of #include "nodes/relation.h" in amapi.h.
  2. amvalidate method now gets opclass oid as parameter. Internally, amvalidate implementations do catalog lookups. opfam_internal.h isn't included from inappropriate places anymore.
  3. I removed amvalidate calls from opclasscmds.c. Validating user-defined opclasses is behavior change which can have issues with backward compatibility. I think if we want to introduce this, it should be considered separately of API rework.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Thu, Nov 12, 2015 at 2:49 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Tue, Nov 3, 2015 at 3:16 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Tue, Nov 3, 2015 at 2:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> I'm kind of inclined to just let the verifiers read the catalogs for
>> themselves.  AFAICS, a loop around the results of SearchSysCacheList
>> is not going to be significantly more code than what this patch does,
>> and it avoids presuming that we know what the verifiers will wish to
>> look at.

> Hmm, so this amounts to saying the verifier can only run after the
> catalog rows are written.  Is that okay?

Why not?  Surely we are not interested in performance-optimizing for
the case of a detectably incorrect CREATE OP CLASS command.

I don't actually care that much about running the verifiers during
CREATE/etc OP CLASS at all.  It's at least as important to be able
to run them across the built-in opclasses, considering all the chances
for human error in constructing those things.  Even in the ALTER OP CLASS
case, the verifier would likely need to look at existing catalog rows as
well as the new ones.  So basically, I see zero value in exposing CREATE/
ALTER OP CLASS's internal working representation to the verifiers.

I'm OK with validating opclass directly by system catalog, i.e. looping over SearchSysCacheList results. Teodor was telling me something similar personally.
I'll also try to rearrange header according to the comments upthread.

The next revision of patch is attached.
The changes are following:
  1. Definitions of Selectivity, Cost and declarations of IndexAmRoutine, PlannerInfo, IndexPath, IndexInfo are moved into separate header reldecls.h. That allows to get rid of #include "nodes/relation.h" in amapi.h.
  2. amvalidate method now gets opclass oid as parameter. Internally, amvalidate implementations do catalog lookups. opfam_internal.h isn't included from inappropriate places anymore.
  3. I removed amvalidate calls from opclasscmds.c. Validating user-defined opclasses is behavior change which can have issues with backward compatibility. I think if we want to introduce this, it should be considered separately of API rework.
​Patch was rebased against current master.
Any notes about current version of patch?
It would be nice to commit it and continue work on other parts of am extendability.​


------

Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-12-09 15:09, Alexander Korotkov wrote:
>
> ​Patch was rebased against current master.
> Any notes about current version of patch?
> It would be nice to commit it and continue work on other parts of am
> extendability.​
>

The rebase seems broken, there are things missing in this version of the 
patch (for example the validation functions).


--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
On Sat, Dec 12, 2015 at 9:21 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2015-12-09 15:09, Alexander Korotkov wrote:

​Patch was rebased against current master.
Any notes about current version of patch?
It would be nice to commit it and continue work on other parts of am
extendability.​


The rebase seems broken, there are things missing in this version of the patch (for example the validation functions).

​Ooops, sorry. Correct version is attached.​


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-12-12 23:17, Alexander Korotkov wrote:
> On Sat, Dec 12, 2015 at 9:21 PM, Petr Jelinek <petr@2ndquadrant.com
> <mailto:petr@2ndquadrant.com>> wrote:
>
>     On 2015-12-09 15:09, Alexander Korotkov wrote:
>
>
>         ​Patch was rebased against current master.
>         Any notes about current version of patch?
>         It would be nice to commit it and continue work on other parts of am
>         extendability.​
>
>
>     The rebase seems broken, there are things missing in this version of
>     the patch (for example the validation functions).
>
>
> ​Ooops, sorry. Correct version is attached.​
>

Hi,

I went over this.

I get these compiler warning about unused variables in the validation 
functions:
brin.c: In function ‘brinvalidate’:
brin.c:94:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]      keytype;      ^
ginutil.c: In function ‘ginvalidate’:
ginutil.c:86:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]      keytype;      ^
gist.c: In function ‘gistvalidate’:
gist.c:101:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]      keytype;      ^
hash.c: In function ‘hashvalidate’:
hash.c:103:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]      keytype;      ^
nbtree.c: In function ‘btvalidate’:
nbtree.c:134:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]      keytype;      ^
nbtree.c:133:6: warning: variable ‘intype’ set but not used 
[-Wunused-but-set-variable]      intype,      ^
spgutils.c: In function ‘spgvalidate’:
spgutils.c:88:6: warning: variable ‘keytype’ set but not used 
[-Wunused-but-set-variable]      keytype;      ^

These look like copy-pastos of boilerplate.

Another note is that amvalidate SQL interface is not documented 
anywhere. I know it's mainly meant for regression tests and we for 
example don't document hashing functions but it's something to think 
about/discuss maybe.

Other than that I am happy with the patch.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Michael Paquier
Дата:
On Mon, Dec 14, 2015 at 11:26 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> These look like copy-pastos of boilerplate.
>
> Another note is that amvalidate SQL interface is not documented anywhere. I
> know it's mainly meant for regression tests and we for example don't
> document hashing functions but it's something to think about/discuss maybe.
>
> Other than that I am happy with the patch.

I have moved this patch to the next CF with the same status.
Alexander, could you address the warnings reported by Petr?
-- 
Michael



Re: WIP: Rework access method interface

От
Alexander Korotkov
Дата:
Hi!

On Mon, Dec 14, 2015 at 5:26 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
I went over this.

I get these compiler warning about unused variables in the validation functions:
brin.c: In function ‘brinvalidate’:
brin.c:94:6: warning: variable ‘keytype’ set but not used [-Wunused-but-set-variable]
      keytype;
      ^
ginutil.c: In function ‘ginvalidate’:
ginutil.c:86:6: warning: variable ‘keytype’ set but not used [-Wunused-but-set-variable]
      keytype;
      ^
gist.c: In function ‘gistvalidate’:
gist.c:101:6: warning: variable ‘keytype’ set but not used [-Wunused-but-set-variable]
      keytype;
      ^
hash.c: In function ‘hashvalidate’:
hash.c:103:6: warning: variable ‘keytype’ set but not used [-Wunused-but-set-variable]
      keytype;
      ^
nbtree.c: In function ‘btvalidate’:
nbtree.c:134:6: warning: variable ‘keytype’ set but not used [-Wunused-but-set-variable]
      keytype;
      ^
nbtree.c:133:6: warning: variable ‘intype’ set but not used [-Wunused-but-set-variable]
      intype,
      ^
spgutils.c: In function ‘spgvalidate’:
spgutils.c:88:6: warning: variable ‘keytype’ set but not used [-Wunused-but-set-variable]
      keytype;
      ^

These look like copy-pastos of boilerplate.
 
​Fixed in the attached version of patch.

Another note is that amvalidate SQL interface is not documented anywhere. I know it's mainly meant for regression tests and we for example don't document hashing functions but it's something to think about/discuss maybe.

What do you think about
​"​
System Administration Functions
​" chapter?
 
Other than that I am happy with the patch.

​Great!​

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 
Вложения

Re: WIP: Rework access method interface

От
Petr Jelinek
Дата:
On 2015-12-24 14:57, Alexander Korotkov wrote:
>
> What do you think about
> ​"​
> System Administration Functions
> ​" chapter?
> http://www.postgresql.org/docs/devel/static/functions-admin.html
>
>     Other than that I am happy with the patch.
>

Sounds good to me.

One last tiny gripe, I think the amroutine in RelationData should be 
named rd_amroutine for consistency.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: WIP: Rework access method interface

От
Alvaro Herrera
Дата:
I just noticed that this patch had remained in Waiting-for-author status
after Alexander had already submitted the updated version of the patch.
I marked it as ready-for-committer, because Petr stated so in the
thread.

Tom, you're registered as committer for this patch.  Do you have time in
the near future to review this patch?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I just noticed that this patch had remained in Waiting-for-author status
> after Alexander had already submitted the updated version of the patch.
> I marked it as ready-for-committer, because Petr stated so in the
> thread.

> Tom, you're registered as committer for this patch.  Do you have time in
> the near future to review this patch?

Yeah, I will look at it soon.
        regards, tom lane



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> [ aminterface-13.patch ]

I've started to review this.  There are a bunch of cosmetic things I don't
like, notably the include-file nesting you've chosen, but they're fixable.
One item that I think could use some discussion is where to put the new
amvalidate functions.  I don't especially like your choice to drop them
into nbtree.c, gist.c, etc, for a couple of reasons:

1. These aren't really at the same semantic level as functions like
btinsert or btgettuple; they're not part of the implementation of an
index, and indeed are *users* of indexes (at least of the catalog
indexes).

2. This approach substantially bloats the #include lists for the
relevant files, which again is a token of the validate functions not
belonging where they were put.

3. There's probably room to share code across the different validators;
but this design isn't very amenable to that.

A comparison point worth noting is that the amcostestimate functions
are in more or less the same boat: they aren't part of the index
implementation in any meaningful way, but are really part of the
planner instead.  Those are all in selfuncs.c, not under backend/access
at all.

There are a couple of things we could do instead:

* Put each amvalidate function into its own file (but probably keep it
in the same directory as now).  This is a reasonable response to
points #1 and #2 but isn't very much help for #3.

* Collect the amvalidate functions into one file, which then leaves
us wondering where to put that; surely not under any one AM's directory.
A new file in src/backend/access/index/ is one plausible solution.
This file would also be a reasonable place to put the amvalidate()
dispatch function itself.

I'm somewhat leaning to the second choice, but perhaps someone has
a better idea, or an argument against doing that.
        regards, tom lane



Re: WIP: Rework access method interface

От
Robert Haas
Дата:
On Jan 16, 2016, at 11:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>> [ aminterface-13.patch ]
>
> I've started to review this.  There are a bunch of cosmetic things I don't
> like, notably the include-file nesting you've chosen, but they're fixable.
> One item that I think could use some discussion is where to put the new
> amvalidate functions.  I don't especially like your choice to drop them
> into nbtree.c, gist.c, etc, for a couple of reasons:
>
> 1. These aren't really at the same semantic level as functions like
> btinsert or btgettuple; they're not part of the implementation of an
> index, and indeed are *users* of indexes (at least of the catalog
> indexes).
>
> 2. This approach substantially bloats the #include lists for the
> relevant files, which again is a token of the validate functions not
> belonging where they were put.
>
> 3. There's probably room to share code across the different validators;
> but this design isn't very amenable to that.
>
> A comparison point worth noting is that the amcostestimate functions
> are in more or less the same boat: they aren't part of the index
> implementation in any meaningful way, but are really part of the
> planner instead.  Those are all in selfuncs.c, not under backend/access
> at all.
>
> There are a couple of things we could do instead:
>
> * Put each amvalidate function into its own file (but probably keep it
> in the same directory as now).  This is a reasonable response to
> points #1 and #2 but isn't very much help for #3.
>
> * Collect the amvalidate functions into one file, which then leaves
> us wondering where to put that; surely not under any one AM's directory.
> A new file in src/backend/access/index/ is one plausible solution.
> This file would also be a reasonable place to put the amvalidate()
> dispatch function itself.
>
> I'm somewhat leaning to the second choice, but perhaps someone has
> a better idea, or an argument against doing that.

Doesn't seem very modular.  How about putting common code there but AM-specific code in each AM's directory?  It would
benice if adding a new AM meant mostly adding a new directory, not much touching the common code. 

...Robert


Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Jan 16, 2016, at 11:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> There are a couple of things we could do instead:
>> 
>> * Put each amvalidate function into its own file (but probably keep it
>> in the same directory as now).  This is a reasonable response to
>> points #1 and #2 but isn't very much help for #3.
>> 
>> * Collect the amvalidate functions into one file, which then leaves
>> us wondering where to put that; surely not under any one AM's directory.
>> A new file in src/backend/access/index/ is one plausible solution.
>> This file would also be a reasonable place to put the amvalidate()
>> dispatch function itself.
>> 
>> I'm somewhat leaning to the second choice, but perhaps someone has
>> a better idea, or an argument against doing that.

> Doesn't seem very modular.  How about putting common code there but AM-specific code in each AM's directory?  It
wouldbe nice if adding a new AM meant mostly adding a new directory, not much touching the common code.
 

Then we're going to end up with option A; and I suspect that we'll never
bother with factoring out any common code, because it won't be worth the
notational trouble if it involves common code that's in a different file
in a different directory.

As for modularity, nobody's moaned particularly about the amcostestimate
functions all being in selfuncs.c.  It all depends on what you think is
"modular".
        regards, tom lane



Re: WIP: Rework access method interface

От
Andres Freund
Дата:
On January 16, 2016 6:32:47 PM GMT+01:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>As for modularity, nobody's moaned particularly about the
>amcostestimate
>functions all being in selfuncs.c.  It all depends on what you think is
>"modular".

Well, back then you couldn't really have a production grade indeed as an extension. With this were getting pretty close
tobeing able to do that for a bunch of useful indexes.
 

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.



Re: WIP: Rework access method interface

От
Amit Kapila
Дата:
On Sat, Jan 16, 2016 at 9:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > [ aminterface-13.patch ]
>
> I've started to review this.  There are a bunch of cosmetic things I don't
> like, notably the include-file nesting you've chosen, but they're fixable.
> One item that I think could use some discussion is where to put the new
> amvalidate functions.  I don't especially like your choice to drop them
> into nbtree.c, gist.c, etc, for a couple of reasons:
>
> 1. These aren't really at the same semantic level as functions like
> btinsert or btgettuple; they're not part of the implementation of an
> index, and indeed are *users* of indexes (at least of the catalog
> indexes).
>
> 2. This approach substantially bloats the #include lists for the
> relevant files, which again is a token of the validate functions not
> belonging where they were put.
>
> 3. There's probably room to share code across the different validators;
> but this design isn't very amenable to that.
>
> A comparison point worth noting is that the amcostestimate functions
> are in more or less the same boat: they aren't part of the index
> implementation in any meaningful way, but are really part of the
> planner instead.  Those are all in selfuncs.c, not under backend/access
> at all.
>
> There are a couple of things we could do instead:
>
> * Put each amvalidate function into its own file (but probably keep it
> in the same directory as now).  This is a reasonable response to
> points #1 and #2 but isn't very much help for #3.
>

Shouldn't we try to move amhandler function as well along with
amvalidate?  I think moving each am's handler and validate into
am specific new file can make this arrangement closer to what
we have for PL's (ex. we have plpgsql_validator and plpgsql_call_
handler in pl_handler.c and similar handler and validator functions
for other languages in their corresponding modules).


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Amit Kapila <amit.kapila16@gmail.com> writes:
> Shouldn't we try to move amhandler function as well along with
> amvalidate?  I think moving each am's handler and validate into
> am specific new file can make this arrangement closer to what
> we have for PL's (ex. we have plpgsql_validator and plpgsql_call_
> handler in pl_handler.c and similar handler and validator functions
> for other languages in their corresponding modules).

I feel no great need to move the amhandler functions, and if we did,
I would not want to put them into the same files as the amvalidate
functions.  As I said before, the latter are appendages to the AMs
that really don't have anything to do with the core index access code.
They require very different sets of #include files, for instance.

So I see the AMs as containing three separate subsets of code:
core index access/maintenance, amcostestimate, and amvalidate.
The second and third really need to be in separate files because of
#include footprint considerations, but the amhandler function can
perfectly well go in with the first group.
        regards, tom lane



Re: WIP: Rework access method interface

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> [ aminterface-13.patch ]

I've committed this after some rather significant rework, not all of
it cosmetic in nature.  For example, the patch fell over under
CLOBBER_CACHE_ALWAYS (planner failing to copy data out of relcache
entries that might not survive long) and on 32-bit machines (failure
to fix index_getbitmap API properly).  I might've missed some other
portability issues; we'll see what the buildfarm says.

The amvalidate functions could do with quite a bit more work in the
long run.  For now, they more or less replace the queries we had to
remove from opr_sanity, but I'd like to see almost all of what
opr_sanity does with opclasses get pushed down, so that these functions
can provide a test facility for extensions' opclasses.  That does not
need to hold up adoption of the patch, though.
        regards, tom lane



Re: WIP: Rework access method interface

От
Robert Haas
Дата:
On Sat, Jan 16, 2016 at 12:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Then we're going to end up with option A; and I suspect that we'll never
> bother with factoring out any common code, because it won't be worth the
> notational trouble if it involves common code that's in a different file
> in a different directory.

Since when is sharing code across files in different directories even
an iota more difficult than sharing code across files in the same
directory?

Sharing code across multiple files is only slightly more difficult
than sharing it within a file.  You just have to create a header file
in the appropriate place and stick the necessary declarations in
there.  I'll admit that's some work, but it's not usually very much.
After that, where the header gets included from really makes no
difference.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company