Обсуждение: Working with PostgreSQL enums in C code

Поиск
Список
Период
Сортировка

Working with PostgreSQL enums in C code

От
Joseph Adams
Дата:
I encountered a situation while implementing JSON support where I
needed to return an enum value from a C function.  To clarify, here's
the SQL:

CREATE TYPE json_type_t AS ENUM ('null', 'string', 'number', 'bool',
'object', 'array');

CREATE OR REPLACE FUNCTION json_type(json)
RETURNS json_type_t
AS 'MODULE_PATHNAME','json_get_type'
LANGUAGE C STRICT IMMUTABLE;

I initially tried looking for another function returning an enum in
the PostgreSQL source tree, but I couldn't find any.  I guess this is
because enums are a relatively new feature in PostgreSQL.

I learned that to return an enum value from C, one needs to return the
OID of the right row of the pg_enum table.  I eventually managed to
write the code below, which is mostly based on the enum_in function in
src/backend/utils/adt/enum.c .

#define PG_RETURN_ENUM(typname, label) return enumLabelToOid(typname, label)

static Oid enumLabelToOid(const char *typname, const char *label)
{Oid         enumtypoid;HeapTuple   tup;Oid         ret;enumtypoid =
TypenameGetTypid(typname);Assert(OidIsValid(enumtypoid));tup= SearchSysCache2(ENUMTYPOIDNAME,
ObjectIdGetDatum(enumtypoid),                     CStringGetDatum(label));Assert(HeapTupleIsValid(tup));ret =
HeapTupleGetOid(tup);ReleaseSysCache(tup);returnret;
 
}

Feel free to nitpick the code above, as I'm still learning.  Note that
I replaced the more robust validity checks of enum_in with (quicker?)
asserts, with the assumption that correct programs would only pass
valid values to PG_RETURN_ENUM .

The code using the method above can be found here:

http://git.postgresql.org/gitweb?p=json-datatype.git;a=tree;f=contrib/json;h=1dd813da4016b31f35cb39b01c6d5f0999da672e;hb=092fa046f95580dd7906a07370ca401692a1f818
.  My testcases passed, so everything seems to work.

I suppose my PG_RETURN_ENUM macro is nice and simple, except for the
fact that the coder has to keep an enum names table in sync with the
SQL code and the C code.  However, going the other way around
(PG_GETARG_ENUM) would need access to that enum names table.  Hence,
it'd make sense to have macros for defining this table so both
PG_RETURN_ENUM and PG_GETARG_ENUM can reference it.

I believe that these macros would be a useful addition to the
PostgreSQL function manager API, as they would provide a decent way to
receive and return custom enums from C code.  Anyone agree/disagree?


Joey Adams


Re: Working with PostgreSQL enums in C code

От
Robert Haas
Дата:
On Fri, May 28, 2010 at 12:07 AM, Joseph Adams
<joeyadams3.14159@gmail.com> wrote:
> I learned that to return an enum value from C, one needs to return the
> OID of the right row of the pg_enum table.  I eventually managed to
> write the code below, which is mostly based on the enum_in function in
> src/backend/utils/adt/enum.c .

PG_RETURN macros shouldn't do any nontrivial processing (see the
existing ones for references).  I assume you have the enum labels
declared in pg_enum.h, so I think you can just return the correct OID
values directly.  Declare constants for them in pg_enum.h and then
just do PG_RETURN_OID(whatever).

#define JSONTypeNullOid   ...
#define JSONTypeStringOid ...
#define JSONTypeNumberOid ...

It really shouldn't be necessary to do a catalog lookup to retrieve a constant.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Working with PostgreSQL enums in C code

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, May 28, 2010 at 12:07 AM, Joseph Adams
> <joeyadams3.14159@gmail.com> wrote:
>> I learned that to return an enum value from C, one needs to return the
>> OID of the right row of the pg_enum table. �I eventually managed to
>> write the code below, which is mostly based on the enum_in function in
>> src/backend/utils/adt/enum.c .

> PG_RETURN macros shouldn't do any nontrivial processing (see the
> existing ones for references).

Yeah, that was my first reaction too.  If we don't already have one,
it would be appropriate to provide a "lookup enum value" function
(functionally about the same as enum_in, but designed to be called
conveniently from C).  Then, if you needed to work from a textual
enum label, you'd call that function and then PG_RETURN_OID.

However, for a built-in enum type, I agree with Robert's solution of
just #define-ing fixed OIDs for the values of the type.
        regards, tom lane


Re: Working with PostgreSQL enums in C code

От
Joseph Adams
Дата:
On Fri, May 28, 2010 at 9:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, May 28, 2010 at 12:07 AM, Joseph Adams
>> <joeyadams3.14159@gmail.com> wrote:
>>> I learned that to return an enum value from C, one needs to return the
>>> OID of the right row of the pg_enum table.  I eventually managed to
>>> write the code below, which is mostly based on the enum_in function in
>>> src/backend/utils/adt/enum.c .
>
>> PG_RETURN macros shouldn't do any nontrivial processing (see the
>> existing ones for references).
>
> Yeah, that was my first reaction too.  If we don't already have one,
> it would be appropriate to provide a "lookup enum value" function
> (functionally about the same as enum_in, but designed to be called
> conveniently from C).  Then, if you needed to work from a textual
> enum label, you'd call that function and then PG_RETURN_OID.

Here is the function I wrote to look up enum values:
Oid enum_label_to_oid(const char *typname, const char *label){    Oid            enumtypoid;    HeapTuple    tup;
Oid           ret;    enumtypoid = TypenameGetTypid(typname);    Assert(OidIsValid(enumtypoid));    tup =
SearchSysCache2(ENUMTYPOIDNAME,                         ObjectIdGetDatum(enumtypoid),
CStringGetDatum(label));   Assert(HeapTupleIsValid(tup));    ret = HeapTupleGetOid(tup);    ReleaseSysCache(tup);
returnret;} 

If this were added to PostgreSQL proper, what source/header files
would it make sense to put it in?  enum.c/builtins.h ?

> However, for a built-in enum type, I agree with Robert's solution of
> just #define-ing fixed OIDs for the values of the type.

I don't know if the JSON datatype will be a contrib module or built-in
yet, but if it were contrib, would it still be better to use fixed
OIDs anyway?  One issue with setting this precedent is that new
contrib modules using enums wouldn't be compatible with older versions
of PostgreSQL.  One idea might be to insert extend CREATE TYPE name AS
ENUM so OIDs can be specified explicitly, but then that could lead to
OID clashes.  That would be a really messy problem for users.

By the way, just curious: why can't the internal representation of an
enum just be an INT starting from 0 by default, like in C?  That would
make a heck of a lot more sense, in my opinion.  It might also allow
users to do things like this in the future:

CREATE TYPE number AS ENUM ('one' = 1, 'two' = 2, 'three' = 3);


Joey Adams


Re: Working with PostgreSQL enums in C code

От
Alvaro Herrera
Дата:
Excerpts from Joseph Adams's message of lun jun 07 17:16:12 -0400 2010:

> > However, for a built-in enum type, I agree with Robert's solution of
> > just #define-ing fixed OIDs for the values of the type.
> 
> I don't know if the JSON datatype will be a contrib module or built-in
> yet, but if it were contrib, would it still be better to use fixed
> OIDs anyway?

If it were contrib, fixed OIDs wouldn't cut it precisely for that
reason.  (A contrib module is not "built-in" for these purposes.)

> One issue with setting this precedent is that new
> contrib modules using enums wouldn't be compatible with older versions
> of PostgreSQL.  One idea might be to insert extend CREATE TYPE name AS
> ENUM so OIDs can be specified explicitly, but then that could lead to
> OID clashes.  That would be a really messy problem for users.

Yeah.  We've just defined an interface for pg_migrator-only usage,
allowing it to define the OID values of ENUMs; it wasn't considered a
good idea to expose the details to the user.

> By the way, just curious: why can't the internal representation of an
> enum just be an INT starting from 0 by default, like in C?  That would
> make a heck of a lot more sense, in my opinion.  It might also allow
> users to do things like this in the future:
> 
> CREATE TYPE number AS ENUM ('one' = 1, 'two' = 2, 'three' = 3);

The problem is that the output function needs to be able to figure out
the value starting with only the datum value.  If it had only the "1"
it couldn't know what enum it'd correspond to.  The other alternative
would have been to make enums 64 bits wide, carrying the enum OID in 32
bits and the value in the other 32.  This was dismissed as too wasteful.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Working with PostgreSQL enums in C code

От
Robert Haas
Дата:
On Mon, Jun 7, 2010 at 5:16 PM, Joseph Adams <joeyadams3.14159@gmail.com> wrote:
> I don't know if the JSON datatype will be a contrib module or built-in
> yet, but if it were contrib, would it still be better to use fixed
> OIDs anyway?

Part of the point is that EXPLAIN (FORMAT JSON) should return json, so
this needs to be built-in.  Otherwise, that won't work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Working with PostgreSQL enums in C code

От
Andrew Dunstan
Дата:

Joseph Adams wrote:
>
> By the way, just curious: why can't the internal representation of an
> enum just be an INT starting from 0 by default, like in C?  That would
> make a heck of a lot more sense, in my opinion.  It might also allow
> users to do things like this in the future:
>
>
>   

Please review the debates over the internal representation from several 
years ago when enums were implemented. Essentially the difficulty is 
that the output function needs to get nothing more than the value 
itself, and that means the representation needs to carry with it some 
information about *which* enum set it is in.

cheers

andrew