Обсуждение: Extending SQL in C using VARIABLE length type

Поиск
Список
Период
Сортировка

Extending SQL in C using VARIABLE length type

От
Carsten Kropf
Дата:
Hello everybody,
I am quite a novice in using the extension features of the PostgreSQL database. Actually, I have to do this for work at the university. At the moment, I am trying around a little bit with creating my own types using shared objects, written in C. The usage of static types with fixed length was actually no problem for me, so I proceeded to variable length types.
I created an n-dimensional point structure called "PointND" that contains a field of float8 values of dynamic length. I also put in a int4/int32 field for the length specification, as required by the documentation. So the structure looks like the following:
struct PointND
{
int32 dimensions;
float8 coordinates[1];
};
I hope, that at least this layout is as it is required. Together with this type, I also provide the required in and out functions to convert the structure to the internal/external representation properly.
The in/out functions work properly when giving the following statement (I also tried it using a debugger):
select '(4,5,6)'::pointnd;
            pointnd             
--------------------------------
 (4.000000, 5.000000, 6.000000)
(1 row)

So it seems, that at least these functions do what they are supposed to.
The problem I have is that if I now create a table that should store entries of the type pointnd, it won't store them actually. If I do an insert like the following:
insert into test (point) values ('(5,16,6)'::pointnd);
INSERT 0 1

I get the feedback that one new row has been created. Actually this row has been created and the in function is also called (I also checked this using the debugger). Now, I would have expected something like the following, when querying the table:
select * from test;
             point              
--------------------------------
 (5.000000, 16.000000, 6.000000)

But, actually I get the following:
select * from test;
             point              
--------------------------------
 (0.000000, 0.000000, 0.000000)

The SQL-Script used to create the type can be seen here:
CREATE TYPE pointnd(
INTERNALLENGTH = VARIABLE,
ALIGNMENT=DOUBLE,
INPUT=pointnd_in,
OUTPUT=pointnd_out,
RECEIVE=pointnd_recv,
SEND=pointnd_send,
STORAGE=PLAIN
);

I played around with the parameters a little bit, but still don't know where this behaviour comes from. Actually, I was thinking that I conform to the requirements given by Postgres after having read the documentation. Storage type set to another method (like MAIN) will result in a segmentation fault, though.
I would be very glad, if somebody could provide me some help to this issue because I could proceed with my "actual" work, after that.
Thank you in advance

Best regards

Carsten Kropf

Re: Extending SQL in C using VARIABLE length type

От
Yeb Havinga
Дата:
Carsten Kropf wrote:
> The usage of static types with fixed length was actually no problem
> for me, so I proceeded to variable length types.
> I created an n-dimensional point structure called "PointND" that
> contains a field of float8 values of dynamic length. I also put in a
> int4/int32 field for the length specification, as required by the
> documentation. So the structure looks like the following:
> struct PointND
> {
> int32 dimensions;
> float8 coordinates[1];
> };
The structure should begin with a int32 vl_len_ header. At creation /
palloc time the size must be set with the SET_VARSIZE macro, and the
size can be queried with the VARSIZE_ * macros -> doxygen.postgresql.org
is your friend here. Take a look at e.g. contrib/cube for examples.

regards,
Yeb Havinga


Re: Extending SQL in C using VARIABLE length type

От
Carsten Kropf
Дата:
Actually, I thought, I did this using the int32 variable called "dimension" which should be exactly this field. Unfortunately, it seems, that something is wrong here. I'll look inside the code of cube to determine the things I'm doing wrong, currently. Thanks so far for your advice.
My in-method is the following:
Datum pointnd_in(PG_FUNCTION_ARGS)
{
PointND * in = NULL;
// access the format string
char * in_string = PG_GETARG_CSTRING(0);
char * save = in_string;
// 1024 bytes should be sufficient for one coordinate
char curr_buffer[1024];
bool corrupted = false;
float8 * coordinates = NULL;
unsigned int dimensions = 0, i = 0;

coordinates = (float8 *) palloc(sizeof(float8));

// allocate the memory
// read bytewise and count the ',' in order to determine the amount of dimensions, after that: parse the point
in_string = find_char(in_string, '(');
// next sign after opening bracket
++in_string;
// read as long, as no closing bracket has been found
for (dimensions = 0; *in_string != ')' && !corrupted; ++dimensions)
{
// clear the memory
memset(curr_buffer, 0, 1024);
for (i = 0; *in_string != ',' && *in_string != 0 && *in_string != ')'; ++in_string, ++i)
{
if ((*in_string < '0' || *in_string > '9') && *in_string != '.')
{
corrupted = true;
break;
}
// copy current sign
curr_buffer[i] = *in_string;
}

// something has happened here (no valid number)
if (corrupted)
{
break;
}
coordinates = (float8 *) repalloc(coordinates, (dimensions + 1) * sizeof(float8));
//sscanf(curr_buffer, "%f", &coordinates[dimensions]);
coordinates[dimensions] = strtod(curr_buffer, NULL);

// if we have a comma here, skip it
if (*in_string == ',')
{
++in_string;
}
if (*in_string == ' ')
{
// skip space
++in_string;
}
}

// something lead to a corruption of the point
if (corrupted)
{
ereport(ERROR, (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), errmsg("invalid representation of a point: %s, has to look like \"(<coord1>,<coord2>,...)\"", save)));
}
else
{
in = (PointND *) palloc(sizeof(float8) * dimensions + VARHDRSZ);
SET_VARSIZE(in, dimensions);
// copy the coordinates to the data area
memcpy((void *) VARDATA(in), (void *) coordinates, (dimensions) * sizeof(float8));
}

PG_RETURN_POINTER(in);
}

So, probably in here, I am doing something terribly wrong and you could correct me at this point. As I already said, I thought that my int32 dimensions variable would represent exactly this vl_len_header field. The remaining stuff is OK then? That means the creation script and information about storage alignment and layout or do you detect addititonal errors?
I know, that this code might not be the best, but I am still in the testing phase of how to achieve sth using Postgres with C.
Thanks so far for your advice
Best regards
Carsten Kropf
Am 10.02.2010 um 11:39 schrieb Yeb Havinga:


Carsten Kropf wrote:
The usage of static types with fixed length was actually no problem for me, so I proceeded to variable length types.
I created an n-dimensional point structure called "PointND" that contains a field of float8 values of dynamic length. I also put in a int4/int32 field for the length specification, as required by the documentation. So the structure looks like the following:
struct PointND
{
int32 dimensions;
float8 coordinates[1];
};
The structure should begin with a int32 vl_len_ header. At creation / palloc time the size must be set with the SET_VARSIZE macro, and the size can be queried with the VARSIZE_ * macros -> doxygen.postgresql.org is your friend here. Take a look at e.g. contrib/cube for examples.

regards,
Yeb Havinga


Re: Extending SQL in C using VARIABLE length type

От
Yeb Havinga
Дата:
Carsten Kropf wrote:
> Actually, I thought, I did this using the int32 variable called
> "dimension" which should be exactly this field.
yes.
> in = (PointND *) palloc(sizeof(float8) * dimensions + VARHDRSZ);
> SET_VARSIZE(in, dimensions);
What about

len = sizeof(float8) * dimensions + VARHDRSZ;
in = (PointND *) palloc0(len);
SET_VARSIZE(in, len);



Re: Extending SQL in C using VARIABLE length type

От
Carsten Kropf
Дата:
Oh, I see, does the VARSIZE length field have to be the total number of bytes occupied (including VARHDRSZ and the size
ofthe structure) or only the size that is used by "my" datatype? Then it would become pretty much obvious, why this is
notsupposed to work. 
I'll try it out then.

regards
    Carsten Kropf
Am 10.02.2010 um 12:04 schrieb Yeb Havinga:

> Carsten Kropf wrote:
>> Actually, I thought, I did this using the int32 variable called "dimension" which should be exactly this field.
> yes.
>> in = (PointND *) palloc(sizeof(float8) * dimensions + VARHDRSZ);
>> SET_VARSIZE(in, dimensions);
> What about
>
> len = sizeof(float8) * dimensions + VARHDRSZ;
> in = (PointND *) palloc0(len);
> SET_VARSIZE(in, len);
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: Extending SQL in C using VARIABLE length type

От
Yeb Havinga
Дата:
Carsten Kropf wrote:
> Oh, I see, does the VARSIZE length field have to be the total number of bytes occupied (including VARHDRSZ and the
sizeof the structure) or only the size that is used by "my" datatype? 
Yes
>  Then it would become pretty much obvious, why this is not supposed to work.
> I'll try it out then.
>
My €0,02: rename the dimensions to vl_len_ to avoid confusion and get
compiler errors where you now use 'dimension'. Add a macro that converts
a pointnd structure to dimension int by taking the VARSIZE_ANY_EXHDR /
sizeof(float8) and use it where dimension is used now. Or if your
database is small you could keep dimension in the structure.


Re: Extending SQL in C using VARIABLE length type

От
Carsten Kropf
Дата:
Thanks for the hint according to the cube, this was actually exactly what I have been looking for. I wanted to do
somethingsimilar like the cube but I didn't think that it would be implemented in multiple dimension. I just thought,
thecube were a 3-d construct, but as I see in the sources, it is, in fact, n-dimensional. So my problems are solved
here.Btw I could manage to get my own point to be saved in the database using your hints, thanks for this. 

regards
    Carsten Kropf
Am 10.02.2010 um 12:20 schrieb Yeb Havinga:

> Carsten Kropf wrote:
>> Oh, I see, does the VARSIZE length field have to be the total number of bytes occupied (including VARHDRSZ and the
sizeof the structure) or only the size that is used by "my" datatype? 
> Yes
>> Then it would become pretty much obvious, why this is not supposed to work.
>> I'll try it out then.
>>
> My €0,02: rename the dimensions to vl_len_ to avoid confusion and get compiler errors where you now use 'dimension'.
Adda macro that converts a pointnd structure to dimension int by taking the VARSIZE_ANY_EXHDR / sizeof(float8) and use
itwhere dimension is used now. Or if your database is small you could keep dimension in the structure. 
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: Extending SQL in C using VARIABLE length type

От
Carsten Kropf
Дата:
Thanks a lot so far. I adopted my structures and am now storing two fields (v_len_ and dimensions) and the storage is now working properly. If I now would try to combine two of these points to a range (like cube) including an upper and a lower bound n-dimensional point structure, I don't get the point how to achieve this.
I tried around a little bit and ended up with a structure like the following:
/**
 * basic structure definition for a range containing an upper and a lower point (in multiple dimensions)
 */
struct Range
{
int32 v_len_;
/**
* the upper limit in each dimension
*/
struct PointND * upper;
/**
* the lower limit in each dimension
*/
struct PointND * lower;
};
However, the problem is again, how to put this range into a table. Actually, I don't know exactly, how to do this, I tried the following:
len = VARSIZE(upper) + VARSIZE(lower) + VARHDRSZ + 2 * sizeof(struct Point *);
result = (Range *) palloc0(len);
// result->upper = upper;
// result->lower = lower;
memcpy((void *) result->upper, (void *) upper, VARSIZE(upper));
memcpy((void *) result->lower, (void *) lower, VARSIZE(lower));

// set the var size
SET_VARSIZE(result, len);
But this didn't do the trick. I did not yet find sth in the code of postgres, how to build such a combined type, unfortunately (or I did not look at the right places until now). How would one do this?

Thanks in advance
regards
Carsten Kropf
Am 10.02.2010 um 12:20 schrieb Yeb Havinga:

Carsten Kropf wrote:
Oh, I see, does the VARSIZE length field have to be the total number of bytes occupied (including VARHDRSZ and the size of the structure) or only the size that is used by "my" datatype?
Yes
Then it would become pretty much obvious, why this is not supposed to work.
I'll try it out then.
 
My €0,02: rename the dimensions to vl_len_ to avoid confusion and get compiler errors where you now use 'dimension'. Add a macro that converts a pointnd structure to dimension int by taking the VARSIZE_ANY_EXHDR / sizeof(float8) and use it where dimension is used now. Or if your database is small you could keep dimension in the structure.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Extending SQL in C using VARIABLE length type

От
Tom Lane
Дата:
Carsten Kropf <ckropf2@fh-hof.de> writes:
> Thanks a lot so far. I adopted my structures and am now storing two fields (v_len_ and dimensions) and the storage is
nowworking properly. If I now would try to combine two of these points to a range (like cube) including an upper and a
lowerbound n-dimensional point structure, I don't get the point how to achieve this. 

Well, you can either make that a separate data type with its own
specialized functions for extracting the points, or you can make it into
an array, or possibly a composite type (record).  The example code seems
to be headed towards the first of these but maybe you should consider
the alternatives.

The main problem with your example code seems to be that it's confusing
a struct with a pointer to a struct.  If you made the struct members
be "struct PointND" rather than pointer to same, it would probably
work, as long as points aren't actually variable-length.  Otherwise
you'd need to deal with the fact that "lower" isn't really at a fixed
offset in the larger struct.

            regards, tom lane

Re: Extending SQL in C using VARIABLE length type

От
Carsten Kropf
Дата:
Thanks for this hint, I already got it to work in the meantime.
My approach now (based on the fact, that PointND is indeed a variable length type) is to have the following structure:
struct Range
{
    int vl_len_;
    struct PointND limits[1];
};
whereas now vl_len_ stores the total size of the structure (including the size of the limiting points) and the points
themselveskeep the layout contained in one of my previous mails. 
I programmed some macros to access the lower point (limits[0]) and the upper point (limits[1]), respectively.
This approach works also when putting it into a table, whereas it will not be like this, if I will be doing the whole
thingfor productive implementation. Each of the two points contains a vl_len_ field and a dimension field whereas the
dimensionshave to be the same for each point. So, according the storage of the range structure, this would yield an
overheadof 2 * vl_len_ and once the dimension field each of which is int32. So, in total, we would have an overhead of
12Bytes for each Range stored somewhere compared to a Range that has just one-dimensional offsets to show the
upper/lowerbound, as seen in cube. 
regards
    Carsten Kropf
Am 11.02.2010 um 16:39 schrieb Tom Lane:

> Carsten Kropf <ckropf2@fh-hof.de> writes:
>> Thanks a lot so far. I adopted my structures and am now storing two fields (v_len_ and dimensions) and the storage
isnow working properly. If I now would try to combine two of these points to a range (like cube) including an upper and
alower bound n-dimensional point structure, I don't get the point how to achieve this. 
>
> Well, you can either make that a separate data type with its own
> specialized functions for extracting the points, or you can make it into
> an array, or possibly a composite type (record).  The example code seems
> to be headed towards the first of these but maybe you should consider
> the alternatives.
>
> The main problem with your example code seems to be that it's confusing
> a struct with a pointer to a struct.  If you made the struct members
> be "struct PointND" rather than pointer to same, it would probably
> work, as long as points aren't actually variable-length.  Otherwise
> you'd need to deal with the fact that "lower" isn't really at a fixed
> offset in the larger struct.
>
>             regards, tom lane
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general