Обсуждение: Using libpq, floats and binary data

Поиск
Список
Период
Сортировка

Using libpq, floats and binary data

От
Michael Hull
Дата:
Hi Guys,
I have jsut started to use the libpq interface to postgre, and was
wondering if someone with some experience could explain the 'best
practices' :)

I am preparing a query, then submitting it with binary data. 6 values
are ints, and one is a float.

For the ints, I understand that since I need network byte ordering, I
have to pump them through htonl()
The same is also true for the floats I assume, so I have dones some
dirty,dirty casting, but this feels like a source for problems later
on.
It does the correct thing, but is there a better way of doing it??


Mike


char qString[] = "INSERT INTO raw_results (ca,kf,ks,lk,na,iinj,
AP1height) VALUES( $1::int4, $2::int4, $3::int4, $4::int4, $5::int4,
$6::int4, $7::float4)";

PGresult* pPreparedQuery = PQprepare(pDB, "InsertQ",qString, nParams, NULL);




float AP1Height = 3.141;
/* TROUBLE AHEAD! */
if( sizeof( float) != sizeof(int32_t) ) throw MyException();
int32_t AP_int = * ( (int32_t*) &AP1Height );
int32_t AP_intSwapped = htonl(AP_int);
float AP1HeightSwapped = *( (float*) &AP_intSwapped);
/* TROUBLE OVER */


int32_t ca= htonl(100);
int32_t kf= htonl(200);
int32_t ks= htonl(300);
int32_t lk= htonl(400);
int32_t na= htonl(500);
int32_t iinj= htonl(600);

const char* values[nParams] = {
                (char*)&ca,
                (char*)&kf,
                (char*)&ks,
                (char*)&lk,
                (char*)&lk,
                (char*)&iinj,
                (char*)&AP1HeightSwapped,
            };
int lengths[nParams] = {
        sizeof(ca),
        sizeof(kf),
        sizeof(ks),
            sizeof(lk),
        sizeof(na),
        sizeof(iinj),
        sizeof(AP1Height),
        };

int binary[nParams]={1,1,1,1,1, 1,1};

PGresult* res = PQexecPrepared( pDB, "InsertQ", nParams, values,
lengths,binary,0);

Re: Using libpq, floats and binary data

От
Merlin Moncure
Дата:
zz

On Wed, Sep 15, 2010 at 12:02 PM, Michael Hull
<mikehulluk@googlemail.com> wrote:
> Hi Guys,
> I have jsut started to use the libpq interface to postgre, and was
> wondering if someone with some experience could explain the 'best
> practices' :)
>
> I am preparing a query, then submitting it with binary data. 6 values
> are ints, and one is a float.
>
> For the ints, I understand that since I need network byte ordering, I
> have to pump them through htonl()
> The same is also true for the floats I assume, so I have dones some
> dirty,dirty casting, but this feels like a source for problems later
> on.
> It does the correct thing, but is there a better way of doing it??
>
>
> Mike
>
>
> char qString[] = "INSERT INTO raw_results (ca,kf,ks,lk,na,iinj,
> AP1height) VALUES( $1::int4, $2::int4, $3::int4, $4::int4, $5::int4,
> $6::int4, $7::float4)";
>
> PGresult* pPreparedQuery = PQprepare(pDB, "InsertQ",qString, nParams, NULL);
>
>
>
>
> float AP1Height = 3.141;
> /* TROUBLE AHEAD! */
> if( sizeof( float) != sizeof(int32_t) ) throw MyException();
> int32_t AP_int = * ( (int32_t*) &AP1Height );
> int32_t AP_intSwapped = htonl(AP_int);
> float AP1HeightSwapped = *( (float*) &AP_intSwapped);
> /* TROUBLE OVER */
>
>
> int32_t ca= htonl(100);
> int32_t kf= htonl(200);
> int32_t ks= htonl(300);
> int32_t lk= htonl(400);
> int32_t na= htonl(500);
> int32_t iinj= htonl(600);
>
> const char* values[nParams] = {
>                                (char*)&ca,
>                                (char*)&kf,
>                                (char*)&ks,
>                                (char*)&lk,
>                                (char*)&lk,
>                                (char*)&iinj,
>                                (char*)&AP1HeightSwapped,
>                        };
> int lengths[nParams] = {
>                sizeof(ca),
>                sizeof(kf),
>                sizeof(ks),
>                sizeof(lk),
>                sizeof(na),
>                sizeof(iinj),
>                sizeof(AP1Height),
>                };
>
> int binary[nParams]={1,1,1,1,1, 1,1};
>
> PGresult* res = PQexecPrepared( pDB, "InsertQ", nParams, values,
> lengths,binary,0);

there is a much better way -- libpqtypes:
http://libpqtypes.esilo.com/

char qString[] = "INSERT INTO raw_results
(ca,kf,ks,lk,na,iinj,AP1height) VALUES( %int4,  %int4,  %int4,  %int4,
%int4, %int4,  %float4)";

res = PQexecf(conn, qString, ca, kf, ks, lk, na, iinj, AP1Height);

getting data out is similarly easy.

merlin

Re: Using libpq, floats and binary data

От
Arjen Nienhuis
Дата:
Hi,

Inserting many of rows is almost always IO bound. Converting ints and floats to text is CPU bound and really fast anyway. To speed things up first look at things like indexes, how often you need to COMMIT or using COPY. Only then look at prepared statements and binary transfer modes. Else it's simply not worth the headache.

Groeten, Arjen

On Wed, Sep 15, 2010 at 6:02 PM, Michael Hull <mikehulluk@googlemail.com> wrote:
Hi Guys,
I have jsut started to use the libpq interface to postgre, and was
wondering if someone with some experience could explain the 'best
practices' :)

I am preparing a query, then submitting it with binary data. 6 values
are ints, and one is a float.

For the ints, I understand that since I need network byte ordering, I
have to pump them through htonl()
The same is also true for the floats I assume, so I have dones some
dirty,dirty casting, but this feels like a source for problems later
on.
It does the correct thing, but is there a better way of doing it??


Mike


char qString[] = "INSERT INTO raw_results (ca,kf,ks,lk,na,iinj,
AP1height) VALUES( $1::int4, $2::int4, $3::int4, $4::int4, $5::int4,
$6::int4, $7::float4)";

PGresult* pPreparedQuery = PQprepare(pDB, "InsertQ",qString, nParams, NULL);




float AP1Height = 3.141;
/* TROUBLE AHEAD! */
if( sizeof( float) != sizeof(int32_t) ) throw MyException();
int32_t AP_int = * ( (int32_t*) &AP1Height );
int32_t AP_intSwapped = htonl(AP_int);
float AP1HeightSwapped = *( (float*) &AP_intSwapped);
/* TROUBLE OVER */


int32_t ca= htonl(100);
int32_t kf= htonl(200);
int32_t ks= htonl(300);
int32_t lk= htonl(400);
int32_t na= htonl(500);
int32_t iinj= htonl(600);

const char* values[nParams] = {
                               (char*)&ca,
                               (char*)&kf,
                               (char*)&ks,
                               (char*)&lk,
                               (char*)&lk,
                               (char*)&iinj,
                               (char*)&AP1HeightSwapped,
                       };
int lengths[nParams] = {
               sizeof(ca),
               sizeof(kf),
               sizeof(ks),
               sizeof(lk),
               sizeof(na),
               sizeof(iinj),
               sizeof(AP1Height),
               };

int binary[nParams]={1,1,1,1,1, 1,1};

PGresult* res = PQexecPrepared( pDB, "InsertQ", nParams, values,
lengths,binary,0);

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Using libpq, floats and binary data

От
Merlin Moncure
Дата:
On Wed, Sep 15, 2010 at 6:18 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com> wrote:
> Hi,
> Inserting many of rows is almost always IO bound. Converting ints and floats
> to text is CPU bound and really fast anyway. To speed things up first look
> at things like indexes, how often you need to COMMIT or using COPY. Only
> then look at prepared statements and binary transfer modes. Else it's simply
> not worth the headache.

That's an awfully broad statement, and untrue...many applications are
cpu bound. It's easier to scale storage than cpu after a point.  Also,
solid state storage is going to become increasingly common moving
forwards.

Not all type receiving parsing is trivial as you claim; timestamps and
bytea for example are significantly cheaper to send in binary wire
format.  Anyways, libpqtypes gives you all the advantages without all
the fuss.  If you are really looking to shave cycles we allow you to
prepare the format string as well as prepare the statement before
sending it.  We wrote this interface for a reason: I'd say on average
it cuts down query time around 20% on average in addition to the other
advantages it provides.

merlin

Re: Using libpq, floats and binary data

От
Arjen Nienhuis
Дата:
On Thu, Sep 16, 2010 at 12:37 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Wed, Sep 15, 2010 at 6:18 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com> wrote:
> Hi,
> Inserting many of rows is almost always IO bound. Converting ints and floats
> to text is CPU bound and really fast anyway. To speed things up first look
> at things like indexes, how often you need to COMMIT or using COPY. Only
> then look at prepared statements and binary transfer modes. Else it's simply
> not worth the headache.

That's an awfully broad statement, and untrue...many applications are
cpu bound. It's easier to scale storage than cpu after a point.  Also,
solid state storage is going to become increasingly common moving
forwards.

Not all type receiving parsing is trivial as you claim; timestamps and
bytea for example are significantly cheaper to send in binary wire
format.  Anyways, libpqtypes gives you all the advantages without all
the fuss.  If you are really looking to shave cycles we allow you to
prepare the format string as well as prepare the statement before
sending it.  We wrote this interface for a reason: I'd say on average
it cuts down query time around 20% on average in addition to the other
advantages it provides.

merlin

I think we can agree on one thing: trying to do it without libpqtypes is a bad idea.

Groeten, Arjen

Re: Using libpq, floats and binary data

От
Merlin Moncure
Дата:
On Wed, Sep 15, 2010 at 6:52 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com> wrote:
> On Thu, Sep 16, 2010 at 12:37 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Wed, Sep 15, 2010 at 6:18 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com>
>> wrote:
>> > Hi,
>> > Inserting many of rows is almost always IO bound. Converting ints and
>> > floats
>> > to text is CPU bound and really fast anyway. To speed things up first
>> > look
>> > at things like indexes, how often you need to COMMIT or using COPY. Only
>> > then look at prepared statements and binary transfer modes. Else it's
>> > simply
>> > not worth the headache.
>>
>> That's an awfully broad statement, and untrue...many applications are
>> cpu bound. It's easier to scale storage than cpu after a point.  Also,
>> solid state storage is going to become increasingly common moving
>> forwards.
>>
>> Not all type receiving parsing is trivial as you claim; timestamps and
>> bytea for example are significantly cheaper to send in binary wire
>> format.  Anyways, libpqtypes gives you all the advantages without all
>> the fuss.  If you are really looking to shave cycles we allow you to
>> prepare the format string as well as prepare the statement before
>> sending it.  We wrote this interface for a reason: I'd say on average
>> it cuts down query time around 20% on average in addition to the other
>> advantages it provides.
>>
>> merlin
>
> I think we can agree on one thing: trying to do it without libpqtypes is a
> bad idea.
> Groeten, Arjen

Yeah, except in the special case of bytea, so you can avoid
escaping/encoding and the fairly large efficiency hit.  The
performance benefit of sending ints in binary vs text is barely
measurable.  Timestamps are the biggest win of the common types,
especially if you are starting from something that is itself a binary
time representation.

Another reason a lot of people try and eek every cycle out of the
query they can (preparing the statement and such) is to try and cut
down query execution time as much as possible even in cases where it
doesn't make much difference in terms of overall system load.   This
can happen for example when you have generated sql that crawls the
server record by record.  We all know this is often not the best way,
but when working under those constraints the method of query execution
becomes quite important.   exec prepared in binary vs exec params in
text can result in a greater than 50% reduction in query time in such
cases (some of that benefit is on the client).  If you're firing off
big queries (in terms of running time), obviously all this matters
very little.

merlin