Re: performance libpq vs JDBC

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: performance libpq vs JDBC
Дата
Msg-id AANLkTik-VoMb-8ok3dFTj8nsnAirigAcA9_41BZ2Lbca@mail.gmail.com
обсуждение исходный текст
Ответ на Re: performance libpq vs JDBC  (Richard Huxton <dev@archonet.com>)
Список pgsql-performance
On Thu, Dec 16, 2010 at 7:14 AM, Richard Huxton <dev@archonet.com> wrote:
> On 16/12/10 09:21, Werner Scholtes wrote:
>>
>> I assume that the wire protocol of PostgreSQL allows to transmit
>> multiple rows at once, but libpq doesn't have an interface to access it.
>> Is that right?
>
> Sounds wrong to me. The libpq client is the default reference implementation
> of the protocol. If there were large efficiencies that could be copied, they
> would be.
>
> Anyway - you don't need to assume what's in the protocol. It's documented
> here:
>  http://www.postgresql.org/docs/9.0/static/protocol.html
>
> I'd stick wireshark or some other network analyser on the two sessions - see
> exactly what is different.

There is only one explanation for the difference: they are slamming
data across the wire without waiting for the result.  libpq queries
are synchronous: you send a query, wait for the result.  This means
for very simple queries like the above you can become network bound.

In C/C++ you can work around this using a couple of different methods.
 COPY of course is the fastest, but extremely limiting in what it can
do.  We developed libpqtypes (I love talking about libpqtypes) to deal
with this problem.  In the attached example, it stacks data into an
array in the client, sends it to the server which unnests and inserts
it.  The attached example inserts a million rows in about 11 seconds
on my workstation (client side prepare could knock that down to 8 or
so).

If you need to do something fancy, the we typically create a receiving
function on the server in plpgsql which unnests() the result and makes
decisions, etc.  This is extremely powerful and you can compose and
send very rich data to/from postgres in a single query.

merlin

#include "libpq-fe.h"
#include "libpqtypes.h"

#define INS_COUNT 1000000

int main()
{
 int i;

 PGconn *conn = PQconnectdb("dbname=pg9");
 PGresult *res;
 if(PQstatus(conn) != CONNECTION_OK)
 {
   printf("bad connection");
   return -1;
 }

 PQtypesRegister(conn);

 PGregisterType type = {"ins_test", NULL, NULL};
 PQregisterComposites(conn, &type, 1);

 PGparam *p =  PQparamCreate(conn);
 PGarray arr;
 arr.param = PQparamCreate(conn);
 arr.ndims = 0;

 PGparam *t = PQparamCreate(conn);

 for(i=0; i<INS_COUNT; i++)
 {
   PGint4 a=i;
   PGtext b = "some_text";
   PGtimestamp c;
   PGbytea d;

   d.len = 8;
   d.data = b;

   c.date.isbc   = 0;
   c.date.year   = 2000;
   c.date.mon    = 0;
   c.date.mday   = 19;
   c.time.hour   = 10;
   c.time.min    = 41;
   c.time.sec    = 6;
   c.time.usec   = 0;
   c.time.gmtoff = -18000;

   PQputf(t, "%int4 %text %timestamptz %bytea", a, b, &c, &d);
   PQputf(arr.param, "%ins_test", t);
   PQparamReset(t);
 }

 if(!PQputf(p, "%ins_test[]", &arr))
 {
   printf("putf failed: %s\n", PQgeterror());
   return -1;
 }
 res = PQparamExec(conn, p, "insert into ins_test select * from
unnest($1) r(a, b, c, d)", 1);

 if(!res)
 {
   printf("got %s\n", PQgeterror());
   return -1;
 }
 PQclear(res);
 PQparamClear(p);
 PQfinish(conn);
}

В списке pgsql-performance по дате отправления:

Предыдущее
От: Krzysztof Nienartowicz
Дата:
Сообщение: Re: Help with bulk read performance
Следующее
От: "Pierre C"
Дата:
Сообщение: Re: Help with bulk read performance