Re: [HACKERS] PATCH: Batch/pipelining support for libpq

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Дата
Msg-id 20170622002908.jh5jlipswgsldswk@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] PATCH: Batch/pipelining support for libpq  (Andres Freund <andres@anarazel.de>)
Ответы Re: [HACKERS] PATCH: Batch/pipelining support for libpq  (Craig Ringer <craig@2ndquadrant.com>)
Список pgsql-hackers
On 2017-06-21 16:40:48 -0700, Andres Freund wrote:
> On 2017-06-20 17:51:23 +0200, Daniel Verite wrote:
> >     Andres Freund wrote:
> > 
> > > FWIW, I still think this needs a pgbench or similar example integration,
> > > so we can actually properly measure the benefits.
> > 
> > Here's an updated version of the patch I made during review,
> > adding \beginbatch and \endbatch to pgbench.
> > The performance improvement appears clearly
> > with a custom script of this kind:
> >   \beginbatch
> >      UPDATE pgbench_branches SET bbalance = bbalance + 1 WHERE bid = 0;
> >      ..above repeated 1000 times...
> >   \endbatch
> > 
> > versus the same with a BEGIN; END; pair instead of \beginbatch \endbatch
> > 
> > On localhost on my desktop I tend to see a 30% difference in favor
> > of the batch mode with that kind of test.
> > On slower networks there are much bigger differences.
> 
> This is seriously impressive.  Just using the normal pgbench mixed
> workload, wrapping a whole transaction into a batch *doubles* the
> throughput.  And that's locally over a unix socket - the gain over
> actual network will be larger.

I've not analyzed this further, but something with the way network is
done isn't yet quite right either in the pgbench patch or in the libpq
patch.  You'll currently get IO like:

sendto(3, "B\0\0\0\22\0P1_2\0\0\0\0\0\0\1\0\0D\0\0\0\6P\0E\0\0\0\t\0"..., 36, MSG_NOSIGNAL, NULL, 0) = 36
sendto(3, "B\0\0\0\35\0P1_4\0\0\0\0\1\0\0\0\0073705952\0\1\0\0D\0"..., 47, MSG_NOSIGNAL, NULL, 0) = 47
sendto(3, "B\0\0\0\35\0P1_6\0\0\0\0\1\0\0\0\0077740854\0\1\0\0D\0"..., 47, MSG_NOSIGNAL, NULL, 0) = 47
sendto(3, "B\0\0\0\35\0P1_8\0\0\0\0\1\0\0\0\0071570280\0\1\0\0D\0"..., 47, MSG_NOSIGNAL, NULL, 0) = 47
sendto(3, "B\0\0\0\36\0P1_10\0\0\0\0\1\0\0\0\0072634305\0\1\0\0D"..., 48, MSG_NOSIGNAL, NULL, 0) = 48
sendto(3, "B\0\0\0\36\0P1_12\0\0\0\0\1\0\0\0\0078960656\0\1\0\0D"..., 48, MSG_NOSIGNAL, NULL, 0) = 48
sendto(3, "B\0\0\0\36\0P1_14\0\0\0\0\1\0\0\0\0073030370\0\1\0\0D"..., 48, MSG_NOSIGNAL, NULL, 0) = 48
sendto(3, "B\0\0\0\35\0P1_16\0\0\0\0\1\0\0\0\006376125\0\1\0\0D\0"..., 47, MSG_NOSIGNAL, NULL, 0) = 47
sendto(3, "B\0\0\0\36\0P1_18\0\0\0\0\1\0\0\0\0072982423\0\1\0\0D"..., 48, MSG_NOSIGNAL, NULL, 0) = 48
sendto(3, "B\0\0\0\36\0P1_20\0\0\0\0\1\0\0\0\0073860195\0\1\0\0D"..., 48, MSG_NOSIGNAL, NULL, 0) = 48
sendto(3, "B\0\0\0\36\0P1_22\0\0\0\0\1\0\0\0\0072794433\0\1\0\0D"..., 48, MSG_NOSIGNAL, NULL, 0) = 48
sendto(3, "B\0\0\0\36\0P1_24\0\0\0\0\1\0\0\0\0075475271\0\1\0\0D"..., 48, MSG_NOSIGNAL, NULL, 0) = 48
sendto(3, "B\0\0\0\23\0P1_25\0\0\0\0\0\0\1\0\0D\0\0\0\6P\0E\0\0\0\t"..., 37, MSG_NOSIGNAL, NULL, 0) = 37
sendto(3, "S\0\0\0\4", 5, MSG_NOSIGNAL, NULL, 0) = 5
recvfrom(3, "2\0\0\0\4n\0\0\0\4C\0\0\0\nBEGIN\0002\0\0\0\4T\0\0\0!\0"..., 16384, 0, NULL, NULL) = 775
recvfrom(3, 0x559a02667ff2, 15630, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667fb1, 15695, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667f6c, 15764, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667f2b, 15829, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667eea, 15894, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667ea9, 15959, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667e68, 16024, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667e24, 16092, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667de3, 16157, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667da2, 16222, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667d5d, 16291, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667d1c, 16356, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x559a02667d06, 16378, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)

I.e. we're doing tiny write send() syscalls (they should be coalesced)
and then completely unnecessarily call recv() over and over again
without polling.  To me it looks very much like we're just doing either
exactly once per command...

- Andres



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] Optional message to user when terminating/cancelling backend
Следующее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] Rules on table partitions