Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with alot of columns

Поиск
Список
Период
Сортировка
От Thom Brown
Тема Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with alot of columns
Дата
Msg-id CAA-aLv7UYu3L7gEeBjOH-7a1O4PmX9huR9dYm2m=NO9VNTP-YA@mail.gmail.com
обсуждение исходный текст
Ответ на [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns  (Andres Freund <andres@anarazel.de>)
Ответы Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with alot of columns  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On 14 September 2017 at 07:34, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> When running workloads that include fast queries with a lot of columns,
> SendRowDescriptionMessage(), and the routines it calls, becomes a
> bottleneck.  Besides syscache lookups (see [1] and [2]) a major cost of
> that is manipulation of the StringBuffer and the version specific
> branches in the per-attribute loop.  As so often, the performance
> differential of this patch gets bigger when the other performance
> patches are applied.
>
> The issues in SendRowDescriptionMessage() are the following:
>
> 1) All the pq_sendint calls, and also the pq_sendstring, are
>    expensive. The amount of calculations done for a single 2/4 byte
>    addition to the stringbuffer (particularly enlargeStringInfo()) is
>    problematic, as are the reallocations themselves.
>
>    I addressed this by adding pq_send*_pre() wrappers, implemented as
>    inline functions, that require that memory is pre-allocated.
>    Combining that with doing a enlargeStringInfo() in
>    SendRowDescriptionMessage() that pre-allocates the maximum required
>    memory, yields pretty good speedup.
>
>    I'm not yet super sure about the implementation. For one, I'm not
>    sure this shouldn't instead be stringinfo.h functions, with very very
>    tiny pqformat.h wrappers. But conversely I think it'd make a lot of
>    sense for the pqformat integer functions to get rid of the
>    continually maintained trailing null-byte - I was hoping the compiler
>    could optimize that away, but alas, no luck.  As soon as a single
>    integer is sent, you can't rely on 0 terminated strings anyway.
>
> 2) It creates a new StringInfo in every iteration. That causes
>    noticeable memory management overhead, and restarts the size of the
>    buffer every time. When the description is relatively large, that
>    leads to a number of reallocations for every
>    SendRowDescriptionMessage() call.
>
>    My solution here was to create persistent StringInfo for
>    SendRowDescriptionMessage() that never gets deallocated (just
>    reset). That in combination with new versions of
>    pq_beginmessage/endmessage that keep the buffer alive, yields a nice
>    speedup.
>
>    Currently I'm using a static variable to allocate a string buffer for
>    the function. It'd probably better to manage that outside of a single
>    function - I'm also wondering why we're allocating a good number of
>    stringinfos in various places, rather than reuse them. Most of them
>    can't be entered recursively, and even if that's a concern, we could
>    have one reusable per portal or such.
>
> 3) The v2/v3 branches in the attribute loop are noticeable (others too,
>    but well...).  I solved that by splitting out the v2 and v3
>    per-attribute loops into separate functions.  Imo also a good chunk
>    more readable.
>
> Comments?

I've run a fairly basic test with a table with 101 columns, selecting
a single row from the table and I get the following results:


Columns with 1-character names:

master (80 jobs, 80 connections, 60 seconds):

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 559275
latency average = 8.596 ms
tps = 9306.984058 (including connections establishing)
tps = 11144.622096 (excluding connections establishing)


patched:

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 585526
latency average = 8.210 ms
tps = 9744.240847 (including connections establishing)
tps = 11454.339301 (excluding connections establishing)


master (80 jobs, 80 connections, 120 seconds):

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 1108312
latency average = 8.668 ms
tps = 9229.356994 (including connections establishing)
tps = 9987.385281 (excluding connections establishing)


patched:

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 1130313
latency average = 8.499 ms
tps = 9412.904876 (including connections establishing)
tps = 10319.404302 (excluding connections establishing)



Columns with at least 42-character names:

master (80 jobs, 80 connections, 60 seconds):

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 197815
latency average = 24.308 ms
tps = 3291.157486 (including connections establishing)
tps = 3825.309489 (excluding connections establishing)



patched:

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 198549
latency average = 24.214 ms
tps = 3303.896651 (including connections establishing)
tps = 3895.738024 (excluding connections establishing)



master (80 jobs, 80 connections, 120 seconds):

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 391312
latency average = 24.551 ms
tps = 3258.493026 (including connections establishing)
tps = 3497.581844 (excluding connections establishing)



patched:

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 391733
latency average = 24.526 ms
tps = 3261.805060 (including connections establishing)
tps = 3552.604408 (excluding connections establishing)


This is just using the patches you posted on this thread.  Does this
exercise the patch in the way you intended?

Regards

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: [HACKERS] Patch: add --if-exists to pg_recvlogical
Следующее
От: Amit Khandekar
Дата:
Сообщение: Re: [HACKERS] UPDATE of partition key