Обсуждение: Performance penalty when requesting text values in binary format

Поиск
Список
Период
Сортировка

Performance penalty when requesting text values in binary format

От
Jack Christensen
Дата:
I'm the creator of the PostgreSQL driver pgx (https://github.com/jackc/pgx) for the Go language. I have found significant performance advantages to using the extended protocol and binary format values -- in particular for types such as timestamptz.

However, I was recently very surprised to find that it is significantly slower to select a text type value in the binary format. For an example case of selecting 1,000 rows each with 5 text columns of 16 bytes each the application time from sending the query to having received the entire response is approximately 16% slower. Here is a link to the test benchmark: https://github.com/jackc/pg_text_binary_bench

Given that the text and binary formats for the text type are identical I would not have expected any performance differences.

 My C is rusty and my knowledge of the PG server internals is minimal but the performance difference appears to be that function textsend creates an extra copy where textout simply returns a pointer to the existing data. This seems to be superfluous.

I can work around this by specifying the format per result column instead of specifying binary for all but this performance bug / anomaly seemed worth reporting.

Jack


Re: Performance penalty when requesting text values in binary format

От
Laurenz Albe
Дата:
On Sat, 2020-05-16 at 20:12 -0500, Jack Christensen wrote:
> I'm the creator of the PostgreSQL driver pgx (https://github.com/jackc/pgx) for the Go language.
> I have found significant performance advantages to using the extended protocol and binary format
> values -- in particular for types such as timestamptz.
> 
> However, I was recently very surprised to find that it is significantly slower to select a text
> type value in the binary format. For an example case of selecting 1,000 rows each with 5 text
> columns of 16 bytes each the application time from sending the query to having received the
> entire response is approximately 16% slower. Here is a link to the test benchmark:
> https://github.com/jackc/pg_text_binary_bench
> 
> Given that the text and binary formats for the text type are identical I would not have
> expected any performance differences.
> 
>  My C is rusty and my knowledge of the PG server internals is minimal but the performance
> difference appears to be that function textsend creates an extra copy where textout
> simply returns a pointer to the existing data. This seems to be superfluous.
> 
> I can work around this by specifying the format per result column instead of specifying
> binary for all but this performance bug / anomaly seemed worth reporting.

Did you profile your benchmark?
It would be interesting to know where the time is spent.

Yours,
Laurenz Albe




Re: Performance penalty when requesting text values in binary format

От
Jack Christensen
Дата:
On Mon, May 18, 2020 at 7:07 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Did you profile your benchmark?
It would be interesting to know where the time is spent.

Unfortunately, I have not. Fortunately, it appears that Tom Lane recognized this as a part of another issue and has prepared a patch.


Thanks,
Jack