Bytea network traffic: binary vs text result format

Поиск
Список
Период
Сортировка
От Miha D. Puc
Тема Bytea network traffic: binary vs text result format
Дата
Msg-id 466299A7.20403@eba.si
обсуждение исходный текст
Ответы Re: Bytea network traffic: binary vs text result format  (Markus Schiltknecht <markus@bluegap.ch>)
Re: Bytea network traffic: binary vs text result format  (Andrew McNamara <andrewm@object-craft.com.au>)
Список pgsql-interfaces
Hi!

There was some debate recently about using text or binary format.
There's people who would like to use it but have trouble converting
binary encoded results into native types and there's people that say
there's not much performance difference.

I'd like to stress that performance is very different over slow
network.  The biggest difference is for byte where the text format
performance is about 3.5 times worse at inserts and updates and about
2.9 times worse at selects . Here's the reasoning:

In text format bytea are escaped using PQescapeBytea. In an average
binary stream about 2/3 would be escaped. Each escaped byte becomes of
form \\ooo at upload and of form \ooo for download, so the size of the
escaped stream is 1/3 + 2/3 * 5 = 11/3 = 3.6 and 1/3 + 2/3 * 4 = 3
respectively.

Here are the results of my test. I inserted and selected an OpenOffice
document of size 2Mb over a 2M/512K cable.
text format: insert: 120.1s select: 24.9s
binary format: insert: 33.5s select: 8.6s
factor:  insert: 3.6 select: 2.9

The difference between the test and the above calculation comes from the
estimate that 2/3 of bytes are escaped where in fact 95 out of 256 are
escaped (63%).

So there is a need (people asking) and reason (performance) to use
binary format. But there's a huge drawback - the conversions. It's easy
for varchar, not too bad for basic types (int, float, bool), effort is
needed for timestamp, date, time and numeric is a pain.

So with all the above there should be a utility for conversion between
binary format and native types and/or string format in libpq.

Regards,
Miha Puc




В списке pgsql-interfaces по дате отправления:

Предыдущее
От: "Francisco Figueiredo Jr."
Дата:
Сообщение: Re: Question on NpgsqlParameter object
Следующее
От: Markus Schiltknecht
Дата:
Сообщение: Re: Bytea network traffic: binary vs text result format