Fwd: Proper query implementation for Postgresql driver

Поиск
Список
Период
Сортировка
От Shay Rojansky
Тема Fwd: Proper query implementation for Postgresql driver
Дата
Msg-id CADT4RqD5u6owNwEoxOevAjc4YzHE+uMrjYosKoy5vuNgS2ZB5Q@mail.gmail.com
обсуждение исходный текст
Ответ на Proper query implementation for Postgresql driver  (Shay Rojansky <roji@roji.org>)
Ответы Re: Fwd: Proper query implementation for Postgresql driver  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Fwd: Proper query implementation for Postgresql driver  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Re: Fwd: Proper query implementation for Postgresql driver  (Craig Ringer <craig@2ndquadrant.com>)
Список pgsql-hackers
Hi again, a few days ago I sent a question (below) about using the extended query protocol efficiently (especially for non-prepared statements). Your responses spawned a good discussion between the npgsql developers, it's here if you wish to look/weigh in: https://github.com/npgsql/npgsql/issues/370

The idea of using extended query protocol for non-prepared queries raised another "best practices" question, which I'm hoping you can help with. To get rid of text encoding (which is locale-dependent, inefficient etc.) for certain fields, it seems that we have to get rid of it for *all* fields. This is because we send queries without knowing their result columns in advance, and would like to pack all messages into a single packet to avoid roundtrips. In other words, when we send a Bind message we don't yet have a RowDescription, and so we can' specify on an individual columnar basis which fields should be returned as binary and which as text. We're left with the option of either going "full binary" or having to support text encoding for *all* fields - the abolition of which was the whole point.

We can split the query process into two roundtrips - wait for the RowDescription and only then send Bind - in order to pick-and-choose text/binary on a columnar basis, but this is a non-trivial performance hit which we want to avoid.

Regarding the option of going "full binary"; selecting all types from pg_type shows hundreds of types. We'd have to provide binary encode/decode implementation for most (?) of them. In addition, say a new type is added (via an extension for example); text encoding at least had the advantage of not forcing us to support everything: the unknown type would be transferred as text and we'd provide that text to the user as an unprocessed string. Going full binary seems to eliminate this possibility.

I'd like to understand what we're supposed to do, as a Postgresql driver. Are we supposed to:
1) Go full binary and implement all types (but what to do about unknown ones)?
2) Do two roundtrips for queries, thereby hurting performance?
3) Support text encoding for all fields, and manage somehow with issues such as locale variations (by forcing the locale to be culture invariant, as we do now)?

Thanks again for your help!

Shay

---------- Forwarded message ----------
From: Shay Rojansky <roji@roji.org>
Date: Sun, Sep 28, 2014 at 11:53 AM
Subject: Proper query implementation for Postgresql driver
To: pgsql-hackers@postgresql.org


Hi.

I'm a core developer on npgsql (the Postgresql .NET provider), we're hard at work on a major 3.0 version. I have a fundamental question that's been bugging me for a while and would like to get your opinion on it.

Npgsql currently supports three basic query modes: simple, client-side parameters and prepared. The first two use the Postgresql simple query flow (client-side parameters means the user specifies parameters programmatically, just like with prepared queries, but the actual substitution work is done client-side). Prepared uses the Postgresql extended query flow.

According to the Postgresql docs (49.2.2), the simple query flow, "the format of the retrieved values is always text". This creates a burden where npgsql needs to parse textual (and locale-specific!) info (e.g. dates, money). The situation is even worse when doing client-side parameters, since npgsql has to *create* textual representations that match what Postgresql is expecting. The workaround for this issue up to now has been to switch to culture-invariant formatting (e.g. lc_monetary=C), but this approach imposes the setting on users and affects functions in ways they don't necessarily want.

I would, in theory, love to switch the entire thing to binary and thereby avoid all textual parsing once and for all. If I understand correctly, this means all queries must be implemented as extended queries, with numerous extra client-server roundtrips - which are a bit hard to stomach. Section 49.1.2 of the manual also states that the unnamed prepared statement and portal are optimized for the case of executing a query only once, hinting that this is the proper way to do things - but this optimization still cannot not eliminate the extra roundtrips mentioned above (PREPARE, BIND, EXECUTE).

Can someone please let me know what the recommended/best practice here would be?

Thanks,

Shay


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: proposal: plpgsql - Assert statement
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: Proper query implementation for Postgresql driver