Обсуждение: Why does the prepareThreshold=0 (to cover pgbouncer transaction mode) disables protocol binary transfers ? (w/ PoC patch and measurements)

Поиск
Список
Период
Сортировка
Hi,

Given our usage of pgjdbc' with ?prepareThreshold=0 connecting over pgbouncer with pool_mode=transactional we (with my
co-worker- Lukasz Pierzan) have spotted in app/pgjdbc Java profilers that pgjdbc is spending a lot of time in >>
convertingthe UUIDs from text to binary << when receiving a lot of UUIDs from DBs during resultSets fetching. Upon
furtherinspection we've concluded that even for basic simple SQL queries from pgjdbc ?prepareThreshold=0 like: "select
uuidtypefrom tab1mlnrows": 
a) PostgreSQL will convert from binary ::uuid datatype to String over TCP wire (CPU waste)
b) the network will transfer string form of UUIDs (network bandwidth and additional CPU waste for TCP/IP stacks on both
sides)
c) java/pgjdbc will read it from socket and convert it again from text to native UUID binary (big CPU waste)

The important thing here is that pgjdbc is capable of using binary transfers (and thus reduce CPU usage both on sending
andreceiving sides) of certain datatypes over sockets and Postgres protocol (and UUID is one of them), HOWEVER due to
prepareThreshold=0such option is being disabled and thus transfers happen to be much less efficient that they could be.
Thismeans that anybody using pgjdbc+PostgreSQL+pgbouncer/pool_mode=transactional stack (which seems to be most popular
combo?)is affected by this. 

We've hacked custom PoC pgjdbc patch that actually tries to enable binary transfers even with disabled prepared
statements.The binary transfer seemed to be disabled in two places: 
(1) org.postgresql.jdbc.PgStatement.isOneShotQuery() - for some reason getForceBinaryTransfer() was checked
(2) org.postgresql.core.v3.QueryExecutorImpl.sendParse() - this seems to be more tricky; fields[] was cleared
We only performed this change + enabled non-fully-documented "org.postgresql.forceBinary=true" for a sake of measuring
potentialperformance impact of enabling binary transfer and not to fully understand pgjdbc/protocols/methods various
usagecases, so the patch may be simply wrong as it stands out. The patch is even probably buggy as we do not have full
insight.

Performance impact of above bugfix/enhancement for "select uuidtype from tab1mlnrows":

env specs:
    client: 8 VCPU r5.2xlarge (1s4c8t), OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode),
    DBserver: 4 VCPU i3.2xlarge (1s2c4t), PostgreSQL 13.5, Amazon Linux 2
    network: same AWS zone, ping RTT between client and server 0.360ms

env preparation/DB:
    create extension "uuid-ossp"
    create table t1 as select  uuid_generate_v4() u from generate_series(1, 1000000) t;
client crude preparation:
    wget https://repo1.maven.org/maven2/org/postgresql/postgresql/42.3.4/postgresql-42.3.4.jar
    javac -cp postgresql-42.3.4.jar PostgresTest.java
    export CLASSPATH=.:postgresql-42.3.4.jar # or the patched pgjdbc
    java PostgresTest # adjust as required

default behavior/no optimization @ most recent pgjdbc (postgresql-42.3.4.jar)
    avg time of execution on PostgreSQL side: ~348ms +/- 50ms according to
pg_stat_statements.mean_exec_time/stddev_exec_time
    avg end-to-end time of exec+fetching on app side to recieve 1 mln UUIDs: ~1000..1100ms end-to-end
    1 call result size on the network: ~45MB (btw no SSL used)
    1 app @ client (1 javaapp/1 connection) -> ~460kB/s recieve, ~20% CPU of single PostgreSQL backend on DBserver,
~91%javaapp thread (single-thread pgjdbc bottleneck conversion) 
    4 apps @ client -> ~30% OS DBserver utilization --> therefore 4/0.3=~13VCPUs theoretically would be necessary to
fullyutilize 4VCPU DB PostgreSQL 

With attached PoC patch, the optimization of binary transfer (postgresql-42.3.5-SNAPSHOT.jar):
    avg time of execution on PostgreSQL side: ~216ms according to pg_stat_statements.mean_exec_time
    avg end-to-end time of exec+fetching on app side to recieve 1 mln UUIDs: ~240ms end-to-end
    1 call result size on the network: ~28MB
    1 app @ client (1 javaapp/1 connection) -> ~1.2MB/s receive stream , ~90% CPU of single PostgreSQL backend on
DBserver,~65% javaapp single thread 
    4 apps @ client -> ~90% OS DBserver utilization

Therefore the impact of sending text over binary using this simplistic testcase seems to be:
    - 1.61x CPU DBserver side
    - 4.58x end-to-end fetching time (with much more CPU processing time on client too)
    - 1.60x bandwidth overhead

Therefore we would ask for someone more familiar than us in those to matters to see if it isn't simply an omission
insidecode to potentially enable binary transfers even without prepared statements (isOneShot=true /
prepareThresholds=0)? Is there is any reason why disabling preparingStatements also disables binary transfers? The
settingas it stands couples two features into one variable.   

Or it that we should be more aware of prepareThreshold=-1 option (that's negative 1) that is pretty rare / not well
knownand actually it should be hinted on
https://www.pgbouncer.org/faq.html#how-to-use-prepared-statements-with-transaction-poolingto actually favor -1 instead
of0?   
Apparently there's even testPreparedStatement_negative1() testcase... however real testing reveal that using -1 with
pool_mode=transactionalshows that actually one hits "prepared statement "S_1" already exists" so it enables two things
again.Any chance to split those two features? 

- Lukasz Pierzan & Jakub Wartak

Вложения
On Fri, Apr 22, 2022 at 7:06 AM Jakub Wartak <Jakub.Wartak@tomtom.com> wrote:
prepareThreshold=0

i.e., you are preventing the use of prepared statements being sent from the client to the server.

As a first point of order, the PostgreSQL "Simple Query Protocol" [1] says:

"In simple Query mode, the format of retrieved values is always text, except when the given command is a FETCH from a cursor declared with the BINARY option."

It appears as though setting prepareThreshold=0 causes the driver to use the Simple Query Protocol.

By forcing binary transfer you override prepareThreshold=0 and use a prepared statement anyway because it is only possible to get the binary data via the Extended Query Protocol (parse, bind, execute).

My understanding is that "Prepared Statements" is simply a different name for "Extended Query Protocol".

Where I think JDBC (and maybe pgbouncer) is going wrong with this, is they don't make (allow for) proper use of the "unnamed prepared statement" which: "...lasts only until the next Parse statement specifying the unnamed statement as destination is issued." [2]  The code does have oneShot in the parse/bind/execute path so it is recognized...at least in JDBC.

By using the unnamed prepared statement you get easy use of the Extended Query Protocol without having to retain any meaningful state which can be messed up if improperly shared.  pgbouncer, in transaction mode, should just enforce "Parse/Bind/Execute <unnamed>" and then get out of the way.  The JDBC can provide whatever friendly UX it wants so long as the user can specify that they don't care about caching and only want to use the unnamed prepared statement.


Hopefully the above helps somewhat.  I tried getting my head around the JDBC end of this and it seems like they do have some provisions for it - whether they are sufficient or user-friendly is another matter.

You need to get your query into the "parse/bind/execute" flow path AND have it return true for oneShot; this will prevent a name from being assigned to the "prepared statement" and thus the dynamic <unnamed> one will be used for all three stages, and then whatever query comes along next can just do the same thing.  I would expect that to just work so far as usage of binary data for the UUID data goes.

I haven't done any tests, just some code review.

David J.

On 2022-04-22 Fr 14:05, David G. Johnston wrote:
> On Fri, Apr 22, 2022 at 7:06 AM Jakub Wartak <Jakub.Wartak@tomtom.com>
> wrote:
>
>     prepareThreshold=0
>
>
> i.e., you are preventing the use of prepared statements being sent
> from the client to the server.
>
> As a first point of order, the PostgreSQL "Simple Query Protocol" [1]
> says:
>
> "In simple Query mode, the format of retrieved values is always text,
> except when the given command is a FETCH from a cursor declared with
> the BINARY option."
>
> It appears as though setting prepareThreshold=0 causes the driver to
> use the Simple Query Protocol.
>
> By forcing binary transfer you override prepareThreshold=0 and use a
> prepared statement anyway because it is only possible to get the
> binary data via the Extended Query Protocol (parse, bind, execute).
>
> My understanding is that "Prepared Statements" is simply a different
> name for "Extended Query Protocol".
>
> Where I think JDBC (and maybe pgbouncer) is going wrong with this, is
> they don't make (allow for) proper use of the "unnamed prepared
> statement" which: "...lasts only until the next Parse statement
> specifying the unnamed statement as destination is issued." [2]  The
> code does have oneShot in the parse/bind/execute path so it is
> recognized...at least in JDBC.
>
> By using the unnamed prepared statement you get easy use of the
> Extended Query Protocol without having to retain any meaningful state
> which can be messed up if improperly shared.  pgbouncer, in
> transaction mode, should just enforce "Parse/Bind/Execute <unnamed>"
> and then get out of the way.  The JDBC can provide whatever friendly
> UX it wants so long as the user can specify that they don't care about
> caching and only want to use the unnamed prepared statement.
>
> [1]
> https://www.postgresql.org/docs/current/protocol-flow.html#id-1.10.5.7.4
> [2]
> https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY
>
> Hopefully the above helps somewhat.  I tried getting my head around
> the JDBC end of this and it seems like they do have some provisions
> for it - whether they are sufficient or user-friendly is another matter.
>
> You need to get your query into the "parse/bind/execute" flow path AND
> have it return true for oneShot; this will prevent a name from being
> assigned to the "prepared statement" and thus the dynamic <unnamed>
> one will be used for all three stages, and then whatever query comes
> along next can just do the same thing.  I would expect that to just
> work so far as usage of binary data for the UUID data goes.
>
> I haven't done any tests, just some code review.
>
>

How's that going to work if you have two prepared statements and want to
execute them in an interleaved fashion? How is the driver meant to know
when to use the unnamed statement and when not to?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




On Mon, Apr 25, 2022 at 1:08 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-04-22 Fr 14:05, David G. Johnston wrote:
> On Fri, Apr 22, 2022 at 7:06 AM Jakub Wartak <Jakub.Wartak@tomtom.com>
> wrote:
>
>     prepareThreshold=0
>
>
> i.e., you are preventing the use of prepared statements being sent
> from the client to the server.
>
> As a first point of order, the PostgreSQL "Simple Query Protocol" [1]
> says:
>
> "In simple Query mode, the format of retrieved values is always text,
> except when the given command is a FETCH from a cursor declared with
> the BINARY option."
>
> It appears as though setting prepareThreshold=0 causes the driver to
> use the Simple Query Protocol.
>
> By forcing binary transfer you override prepareThreshold=0 and use a
> prepared statement anyway because it is only possible to get the
> binary data via the Extended Query Protocol (parse, bind, execute).
>
> My understanding is that "Prepared Statements" is simply a different
> name for "Extended Query Protocol".
>
> Where I think JDBC (and maybe pgbouncer) is going wrong with this, is
> they don't make (allow for) proper use of the "unnamed prepared
> statement" which: "...lasts only until the next Parse statement
> specifying the unnamed statement as destination is issued." [2]  The
> code does have oneShot in the parse/bind/execute path so it is
> recognized...at least in JDBC.
>
> By using the unnamed prepared statement you get easy use of the
> Extended Query Protocol without having to retain any meaningful state
> which can be messed up if improperly shared.  pgbouncer, in
> transaction mode, should just enforce "Parse/Bind/Execute <unnamed>"
> and then get out of the way.  The JDBC can provide whatever friendly
> UX it wants so long as the user can specify that they don't care about
> caching and only want to use the unnamed prepared statement.
>
> [1]
> https://www.postgresql.org/docs/current/protocol-flow.html#id-1.10.5.7.4
> [2]
> https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY
>
> Hopefully the above helps somewhat.  I tried getting my head around
> the JDBC end of this and it seems like they do have some provisions
> for it - whether they are sufficient or user-friendly is another matter.
>
> You need to get your query into the "parse/bind/execute" flow path AND
> have it return true for oneShot; this will prevent a name from being
> assigned to the "prepared statement" and thus the dynamic <unnamed>
> one will be used for all three stages, and then whatever query comes
> along next can just do the same thing.  I would expect that to just
> work so far as usage of binary data for the UUID data goes.
>
> I haven't done any tests, just some code review.
>
>

How's that going to work if you have two prepared statements and want to
execute them in an interleaved fashion? How is the driver meant to know
when to use the unnamed statement and when not to?


Session configuration.  If you are using a pooler mode that simply doesn't play nice with named prepared statements you must configure JDBC to not use them (ever) and use only the unnamed prepared statement for parse/bind/execute.  The interleaving you want to do is simply not possible (or, rather, you will not get the benefit of actually having the prepared statement saved in a cache for re-use, it will be re-parsed every time).

This seems better than nothing given lots of uses of prepared statements are simply to get access to the extended query protocol's parse/bind/execute.  It is quite possible an even better solution exists if pgJDBC and pgbouncer cooperate and design and implement something to overcome this complaint.  I'm not going there myself (not that I'm implementing this suggestion) as this solution seems simpler and sufficiently effective for the majority of use cases.  It also likely doesn't impose any kind of constraints (especially as this is basically implementation details + configuration, not code-level API changes) on a more nuanced solution.

David J.






> Where I think JDBC (and maybe pgbouncer) is going wrong with this, is
> they don't make (allow for) proper use of the "unnamed prepared
> statement" which: "...lasts only until the next Parse statement
> specifying the unnamed statement as destination is issued." [2]  The
> code does have oneShot in the parse/bind/execute path so it is
> recognized...at least in JDBC.

The driver does use the unnamed statement for pretty much everything until we see the same statement being used prepareThreshold times. Then we switch to a named statement. 


Session configuration.  If you are using a pooler mode that simply doesn't play nice with named prepared statements you must configure JDBC to not use them (ever) and use only the unnamed prepared statement for parse/bind/execute.  The interleaving you want to do is simply not possible (or, rather, you will not get the benefit of actually having the prepared statement saved in a cache for re-use, it will be re-parsed every time).

This seems better than nothing given lots of uses of prepared statements are simply to get access to the extended query protocol's parse/bind/execute.  It is quite possible an even better solution exists if pgJDBC and pgbouncer cooperate and design and implement something to overcome this complaint.  I'm not going there myself (not that I'm implementing this suggestion) as this solution seems simpler and sufficiently effective for the majority of use cases.  It also likely doesn't impose any kind of constraints (especially as this is basically implementation details + configuration, not code-level API changes) on a more nuanced solution.

There are a number of other session settings that we can't really track either.

Dave