Обсуждение: Prepared Statement Query Planning

Поиск
Список
Период
Сортировка

Prepared Statement Query Planning

От
Brett Henderson
Дата:
Hi,

I'm trying to use a PreparedStatement to query a small number of rows
from a table containing over 500 million rows.  I'm encountering bad
query plans due to the query plan being constructed without access to
the bind variables.

I've read this link which suggests disabling server side prepared
statements but it doesn't appear to be helping.  I am happy to disable
all server side prepared statements because I am effectively performing
batch queries where the cost of a query plan is relatively small
compared to the execution of the query.
http://jdbc.postgresql.org/documentation/83/server-prepare.html

I've added "prepareThreshold=0", and "prepareThreshold=3" to the
connection URL but neither appear to have any effect.  I've checked the
prepare threshold on the statement objects and found that they are being
set to the values I specify.

Is there any way of verifying what type of statement is being issued to
the server?

This is the query:
SELECT e.id, e.version, e.timestamp, e.visible, u.data_public,
u.id AS user_id, u.display_name, e.changeset_id, e.latitude, e.longitude
FROM nodes e
INNER JOIN (
SELECT id, version FROM
nodes WHERE timestamp > ? AND timestamp <= ?
) t ON e.id = t.id AND e.version = t.version
INNER JOIN changesets c ON e.changeset_id = c.id
INNER JOIN users u ON c.user_id = u.id

The node table contains approx 500 million rows, the user table approx
100 thousand rows, and the changeset table somewhere in between but
closer to the lower end.  The server has 32 GB of RAM so can fit smaller
tables in RAM, but not the node table.

The query retrieves all rows within a timestamp range.  I realise the
query could be re-written without the sub-select, but it is implemented
in this way as a result of the way the query is dynamically constructed
and allows the sub-select portion can be switched out for other row
selection criteria.

The timestamp range varies anywhere from 1 minute to 1 day, but no
larger than that.  The table contains data over a period of several years.

The full JDBC trace with loglevel=2 is below.  Is it possible to tell
whether server side prepared statements are being used from this?  Note
that this has been obtained from a windows laptop running PostgreSQL
8.3.5, but the real problem is occurring on an Ubuntu Linux server
running PostgreSQL 8.3.7.  The same JDBC driver is being used across the
board.
17:48:46.077 (1) PostgreSQL 8.3 JDBC4 with SSL (build 603)
17:48:46.091 (1) Trying to establish a protocol version 3 connection to
localhost:5432
17:48:46.204 (1)  FE=> StartupPacket(user=osm, database=api06_test,
client_encoding=UNICODE, DateStyle=ISO, extra_float_digits=2)
17:48:46.382 (1)  <=BE AuthenticationReqMD5(salt=xxxxxxxx)
17:48:46.391 (1)  FE=>
Password(md5digest=md5xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx)
17:48:46.397 (1)  <=BE AuthenticationOk
17:48:46.408 (1)  <=BE ParameterStatus(client_encoding = UNICODE)
17:48:46.408 (1)  <=BE ParameterStatus(DateStyle = ISO, DMY)
17:48:46.409 (1)  <=BE ParameterStatus(integer_datetimes = off)
17:48:46.409 (1)  <=BE ParameterStatus(is_superuser = on)
17:48:46.409 (1)  <=BE ParameterStatus(server_encoding = UTF8)
17:48:46.409 (1)  <=BE ParameterStatus(server_version = 8.3.5)
17:48:46.409 (1)  <=BE ParameterStatus(session_authorization = osm)
17:48:46.409 (1)  <=BE ParameterStatus(standard_conforming_strings = off)
17:48:46.409 (1)  <=BE ParameterStatus(TimeZone = Australia/Canberra)
17:48:46.409 (1)  <=BE BackendKeyData(pid=8292,ckey=224285055)
17:48:46.409 (1)  <=BE ReadyForQuery(I)
17:48:46.409 (1)     compatible = 8.3
17:48:46.409 (1)     loglevel = 2
17:48:46.409 (1)     prepare threshold = 0
17:48:46.442 (1) simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandler@1b8e059,
maxRows=0, fetchSize=10000, flags=9
17:48:46.442 (1)  FE=> Parse(stmt=S_1,query="BEGIN",oids={})
17:48:46.443 (1)  FE=> Bind(stmt=S_1,portal=null)
17:48:46.443 (1)  FE=> Execute(portal=null,limit=0)
17:48:46.443 (1)  FE=> Parse(stmt=S_2,query="SELECT e.id, e.version,
e.timestamp, e.visible, u.data_public, u.id AS user_id, u.display_name,
e.changeset_id, e.latitude, e.longitude FROM nodes e LEFT OUTER JOIN
changesets c ON e.changeset_id = c.id LEFT OUTER JOIN users u ON
c.user_id = u.id WHERE e.timestamp > $1 AND e.timestamp <= $2 ORDER BY
e.id, e.version",oids={0,0})
17:48:46.444 (1)  FE=> Describe(statement=S_2)
17:48:46.444 (1)  FE=> Bind(stmt=S_2,portal=C_3,$1=<1970-01-01
10:00:00.000000 +10:00:00>,$2=<2009-08-28 17:48:45.932000 +10:00:00>)
17:48:46.445 (1)  FE=> Execute(portal=C_3,limit=10000)
17:48:46.445 (1)  FE=> Sync
17:48:46.636 (1)  <=BE ParseComplete [S_1]
17:48:46.636 (1)  <=BE BindComplete [null]
17:48:46.636 (1)  <=BE CommandStatus(BEGIN)
17:48:46.636 (1)  <=BE ParseComplete [S_2]
17:48:46.636 (1)  <=BE ParameterDescription
17:48:46.637 (1)  <=BE RowDescription(10)
17:48:46.638 (1)  <=BE BindComplete [C_3]
17:48:46.638 (1)  <=BE CommandStatus(SELECT)
17:48:46.650 (1)  <=BE ReadyForQuery(T)
17:48:46.651 (1) simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Connection$TransactionCommandHandler@7b6889,
maxRows=0, fetchSize=0, flags=22
17:48:46.651 (1)  FE=> CloseStatement(S_2)
17:48:46.651 (1)  FE=> ClosePortal(C_3)
17:48:46.651 (1)  FE=> Parse(stmt=S_4,query="COMMIT",oids={})
17:48:46.651 (1)  FE=> Bind(stmt=S_4,portal=null)
17:48:46.651 (1)  FE=> Execute(portal=null,limit=1)
17:48:46.651 (1)  FE=> Sync
17:48:46.652 (1)  <=BE CloseComplete
17:48:46.652 (1)  <=BE CloseComplete
17:48:46.652 (1)  <=BE ParseComplete [S_4]
17:48:46.652 (1)  <=BE BindComplete [null]
17:48:46.652 (1)  <=BE CommandStatus(COMMIT)
17:48:46.652 (1)  <=BE ReadyForQuery(I)
17:48:46.652 (1)  FE=> Terminate

If it helps, the following blog entry includes two query plan diagrams
with and without bind variables.
http://www.odecee.com.au/blogs/?p=134

Any suggestions on what I'm doing wrong?

Brett


Re: Prepared Statement Query Planning

От
Віталій Тимчишин
Дата:


2009/8/29 Brett Henderson <brett@bretth.com>

This is the query:
SELECT e.id, e.version, e.timestamp, e.visible, u.data_public,
u.id AS user_id, u.display_name, e.changeset_id, e.latitude, e.longitude
FROM nodes e
INNER JOIN (
SELECT id, version FROM
nodes WHERE timestamp > ? AND timestamp <= ?
) t ON e.id = t.id AND e.version = t.version
INNER JOIN changesets c ON e.changeset_id = c.id
INNER JOIN users u ON c.user_id = u.id

The node table contains approx 500 million rows, the user table approx 100 thousand rows, and the changeset table somewhere in between but closer to the lower end.  The server has 32 GB of RAM so can fit smaller tables in RAM, but not the node table.

The query retrieves all rows within a timestamp range.  I realise the query could be re-written without the sub-select, but it is implemented in this way as a result of the way the query is dynamically constructed and allows the sub-select portion can be switched out for other row selection criteria.

It the subselect is only for switching out, it can be rewritten to:

SELECT e.id, e.version, e.timestamp, e.visible, u.data_public,
u.id AS user_id, u.display_name, e.changeset_id, e.latitude, e.longitude
FROM nodes e
INNER JOIN (
values (?,?)
) t(from, to) ON timestamp > from AND timestamp <= to
INNER JOIN changesets c ON e.changeset_id = c.id
INNER JOIN users u ON c.user_id = u.id

Still don't know if this will make things better or worse.
Another (better) option could be not to add a join, but replace original "from nodes e" in this case:
SELECT e.id, e.version, e.timestamp, e.visible, u.data_public,
u.id AS user_id, u.display_name, e.changeset_id, e.latitude, e.longitude
FROM (select * from nodes where timestamp > ? AND timestamp <= ?) e
INNER JOIN changesets c ON e.changeset_id = c.id
INNER JOIN users u ON c.user_id = u.id

Best regards, Vitalii Tymchyshyn

Re: Prepared Statement Query Planning

От
Brett Henderson
Дата:
Віталій Тимчишин wrote:
It the subselect is only for switching out, it can be rewritten to:

SELECT e.id, e.version, e.timestamp, e.visible, u.data_public,
u.id AS user_id, u.display_name, e.changeset_id, e.latitude, e.longitude
FROM nodes e
INNER JOIN (
values (?,?)
) t(from, to) ON timestamp > from AND timestamp <= to
INNER JOIN changesets c ON e.changeset_id = c.id
INNER JOIN users u ON c.user_id = u.id

Still don't know if this will make things better or worse.
Another (better) option could be not to add a join, but replace original "from nodes e" in this case:
SELECT e.id, e.version, e.timestamp, e.visible, u.data_public,
u.id AS user_id, u.display_name, e.changeset_id, e.latitude, e.longitude
FROM (select * from nodes where timestamp > ? AND timestamp <= ?) e
INNER JOIN changesets c ON e.changeset_id = c.id
INNER JOIN users u ON c.user_id = u.id
Thanks for the tips.  I haven't see the first style before, and hadn't considered the second.

However I don't think it will work in my case.  I currently have three different ways of selecting records, 1. by timestamp range (as in my initial example), 2. by records in a temp table, and 3. unrestricted.  The unrestricted example isn't an issue because a full table scan is appropriate in that case.  However the temp table one is a different matter.  In that case the query looks like this:

SELECT e.id, e.version, e.timestamp, e.visible, u.data_public,
u.id AS user_id, u.display_name, e.changeset_id, e.latitude, e.longitude
FROM nodes e
INNER JOIN
tmp_nodes
t ON e.id = t.id AND e.version = t.version
INNER JOIN changesets c ON e.changeset_id = c.id
INNER JOIN users u ON c.user_id = u.id

The tmp_nodes table contains only two columns id and version which have been built up by previous queries.  Most of the query remains identical, but instead of doing a "SELECT id, version FROM nodes WHERE timestamp > ? AND timestamp <= ?", I join to tmp_nodes which contains only the records I'm interested in.  Originally I was creating a temp table in the timestamp range case as well but moved away in an attempt to get better performance, I suspect I was encountering bad query plans in that case as well.

Re: Prepared Statement Query Planning

От
Oliver Jowett
Дата:
Brett Henderson wrote:

> The full JDBC trace with loglevel=2 is below.  Is it possible to tell
> whether server side prepared statements are being used from this?  Note
> that this has been obtained from a windows laptop running PostgreSQL
> 8.3.5, but the real problem is occurring on an Ubuntu Linux server
> running PostgreSQL 8.3.7.  The same JDBC driver is being used across the
> board.

> 17:48:46.443 (1)  FE=> Parse(stmt=S_2,query="SELECT e.id, e.version,
> e.timestamp, e.visible, u.data_public, u.id AS user_id, u.display_name,
> e.changeset_id, e.latitude, e.longitude FROM nodes e LEFT OUTER JOIN
> changesets c ON e.changeset_id = c.id LEFT OUTER JOIN users u ON
> c.user_id = u.id WHERE e.timestamp > $1 AND e.timestamp <= $2 ORDER BY
> e.id, e.version",oids={0,0})
> 17:48:46.444 (1)  FE=> Describe(statement=S_2)
> 17:48:46.444 (1)  FE=> Bind(stmt=S_2,portal=C_3,$1=<1970-01-01
> 10:00:00.000000 +10:00:00>,$2=<2009-08-28 17:48:45.932000 +10:00:00>)
> 17:48:46.445 (1)  FE=> Execute(portal=C_3,limit=10000)
> 17:48:46.445 (1)  FE=> Sync

This is using a named statement (S_2 is the name)

It looks like you're using setFetchSize(). That forces use of a named
statement regardless of prepareThreshold (we have to keep the statement
and corresponding portal alive so we can do subsequent fetches, even if
there are some other intervening queries, so we can't use the unnamed
statement)

-O

Re: Prepared Statement Query Planning

От
Brett Henderson
Дата:
Oliver Jowett wrote:
> Brett Henderson wrote:
>
>> The full JDBC trace with loglevel=2 is below.  Is it possible to tell
>> whether server side prepared statements are being used from this?
>> Note that this has been obtained from a windows laptop running
>> PostgreSQL 8.3.5, but the real problem is occurring on an Ubuntu
>> Linux server running PostgreSQL 8.3.7.  The same JDBC driver is being
>> used across the board.
>
>> 17:48:46.443 (1)  FE=> Parse(stmt=S_2,query="SELECT e.id, e.version,
>> e.timestamp, e.visible, u.data_public, u.id AS user_id,
>> u.display_name, e.changeset_id, e.latitude, e.longitude FROM nodes e
>> LEFT OUTER JOIN changesets c ON e.changeset_id = c.id LEFT OUTER JOIN
>> users u ON c.user_id = u.id WHERE e.timestamp > $1 AND e.timestamp <=
>> $2 ORDER BY e.id, e.version",oids={0,0})
>> 17:48:46.444 (1)  FE=> Describe(statement=S_2)
>> 17:48:46.444 (1)  FE=> Bind(stmt=S_2,portal=C_3,$1=<1970-01-01
>> 10:00:00.000000 +10:00:00>,$2=<2009-08-28 17:48:45.932000 +10:00:00>)
>> 17:48:46.445 (1)  FE=> Execute(portal=C_3,limit=10000)
>> 17:48:46.445 (1)  FE=> Sync
>
> This is using a named statement (S_2 is the name)
>
> It looks like you're using setFetchSize(). That forces use of a named
> statement regardless of prepareThreshold (we have to keep the
> statement and corresponding portal alive so we can do subsequent
> fetches, even if there are some other intervening queries, so we can't
> use the unnamed statement)
Yes, I'm currently using a fetch size of 10000.  I can't allow all
results to be read at once because there can potentially be a huge
number of results in the queries.  I've just tested it out, and sure
enough leaving the fetch size at 0 prevents the use of named statements.

It sounds like I'm stuck with forcing the query planner via set
statements.  I'm currently using "set enable_seqscan=false;set
enable_hashjoin=false;set enable_mergejoin=false" and getting a good
query plan.

I'm not familiar with PostgreSQL internals, but I assume a portal is
basically a cursor?  So is there no way of creating a cursor and
fetching results in batches using an unnamed statement?

Brett


Re: Prepared Statement Query Planning

От
Tom Lane
Дата:
Brett Henderson <brett@bretth.com> writes:
> However I don't think it will work in my case.  I currently have three
> different ways of selecting records, 1. by timestamp range (as in my
> initial example), 2. by records in a temp table, and 3. unrestricted.
> The unrestricted example isn't an issue because a full table scan is
> appropriate in that case.  However the temp table one is a different
> matter.

In the case of a temp table you have to ANALYZE the temp table after
filling it; otherwise the planner is working completely blind as to
what is in it.  (autovacuum will not do this for you...)

            regards, tom lane

Re: Prepared Statement Query Planning

От
Віталій Тимчишин
Дата:


2009/8/29 Brett Henderson <brett@bretth.com>
Oliver Jowett wrote:
Brett Henderson wrote:

It looks like you're using setFetchSize(). That forces use of a named statement regardless of prepareThreshold (we have to keep the statement and corresponding portal alive so we can do subsequent fetches, even if there are some other intervening queries, so we can't use the unnamed statement)
Yes, I'm currently using a fetch size of 10000.  I can't allow all results to be read at once because there can potentially be a huge number of results in the queries.  I've just tested it out, and sure enough leaving the fetch size at 0 prevents the use of named statements.

How about "create or replace temp view tmp_v as <query>" without setFetchSize and then "select * from tmp_v" with setFetchSize? Not sure still if DDL can use prepared query parameters.
Another options would be either to use protocolLevel=2 (writing from memory, may misspell parameter name) - this will force client-side parameter embedding or to embed your timestamps into query text by yourself and not to use PreparedStatement at all. Fortunatelly this are not strings - so no quoting needed.

Best regards, Vitalii Tymchyshyn

Re: Prepared Statement Query Planning

От
Brett Henderson
Дата:
Віталій Тимчишин wrote:
>
>
> 2009/8/29 Brett Henderson <brett@bretth.com <mailto:brett@bretth.com>>
>
>     Oliver Jowett wrote:
>
>         Brett Henderson wrote:
>
>         It looks like you're using setFetchSize(). That forces use of
>         a named statement regardless of prepareThreshold (we have to
>         keep the statement and corresponding portal alive so we can do
>         subsequent fetches, even if there are some other intervening
>         queries, so we can't use the unnamed statement)
>
>     Yes, I'm currently using a fetch size of 10000.  I can't allow all
>     results to be read at once because there can potentially be a huge
>     number of results in the queries.  I've just tested it out, and
>     sure enough leaving the fetch size at 0 prevents the use of named
>     statements.
>
>
> How about "create or replace temp view tmp_v as <query>" without
> setFetchSize and then "select * from tmp_v" with setFetchSize? Not
> sure still if DDL can use prepared query parameters.
I don't think I can do that.  I couldn't use the syntax
"PREPARE STATEMENT mystatement (timestamp, timestamp) AS CREATE TEMP
VIEW tmp_nodes AS SELECT id, version FROM nodes WHERE timestamp > ? AND
timestamp <= ?"
> Another options would be either to use protocolLevel=2 (writing from
> memory, may misspell parameter name) - this will force client-side
> parameter embedding or to embed your timestamps into query text by
> yourself and not to use PreparedStatement at all. Fortunatelly this
> are not strings - so no quoting needed.
Will this force me to load all results in memory (ie. prevent me from
reading resultsets in batches)?  I'm hesitant to force the older
protocol, I've read elsewhere that it should be avoided if possible
(something about exception handling??).

At this point it sounds like the lesser evil is to to specify some "set
local enable_seqscan = false" type statements.  It seems to be working
well enough.

Brett


Re: Prepared Statement Query Planning

От
Brett Henderson
Дата:
Tom Lane wrote:
Brett Henderson <brett@bretth.com> writes: 
However I don't think it will work in my case.  I currently have three 
different ways of selecting records, 1. by timestamp range (as in my 
initial example), 2. by records in a temp table, and 3. unrestricted.  
The unrestricted example isn't an issue because a full table scan is 
appropriate in that case.  However the temp table one is a different 
matter.   
In the case of a temp table you have to ANALYZE the temp table after
filling it; otherwise the planner is working completely blind as to
what is in it.  (autovacuum will not do this for you...) 
Thanks for the tip.  I haven't tried the temp table queries yet.  They're next.