Обсуждение: Query does not return rows unless a LIMIT statement is used.
Hello! This is my first time using Postgres, and my first posting to this mailing list. I am evaluating the performance of Postgres version 8.1.2 as a possible replacement for our aging Sybase ASE version 11 database. However, I have run into what could be a show-stopper problem. I am having a problem with a query taking a very long time to run. So long, in fact, that I have yet to see that query return any rows. I will apologize now for what may become a very long message, but I want to be sure that I pass along all of the pertinent details. Here we go. I am using Red Hat Enterprise Linux 3.0 on a box with 1.5 GB of RAM. The CPU is an Intel Xeon 2.4 GHz hyper-threaded chip with 512 KB cache. I am trying to do a simple join on indexed columns between two tables. Table PID has 328,403 records, and table PV1 has 1,159,822 records. The query which I am using is: SELECT * FROM pid, pv1 WHERE pid.patient_id_internal_id = pv1.patient_id_internal_id LIMIT 5324; For table PID, the column patient_id_internal_id is defined as character varying(20). The index on this column is defined as: "kdx_pid" btree (patient_id_internal_id), tablespace "hl7_segments_ihe_idx" For table PV1, the column patient_id_internal_id is defined as character varying(20). The index on this column is defined as: "patient_id_internal_id_idx" btree (patient_id_internal_id), tablespace "hl7_segments_ihe_idx" Both tables live in the tablespace hl7_segments_ihe_data. Note the LIMIT 5324 statement. I empirically determined that this magic number is the maximum row count before the query performance degrades completely. It appears to be a sudden performance degradation. If I increase the value in LIMIT beyond 5324, or if I remove the LIMIT altogether, then the query performance is horrible. Here is the query plan with the LIMIT statement: QUERY PLAN --------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..1607.27 rows=5324 width=691) -> Merge Join (cost=0.00..350141.38 rows=1159822 width=691) Merge Cond: (("outer".patient_id_internal_id)::text = ("inner".patient_id_internal_id)::text) -> Index Scan using kdx_pid on pid (cost=0.00..49229.38 rows=328403 width=284) -> Index Scan using patient_id_internal_id_idx on pv1 (cost=0.00..290598.21 rows=1159822 width=407) (5 rows) Time: 1.472 ms Now here is the query plan without the LIMIT statement: QUERY PLAN --------------------------------------------------------------------------------------------------------- Merge Join (cost=0.00..350141.38 rows=1159822 width=691) Merge Cond: (("outer".patient_id_internal_id)::text = ("inner".patient_id_internal_id)::text) -> Index Scan using kdx_pid on pid (cost=0.00..49229.38 rows=328403 width=284) -> Index Scan using patient_id_internal_id_idx on pv1 (cost=0.00..290598.21 rows=1159822 width=407) (4 rows) Time: 1.432 ms I have tried various combinations of SET enable_seqscan = off;, SET enable_mergejoin = off;, and so forth. All to no avail. The bottom line is still that LIMIT statement. Without it the query performance is awful. From observation using the TOP utility, I have noticed that when the query is running without the LIMIT statement, that the postmaster process resident memory size is increasing at a steady rate. It would seem that the query results are being gathered in memory first before being displayed in PSQL. Is there a way around this? After loading the tables with data I did perform a VACUUM ANALYZE and a database restart prior to any querying.. I have also modified the following parameters in the postgres.conf file, and then restarted the database: shared_buffers = 16384 commit_delay = 100000 commit_siblings = 5 enable_seqscan = off effective_cache_size = 100000 log_destination = 'stderr' redirect_stderr = on log_directory = 'pg_log' log_truncate_on_rotation = on log_rotation_age = 1440 log_rotation_size = 200 lc_messages = 'en_US' lc_monetary = 'en_US' lc_numeric = 'en_US' lc_time = 'en_US' What am I doing wrong? Has anyone run into something similar? Any help would be greatly appreciated. Thank you in advance. -- Regards, Richard Kut Database Administrator Research & Development Intelerad Medical Systems Inc. 460 Ste-Catherine West, Suite 210 Montreal, Quebec, Canada H3B 1A7 Tel: 514.931.6222 x7733 Fax: 514.931.4653 rkut@intelerad.com www.intelerad.com This email or any attachments may contain confidential or legally privileged information intended for the sole use of the addressees. Any use, redistribution, disclosure, or reproduction of this information, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments.
On 2/15/06 10:06 AM, "Richard Kut" <rkut@intelerad.com> wrote: > Hello! > > This is my first time using Postgres, and my first posting to this mailing > list. > > I am evaluating the performance of Postgres version 8.1.2 as a possible > replacement for our aging Sybase ASE version 11 database. However, I have run > into what could be a show-stopper problem. > > I am having a problem with a query taking a very long time to run. So long, > in fact, that I have yet to see that query return any rows. > > I will apologize now for what may become a very long message, but I want to > be sure that I pass along all of the pertinent details. Here we go. > > I am using Red Hat Enterprise Linux 3.0 on a box with 1.5 GB of RAM. The CPU > is an Intel Xeon 2.4 GHz hyper-threaded chip with 512 KB cache. > > I am trying to do a simple join on indexed columns between two tables. Table > PID has 328,403 records, and table PV1 has 1,159,822 records. The query which > I am using is: > > SELECT * > FROM pid, pv1 > WHERE pid.patient_id_internal_id = pv1.patient_id_internal_id > LIMIT 5324; > > For table PID, the column patient_id_internal_id is defined as character > varying(20). The index on this column is defined as: > "kdx_pid" btree (patient_id_internal_id), tablespace "hl7_segments_ihe_idx" > > For table PV1, the column patient_id_internal_id is defined as character > varying(20). The index on this column is defined as: > "patient_id_internal_id_idx" btree (patient_id_internal_id), tablespace > "hl7_segments_ihe_idx" > > Both tables live in the tablespace hl7_segments_ihe_data. > > Note the LIMIT 5324 statement. I empirically determined that this magic > number is the maximum row count before the query performance degrades > completely. It appears to be a sudden performance degradation. If I increase > the value in LIMIT beyond 5324, or if I remove the LIMIT altogether, then the > query performance is horrible. Can you send EXPLAIN ANALYZE output? That has timing and "real" numbers of rows, etc. > Here is the query plan with the LIMIT statement: > > QUERY PLAN > ------------------------------------------------------------------------------ > --------------------------------- > Limit (cost=0.00..1607.27 rows=5324 width=691) > -> Merge Join (cost=0.00..350141.38 rows=1159822 width=691) > Merge Cond: (("outer".patient_id_internal_id)::text = > ("inner".patient_id_internal_id)::text) > -> Index Scan using kdx_pid on pid (cost=0.00..49229.38 rows=328403 > width=284) > -> Index Scan using patient_id_internal_id_idx on pv1 > (cost=0.00..290598.21 rows=1159822 width=407) > (5 rows) > > Time: 1.472 ms > > Now here is the query plan without the LIMIT statement: > > QUERY PLAN > ------------------------------------------------------------------------------ > --------------------------- > Merge Join (cost=0.00..350141.38 rows=1159822 width=691) > Merge Cond: (("outer".patient_id_internal_id)::text = > ("inner".patient_id_internal_id)::text) > -> Index Scan using kdx_pid on pid (cost=0.00..49229.38 rows=328403 > width=284) > -> Index Scan using patient_id_internal_id_idx on pv1 > (cost=0.00..290598.21 rows=1159822 width=407) > (4 rows) > > Time: 1.432 ms > > I have tried various combinations of SET enable_seqscan = off;, SET > enable_mergejoin = off;, and so forth. All to no avail. The bottom line is > still that LIMIT statement. Without it the query performance is awful. > > From observation using the TOP utility, I have noticed that when the query is > running without the LIMIT statement, that the postmaster process resident > memory size is increasing at a steady rate. It would seem that the query > results are being gathered in memory first before being displayed in PSQL. Is > there a way around this? That is what is going on, I think. > After loading the tables with data I did perform a VACUUM ANALYZE and a > database restart prior to any querying..
Richard Kut <rkut@intelerad.com> writes: > Note the LIMIT 5324 statement. I empirically determined that this magic > number is the maximum row count before the query performance degrades > completely. It appears to be a sudden performance degradation. If I increase > the value in LIMIT beyond 5324, or if I remove the LIMIT altogether, then the > query performance is horrible. This is really, really hard to believe considering that you're showing the exact same query plan in both cases. Are you certain you didn't get confused about which EXPLAIN output was which? It's entirely possible for the planner to pick a different underlying plan depending on the LIMIT setting --- small LIMIT values will make it tend to prefer fast-start query plans, since a slow-overall plan with nil startup cost can beat a fast-overall plan with high startup cost if you're not intending to run it to completion. Up until I got to the query plans I thought you were describing a situation where the planner had switched over to a high-startup-cost plan much sooner than it should do, but your EXPLAIN printouts don't reflect any such thing. If it is a question of bad choice of query plan, the answer is probably that you need to VACUUM and/or ANALYZE the tables involved so that the planner has more accurate statistics to work with. Do the estimated row counts shown by EXPLAIN look reasonable at all? > From observation using the TOP utility, I have noticed that when the query is > running without the LIMIT statement, that the postmaster process resident > memory size is increasing at a steady rate. It would seem that the query > results are being gathered in memory first before being displayed in PSQL. Is > there a way around this? The postmaster doesn't gather up query results, but psql itself will. If you want incremental fetch of results, use a cursor and FETCH some appropriate number of rows at a time. > commit_delay = 100000 This is unlikely to be a good idea. Your other parameter settings look reasonable though. regards, tom lane
Hi Tom! Thanks for the quick response! Here are the query plans for both queries. EXPLAIN ANALYZE SELECT * FROM pid, pv1 WHERE pid.patient_id_internal_id = pv1.patient_id_internal_id LIMIT 5324; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..33408.19 rows=5324 width=691) (actual time=0.054..106.972 rows=5324 loops=1) -> Nested Loop (cost=0.00..7277902.27 rows=1159822 width=691) (actual time=0.051..99.230 rows=5324 loops=1) -> Index Scan using patient_id_internal_id_idx on pv1 (cost=0.00..290598.21 rows=1159822 width=407) (actual time=0.020..14.995 rows=5324 loops=1) -> Index Scan using kdx_pid on pid (cost=0.00..6.01 rows=1 width=284) (actual time=0.009..0.010 rows=1 loops=5324) Index Cond: ((pid.patient_id_internal_id)::text = ("outer".patient_id_internal_id)::text) Total runtime: 111.137 ms (6 rows) Time: 112.405 ms EXPLAIN ANALYZE SELECT * FROM pid, pv1 WHERE pid.patient_id_internal_id = pv1.patient_id_internal_id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=62558.39..509419.48 rows=1159822 width=691) (actual time=3348.848..163051.064 rows=1159805 loops=1) Hash Cond: (("outer".patient_id_internal_id)::text = ("inner".patient_id_internal_id)::text) -> Index Scan using patient_id_internal_id_idx on pv1 (cost=0.00..290598.21 rows=1159822 width=407) (actual time=0.020..56378.131 rows=1159822 loops=1) -> Hash (cost=49229.38..49229.38 rows=328403 width=284) (actual time=3342.874..3342.874 rows=328403 loops=1) -> Index Scan using kdx_pid on pid (cost=0.00..49229.38 rows=328403 width=284) (actual time=0.010..2110.172 rows=328403 loops=1) Total runtime: 163917.488 ms (6 rows) Time: 164006.053 ms I hope that this helps. On Wednesday 15 February 2006 10:53, Tom Lane wrote: > Richard Kut <rkut@intelerad.com> writes: > > Note the LIMIT 5324 statement. I empirically determined that this magic > > number is the maximum row count before the query performance degrades > > completely. It appears to be a sudden performance degradation. If I > > increase the value in LIMIT beyond 5324, or if I remove the LIMIT > > altogether, then the query performance is horrible. > > This is really, really hard to believe considering that you're showing > the exact same query plan in both cases. Are you certain you didn't > get confused about which EXPLAIN output was which? It's entirely > possible for the planner to pick a different underlying plan depending > on the LIMIT setting --- small LIMIT values will make it tend to prefer > fast-start query plans, since a slow-overall plan with nil startup cost > can beat a fast-overall plan with high startup cost if you're not > intending to run it to completion. Up until I got to the query plans > I thought you were describing a situation where the planner had switched > over to a high-startup-cost plan much sooner than it should do, but your > EXPLAIN printouts don't reflect any such thing. > > If it is a question of bad choice of query plan, the answer is probably > that you need to VACUUM and/or ANALYZE the tables involved so that the > planner has more accurate statistics to work with. Do the estimated > row counts shown by EXPLAIN look reasonable at all? > > > From observation using the TOP utility, I have noticed that when the > > query is running without the LIMIT statement, that the postmaster process > > resident memory size is increasing at a steady rate. It would seem that > > the query results are being gathered in memory first before being > > displayed in PSQL. Is there a way around this? > > The postmaster doesn't gather up query results, but psql itself will. > If you want incremental fetch of results, use a cursor and FETCH some > appropriate number of rows at a time. > > > commit_delay = 100000 > > This is unlikely to be a good idea. Your other parameter settings look > reasonable though. > > regards, tom lane -- Regards, Richard Kut Database Administrator Research & Development Intelerad Medical Systems Inc. 460 Ste-Catherine West, Suite 210 Montreal, Quebec, Canada H3B 1A7 Tel: 514.931.6222 x7733 Fax: 514.931.4653 rkut@intelerad.com www.intelerad.com This email or any attachments may contain confidential or legally privileged information intended for the sole use of the addressees. Any use, redistribution, disclosure, or reproduction of this information, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments.
Richard Kut <rkut@intelerad.com> writes: > -> Hash (cost=49229.38..49229.38 rows=328403 width=284) (actual > time=3342.874..3342.874 rows=328403 loops=1) > -> Index Scan using kdx_pid on pid (cost=0.00..49229.38 rows=328403 > width=284) (actual time=0.010..2110.172 rows=328403 loops=1) Hm, it looks like you've still got the planner's hands tied behind its back --- specifically, I'll bet enable_seqscan = off. There's no reason at all to use a full-table indexscan to load a hash table. Please try it again with default planner parameter settings. A general comment from comparing your two cases is that random_page_cost is probably too high for the scenario you are testing, which looks to be a case where both tables are fully cached in memory. However, before reducing it you should ask yourself whether that's still likely to be true in production. It's a bad idea to optimize on the basis of test cases that are much smaller than your production scenario will be ... regards, tom lane
Hi Tom! Here are the results with enable_seqscan = on: hl7segmentsihe=> set enable_seqscan = on; SET Time: 0.210 ms hl7segmentsihe=> EXPLAIN ANALYZE SELECT * hl7segmentsihe-> FROM pid, pv1 hl7segmentsihe-> WHERE pid.patient_id_internal_id = pv1.patient_id_internal_id hl7segmentsihe-> LIMIT 5324; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Limit (cost=27704.04..28790.59 rows=5324 width=691) (actual time=3865.677..17464.071 rows=5324 loops=1) -> Hash Join (cost=27704.04..264408.14 rows=1159822 width=691) (actual time=3865.674..17453.938 rows=5324 loops=1) Hash Cond: (("outer".patient_id_internal_id)::text = ("inner".patient_id_internal_id)::text) -> Seq Scan on pv1 (cost=0.00..80441.22 rows=1159822 width=407) (actual time=4.907..9361.791 rows=654414 loops=1) -> Hash (cost=14375.03..14375.03 rows=328403 width=284) (actual time=3853.090..3853.090 rows=328403 loops=1) -> Seq Scan on pid (cost=0.00..14375.03 rows=328403 width=284) (actual time=6.198..2658.306 rows=328403 loops=1) Total runtime: 18120.833 ms (7 rows) Time: 18148.473 ms hl7segmentsihe=> EXPLAIN ANALYZE SELECT * hl7segmentsihe-> FROM pid, pv1 hl7segmentsihe-> WHERE pid.patient_id_internal_id = pv1.patient_id_internal_id hl7segmentsihe-> ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=27704.04..264408.14 rows=1159822 width=691) (actual time=3469.373..107465.440 rows=1159805 loops=1) Hash Cond: (("outer".patient_id_internal_id)::text = ("inner".patient_id_internal_id)::text) -> Seq Scan on pv1 (cost=0.00..80441.22 rows=1159822 width=407) (actual time=4.977..10166.541 rows=1159822 loops=1) -> Hash (cost=14375.03..14375.03 rows=328403 width=284) (actual time=3457.473..3457.473 rows=328403 loops=1) -> Seq Scan on pid (cost=0.00..14375.03 rows=328403 width=284) (actual time=6.322..2288.957 rows=328403 loops=1) Total runtime: 108304.628 ms (6 rows) Time: 108305.898 ms hl7segmentsihe=> I hope that this helps. On Wednesday 15 February 2006 11:21, Tom Lane wrote: > Richard Kut <rkut@intelerad.com> writes: > > -> Hash (cost=49229.38..49229.38 rows=328403 width=284) (actual > > time=3342.874..3342.874 rows=328403 loops=1) > > -> Index Scan using kdx_pid on pid (cost=0.00..49229.38 > > rows=328403 width=284) (actual time=0.010..2110.172 rows=328403 loops=1) > > Hm, it looks like you've still got the planner's hands tied behind its > back --- specifically, I'll bet enable_seqscan = off. There's no reason > at all to use a full-table indexscan to load a hash table. Please try > it again with default planner parameter settings. > > A general comment from comparing your two cases is that random_page_cost > is probably too high for the scenario you are testing, which looks to be > a case where both tables are fully cached in memory. However, before > reducing it you should ask yourself whether that's still likely to be > true in production. It's a bad idea to optimize on the basis of test > cases that are much smaller than your production scenario will be ... > > regards, tom lane -- Regards, Richard Kut Database Administrator Research & Development Intelerad Medical Systems Inc. 460 Ste-Catherine West, Suite 210 Montreal, Quebec, Canada H3B 1A7 Tel: 514.931.6222 x7733 Fax: 514.931.4653 rkut@intelerad.com www.intelerad.com This email or any attachments may contain confidential or legally privileged information intended for the sole use of the addressees. Any use, redistribution, disclosure, or reproduction of this information, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments.
Richard Kut <rkut@intelerad.com> writes: > Here are the results with enable_seqscan = on: Hmm, there's something pretty odd about your data, it seems like. In the planner's preferred hash-join plan, it seems to be necessary to scan through 654414 rows of pv1 to find the first 5324 rows that join to pid: > Limit (cost=27704.04..28790.59 rows=5324 width=691) (actual > time=3865.677..17464.071 rows=5324 loops=1) > -> Hash Join (cost=27704.04..264408.14 rows=1159822 width=691) (actual > time=3865.674..17453.938 rows=5324 loops=1) > Hash Cond: (("outer".patient_id_internal_id)::text = > ("inner".patient_id_internal_id)::text) > -> Seq Scan on pv1 (cost=0.00..80441.22 rows=1159822 width=407) > (actual time=4.907..9361.791 rows=654414 loops=1) ^^^^^^^^^^^ > -> Hash (cost=14375.03..14375.03 rows=328403 width=284) (actual > time=3853.090..3853.090 rows=328403 loops=1) > -> Seq Scan on pid (cost=0.00..14375.03 rows=328403 > width=284) (actual time=6.198..2658.306 rows=328403 loops=1) > Total runtime: 18120.833 ms This is strange because all the other results look like there's about a one-to-one correspondence. In the case that is fast for you, the outer scan appears to have matched every time: > Limit (cost=0.00..33408.19 rows=5324 width=691) (actual time=0.054..106.972 > rows=5324 loops=1) > -> Nested Loop (cost=0.00..7277902.27 rows=1159822 width=691) (actual > time=0.051..99.230 rows=5324 loops=1) > -> Index Scan using patient_id_internal_id_idx on pv1 > (cost=0.00..290598.21 rows=1159822 width=407) (actual time=0.020..14.995 > rows=5324 loops=1) ^^^^^^^^^ > -> Index Scan using kdx_pid on pid (cost=0.00..6.01 rows=1 > width=284) (actual time=0.009..0.010 rows=1 loops=5324) > Index Cond: ((pid.patient_id_internal_id)::text = > ("outer".patient_id_internal_id)::text) > Total runtime: 111.137 ms Do you have any idea why the hash case is so unlucky? BTW, these examples sure look like they are not being done on exactly the same tables... the names are different. regards, tom lane
Hi Tom! Yes, it is always the same tables in both examples. I have no idea why the query is this way. On Wednesday 15 February 2006 12:01, Tom Lane wrote: > Hmm, there's something pretty odd about your data, it seems like. > In the planner's preferred hash-join plan, it seems to be necessary > to scan through 654414 rows of pv1 to find the first 5324 rows that > > join to pid: > > Limit (cost=27704.04..28790.59 rows=5324 width=691) (actual > > time=3865.677..17464.071 rows=5324 loops=1) > > -> Hash Join (cost=27704.04..264408.14 rows=1159822 width=691) > > (actual time=3865.674..17453.938 rows=5324 loops=1) > > Hash Cond: (("outer".patient_id_internal_id)::text = > > ("inner".patient_id_internal_id)::text) > > -> Seq Scan on pv1 (cost=0.00..80441.22 rows=1159822 > > width=407) (actual time=4.907..9361.791 rows=654414 loops=1) > ^^^^^^^^^^^ > This is strange because all the other results look like there's about a > one-to-one correspondence. In the case that is fast for you, the outer > scan appears to have matched every time: > > Do you have any idea why the hash case is so unlucky? > > BTW, these examples sure look like they are not being done on exactly > the same tables... the names are different. > > regards, tom lane -- Regards, Richard Kut Database Administrator Research & Development Intelerad Medical Systems Inc. 460 Ste-Catherine West, Suite 210 Montreal, Quebec, Canada H3B 1A7 Tel: 514.931.6222 x7733 Fax: 514.931.4653 rkut@intelerad.com www.intelerad.com This email or any attachments may contain confidential or legally privileged information intended for the sole use of the addressees. Any use, redistribution, disclosure, or reproduction of this information, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments.