Обсуждение: Postgres not using indexes
We have a huge performance issues in Postgres that surfaced due to existing= indexes not being used like in the example below in both 8.35 and 9.0 vers= ions. Client_Orders table with and int ID as PK which is the order_id and indexed= - about 155,000 rows Order_Items table with and int ID primary key and INDEX on Order_id (int) m= atching the ID in the above client_orders table. - about 33 million rows A query like below takes almost ten minutes to complete however the result = set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the= fact that the index on Order_Items it is NOT used and a sequence scan is d= one instead but this is obviously not acceptable from performance point of = view. If I add a LIMIT 1000 for instance then the index is used and query r= eturns results in no time as expected but as soon as I go higher in the lim= it to a few thousands then the index on Order_Items.Order_id is no longer u= sed - why??? Is there any way to force Postgres to use the existing indexes= instead of table seq scan which is deadly? select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id= =3D co.id Regards, Nenea Nelu. ________________________________ Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminatio= n or other use of, or taking of any action in reliance upon, this informati= on by persons or entities other than the intended recipient is prohibited. = If you received this in error, please contact the sender and delete the mat= erial from any system and destroy any copies.
I force postgresql to use indexes instead of sequential scans by setting enable_seqscan =3D off in postgresql.conf and it helps in a lot of cases. Probably not the best practice, but it does improve a lot of the queries we will execute on a regular basis. It forces the planner to prefer indexes. I've also noticed that limit behavior which is sort of puzzling to me. =20 =20 From: pgsql-bugs-owner@postgresql.org [mailto:pgsql-bugs-owner@postgresql.org] On Behalf Of Lawrence Cohan Sent: Wednesday, March 30, 2011 10:01 AM To: pgsql-bugs@postgresql.org Subject: [BUGS] Postgres not using indexes =20 We have a huge performance issues in Postgres that surfaced due to existing indexes not being used like in the example below in both 8.35 and 9.0 versions. =20 Client_Orders table with and int ID as PK which is the order_id and indexed - about 155,000 rows Order_Items table with and int ID primary key and INDEX on Order_id (int) matching the ID in the above client_orders table. - about 33 million rows =20 A query like below takes almost ten minutes to complete however the result set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the fact that the index on Order_Items it is NOT used and a sequence scan is done instead but this is obviously not acceptable from performance point of view. If I add a LIMIT 1000 for instance then the index is used and query returns results in no time as expected but as soon as I go higher in the limit to a few thousands then the index on Order_Items.Order_id is no longer used - why??? Is there any way to force Postgres to use the existing indexes instead of table seq scan which is deadly? =20 select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id =3D co.id =20 Regards, Nenea Nelu. =20 =20 ________________________________ Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies.
Hello 2011/3/30 Lawrence Cohan <LCohan@web.com>: > We have a huge performance issues in Postgres that surfaced due to existi= ng > indexes not being used like in the example below in both 8.35 and 9.0 > versions. > > > > Client_Orders table with and int ID as PK which is the order_id and index= ed > =E2=80=93 about 155,000 rows > > Order_Items table with and int ID primary key and INDEX on Order_id (int) > matching the ID in the above client_orders table. =E2=80=93 about 33 mill= ion rows > > > > A query like below takes almost ten minutes to complete however the result > set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to t= he > fact that the index on Order_Items it is NOT used and a sequence scan is > done instead but this is obviously not acceptable from performance point = of > view. If I add a LIMIT 1000 for instance then the index is used and query > returns results in no time as expected but as soon as I go higher in the > limit to a few thousands then the index on Order_Items.Order_id is no lon= ger > used =E2=80=93 why??? Is there any way to force Postgres to use the exist= ing indexes > instead of table seq scan which is deadly? > > > > select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_= id > =3D co.id > > Do you do a ANALYZE and VACUUM. Can you send a result of EXPLAIN ANALYZE SELECT ... Please, do ANALYZE and VACUUM first. regards Pavel Stehule > > Regards, > > Nenea Nelu. > > > > ________________________________ > Attention: > The information contained in this message and or attachments is intended > only for the person or entity to which it is addressed and may contain > confidential and/or privileged material. Any review, retransmission, > dissemination or other use of, or taking of any action in reliance upon, > this information by persons or entities other than the intended recipient= is > prohibited. If you received this in error, please contact the sender and > delete the material from any system and destroy any copies. >
Thanks for the tip however No 1 is that we can't do that in the production = environment due to impact and No 2 that I tried that and is still not using= an index on the large table but seq scan. From: Nathan M. Davalos [mailto:n.davalos@sharedmarketing.com] Sent: March-30-11 12:05 PM To: Lawrence Cohan; pgsql-bugs@postgresql.org Subject: RE: [BUGS] Postgres not using indexes I force postgresql to use indexes instead of sequential scans by setting en= able_seqscan =3D off in postgresql.conf and it helps in a lot of cases. Pro= bably not the best practice, but it does improve a lot of the queries we wi= ll execute on a regular basis. It forces the planner to prefer indexes. I'v= e also noticed that limit behavior which is sort of puzzling to me. From: pgsql-bugs-owner@postgresql.org [mailto:pgsql-bugs-owner@postgresql.o= rg] On Behalf Of Lawrence Cohan Sent: Wednesday, March 30, 2011 10:01 AM To: pgsql-bugs@postgresql.org Subject: [BUGS] Postgres not using indexes We have a huge performance issues in Postgres that surfaced due to existing= indexes not being used like in the example below in both 8.35 and 9.0 vers= ions. Client_Orders table with and int ID as PK which is the order_id and indexed= - about 155,000 rows Order_Items table with and int ID primary key and INDEX on Order_id (int) m= atching the ID in the above client_orders table. - about 33 million rows A query like below takes almost ten minutes to complete however the result = set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the= fact that the index on Order_Items it is NOT used and a sequence scan is d= one instead but this is obviously not acceptable from performance point of = view. If I add a LIMIT 1000 for instance then the index is used and query r= eturns results in no time as expected but as soon as I go higher in the lim= it to a few thousands then the index on Order_Items.Order_id is no longer u= sed - why??? Is there any way to force Postgres to use the existing indexes= instead of table seq scan which is deadly? select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id= =3D co.id Regards, Nenea Nelu. ________________________________ Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminatio= n or other use of, or taking of any action in reliance upon, this informati= on by persons or entities other than the intended recipient is prohibited. = If you received this in error, please contact the sender and delete the mat= erial from any system and destroy any copies. ________________________________ Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminatio= n or other use of, or taking of any action in reliance upon, this informati= on by persons or entities other than the intended recipient is prohibited. = If you received this in error, please contact the sender and delete the mat= erial from any system and destroy any copies.
Lawrence Cohan <LCohan@web.com> wrote: > We have a huge performance issues in Postgres that surfaced due to > existing indexes not being used This doesn't sound like a bug; it sounds like you haven't tuned your server. For starters, you should check out this page: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server As a quick try, you could issue these statements on the connection right before one of the problem queries: set effective_cache_size = '7GB'; -- use 1 or 2 GB less than RAM on the machine set random_page_cost = 2; -- now try your query If, after reading the above-cited page and tuning your server you still have performance problems, pick one query to work on first, and follow the step outlined here: http://wiki.postgresql.org/wiki/SlowQueryQuestions Use the pgsql-performance list for performance issues, not the bugs list. You'll find more people who will be able to help you with performance issues there. -Kevin
Please see results attached before and after the vacuum and note they are taken from version 9.0 As regular maintenance we reindex/vacuum/analyze entire database once a week and run ANALYZE against it every few hours. Lawrence Cohan. -----Original Message----- From: Pavel Stehule [mailto:pavel.stehule@gmail.com] Sent: March-30-11 12:08 PM To: Lawrence Cohan Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] Postgres not using indexes Hello 2011/3/30 Lawrence Cohan <LCohan@web.com>: > We have a huge performance issues in Postgres that surfaced due to existing > indexes not being used like in the example below in both 8.35 and 9.0 > versions. > > > > Client_Orders table with and int ID as PK which is the order_id and indexed > – about 155,000 rows > > Order_Items table with and int ID primary key and INDEX on Order_id (int) > matching the ID in the above client_orders table. – about 33 million rows > > > > A query like below takes almost ten minutes to complete however the result > set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the > fact that the index on Order_Items it is NOT used and a sequence scan is > done instead but this is obviously not acceptable from performance point of > view. If I add a LIMIT 1000 for instance then the index is used and query > returns results in no time as expected but as soon as I go higher in the > limit to a few thousands then the index on Order_Items.Order_id is no longer > used – why??? Is there any way to force Postgres to use the existing indexes > instead of table seq scan which is deadly? > > > > select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id > = co.id > > Do you do a ANALYZE and VACUUM. Can you send a result of EXPLAIN ANALYZE SELECT ... Please, do ANALYZE and VACUUM first. regards Pavel Stehule > > Regards, > > Nenea Nelu. > > > > ________________________________ > Attention: > The information contained in this message and or attachments is intended > only for the person or entity to which it is addressed and may contain > confidential and/or privileged material. Any review, retransmission, > dissemination or other use of, or taking of any action in reliance upon, > this information by persons or entities other than the intended recipient is > prohibited. If you received this in error, please contact the sender and > delete the material from any system and destroy any copies. > Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressedand may contain confidential and/or privileged material. Any review, retransmission, dissemination or other useof, or taking of any action in reliance upon, this information by persons or entities other than the intended recipientis prohibited. If you received this in error, please contact the sender and delete the material from any systemand destroy any copies.
Вложения
We thank you for the links that have a lots of info and please note that we= tuned our servers as recommended by Enterprise DB experts while they were = in house for our hardware/software migrations and the setting you mentioned= are in place already. Regards, Lawrence Cohan. -----Original Message----- From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] Sent: March-30-11 12:45 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: Re: [BUGS] Postgres not using indexes Lawrence Cohan <LCohan@web.com> wrote: > We have a huge performance issues in Postgres that surfaced due to > existing indexes not being used This doesn't sound like a bug; it sounds like you haven't tuned your server. For starters, you should check out this page: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server As a quick try, you could issue these statements on the connection right before one of the problem queries: set effective_cache_size =3D '7GB'; -- use 1 or 2 GB less than RAM on the machine set random_page_cost =3D 2; -- now try your query If, after reading the above-cited page and tuning your server you still have performance problems, pick one query to work on first, and follow the step outlined here: http://wiki.postgresql.org/wiki/SlowQueryQuestions Use the pgsql-performance list for performance issues, not the bugs list. You'll find more people who will be able to help you with performance issues there. -Kevin Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminati= on or other use of, or taking of any action in reliance upon, this informat= ion by persons or entities other than the intended recipient is prohibited.= If you received this in error, please contact the sender and delete the ma= terial from any system and destroy any copies.
Lawrence Cohan <LCohan@web.com> wrote: > From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] >> [configuration advice] >> If, after reading the above-cited page and tuning your server you >> still have performance problems, pick one query to work on first, >> and follow the step outlined here: >> >> http://wiki.postgresql.org/wiki/SlowQueryQuestions > We thank you for the links that have a lots of info and please > note that we tuned our servers as recommended by Enterprise DB > experts while they were in house for our hardware/software > migrations and the setting you mentioned are in place already. Then the next step would be to provide enough information on one of the slow queries for people to be able to offer useful advice. Your other post showed the query and the EXPLAIN ANALYZE output, but the other information listed in the above-cited page is useful when trying to understand a problem. I'm particularly curious about the data types of the id columns and the specifics of the index definitions. -Kevin
Please see updated attachment that includes the tables involved in the simple query below and all their indexes. We believethat the performance issue is due to the query not using any index but doing seq scans instead and this is very littlerelated to the knowledge from the link you posted below. As you can see we picked a simple query with INNER JOIN betweentwo indexed tables where postgres 8.3 and 9.0 decides to not use existing indexes for whatever reason. select oi.id from order_items oi inner join clients_orders co on oi.order_id = co.id; Lawrence Cohan. -----Original Message----- From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] Sent: March-30-11 1:33 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: RE: [BUGS] Postgres not using indexes Lawrence Cohan <LCohan@web.com> wrote: > From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] >> [configuration advice] >> If, after reading the above-cited page and tuning your server you >> still have performance problems, pick one query to work on first, >> and follow the step outlined here: >> >> http://wiki.postgresql.org/wiki/SlowQueryQuestions > We thank you for the links that have a lots of info and please > note that we tuned our servers as recommended by Enterprise DB > experts while they were in house for our hardware/software > migrations and the setting you mentioned are in place already. Then the next step would be to provide enough information on one of the slow queries for people to be able to offer useful advice. Your other post showed the query and the EXPLAIN ANALYZE output, but the other information listed in the above-cited page is useful when trying to understand a problem. I'm particularly curious about the data types of the id columns and the specifics of the index definitions. -Kevin Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressedand may contain confidential and/or privileged material. Any review, retransmission, dissemination or other useof, or taking of any action in reliance upon, this information by persons or entities other than the intended recipientis prohibited. If you received this in error, please contact the sender and delete the material from any systemand destroy any copies.
Вложения
Lawrence Cohan <LCohan@web.com> wrote: > Please see updated attachment that includes the tables involved in > the simple query below and all their indexes. Well, that rules out a couple common problems (comparisons between different types and incorrect indexing). > We believe that the performance issue is due to the query not > using any index but doing seq scans instead So it appears. > and this is very little related to the knowledge from the link you > posted below. Oh, but it is very much related. The PostgreSQL optimizer looks at all the various plans available, calculates a cost for each, and run the one with the lowest calculated cost. Various configuration parameters affect the costing calculations, and thus the plan ultimately chosen. To get good plans, the configuration must accurately model the actual costs for your particular machine. Having seen that the types match and the indexes look usable, it must come down to something in your configuration. Probably the easiest way to show that is to run the query here and post the results: http://wiki.postgresql.org/wiki/Server_Configuration > As you can see we picked a simple query with INNER JOIN between > two indexed tables where postgres 8.3 and 9.0 decides to not use > existing indexes for whatever reason. The reason is that with the configuration you're using, PostgreSQL calculates the cost of using the index as being higher than the cost of a sequential scan. The trick is to find where your configuration is wrong, so that the calculated costs better match the reality on your server. -Kevin
I think you are right (my bad) and please see the results below plus a litt= le bit more info about the environment and sorry I missed that before. I've= been told the server was tuned to the best for what we need and looks like= we will need to change at least the two values below and maybe play with w= ork_mem to see if it solves our issues. The only issue is that we are running a 24/7 web site against the db and if= we need to restart PG for the changes to take place we will need to wait f= or a downtime before any changes can be made. 'shared_buffers';'500MB' - shared_buffers should be 10% to 25% of available= RAM -> change it to 2GB 'effective_cache_size';'2GB' - effective_cache_size should be 75% of availa= ble RAM -> change it to 10GB 'work_mem';'1MB' - increase it to 8MB, 32MB, 256MB, 1GB and check if better= results. PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1= .2 20080704 (Red Hat 4.1.2-48), 64-bit 2 x Intel(R) Xeon(R) CPU E5345 @ 2.33GHz 4 x 4GB =3D 16GB RAM --query results below: 'version';'PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gc= c (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit' 'archive_command';'cp %p /pglog/wal_export/%f' 'archive_mode';'on' 'archive_timeout';'3min' 'autovacuum_analyze_threshold';'1000' 'autovacuum_vacuum_threshold';'1000' 'bytea_output';'escape' 'checkpoint_segments';'64' 'checkpoint_warning';'1min' 'client_encoding';'UNICODE' 'effective_cache_size';'2GB' 'escape_string_warning';'off' 'lc_collate';'en_US.UTF-8' 'lc_ctype';'en_US.UTF-8' 'listen_addresses';'xxx.xxx.xxx.xxx' 'log_autovacuum_min_duration';'2s' 'log_checkpoints';'on' 'log_destination';'syslog' 'log_line_prefix';'user=3D%u,db=3D%d ' 'log_min_duration_statement';'1s' 'maintenance_work_mem';'256MB' 'max_connections';'1200' 'max_stack_depth';'2MB' 'port';'5432' 'server_encoding';'UTF8' 'shared_buffers';'500MB' 'syslog_facility';'local0' 'syslog_ident';'postgres' 'TimeZone';'Canada/Eastern' 'vacuum_cost_delay';'10ms' 'wal_buffers';'4MB' 'wal_level';'hot_standby' -----Original Message----- From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] Sent: March-30-11 1:33 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: RE: [BUGS] Postgres not using indexes Lawrence Cohan <LCohan@web.com> wrote: > From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] >> [configuration advice] >> If, after reading the above-cited page and tuning your server you >> still have performance problems, pick one query to work on first, >> and follow the step outlined here: >> >> http://wiki.postgresql.org/wiki/SlowQueryQuestions > We thank you for the links that have a lots of info and please > note that we tuned our servers as recommended by Enterprise DB > experts while they were in house for our hardware/software > migrations and the setting you mentioned are in place already. Then the next step would be to provide enough information on one of the slow queries for people to be able to offer useful advice. Your other post showed the query and the EXPLAIN ANALYZE output, but the other information listed in the above-cited page is useful when trying to understand a problem. I'm particularly curious about the data types of the id columns and the specifics of the index definitions. -Kevin Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminati= on or other use of, or taking of any action in reliance upon, this informat= ion by persons or entities other than the intended recipient is prohibited.= If you received this in error, please contact the sender and delete the ma= terial from any system and destroy any copies.
I'm just a lowly end user. Bumping the default statistics target or using ALTER TABLE SET STATISTICS has made large differences in query performance on large tables. The query planner has to guesstimate using the statistics. Sometimes on large tables the guesstimate isn't very good with a small statistical sample. On 3/30/2011 12:05 PM, Lawrence Cohan wrote: > I think you are right (my bad) and please see the results below plus a little bit more info about the environment and sorryI missed that before. I've been told the server was tuned to the best for what we need and looks like we will need tochange at least the two values below and maybe play with work_mem to see if it solves our issues. > The only issue is that we are running a 24/7 web site against the db and if we need to restart PG for the changes to takeplace we will need to wait for a downtime before any changes can be made. > > 'shared_buffers';'500MB' - shared_buffers should be 10% to 25% of available RAM -> change it to 2GB > 'effective_cache_size';'2GB' - effective_cache_size should be 75% of available RAM -> change it to 10GB > 'work_mem';'1MB' - increase it to 8MB, 32MB, 256MB, 1GB and check if better results. > > > PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit > 2 x Intel(R) Xeon(R) CPU E5345 @ 2.33GHz > 4 x 4GB = 16GB RAM > > --query results below: > 'version';'PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit' > 'archive_command';'cp %p /pglog/wal_export/%f' > 'archive_mode';'on' > 'archive_timeout';'3min' > 'autovacuum_analyze_threshold';'1000' > 'autovacuum_vacuum_threshold';'1000' > 'bytea_output';'escape' > 'checkpoint_segments';'64' > 'checkpoint_warning';'1min' > 'client_encoding';'UNICODE' > 'effective_cache_size';'2GB' > 'escape_string_warning';'off' > 'lc_collate';'en_US.UTF-8' > 'lc_ctype';'en_US.UTF-8' > 'listen_addresses';'xxx.xxx.xxx.xxx' > 'log_autovacuum_min_duration';'2s' > 'log_checkpoints';'on' > 'log_destination';'syslog' > 'log_line_prefix';'user=%u,db=%d' > 'log_min_duration_statement';'1s' > 'maintenance_work_mem';'256MB' > 'max_connections';'1200' > 'max_stack_depth';'2MB' > 'port';'5432' > 'server_encoding';'UTF8' > 'shared_buffers';'500MB' > 'syslog_facility';'local0' > 'syslog_ident';'postgres' > 'TimeZone';'Canada/Eastern' > 'vacuum_cost_delay';'10ms' > 'wal_buffers';'4MB' > 'wal_level';'hot_standby' > > > -----Original Message----- > From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] > Sent: March-30-11 1:33 PM > To: pgsql-bugs@postgresql.org; Lawrence Cohan > Subject: RE: [BUGS] Postgres not using indexes > > Lawrence Cohan<LCohan@web.com> wrote: >> From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] >>> [configuration advice] >>> If, after reading the above-cited page and tuning your server you >>> still have performance problems, pick one query to work on first, >>> and follow the step outlined here: >>> >>> http://wiki.postgresql.org/wiki/SlowQueryQuestions >> We thank you for the links that have a lots of info and please >> note that we tuned our servers as recommended by Enterprise DB >> experts while they were in house for our hardware/software >> migrations and the setting you mentioned are in place already. > Then the next step would be to provide enough information on one of > the slow queries for people to be able to offer useful advice. Your > other post showed the query and the EXPLAIN ANALYZE output, but the > other information listed in the above-cited page is useful when > trying to understand a problem. I'm particularly curious about the > data types of the id columns and the specifics of the index > definitions. > > -Kevin > > Attention: > The information contained in this message and or attachments is intended only for the person or entity to which it is addressedand may contain confidential and/or privileged material. Any review, retransmission, dissemination or other useof, or taking of any action in reliance upon, this information by persons or entities other than the intended recipientis prohibited. If you received this in error, please contact the sender and delete the material from any systemand destroy any copies. >
Harry Rossignol <harrywr2@comcast.net> wrote: > I'm just a lowly end user. Bumping the default statistics target > or using ALTER TABLE SET STATISTICS has made large differences in > query performance on large tables. The default has been bumped up in later versions, so that shouldn't be as big a problem as it once was. In this particular case the EXPLAIN ANALYZE showed the estimated row counts were quite accurate compared to the actual, so this definitely wasn't the problem this time. -Kevin
Lawrence Cohan <LCohan@web.com> wrote: > looks like we will need to change at least the two values below > and maybe play with work_mem to see if it solves our issues. You will probably get better throughput by bumping up shared_buffers to the recommended setting, but beware of "stalls" in query processing at checkpoint time. If that happens you want to make the background writer more aggressive and/or back off on shared_memory, so that there isn't such a glut of dirty pages to write during a checkpoint. I think even the recommended setting for effective_cache size is on the low side. This one affects how expensive the optimizer thinks index usage will be, so given your current problem this is probably important to raise. I add up shared_buffers and what free tells me is cached space is after PostgreSQL has been running a while. That usually winds up being 1GB to 2GB less than total memory on our machines, so actually, I usually just start there. We usually need to reduce random_page_cost to get good plans. For a fully-cached database you may want to reduce both seq_page_cost and random_page_cost to equal numbers around 0.05. With partial caching, we often leave seq_page_cost alone and reduce random_page_cost to 2. YMMV. The setting for work_mem can be tricky, especially with 1200 connections configured. Each connection may be using one or more allocations of work_mem at the same time. Which leads to the question of why you have 1200 connections configured. You are almost always better off using a connection pooler to limit this to something on the order of twice your CPU cores plus your effective spindle count. Tomcat has a very good connection pooler built in, as do many other products. There are also good external poolers, like pgpool and pgbouncer. With a reasonable amount of RAM you're almost always better off bumping wal_buffers to 32MB. > The only issue is that we are running a 24/7 web site against the > db and if we need to restart PG for the changes to take place we > will need to wait for a downtime before any changes can be made. Some of these can be set per user with ALTER ROLE. New connections would then start using the new settings with no down time. -Kevin
Thank you for all your suggestions - will attempt to make changes as recomm= ended one at a time and will post back the results. Regards, Lawrence Cohan. -----Original Message----- From: pgsql-bugs-owner@postgresql.org [mailto:pgsql-bugs-owner@postgresql.o= rg] On Behalf Of Kevin Grittner Sent: March-30-11 4:12 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: Re: [BUGS] Postgres not using indexes Lawrence Cohan <LCohan@web.com> wrote: > looks like we will need to change at least the two values below > and maybe play with work_mem to see if it solves our issues. You will probably get better throughput by bumping up shared_buffers to the recommended setting, but beware of "stalls" in query processing at checkpoint time. If that happens you want to make the background writer more aggressive and/or back off on shared_memory, so that there isn't such a glut of dirty pages to write during a checkpoint. I think even the recommended setting for effective_cache size is on the low side. This one affects how expensive the optimizer thinks index usage will be, so given your current problem this is probably important to raise. I add up shared_buffers and what free tells me is cached space is after PostgreSQL has been running a while. That usually winds up being 1GB to 2GB less than total memory on our machines, so actually, I usually just start there. We usually need to reduce random_page_cost to get good plans. For a fully-cached database you may want to reduce both seq_page_cost and random_page_cost to equal numbers around 0.05. With partial caching, we often leave seq_page_cost alone and reduce random_page_cost to 2. YMMV. The setting for work_mem can be tricky, especially with 1200 connections configured. Each connection may be using one or more allocations of work_mem at the same time. Which leads to the question of why you have 1200 connections configured. You are almost always better off using a connection pooler to limit this to something on the order of twice your CPU cores plus your effective spindle count. Tomcat has a very good connection pooler built in, as do many other products. There are also good external poolers, like pgpool and pgbouncer. With a reasonable amount of RAM you're almost always better off bumping wal_buffers to 32MB. > The only issue is that we are running a 24/7 web site against the > db and if we need to restart PG for the changes to take place we > will need to wait for a downtime before any changes can be made. Some of these can be set per user with ALTER ROLE. New connections would then start using the new settings with no down time. -Kevin -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminati= on or other use of, or taking of any action in reliance upon, this informat= ion by persons or entities other than the intended recipient is prohibited.= If you received this in error, please contact the sender and delete the ma= terial from any system and destroy any copies.
On Wed, Mar 30, 2011 at 7:32 PM, Lawrence Cohan <LCohan@web.com> wrote: > Please see updated attachment that includes the tables involved in the si= mple query below and all their indexes. We believe that the performance iss= ue is due to the query not using any index but doing seq scans instead and = this is very little related to the knowledge from the link you posted below= . As you can see we picked a simple query with INNER JOIN between two index= ed tables where postgres 8.3 and 9.0 decides to not use existing indexes fo= r whatever reason. Have you tried using enable_seqscan =3D off and looked at the explain outpu= t then? I think you'll find Postgres is choosing not to use the indexes because they're just not helpful. If you disable seqscan it should use the indexes but I expect it will be slower. The current plan is doing a single pass through both tables using entirely sequential i/o. That's about as fast as you could hope for it to be as your query does require reading all the data. --=20 greg
Greg Stark <gsstark@mit.edu> wrote: > your query does require reading all the data. Huh? It requires reading all the data from at least *one* of the tables. I could conceivably be faster to read all the data from the table with 23,980 rows and randomly pick out the necessary 33,768 rows from the table with 33,909,690 rows using the idx_order_items_order_id index. That seems like it might be faster. -Kevin
On Thu, Mar 31, 2011 at 11:33 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Greg Stark <gsstark@mit.edu> wrote: > >> your query does require reading all the data. > > Huh? =A0It requires reading all the data from at least *one* of the > tables. The query he posted a plan for was: EXPLAIN ANALYZE select oi.id from order_items oi inner join clients_orders co on oi.order_id =3D co.id; And the plan for it looks like it's optimal to me: 'Hash Join (cost=3D780.55..1908023.16 rows=3D1027457 width=3D4) (actual time=3D63.506..85607.003 rows=3D33768 loops=3D1)' ' Hash Cond: (oi.order_id =3D co.id)' ' -> Seq Scan on order_items oi (cost=3D0.00..1558536.52 rows=3D33843152 width=3D8) (actual time=3D0.005..69718.563 rows=3D33909137 loops=3D1)' ' -> Hash (cost=3D480.80..480.80 rows=3D23980 width=3D4) (actual time=3D13.072..13.072 rows=3D23980 loops=3D1)' ' Buckets: 4096 Batches: 1 Memory Usage: 844kB' ' -> Seq Scan on clients_orders co (cost=3D0.00..480.80 rows=3D23980 width=3D4) (actual time=3D0.006..6.570 rows=3D23980 loops=3D1)' 'Total runtime: 85613.391 ms' --=20 greg
Greg Stark <gsstark@mit.edu> wrote: > On Thu, Mar 31, 2011 at 11:33 PM, Kevin Grittner > <Kevin.Grittner@wicourts.gov> wrote: >> Greg Stark <gsstark@mit.edu> wrote: >> >>> your query does require reading all the data. >> >> Huh? It requires reading all the data from at least *one* of the >> tables. > > The query he posted a plan for was: > > EXPLAIN ANALYZE select oi.id from order_items oi inner join > clients_orders co on oi.order_id = co.id; > > And the plan for it looks like it's optimal to me: > > 'Hash Join (cost=780.55..1908023.16 rows=1027457 width=4) (actual > time=63.506..85607.003 rows=33768 loops=1)' > ' Hash Cond: (oi.order_id = co.id)' > ' -> Seq Scan on order_items oi (cost=0.00..1558536.52 > rows=33843152 width=8) (actual time=0.005..69718.563 rows=33909137 > loops=1)' > ' -> Hash (cost=480.80..480.80 rows=23980 width=4) (actual > time=13.072..13.072 rows=23980 loops=1)' > ' Buckets: 4096 Batches: 1 Memory Usage: 844kB' > ' -> Seq Scan on clients_orders co (cost=0.00..480.80 > rows=23980 width=4) (actual time=0.006..6.570 rows=23980 loops=1)' > 'Total runtime: 85613.391 ms' It may or may not be optimal, but the assertion that all 33.9 *million* order_items rows must be read to pick out the needed 33.8 *thousand* is just plain incorrect. Personally, I won't be shocked if using the index to cut the tuples accessed by three orders of magnitude is faster. -Kevin
Thank you for all your suggestions and I hope the "set enable_seqscan =3D o= ff;" will work for the time being until we can make PG config changes and m= ore testing in the near future. We expect indeed much better performance wi= th index being used on the 33+million rows table vs seq scan and I will pos= t back real time results as soon as I can get them done in production serve= rs. Regards, Lawrence Cohan. -----Original Message----- From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] Sent: April-01-11 10:38 AM To: Greg Stark Cc: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: Re: Postgres not using indexes Greg Stark <gsstark@mit.edu> wrote: > On Thu, Mar 31, 2011 at 11:33 PM, Kevin Grittner > <Kevin.Grittner@wicourts.gov> wrote: >> Greg Stark <gsstark@mit.edu> wrote: >> >>> your query does require reading all the data. >> >> Huh? It requires reading all the data from at least *one* of the >> tables. > > The query he posted a plan for was: > > EXPLAIN ANALYZE select oi.id from order_items oi inner join > clients_orders co on oi.order_id =3D co.id; > > And the plan for it looks like it's optimal to me: > > 'Hash Join (cost=3D780.55..1908023.16 rows=3D1027457 width=3D4) (actual > time=3D63.506..85607.003 rows=3D33768 loops=3D1)' > ' Hash Cond: (oi.order_id =3D co.id)' > ' -> Seq Scan on order_items oi (cost=3D0.00..1558536.52 > rows=3D33843152 width=3D8) (actual time=3D0.005..69718.563 rows=3D33909137 > loops=3D1)' > ' -> Hash (cost=3D480.80..480.80 rows=3D23980 width=3D4) (actual > time=3D13.072..13.072 rows=3D23980 loops=3D1)' > ' Buckets: 4096 Batches: 1 Memory Usage: 844kB' > ' -> Seq Scan on clients_orders co (cost=3D0.00..480.80 > rows=3D23980 width=3D4) (actual time=3D0.006..6.570 rows=3D23980 loops=3D= 1)' > 'Total runtime: 85613.391 ms' It may or may not be optimal, but the assertion that all 33.9 *million* order_items rows must be read to pick out the needed 33.8 *thousand* is just plain incorrect. Personally, I won't be shocked if using the index to cut the tuples accessed by three orders of magnitude is faster. -Kevin Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminati= on or other use of, or taking of any action in reliance upon, this informat= ion by persons or entities other than the intended recipient is prohibited.= If you received this in error, please contact the sender and delete the ma= terial from any system and destroy any copies.
Hello Kevin, We managed to put together a new test server running PG 9.0.2 on 2socketsx6= cores =3D 12CPU with 64 GB RAM against a 3PAR 10TB SAN. We kept the setting= s I submitted already (and enclosed below) and after 12 hours of pounding t= he box with PGBENCH running 8 scripts to perform all of INSERT/UPDATE/DELET= E/SELECT statements we wanted we got a pretty good picture of what can do w= ith those settings. We got a load average of 60 with CPU up and around that= 60% mark, pushing through about 1400 transactions per second for 12 hours.= We made the changes as suggested and listed below but the throughput dropp= ed from 1400 t/s to 400 t/s and I suspect are the "stalled" transactions yo= u mentioned about. Here's what we changed: Current Settings Test Settings =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D shared_buffers =3D 500MB shared_buffers =3D 8GB effective_cache_size =3D 2GB effective_cache_size =3D 32GB Just to be 100% accurate we ask you what do you mean by: 1) "Make the background writer more aggressive and/or back off on shared_memory, so that there isn't such a glut of dirty pages to Write during a checkpoint." By aggressive does he mean changing any of the following? # - Background Writer - #bgwriter_delay =3D 200ms # 10-10000ms between rounds #bgwriter_lru_maxpages =3D 100 # 0-1000 max buffers written/round #bgwriter_lru_multiplier =3D 2.0 # 0-10.0 multiplier on buffers sc= anned/round Or we should be better of by the checkpoint segment handling - any of the b= elow: # - Checkpoints - checkpoint_segments =3D 64 # in logfile segments, min 1, 16M= B each #checkpoint_timeout =3D 5min # range 30s-1h #checkpoint_completion_target =3D 0.5 # checkpoint target duration, 0.0= - 1.0 checkpoint_warning =3D 60s # 0 disables Best regards, Lawrence Cohan. -----Original Message----- From: pgsql-bugs-owner@postgresql.org [mailto:pgsql-bugs-owner@postgresql.o= rg] On Behalf Of Kevin Grittner Sent: March-30-11 4:12 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: Re: [BUGS] Postgres not using indexes Lawrence Cohan <LCohan@web.com> wrote: > looks like we will need to change at least the two values below > and maybe play with work_mem to see if it solves our issues. You will probably get better throughput by bumping up shared_buffers to the recommended setting, but beware of "stalls" in query processing at checkpoint time. If that happens you want to make the background writer more aggressive and/or back off on shared_memory, so that there isn't such a glut of dirty pages to write during a checkpoint. I think even the recommended setting for effective_cache size is on the low side. This one affects how expensive the optimizer thinks index usage will be, so given your current problem this is probably important to raise. I add up shared_buffers and what free tells me is cached space is after PostgreSQL has been running a while. That usually winds up being 1GB to 2GB less than total memory on our machines, so actually, I usually just start there. We usually need to reduce random_page_cost to get good plans. For a fully-cached database you may want to reduce both seq_page_cost and random_page_cost to equal numbers around 0.05. With partial caching, we often leave seq_page_cost alone and reduce random_page_cost to 2. YMMV. The setting for work_mem can be tricky, especially with 1200 connections configured. Each connection may be using one or more allocations of work_mem at the same time. Which leads to the question of why you have 1200 connections configured. You are almost always better off using a connection pooler to limit this to something on the order of twice your CPU cores plus your effective spindle count. Tomcat has a very good connection pooler built in, as do many other products. There are also good external poolers, like pgpool and pgbouncer. With a reasonable amount of RAM you're almost always better off bumping wal_buffers to 32MB. > The only issue is that we are running a 24/7 web site against the > db and if we need to restart PG for the changes to take place we > will need to wait for a downtime before any changes can be made. Some of these can be set per user with ALTER ROLE. New connections would then start using the new settings with no down time. -Kevin -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminati= on or other use of, or taking of any action in reliance upon, this informat= ion by persons or entities other than the intended recipient is prohibited.= If you received this in error, please contact the sender and delete the ma= terial from any system and destroy any copies.
Lawrence Cohan <LCohan@web.com> wrote: > We managed to put together a new test server running PG 9.0.2 on > 2socketsx6cores = 12CPU with 64 GB RAM against a 3PAR 10TB SAN. We > kept the settings I submitted already (and enclosed below) and > after 12 hours of pounding the box with PGBENCH running 8 scripts > to perform all of INSERT/UPDATE/DELETE/SELECT statements we wanted > we got a pretty good picture of what can do with those settings. > We got a load average of 60 with CPU up and around that 60% mark, > pushing through about 1400 transactions per second for 12 hours. > We made the changes as suggested and listed below but the > throughput dropped from 1400 t/s to 400 t/s and I suspect are the > "stalled" transactions you mentioned about. > > Here's what we changed: > > Current Settings Test Settings > ================================================================ > shared_buffers = 500MB shared_buffers = 8GB > effective_cache_size = 2GB effective_cache_size = 32GB To make sure I understand, are the "Current Settings" the ones which performed better? > Just to be 100% accurate we ask you what do you mean by: > > 1) "Make the background writer more aggressive and/or back > off on shared_memory, so that there isn't such a glut of dirty > pages to Write during a checkpoint." > > By aggressive does he mean changing any of the following? > # - Background Writer - > > #bgwriter_delay = 200ms > #bgwriter_lru_maxpages = 100 > #bgwriter_lru_multiplier = 2.0 We use these overrides: bgwriter_lru_maxpages = 1000 bgwriter_lru_multiplier = 4.0 > Or we should be better of by the checkpoint segment handling - any > of the below: > # - Checkpoints - > > checkpoint_segments = 64 > #checkpoint_timeout = 5min > #checkpoint_completion_target = 0.5 > checkpoint_warning = 60s You might consider increasing checkpoint_timeout if you can tolerate the increased recovery time if there is a crash. You should probably boost checkpoint_completion_target to 0.8 or 0.9. Really, if you don't already have it, you should get a copy of Greg Smith's recent book on performance: http://www.postgresql.org/docs/books/ It gives good descriptions of all of these parameters and advice on incremental tuning to find you best settings. The fact that you listed shared_buffers and effective_cache_size together at least suggests that you don't yet grasp the role of these settings. One affects how much memory PostgreSQL allocates; the other has absolutely nothing to do with that. effective_cache_size affects costs assigned to various plans, thereby affecting plan choice. While a high shared_buffers setting might lead to a glut of writes around commit time, setting effective_cache_size incorrectly might lead to plans which don't read the data efficiently. Seeing what vmstat or iostat say during a slow episode, and seeing whether the episodes correspond to checkpoints, will give you a better indication of where the problem lies. -Kevin
Many thanks for all your advice and we will use Greg Smith's book on perfor= mance to incrementaly tune our environment. Regards, Lawrence Cohan. -----Original Message----- From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] Sent: April-21-11 3:38 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: RE: [BUGS] Postgres not using indexes Lawrence Cohan <LCohan@web.com> wrote: > We managed to put together a new test server running PG 9.0.2 on > 2socketsx6cores =3D 12CPU with 64 GB RAM against a 3PAR 10TB SAN. We > kept the settings I submitted already (and enclosed below) and > after 12 hours of pounding the box with PGBENCH running 8 scripts > to perform all of INSERT/UPDATE/DELETE/SELECT statements we wanted > we got a pretty good picture of what can do with those settings. > We got a load average of 60 with CPU up and around that 60% mark, > pushing through about 1400 transactions per second for 12 hours. > We made the changes as suggested and listed below but the > throughput dropped from 1400 t/s to 400 t/s and I suspect are the > "stalled" transactions you mentioned about. > > Here's what we changed: > > Current Settings Test Settings > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > shared_buffers =3D 500MB shared_buffers =3D 8GB > effective_cache_size =3D 2GB effective_cache_size =3D 32GB To make sure I understand, are the "Current Settings" the ones which performed better? > Just to be 100% accurate we ask you what do you mean by: > > 1) "Make the background writer more aggressive and/or back > off on shared_memory, so that there isn't such a glut of dirty > pages to Write during a checkpoint." > > By aggressive does he mean changing any of the following? > # - Background Writer - > > #bgwriter_delay =3D 200ms > #bgwriter_lru_maxpages =3D 100 > #bgwriter_lru_multiplier =3D 2.0 We use these overrides: bgwriter_lru_maxpages =3D 1000 bgwriter_lru_multiplier =3D 4.0 > Or we should be better of by the checkpoint segment handling - any > of the below: > # - Checkpoints - > > checkpoint_segments =3D 64 > #checkpoint_timeout =3D 5min > #checkpoint_completion_target =3D 0.5 > checkpoint_warning =3D 60s You might consider increasing checkpoint_timeout if you can tolerate the increased recovery time if there is a crash. You should probably boost checkpoint_completion_target to 0.8 or 0.9. Really, if you don't already have it, you should get a copy of Greg Smith's recent book on performance: http://www.postgresql.org/docs/books/ It gives good descriptions of all of these parameters and advice on incremental tuning to find you best settings. The fact that you listed shared_buffers and effective_cache_size together at least suggests that you don't yet grasp the role of these settings. One affects how much memory PostgreSQL allocates; the other has absolutely nothing to do with that. effective_cache_size affects costs assigned to various plans, thereby affecting plan choice. While a high shared_buffers setting might lead to a glut of writes around commit time, setting effective_cache_size incorrectly might lead to plans which don't read the data efficiently. Seeing what vmstat or iostat say during a slow episode, and seeing whether the episodes correspond to checkpoints, will give you a better indication of where the problem lies. -Kevin Attention: The information contained in this message and or attachments is intended on= ly for the person or entity to which it is addressed and may contain confid= ential and/or privileged material. Any review, retransmission, disseminati= on or other use of, or taking of any action in reliance upon, this informat= ion by persons or entities other than the intended recipient is prohibited.= If you received this in error, please contact the sender and delete the ma= terial from any system and destroy any copies.