Обсуждение: significant slow down with various LIMIT

От:
Helio Campos Mello de Andrade
Дата:

Andrey,
- Another idea for your problem is the one Kevin gave in the message following:

##########################################################################################################################
> SELECT * FROM t_route
>   WHERE t_route.route_type_fk = 1
>   limit 4; 
 
This one scanned the t_route table until it found four rows that
matched.  It apparently didn't need to look at very many rows to find
the four matches, so it was fast.
> SELECT * FROM t_route
>   WHERE t_route.route_type_fk =
>     (SELECT id FROM t_route_type WHERE type = 2)
>   limit 4; 
 
This one came up with an id for a route type that didn't have any
matches in the t_route table, so it had to scan the entire t_route
table.  (Based on your next query, the subquery probably returned
NULL, so there might be room for some optimization here.)  If you had
chosen a route type with at least four matches near the  start of the
route table, this query would have completed quickly.
> SELECT * FROM t_route, t_route_type
>   WHERE t_route.route_type_fk = t_route_type.id
>     AND type = 2
>   limit 4; 
 
Since it didn't find any t_route_type row which matched, it knew
there couldn't be any output from the JOIN, so it skipped the scan of
the t_route table entirely.
-Kevin
##############################################################################################################
Regards....

-------- Original Message --------
From: - Fri Apr 9 17:36:41 2010
X-Account-Key: account3
X-UIDL: GmailId127e449663a13d39
X-Mozilla-Status: 0011
X-Mozilla-Status2: 00000000
X-Mozilla-Keys:
Delivered-To:
Received: by 10.231.79.67 with SMTP id o3cs40933ibk; Fri, 9 Apr 2010 13:36:16 -0700 (PDT)
Received: by 10.114.248.22 with SMTP id v22mr967398wah.8.1270845368202; Fri, 09 Apr 2010 13:36:08 -0700 (PDT)
Return-Path: <>
Received: from maia-1.hub.org (maia-1.hub.org [200.46.208.211]) by mx.google.com with ESMTP id 8si1947813ywh.11.2010.04.09.13.36.07; Fri, 09 Apr 2010 13:36:08 -0700 (PDT)
Received-SPF: neutral (google.com: 200.46.208.211 is neither permitted nor denied by best guess record for domain of ) client-ip=200.46.208.211;
Authentication-Results: mx.google.com; spf=neutral (google.com: 200.46.208.211 is neither permitted nor denied by best guess record for domain of ) smtp.mail=
Received: from postgresql.org (mail.postgresql.org [200.46.204.86]) by maia-1.hub.org (Postfix) with ESMTP id 54BAEAFD1B6; Fri, 9 Apr 2010 20:36:00 +0000 (UTC)
Received: from maia.hub.org (unknown [200.46.204.183]) by mail.postgresql.org (Postfix) with ESMTP id 2E74B633047 for <>; Thu, 8 Apr 2010 22:36:17 -0300 (ADT)
Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024) with ESMTP id 90832-06 for <>; Fri, 9 Apr 2010 01:36:06 +0000 (UTC)
Received: from news.hub.org (news.hub.org [200.46.204.72]) by mail.postgresql.org (Postfix) with ESMTP id BBD50632DC3 for <>; Thu, 8 Apr 2010 22:36:06 -0300 (ADT)
Received: from news.hub.org (news.hub.org [200.46.204.72]) by news.hub.org (8.14.3/8.14.3) with ESMTP id o391a091050073 for <>; Thu, 8 Apr 2010 22:36:00 -0300 (ADT) (envelope-from )
Received: (from news@localhost) by news.hub.org (8.14.3/8.14.3/Submit) id o391DTvp041710 for ; Thu, 8 Apr 2010 22:13:29 -0300 (ADT) (envelope-from news)
From: norn <>
X-Newsgroups: pgsql.performance
Subject: Re: [PERFORM] significant slow down with various LIMIT
Date: Thu, 8 Apr 2010 18:13:33 -0700 (PDT)
Organization: http://groups.google.com
Lines: 72
Message-ID: <>
References: <> <>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Complaints-To:
Complaints-To:
Injection-Info: 30g2000yqi.googlegroups.com; posting-host=94.78.201.171; posting-account=woDzKwoAAACEqYut1Qq-BHNhLOB-6ihP
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.368.0 Safari/533.4,gzip(gfe)
To:
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Spam-Status: No, hits=-0.74 tagged_above=-10 required=5 tests=BAYES_20=-0.74
X-Spam-Level:
X-Mailing-List: pgsql-performance
List-Archive: <http://archives.postgresql.org/pgsql-performance>
List-Help: <mailto:?body=help>
List-ID: <pgsql-performance.postgresql.org>
List-Owner: <mailto:>
List-Post: <mailto:>
List-Subscribe: <mailto:?body=sub%20pgsql-performance>
List-Unsubscribe: <mailto:?body=unsub%20pgsql-performance>
Precedence: bulk
Sender:


Kevin, thanks for your attention!
I've read SlowQueryQuestions, but anyway can't find bottleneck...

Here requested information:
OS: Ubuntu 9.10 64bit, Postgresql 8.4.2 with Postgis
Hardware: AMD Phenom(tm) II X4 945, 8GB RAM, 2 SATA 750GB (pg db
installed in software RAID 0)
Please also note that this hardware isn't dedicated DB server, but
also serve as web server and file server.

I have about 3 million rows in core_object, 1.5 million in
plugin_plugin_addr and 1.5 million in plugins_guide_address.
When there were 300 000+ objects queries works perfectly, but as db
enlarge things go worse...

# select version();
PostgreSQL 8.4.2 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Ubuntu 4.4.1-4ubuntu8) 4.4.1, 64-bit
---postgresql.conf---
data_directory = '/mnt/fast/postgresql/8.4/main'
hba_file = '/etc/postgresql/8.4/main/pg_hba.conf'
ident_file = '/etc/postgresql/8.4/main/pg_ident.conf'
external_pid_file = '/var/run/postgresql/8.4-main.pid'
listen_addresses = 'localhost'
port = 5432
max_connections = 250
unix_socket_directory = '/var/run/postgresql'
ssl = true
shared_buffers = 1024MB
temp_buffers = 16MB
work_mem = 128MB
maintenance_work_mem = 512MB
fsync = off
wal_buffers = 4MB
checkpoint_segments = 16
effective_cache_size = 1536MB
log_min_duration_statement = 8000
log_line_prefix = '%t '
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
standard_conforming_strings = on
escape_string_warning = off
constraint_exclusion = on
checkpoint_completion_target = 0.9
---end postgresql.conf---

I hope this help!
Any ideas are appreciated!


On Apr 9, 12:44 am, Kevin.Gritt...@wicourts.gov ("Kevin Grittner")
wrote:
>
> Could you show us the output from "select version();", describe your
> hardware and OS, and show us the contents of your postgresql.conf
> file (with all comments removed)?  We can then give more concrete
> advice than is possible with the information provided so far.
>
> http://wiki.postgresql.org/wiki/SlowQueryQuestions
>
> -Kevin
>
> --
> Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list ()
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

От:
"Kevin Grittner"
Дата:

[rearranging to put related information together]

norn

Since the LIMIT 3 and LIMIT 4 queries generated exactly the same
plan, the increased time for LIMIT 4 suggests that there are 3
matching rows which are near the end of the index it is scanning, but
the fourth one is much farther in.

Since what you're showing suggests that the active portion of your
data is heavily cached, you might benefit from decreasing
random_page_cost, and possibly also seq_page_cost.

> 8GB RAM

> effective_cache_size = 1536MB

> Please also note that this hardware isn't dedicated DB server, but
> also serve as web server and file server.

Even with those other uses, you're likely to actually be using 6 GB
or 7 GB for cache.  I'd set effective_cache_size in that range.

> max_connections = 250
> work_mem = 128MB

While probably not related to this problem, that's a dangerous
combination.  What if all 250 connections are active with a query
which uses work_mem memory?  A single connection can actually be
using several work_mem allocations at once.

> 2 SATA 750GB (pg db installed in software RAID 0)

You do realize that if either drive dies  you lose all your data on
that pair of drives, right?  I hope the value of the data and well
tested backup procedures keeps the loss to something which is
acceptable.

> I have about 3 million rows in core_object, 1.5 million in
> plugin_plugin_addr and 1.5 million in plugins_guide_address.
> When there were 300 000+ objects queries works perfectly, but as db
> enlarge things go worse...

With a relational database, it's not unusual for the most efficient
plan to depend on the quantity of data in the tables.  It is
important that your statistics are kept up-to-date so that plans can
adapt to the changing table sizes or data distributions.  The
effective_cache_size and cost parameters are also used to calculate
the costs of various plans, so adjusting those may help the optimizer
make good choices.

-Kevin


От:
norn
Дата:

Kevin,
I appreciate your help very much!

> Since the LIMIT 3 and LIMIT 4 queries generated exactly the same
> plan, the increased time for LIMIT 4 suggests that there are 3
> matching rows which are near the end of the index it is scanning, but
> the fourth one is much farther in.
Yes, you are right, I checked id of the rows and found that problem
occurred when one of id is 1377077 and next one is 132604.
This table was modified with several new rows and the problem now
happens between limit 4 and 5, this is another evidence of your
rightness!

Followed by your advices I set the following:
effective_cache_size=6144MB
random_page_cost=0.25
seq_page_cost = 0.25
max_connections = 50 # thanks for pointing! I don't really need so
much

> > 2 SATA 750GB (pg db installed in software RAID 0)
>
> You do realize that if either drive dies  you lose all your data on
> that pair of drives, right?  I hope the value of the data and well
> tested backup procedures keeps the loss to something which is
> acceptable.
Thanks for attention! I have some regular backup procedures already,
so there are no extra risk related to drive failure...

I restarted Postgresql with new settings and got no performance
improvements in this particular query...
Do you have ideas how much random_page_cost and seq_page_cost should
be decreased?
Also I wish to notice, that I made REINDEX DATABASE while tried to
solve the problem by myself.. this doesn't help...


От:
"Kevin Grittner"
Дата:

norn <> wrote:

> I restarted Postgresql with new settings and got no performance
> improvements in this particular query...

The cost settings help the optimizer make good decisions about plan
choice.  I guess I don't have much reason to believe, at this point,
that there is a better plan for it to choose for this query.  Do you
think you see one?  What would that be?  (We might be able to force
that plan and find out if you're right, which can be a valuable
diagnostic step, even if the way it gets forced isn't a
production-quality solution.)

Are you able to share the table descriptions?  (That might help us
suggest an index or some such which might help.)

> Do you have ideas how much random_page_cost and seq_page_cost
> should be decreased?

It really depends on how much of your active data set is cached.  If
it is effectively fully cached, you might want to go to 0.01 for
both (or even lower).  Many of our databases perform best with
seq_page_cost = 1 and random_page_cost = 2.  With some, either of
those "extremes" causes some queries to optimize poorly, and we've
had luck with 0.3 and 0.5.  This is one worth testing with your
workload, because you can make some queries faster at the expense of
others; sometimes it comes down to which needs better response time
to keep your users happy.

-Kevin

От:
norn
Дата:

On Apr 13, 5:28 am, Kevin.Gritt...@wicourts.gov ("Kevin Grittner")
wrote:
> The cost settings help the optimizer make good decisions about plan
> choice.  I guess I don't have much reason to believe, at this point,
> that there is a better plan for it to choose for this query.  Do you
> think you see one?  What would that be?  (We might be able to force
> that plan and find out if you're right, which can be a valuable
> diagnostic step, even if the way it gets forced isn't a
> production-quality solution.)
I have no deep knowledge of Postgresql, so I've no idea which plan is
the best, but I am wondering why there are so big gap between two
limits and how to avoid this...

> Are you able to share the table descriptions?  (That might help us
> suggest an index or some such which might help.)
sure, here it is

# \d core_object
                           Table "public.core_object"
  Column   |  Type   |
Modifiers
-----------+---------
+----------------------------------------------------------
 id        | integer | not null default
nextval('core_object_id_seq'::regclass)
 typeid_id | integer | not
null
Indexes:
    "core_object_pkey" PRIMARY KEY, btree
(id)
    "core_object_pkey_desc" btree (id
DESC)
    "core_object_typeid_id" btree
(typeid_id)
Foreign-key
constraints:
    "core_object_typeid_id_fkey" FOREIGN KEY (typeid_id) REFERENCES
core_obj_typeset(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
                    TABLE "plugins_plugin_addr" CONSTRAINT
"plugins_plugin_addr_oid_id_fkey" FOREIGN KEY (oid_id) REFERENCES
core_object(id) DEFERRABLE INITIALLY
DEFERRED
...and many others, so I skipped as irrelevant....

#  \d plugins_plugin_addr
                             Table "public.plugins_plugin_addr"
    Column     |  Type   |                            Modifiers
---------------+---------
+------------------------------------------------------------------
 id            | integer | not null default
nextval('plugins_plugin_addr_id_seq'::regclass)
 oid_id        | integer | not null
 sub_attrib_id | integer | not null
 address_id    | integer | not null
Indexes:
    "plugins_plugin_addr_pkey" PRIMARY KEY, btree (id)
    "plugins_plugin_addr_sub_attrib_id_key" UNIQUE, btree
(sub_attrib_id)
    "plugins_plugin_addr_address_id" btree (address_id)
    "plugins_plugin_addr_oid_id" btree (oid_id)
Foreign-key constraints:
    "plugins_plugin_addr_address_id_fkey" FOREIGN KEY (address_id)
REFERENCES plugins_guide_address(id) DEFERRABLE INITIALLY DEFERRED
    "plugins_plugin_addr_oid_id_fkey" FOREIGN KEY (oid_id) REFERENCES
core_object(id) DEFERRABLE INITIALLY DEFERRED
    "plugins_plugin_addr_sub_attrib_id_fkey" FOREIGN KEY
(sub_attrib_id) REFERENCES plugins_sub_attrib(id) DEFERRABLE INITIALLY
DEFERRED

#  \d plugins_guide_address
                                   Table
"public.plugins_guide_address"
    Column    |          Type          |
Modifiers
--------------+------------------------
+--------------------------------------------------------------------
 id           | integer                | not null default
nextval('plugins_guide_address_id_seq'::regclass)
 country_id   | integer                |
 region_id    | integer                |
 city_id      | integer                |
 zip_id       | integer                |
 street_id    | integer                |
 house        | character varying(20)  |
 district_id  | integer                |
 code         | character varying(23)  |
 significance | smallint               |
 alias_fr     | character varying(300) |
 alias_ru     | character varying(300) |
 alias_en     | character varying(300) |
 alias_de     | character varying(300) |
 alias_it     | character varying(300) |
 alias_len    | smallint               |
Indexes:
    "plugins_guide_address_pkey" PRIMARY KEY, btree (id)
    "plugins_guide_address_uniq" UNIQUE, btree (country_id, region_id,
district_id, city_id, street_id, house)
    "plugins_guide_address_alias_ru" btree (alias_ru)
    "plugins_guide_address_city_id" btree (city_id)
    "plugins_guide_address_code" btree (code)
    "plugins_guide_address_country_id" btree (country_id)
    "plugins_guide_address_district_id" btree (district_id)
    "plugins_guide_address_house" btree (house)
    "plugins_guide_address_house_upper" btree (upper(house::text))
    "plugins_guide_address_region_id" btree (region_id)
    "plugins_guide_address_significance" btree (significance)
    "plugins_guide_address_street_id" btree (street_id)
    "plugins_guide_address_zip_id" btree (zip_id)
Foreign-key constraints:
    "plugins_guide_address_city_id_fkey" FOREIGN KEY (city_id)
REFERENCES plugins_guide_city(id) DEFERRABLE INITIALLY DEFERRED
    "plugins_guide_address_country_id_fkey" FOREIGN KEY (country_id)
REFERENCES plugins_guide_country(id) DEFERRABLE INITIALLY DEFERRED
    "plugins_guide_address_district_id_fkey" FOREIGN KEY (district_id)
REFERENCES plugins_guide_district(id) DEFERRABLE INITIALLY DEFERRED
    "plugins_guide_address_region_id_fkey" FOREIGN KEY (region_id)
REFERENCES plugins_guide_region(id) DEFERRABLE INITIALLY DEFERRED
    "plugins_guide_address_street_id_fkey" FOREIGN KEY (street_id)
REFERENCES plugins_guide_street(id) DEFERRABLE INITIALLY DEFERRED
    "plugins_guide_address_zip_id_fkey" FOREIGN KEY (zip_id)
REFERENCES plugins_guide_zip(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
    TABLE "plugins_guide_ziphelper" CONSTRAINT
"plugins_guide_ziphelper_address_id_fkey" FOREIGN KEY (address_id)
REFERENCES plugins_guide_address(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "plugins_plugin_addr" CONSTRAINT
"plugins_plugin_addr_address_id_fkey" FOREIGN KEY (address_id)
REFERENCES plugins_guide_address(id) DEFERRABLE INITIALLY DEFERRED

------------end---------------

От:
"Kevin Grittner"
Дата:

norn <> wrote:

> I am wondering why there are so big gap between two limits and how
> to avoid this...

I think we've already established that it is because of the
percentage of the table which must be scanned to get to the desired
number of rows.  The problem is exacerbated by the fact that it's a
"backward" scan on the index, which is slower than a forward scan --
mainly because disks spin in one direction, and the spacing of the
sectors is optimized for forward scans.

There are a couple things to try which will give a more complete
picture of what might work to make the run time more predictable.
Please try these, and run EXPLAIN ANALYZE of your problem query each
way.

(1) Try it without the ORDER BY clause and the LIMIT.

(2) Temporarily take that top index out of consideration.  (Don't
worry, it'll come back when you issue the ROLLBACK -- just don't
forget the BEGIN statement.)

BEGIN;
DROP INDEX plugins_plugin_addr_oid_id;
explain analyze <your query>
ROLLBACK;

(3) Try it like this (untested, so you may need to fix it up):

explain analyze
SELECT core_object.id
  from (SELECT id, city_id FROM "plugins_guide_address")
       "plugins_guide_address"
  JOIN "plugins_plugin_addr"
    ON ("plugins_plugin_addr"."address_id"
       = "plugins_guide_address"."id")
  JOIN "core_object"
    ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
  WHERE "plugins_guide_address"."city_id" = 4535
  ORDER BY "core_object"."id" DESC
  LIMIT 4 -- or whatever it normally takes to cause the problem
;

-Kevin

От:
Chris Bowlby
Дата:

I'm also wondering if a re-clustering of the table would work based on
the index that's used.

such that:

CLUSTER core_object USING plugins_plugin_addr_oid_id;

and see if that makes any change in the differences that your seeing.

On 04/13/2010 02:24 PM, Kevin Grittner wrote:
> norn <> wrote:
>
>
>> I am wondering why there are so big gap between two limits and how
>> to avoid this...
>>
>
> I think we've already established that it is because of the
> percentage of the table which must be scanned to get to the desired
> number of rows.  The problem is exacerbated by the fact that it's a
> "backward" scan on the index, which is slower than a forward scan --
> mainly because disks spin in one direction, and the spacing of the
> sectors is optimized for forward scans.
>
> There are a couple things to try which will give a more complete
> picture of what might work to make the run time more predictable.
> Please try these, and run EXPLAIN ANALYZE of your problem query each
> way.
>
> (1) Try it without the ORDER BY clause and the LIMIT.
>
> (2) Temporarily take that top index out of consideration.  (Don't
> worry, it'll come back when you issue the ROLLBACK -- just don't
> forget the BEGIN statement.)
>
> BEGIN;
> DROP INDEX plugins_plugin_addr_oid_id;
> explain analyze <your query>
> ROLLBACK;
>
> (3) Try it like this (untested, so you may need to fix it up):
>
> explain analyze
> SELECT core_object.id
>   from (SELECT id, city_id FROM "plugins_guide_address")
>        "plugins_guide_address"
>   JOIN "plugins_plugin_addr"
>     ON ("plugins_plugin_addr"."address_id"
>        = "plugins_guide_address"."id")
>   JOIN "core_object"
>     ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
>   WHERE "plugins_guide_address"."city_id" = 4535
>   ORDER BY "core_object"."id" DESC
>   LIMIT 4 -- or whatever it normally takes to cause the problem
> ;
>
> -Kevin
>
>


От:
"Kevin Grittner"
Дата:

"Kevin Grittner" <> wrote:

> (3) Try it like this (untested, so you may need to fix it up):
>
> explain analyze
> SELECT core_object.id
>   from (SELECT id, city_id FROM "plugins_guide_address")
>        "plugins_guide_address"
>   JOIN "plugins_plugin_addr"
>     ON ("plugins_plugin_addr"."address_id"
>        = "plugins_guide_address"."id")
>   JOIN "core_object"
>     ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
>   WHERE "plugins_guide_address"."city_id" = 4535
>   ORDER BY "core_object"."id" DESC
>   LIMIT 4 -- or whatever it normally takes to cause the problem
> ;

Hmph.  I see I didn't take that quite where I intended.
Forget the above and try this:

explain analyze
SELECT core_object.id
  from (SELECT id, city_id FROM "plugins_guide_address"
          WHERE "city_id" = 4535) "plugins_guide_address"
  JOIN "plugins_plugin_addr"
    ON ("plugins_plugin_addr"."address_id"
       = "plugins_guide_address"."id")
  JOIN "core_object"
    ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
  ORDER BY "core_object"."id" DESC
  LIMIT 4 -- or whatever it normally takes to cause the problem
;

-Kevin

От:
norn
Дата:

Kevin,
thanks for your time!
Here the requested tests.

> (1) Try it without the ORDER BY clause and the LIMIT.
W/o the 'order by' it works instantly (about 1ms!)
 Limit  (cost=0.00..3.59 rows=5 width=4) (actual time=0.127..0.229
rows=5 loops=1)
   ->  Nested Loop  (cost=0.00..277863.53 rows=386544 width=4) (actual
time=0.125..0.224 rows=5 loops=1)
         ->  Nested Loop  (cost=0.00..91136.78 rows=386544 width=4)
(actual time=0.106..0.154 rows=5 loops=1)
               ->  Index Scan using plugins_guide_address_city_id on
plugins_guide_address  (cost=0.00..41109.07 rows=27673 width=4)
(actual time=0.068..0.080 rows=5 loops=1)
                     Index Cond: (city_id = 4535)
               ->  Index Scan using plugins_plugin_addr_address_id on
plugins_plugin_addr  (cost=0.00..1.63 rows=14 width=8) (actual
time=0.011..0.012 rows=1 loops=5)
                     Index Cond: (plugins_plugin_addr.address_id =
plugins_guide_address.id)
         ->  Index Scan using core_object_pkey on core_object
(cost=0.00..0.47 rows=1 width=4) (actual time=0.011..0.012 rows=1
loops=5)
               Index Cond: (core_object.id =
plugins_plugin_addr.oid_id)
 Total runtime: 0.328 ms
(10 rows)


W/o the limit it takes 1.4 seconds, which is anyway better than...
 Sort  (cost=199651.74..200618.10 rows=386544 width=4) (actual
time=1153.167..1157.841 rows=43898 loops=1)
   Sort Key: core_object.id
   Sort Method:  quicksort  Memory: 3594kB
   ->  Hash Join  (cost=81234.35..163779.93 rows=386544 width=4)
(actual time=122.050..1128.909 rows=43898 loops=1)
         Hash Cond: (core_object.id = plugins_plugin_addr.oid_id)
         ->  Seq Scan on core_object  (cost=0.00..46467.07
rows=3221307 width=4) (actual time=0.011..378.677 rows=3221349
loops=1)
         ->  Hash  (cost=76402.55..76402.55 rows=386544 width=4)
(actual time=121.170..121.170 rows=43898 loops=1)
               ->  Nested Loop  (cost=368.81..76402.55 rows=386544
width=4) (actual time=8.645..104.842 rows=43898 loops=1)
                     ->  Bitmap Heap Scan on plugins_guide_address
(cost=368.81..26374.83 rows=27673 width=4) (actual time=8.599..15.590
rows=26583 loops=1)
                           Recheck Cond: (city_id = 4535)
                           ->  Bitmap Index Scan on
plugins_guide_address_city_id  (cost=0.00..361.89 rows=27673 width=0)
(actual time=7.856..7.856 rows=26583 loops=1)
                                 Index Cond: (city_id = 4535)
                     ->  Index Scan using
plugins_plugin_addr_address_id on plugins_plugin_addr
(cost=0.00..1.63 rows=14 width=8) (actual time=0.002..0.003 rows=2
loops=26583)
                           Index Cond: (plugins_plugin_addr.address_id
= plugins_guide_address.id)
 Total runtime: 1162.193 ms
(15 rows)

>(2) Temporarily take that top index out of consideration
It works nice! Query takes about 0.6 seconds as expected!

explain analyze SELECT core_object.id from "core_object" INNER JOIN
"plugins_plugin_addr" ON ("core_object"."id" =
"plugins_plugin_addr"."oid_id") INNER JOIN "plugins_guide_address" ON
("plugins_plugin_addr"."address_id" = "plugins_guide_address"."id")
WHERE "plugins_guide_address"."city_id" = 4535  ORDER BY
"core_object"."id" DESC;

 Limit  (cost=112274.36..112275.66 rows=5 width=4) (actual
time=200.758..637.039 rows=5 loops=1)
   ->  Merge Join  (cost=112274.36..213042.22 rows=386544 width=4)
(actual time=200.754..637.035 rows=5 loops=1)
         Merge Cond: (core_object.id = plugins_plugin_addr.oid_id)
         ->  Index Scan Backward using core_object_pkey on
core_object  (cost=0.00..86916.44 rows=3221307 width=4) (actual
time=0.115..302.512 rows=1374693 loops=1)
         ->  Sort  (cost=112274.36..113240.72 rows=386544 width=4)
(actual time=154.635..154.635 rows=5 loops=1)
               Sort Key: plugins_plugin_addr.oid_id
               Sort Method:  quicksort  Memory: 3594kB
               ->  Nested Loop  (cost=368.81..76402.55 rows=386544
width=4) (actual time=9.522..126.206 rows=43898 loops=1)
                     ->  Bitmap Heap Scan on plugins_guide_address
(cost=368.81..26374.83 rows=27673 width=4) (actual time=9.367..21.311
rows=26583 loops=1)
                           Recheck Cond: (city_id = 4535)
                           ->  Bitmap Index Scan on
plugins_guide_address_city_id  (cost=0.00..361.89 rows=27673 width=0)
(actual time=8.577..8.577 rows=26583 loops=1)
                                 Index Cond: (city_id = 4535)
                     ->  Index Scan using
plugins_plugin_addr_address_id on plugins_plugin_addr
(cost=0.00..1.63 rows=14 width=8) (actual time=0.002..0.003 rows=2
loops=26583)
                           Index Cond: (plugins_plugin_addr.address_id
= plugins_guide_address.id)
 Total runtime: 637.620 ms
(15 rows)


> (3) Try it like this (untested, so you may need to fix it up):
explain analyze
SELECT core_object.id
  from (SELECT id, city_id FROM "plugins_guide_address"
          WHERE "city_id" = 4535) "plugins_guide_address"
  JOIN "plugins_plugin_addr"
    ON ("plugins_plugin_addr"."address_id"
       = "plugins_guide_address"."id")
  JOIN "core_object"
    ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
  ORDER BY "core_object"."id" DESC
  LIMIT 5;
 Limit  (cost=0.00..11.51 rows=5 width=4) (actual
time=494.600..4737.867 rows=5 loops=1)
   ->  Merge Join  (cost=0.00..889724.50 rows=386544 width=4) (actual
time=494.599..4737.862 rows=5 loops=1)
         Merge Cond: (plugins_plugin_addr.oid_id = core_object.id)
         ->  Nested Loop  (cost=0.00..789923.00 rows=386544 width=4)
(actual time=450.359..4269.608 rows=5 loops=1)
               ->  Index Scan Backward using
plugins_plugin_addr_oid_id on plugins_plugin_addr
(cost=0.00..45740.51 rows=1751340 width=8) (actual time=0.038..321.285
rows=1374690 loops=1)
               ->  Index Scan using plugins_guide_address_pkey on
plugins_guide_address  (cost=0.00..0.41 rows=1 width=4) (actual
time=0.003..0.003 rows=0 loops=1374690)
                     Index Cond: (public.plugins_guide_address.id =
plugins_plugin_addr.address_id)
                     Filter: (public.plugins_guide_address.city_id =
4535)
         ->  Index Scan Backward using core_object_pkey on
core_object  (cost=0.00..86916.44 rows=3221307 width=4) (actual
time=0.008..288.625 rows=1374693 loops=1)
 Total runtime: 4737.964 ms
(10 rows)

So, as we can see, dropping index may help, but why? What shall I do
in my particular situation? Probably analyzing my tests help you
giving some recommendations, I hope so! :)

Thanks again for your time!

On Apr 14, 10:31 pm, Kevin.Gritt...@wicourts.gov ("Kevin Grittner")
wrote:
> "Kevin Grittner" <Kevin.Gritt...@wicourts.gov> wrote:
> > (3) Try it like this (untested, so you may need to fix it up):
>
> > explain analyze
> > SELECT core_object.id
> >   from (SELECT id, city_id FROM "plugins_guide_address")
> >        "plugins_guide_address"
> >   JOIN "plugins_plugin_addr"
> >     ON ("plugins_plugin_addr"."address_id"
> >        = "plugins_guide_address"."id")
> >   JOIN "core_object"
> >     ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
> >   WHERE "plugins_guide_address"."city_id" = 4535
> >   ORDER BY "core_object"."id" DESC
> >   LIMIT 4 -- or whatever it normally takes to cause the problem
> > ;
>
> Hmph.  I see I didn't take that quite where I intended.
> Forget the above and try this:
>
> explain analyze
> SELECT core_object.id
>   from (SELECT id, city_id FROM "plugins_guide_address"
>           WHERE "city_id" = 4535) "plugins_guide_address"
>   JOIN "plugins_plugin_addr"
>     ON ("plugins_plugin_addr"."address_id"
>        = "plugins_guide_address"."id")
>   JOIN "core_object"
>     ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
>   ORDER BY "core_object"."id" DESC
>   LIMIT 4 -- or whatever it normally takes to cause the problem
> ;
>
> -Kevin
>
> --
> Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance


От:
"Kevin Grittner"
Дата:

norn <> wrote:

>> (1) Try it without the ORDER BY clause and the LIMIT.
> W/o the 'order by' it works instantly (about 1ms!)

> W/o the limit it takes 1.4 seconds

>>(2) Temporarily take that top index out of consideration
> It works nice! Query takes about 0.6 seconds as expected!

> So, as we can see, dropping index may help, but why? What shall I
> do in my particular situation? Probably analyzing my tests help
> you giving some recommendations, I hope so! :)

The combination of the ORDER BY DESC and the LIMIT causes it to
think it can get the right data most quickly by scanning backwards
on the index.  It's wrong about that.  With the information from the
additional plans, it seems that this bad estimate might be why it's
not recognizing the plan which is actually four orders of magnitude
faster:

Index Scan using plugins_guide_address_city_id
  on plugins_guide_address
  Index Cond: (city_id = 4535)
  estimated rows=27673
  actual rows=5

Try this:

ALTER TABLE ALTER plugins_guide_address
  ALTER COLUMN city_id SET STATISTICS 1000;
ANALYZE plugins_guide_address;

Then try your query.

I have one more diagnostic query to test, if the above doesn't work:

explain analyze
SELECT id FROM
  (
    SELECT core_object.id
      FROM "core_object"
      JOIN "plugins_plugin_addr"
        ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
      JOIN "plugins_guide_address"
        ON ("plugins_plugin_addr"."address_id" =
            "plugins_guide_address"."id")
      WHERE "plugins_guide_address"."city_id" = 4535
  ) x
  ORDER BY id DESC
  LIMIT 4;

-Kevin

От:
"Kevin Grittner"
Дата:

I wrote:

> ALTER TABLE ALTER plugins_guide_address
>   ALTER COLUMN city_id SET STATISTICS 1000;

One too many ALTERs in there.  Should be:

ALTER TABLE plugins_guide_address
  ALTER COLUMN city_id SET STATISTICS 1000;

-Kevin

От:
norn
Дата:

> Try this:
>
> ALTER TABLE ALTER plugins_guide_address
>   ALTER COLUMN city_id SET STATISTICS 1000;
> ANALYZE plugins_guide_address;
>
> Then try your query.
No luck... The same query time...

> I have one more diagnostic query to test, if the above doesn't work:
>
> explain analyze
> SELECT id FROM
>   (
>     SELECT core_object.id
>       FROM "core_object"
>       JOIN "plugins_plugin_addr"
>         ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
>       JOIN "plugins_guide_address"
>         ON ("plugins_plugin_addr"."address_id" =
>             "plugins_guide_address"."id")
>       WHERE "plugins_guide_address"."city_id" = 4535
>   ) x
>   ORDER BY id DESC
>   LIMIT 4;

Limit  (cost=0.00..8.29 rows=4 width=4) (actual time=0.284..1322.792
rows=4 loops=1)
   ->  Merge Join  (cost=0.00..993770.68 rows=479473 width=4) (actual
time=0.281..1322.787 rows=4 loops=1)
         Merge Cond: (plugins_plugin_addr.oid_id = core_object.id)
         ->  Nested Loop  (cost=0.00..887841.46 rows=479473 width=4)
(actual time=0.194..1201.318 rows=4 loops=1)
               ->  Index Scan Backward using
plugins_plugin_addr_oid_id on plugins_plugin_addr
(cost=0.00..51546.26 rows=1980627 width=8) (actual time=0.117..87.035
rows=359525 loops=1)
               ->  Index Scan using plugins_guide_address_pkey on
plugins_guide_address  (cost=0.00..0.41 rows=1 width=4) (actual
time=0.003..0.003 rows=0 loops=359525)
                     Index Cond: (plugins_guide_address.id =
plugins_plugin_addr.address_id)
                     Filter: (plugins_guide_address.city_id = 4535)
         ->  Index Scan Backward using core_object_pkey on
core_object  (cost=0.00..91309.16 rows=3450658 width=4) (actual
time=0.079..73.071 rows=359525 loops=1)
 Total runtime: 1323.065 ms
(10 rows)


От:
norn
Дата:

On Apr 21, 9:52 pm, Kevin.Gritt...@wicourts.gov ("Kevin Grittner")
wrote:
> I wrote:
> > ALTER TABLE ALTER plugins_guide_address
> >   ALTER COLUMN city_id SET STATISTICS 1000;
>
> One too many ALTERs in there.  Should be:
>
> ALTER TABLE plugins_guide_address
>   ALTER COLUMN city_id SET STATISTICS 1000;


Yeah, I noticed it and ran correctly.