Обсуждение: Planner create a slow plan without an available index

Поиск
Список
Период
Сортировка

Planner create a slow plan without an available index

От
Ben-Nes Yonatan
Дата:
Hi All,

I got a weird problem with the planner which cause my queries to take
ages... ill try to explain it shortly and summarized... :)

I got the following table (which got 1.2 million rows):

                      Table "public.items"
            Column           |     Type     |      Modifiers
----------------------------+--------------+---------------------
  items_id                   | text         | not null
  price                    | numeric(8,2) | not null
  left                       | integer      |
  right                      | integer      |
Indexes:
     "items_items_id_key" UNIQUE, btree (items_id)
     "items_left" btree (left)
     "items_left_right" btree (left, right)

 From that table I created the next table in order to save "ORDER BY
price" at the queries:

bh.com=# CREATE TABLE items_price AS SELECT * FROM items ORDER BY price;

After the creation of the table I created indexes which are exactly the
same as the items table has (the source table).
Later I ran on both tables "VACUUM FULL ANALYZE".


Now here start the weird stuff....

bh.com=# EXPLAIN ANALYZE SELECT * FROM items WHERE left>=(SELECT left
FROM category WHERE category_id=821) AND right<=(SELECT right FROM
category WHERE category_id=821) OFFSET 24 LIMIT 13;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=58.27..86.55 rows=13 width=619) (actual
time=0.811..130.993 rows=9 loops=1)
    InitPlan
      ->  Index Scan using category_pkey on category  (cost=0.00..3.03
rows=1 width=4) (actual time=0.118..0.124 rows=1 loops=1)
            Index Cond: (category_id = 821)
      ->  Index Scan using category_pkey on category  (cost=0.00..3.03
rows=1 width=4) (actual time=0.012..0.016 rows=1 loops=1)
            Index Cond: (category_id = 821)
    ->  Index Scan using items_left_right on items
(cost=0.00..294897.72 rows=135553 width=619) (actual time=0.314..130.815
rows=33 loops=1)
          Index Cond: ((left >= $0) AND (right <= $1))
  Total runtime: 131.140 ms
(9 rows)

bh.com=# ANALYZE items;
ANALYZE
bh.com=# EXPLAIN ANALYZE SELECT * FROM items WHERE left>=(SELECT left
FROM category WHERE category_id=821) AND right<=(SELECT right FROM
category WHERE category_id=821) OFFSET 24 LIMIT 13;


QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=57.11..84.77 rows=13 width=626) (actual
time=45.512..145316.423 rows=9 loops=1)
    InitPlan
      ->  Index Scan using category_pkey on category  (cost=0.00..3.03
rows=1 width=4) (actual time=0.185..0.191 rows=1 loops=1)
            Index Cond: (category_id = 821)
      ->  Index Scan using category_pkey on category  (cost=0.00..3.03
rows=1 width=4) (actual time=0.026..0.032 rows=1 loops=1)
            Index Cond: (category_id = 821)
    ->  Index Scan using items_left on items  (cost=0.00..293408.52
rows=137924 width=626) (actual time=45.008..145316.246 rows=33 loops=1)
          Index Cond: (left >= $0)
          Filter: (right <= $1)
  Total runtime: 145316.590 ms
(10 rows)


The "ANALYZE items" actually made the planner work without the INDEX and
by that the query became a lot slower! after running VACUUM ANALYZE on
the items table I receive good results back again.
Now I do know the diffrence between the 2 actions (VACUUM ANALYZE vs.
ANALYZE) but whats bug me is that when I do the exact same operations on
items_price (which is the same table exactly with the same indexes just
ordered diffrently) I receive a slow result no matter what I do!

I tried to mess with "ALTER TABLE items_price ALTER right SET STATISTICS
;" (and also on left) with diffrent values up to even 1000 but that
didnt help a bit (I did ran VACUUM ANALYZE after each change).

I'm quite clueless and also quite in a hurry to finish this project so
any help or a piece of clue will be welcomed gladly!

Thanks alot in advance (even only for reading what I wrote :P),
Ben-Nes Yonatan
Canaan Surfing ltd.
http://www.canaan.net.il

Re: Planner create a slow plan without an available index

От
Tom Lane
Дата:
Ben-Nes Yonatan <da@canaan.co.il> writes:
> Indexes:
>      "items_items_id_key" UNIQUE, btree (items_id)
>      "items_left" btree (left)
>      "items_left_right" btree (left, right)

You could get rid of the items_left index --- it's redundant with the
first column of the combined index anyway.

> bh.com=# EXPLAIN ANALYZE SELECT * FROM items WHERE left>=(SELECT left
> FROM category WHERE category_id=821) AND right<=(SELECT right FROM
> category WHERE category_id=821) OFFSET 24 LIMIT 13;

Doing OFFSET/LIMIT without an ORDER BY is just asking for trouble.
If you were to specify "ORDER BY left, right" that would probably
convince the planner to use the index you want.

However ... this query is basically going to suck with any btree index,
because btree can't usefully do range checks on two separate variables.
There's an exactly similar problem being discussed over in pgsql-novice:
http://archives.postgresql.org/pgsql-novice/2005-08/msg00243.php

            regards, tom lane

Re: Planner create a slow plan without an available index

От
Ben-Nes Yonatan
Дата:
Tom Lane wrote:
> Ben-Nes Yonatan <da@canaan.co.il> writes:
>
>>Indexes:
>>     "items_items_id_key" UNIQUE, btree (items_id)
>>     "items_left" btree (left)
>>     "items_left_right" btree (left, right)
>
>
> You could get rid of the items_left index --- it's redundant with the
> first column of the combined index anyway.
>
>
>>bh.com=# EXPLAIN ANALYZE SELECT * FROM items WHERE left>=(SELECT left
>>FROM category WHERE category_id=821) AND right<=(SELECT right FROM
>>category WHERE category_id=821) OFFSET 24 LIMIT 13;
>
>
> Doing OFFSET/LIMIT without an ORDER BY is just asking for trouble.
> If you were to specify "ORDER BY left, right" that would probably
> convince the planner to use the index you want.
>
> However ... this query is basically going to suck with any btree index,
> because btree can't usefully do range checks on two separate variables.
> There's an exactly similar problem being discussed over in pgsql-novice:
> http://archives.postgresql.org/pgsql-novice/2005-08/msg00243.php
>
>             regards, tom lane

First of all thanks I did succed to use the index that way and to
receive less then 80ms responds, but if imporvement is possible I would
like to do it.

If btree index is not suitable for this query then which index is? as
far as I understand the rtree index doesnt support range checks and the
hash index is not recommended by almost everyone (including the manual)
so the only one left is the gist, is that the most suitable index for
this query? if so can you give me a link as to where I can learn how to
use such an index efficently? (by the way the only link that worked at
the postgresql manual "Chapter 48. GiST Indexes" is the one which direct
to "the University of California at Berkeley's GiST Indexing Project web
site" the other 2 links direct to 404 pages and I guess that they should
be removed).

Thanks alot,
    Ben-Nes Yonatan

Re: Planner create a slow plan without an available index

От
Martijn van Oosterhout
Дата:
On Tue, Aug 30, 2005 at 11:25:26AM +0200, Ben-Nes Yonatan wrote:
> Tom Lane wrote:
> >However ... this query is basically going to suck with any btree index,
> >because btree can't usefully do range checks on two separate variables.
> >There's an exactly similar problem being discussed over in pgsql-novice:
> >http://archives.postgresql.org/pgsql-novice/2005-08/msg00243.php
>
> If btree index is not suitable for this query then which index is? as
> far as I understand the rtree index doesnt support range checks and the
> hash index is not recommended by almost everyone (including the manual)
> so the only one left is the gist, is that the most suitable index for
> this query? if so can you give me a link as to where I can learn how to
> use such an index efficently? (by the way the only link that worked at
> the postgresql manual "Chapter 48. GiST Indexes" is the one which direct
> to "the University of California at Berkeley's GiST Indexing Project web
> site" the other 2 links direct to 404 pages and I guess that they should
> be removed).

Basically, no index is really setup for the query as you wrote it.
Indexes generally improve performance by taking advantage of order. You
have two constants (the two subqueries) and two variables (left and
right). Andyou applying range range (not equality) to each one. There
is no way to order an index that would give the answer you want.

rtree works on multidimesional (geometric) data. It can do range tests
(is object A to the left of object B) but it's only applicable if your
conditions can be interpreted that way.

GiST is for creating custom index types, hardly likely to be useful
in your case.

I can't tell from your query (and PostgreSQL certainly can't) but are
there other constraints such left <= right or something similar for the
constants. If so, maybe adding that will help PostgreSQL optimise your
query. Maybe indexing on (left+right) might work for you but it all
depends on the structure of your data.

If you want more help, you're going to need to provide information on
your database including what left, right and items actually are.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

Re: Planner create a slow plan without an available index

От
Bruno Wolff III
Дата:
On Tue, Aug 30, 2005 at 11:25:26 +0200,
  Ben-Nes Yonatan <da@canaan.co.il> wrote:
>
> If btree index is not suitable for this query then which index is? as
> far as I understand the rtree index doesnt support range checks and the
> hash index is not recommended by almost everyone (including the manual)
> so the only one left is the gist, is that the most suitable index for
> this query? if so can you give me a link as to where I can learn how to
> use such an index efficently? (by the way the only link that worked at
> the postgresql manual "Chapter 48. GiST Indexes" is the one which direct
> to "the University of California at Berkeley's GiST Indexing Project web
> site" the other 2 links direct to 404 pages and I guess that they should
> be removed).

rtree indexes allow you to quickly check for containment. Range checking
is one dimensional containment.

Re: Planner create a slow plan without an available index

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> rtree works on multidimesional (geometric) data. It can do range tests
> (is object A to the left of object B) but it's only applicable if your
> conditions can be interpreted that way.

> GiST is for creating custom index types, hardly likely to be useful
> in your case.

Actually either rtree or GIST should be able to do something useful with
this, since it's basically a 1-D overlap query.  The main problem with
GIST is to find a suitable opclass, since there aren't any in the core
system.  Possibly contrib/seg could be used.

            regards, tom lane

Re: Planner create a slow plan without an available index

От
Ben-Nes Yonatan
Дата:
Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
>
>>rtree works on multidimesional (geometric) data. It can do range tests
>>(is object A to the left of object B) but it's only applicable if your
>>conditions can be interpreted that way.
>
>
>>GiST is for creating custom index types, hardly likely to be useful
>>in your case.
>
>
> Actually either rtree or GIST should be able to do something useful with
> this, since it's basically a 1-D overlap query.  The main problem with
> GIST is to find a suitable opclass, since there aren't any in the core
> system.  Possibly contrib/seg could be used.
>
>             regards, tom lane

Ok first of all thanks guys as always for your help, and I will try to
use rtree to improve my query (hopefuly ill be able to come back and say
that it worked :)).

I got another question which is not connected and probably its just me
being paranoid late at night but still... :)

at chapter "21.1.3. Preventing transaction ID wraparound failures" of
the postgresql manual its written the following info:
"Prior to PostgreSQL 7.2, the only defense against XID wraparound was to
re-initdb at least every 4 billion transactions. This of course was not
very satisfactory for high-traffic sites, so a better solution has been
devised. The new approach allows a server to remain up indefinitely,
without initdb or any sort of restart. The price is this maintenance
requirement: every table in the database must be vacuumed at least once
every billion transactions."
My postgresql version is 8.01 (I should have mentioned that at start no? :))

Now again im probably just paranoid but when I'm starting a transaction
and in it im making more then 4 billions diffrent queries
(select,insert,update,truncate...) and then im closing it, its counted
as only one transaction right? (should I duck to avoid the manual? ;))

As always thanks alot!
Ben-Nes Yonatan

Re: Planner create a slow plan without an available index

От
Alvaro Herrera
Дата:
On Wed, Aug 31, 2005 at 01:27:30AM +0200, Ben-Nes Yonatan wrote:

> Now again im probably just paranoid but when I'm starting a transaction
> and in it im making more then 4 billions diffrent queries
> (select,insert,update,truncate...) and then im closing it, its counted
> as only one transaction right? (should I duck to avoid the manual? ;))

Yes, one transaction.  You cannot do more than 4 billion commands -- the
limit is 2^32 anyway.

--
Alvaro Herrera <alvherre[]alvh.no-ip.org>      Architect, www.EnterpriseDB.com
"Los románticos son seres que mueren de deseos de vida"

Re: Planner create a slow plan without an available index

От
Bruno Wolff III
Дата:
On Wed, Aug 31, 2005 at 01:27:30 +0200,
  Ben-Nes Yonatan <da@canaan.co.il> wrote:
>
> Now again im probably just paranoid but when I'm starting a transaction
> and in it im making more then 4 billions diffrent queries
> (select,insert,update,truncate...) and then im closing it, its counted
> as only one transaction right? (should I duck to avoid the manual? ;))

I believe there is a limit on the number of queries in a transaction of
2 or 4 billion (though this may be just in functions).

Ignoring subtransactions, all these queries count as just one transaction.
I am not sure how subtransactions are counted.

Re: Planner create a slow plan without an available index

От
Alvaro Herrera
Дата:
On Tue, Aug 30, 2005 at 10:39:57PM -0500, Bruno Wolff III wrote:
> On Wed, Aug 31, 2005 at 01:27:30 +0200,
>   Ben-Nes Yonatan <da@canaan.co.il> wrote:
> >
> > Now again im probably just paranoid but when I'm starting a transaction
> > and in it im making more then 4 billions diffrent queries
> > (select,insert,update,truncate...) and then im closing it, its counted
> > as only one transaction right? (should I duck to avoid the manual? ;))
>
> I believe there is a limit on the number of queries in a transaction of
> 2 or 4 billion (though this may be just in functions).
>
> Ignoring subtransactions, all these queries count as just one transaction.
> I am not sure how subtransactions are counted.

If the subtransaction writes at least a tuple, it counts as another
transaction.  Else it doesn't count.

--
Alvaro Herrera <alvherre[]alvh.no-ip.org>      Architect, www.EnterpriseDB.com
"I'm always right, but sometimes I'm more right than other times."
                                                  (Linus Torvalds)

Re: Planner create a slow plan without an available index

От
Ben-Nes Yonatan
Дата:
Alvaro Herrera wrote:
> On Tue, Aug 30, 2005 at 10:39:57PM -0500, Bruno Wolff III wrote:
>
>>On Wed, Aug 31, 2005 at 01:27:30 +0200,
>>  Ben-Nes Yonatan <da@canaan.co.il> wrote:
>>
>>>Now again im probably just paranoid but when I'm starting a transaction
>>>and in it im making more then 4 billions diffrent queries
>>>(select,insert,update,truncate...) and then im closing it, its counted
>>>as only one transaction right? (should I duck to avoid the manual? ;))
>>
>>I believe there is a limit on the number of queries in a transaction of
>>2 or 4 billion (though this may be just in functions).
>>
>>Ignoring subtransactions, all these queries count as just one transaction.
>>I am not sure how subtransactions are counted.
>
>
> If the subtransaction writes at least a tuple, it counts as another
> transaction.  Else it doesn't count.
>

Oh crap I fear that now im in serious troubles....
Where can I read about this limitation? and beside that what if I count
the number of queries and every 900,000 or so I create a subtransaction
and continue my process with it, will that work or I'm just trying to be
a smart ass with the db?

As always thanks alot,
    Ben-Nes Yonatan

Re: Planner create a slow plan without an available index

От
Martijn van Oosterhout
Дата:
On Wed, Aug 31, 2005 at 09:19:05AM +0200, Ben-Nes Yonatan wrote:
> >If the subtransaction writes at least a tuple, it counts as another
> >transaction.  Else it doesn't count.
> >
>
> Oh crap I fear that now im in serious troubles....
> Where can I read about this limitation? and beside that what if I count
> the number of queries and every 900,000 or so I create a subtransaction
> and continue my process with it, will that work or I'm just trying to be
> a smart ass with the db?

Um, 1 billion transactions is 1 thousand million. So 900,000
inserts/updates are not even one tenth of one percent of the limit for
one transaction.

Are you really approaching a billion inserts/updates per transaction?
That's alot of diskspace being used...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

Re: Planner create a slow plan without an available index

От
Ben-Nes Yonatan
Дата:
Martijn van Oosterhout wrote:
> On Wed, Aug 31, 2005 at 09:19:05AM +0200, Ben-Nes Yonatan wrote:
>
>>>If the subtransaction writes at least a tuple, it counts as another
>>>transaction.  Else it doesn't count.
>>>
>>
>>Oh crap I fear that now im in serious troubles....
>>Where can I read about this limitation? and beside that what if I count
>>the number of queries and every 900,000 or so I create a subtransaction
>>and continue my process with it, will that work or I'm just trying to be
>>a smart ass with the db?
>
>
> Um, 1 billion transactions is 1 thousand million. So 900,000
> inserts/updates are not even one tenth of one percent of the limit for
> one transaction.
>
> Are you really approaching a billion inserts/updates per transaction?
> That's alot of diskspace being used...
>
> Have a nice day,

No apprantly I just lack a decent sleep.... I think that ill stop ask
you guys questions before you will decide to get your clubs out... :P
In other words I was mistaken and thought about a million and not a
billion :)

With hopes that this is the end of my bugging :)
Thanks alot,
    Ben-Nes Yonatan