Обсуждение: A slow query

Поиск
Список
Период
Сортировка

A slow query

От
Alban Hertroys
Дата:
Hi all,

We're using some 3rd party product that uses inheritence, and the
following query is rather slow on PostgreSQL 7.4.7 (debian stable). Any
suggestions how to speed it up?

explain analyze SELECT
otype,owner,rnumber,dir,number,dnumber,pos,snumber FROM mm_posrel posrel
ORDER BY number DESC LIMIT 25;

QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=7996.04..7996.10 rows=25 width=60) (actual
time=2329.505..2329.767 rows=25 loops=1)
    ->  Sort  (cost=7996.04..8157.42 rows=64553 width=60) (actual
time=2329.495..2329.585 rows=25 loops=1)
          Sort Key: posrel.number
          ->  Result  (cost=0.00..1510.51 rows=64553 width=60) (actual
time=0.045..1644.541 rows=75597 loops=1)
                ->  Append  (cost=0.00..1510.51 rows=64553 width=60)
(actual time=0.034..977.543 rows=75597 loops=1)
                      ->  Seq Scan on mm_posrel posrel
(cost=0.00..1510.51 rows=64551 width=39) (actual time=0.027..436.501
rows=75597 loops=1)
                      ->  Seq Scan on mm_menu_item posrel
(cost=0.00..0.00 rows=1 width=60) (actual time=0.005..0.005 rows=0 loops=1)
                      ->  Seq Scan on mm_cms_operation posrel
(cost=0.00..0.00 rows=1 width=60) (actual time=0.005..0.005 rows=0 loops=1)
  Total runtime: 2332.136 ms
(9 rows)


The tables look like (I added the pkeys after the indexes on number, it
didn't change the problem):

Table "public.mm_posrel"
  Column  |  Type   | Modifiers
---------+---------+-----------
  number  | integer | not null
  otype   | integer | not null
  owner   | text    | not null
  snumber | integer | not null
  dnumber | integer | not null
  rnumber | integer | not null
  dir     | integer |
  pos     | integer |
Indexes:
     "mm_posrel_pkey" primary key, btree (number)
     "mm_posrel_dnumber_idx" btree (dnumber)
     "mm_posrel_number_idx" btree (number)
     "mm_posrel_rnumber_idx" btree (rnumber)
     "mm_posrel_snumber_idx" btree (snumber)
Inherits: mm_insrel


Table "public.mm_menu_item"
  Column  |  Type   | Modifiers
---------+---------+-----------
  number  | integer | not null
  otype   | integer | not null
  owner   | text    | not null
  snumber | integer | not null
  dnumber | integer | not null
  rnumber | integer | not null
  dir     | integer |
  pos     | integer |
  name    | text    | not null
Indexes:
     "mm_menu_item_pkey" primary key, btree (number)
     "mm_menu_item_dnumber_idx" btree (dnumber)
     "mm_menu_item_number_idx" btree (number)
     "mm_menu_item_rnumber_idx" btree (rnumber)
     "mm_menu_item_snumber_idx" btree (snumber)
Inherits: mm_posrel

Table "public.mm_cms_operation"
  Column  |  Type   | Modifiers
---------+---------+-----------
  number  | integer | not null
  otype   | integer | not null
  owner   | text    | not null
  snumber | integer | not null
  dnumber | integer | not null
  rnumber | integer | not null
  dir     | integer |
  pos     | integer |
  m_type  | text    | not null
  getvars | text    |
Indexes:
     "mm_cms_operation_pkey" primary key, btree (number)
     "mm_cms_operation_dnumber_idx" btree (dnumber)
     "mm_cms_operation_number_idx" btree (number)
     "mm_cms_operation_rnumber_idx" btree (rnumber)
     "mm_cms_operation_snumber_idx" btree (snumber)
Inherits: mm_posrel

Table "public.mm_insrel"
  Column  |  Type   | Modifiers
---------+---------+-----------
  number  | integer | not null
  otype   | integer | not null
  owner   | text    | not null
  snumber | integer | not null
  dnumber | integer | not null
  rnumber | integer | not null
  dir     | integer |
Indexes:
     "mm_insrel_dnumber_idx" btree (dnumber)
     "mm_insrel_number_idx" btree (number)
     "mm_insrel_rnumber_idx" btree (rnumber)
     "mm_insrel_snumber_idx" btree (snumber)
Inherits: mm_object

Table "public.mm_object"
  Column |  Type   | Modifiers
--------+---------+-----------
  number | integer | not null
  otype  | integer | not null
  owner  | text    | not null
Indexes:
     "mm_object_pkey" primary key, btree (number)



--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: A slow query

От
Alban Hertroys
Дата:
Alban Hertroys wrote:
> Hi all,
>
> We're using some 3rd party product that uses inheritence, and the
> following query is rather slow on PostgreSQL 7.4.7 (debian stable). Any
> suggestions how to speed it up?

A few more datapoints:
- Database was vacuum full analyzed just before the query.
- The same query on mm_insrel takes just over 11s.
- The actual and estimated row numbers are accurate.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: A slow query - Help please?

От
Alban Hertroys
Дата:
Alban Hertroys wrote:
> Hi all,
>
> We're using some 3rd party product that uses inheritence, and the
> following query is rather slow on PostgreSQL 7.4.7 (debian stable). Any
> suggestions how to speed it up?

We really need this solved. Isn't anybody able to shed some light on
this? Is it possible to make this query use an index scan, preferably
w/o disabling sequential scanning?

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: A slow query - Help please?

От
"hubert depesz lubaczewski"
Дата:
On 6/16/06, Alban Hertroys <alban@magproductions.nl> wrote:
We really need this solved. Isn't anybody able to shed some light on
this? Is it possible to make this query use an index scan, preferably
w/o disabling sequential scanning?

ditch the inheritance. it is no good, and makes everything too complicated to work with.

in case you can't, do something similar to this:

select * from
(
select * from only table_a  order by number desc limit 25
union
select * from only table_b  order by number desc limit 25
union
select * from only table_c  order by number desc limit 25
) x
order by number desc limit 25;

it should be faster. and yes, i know it's ugly.

depesz

--
http://www.depesz.com/ - nowy, lepszy depesz

Re: A slow query - Help please?

От
Alban Hertroys
Дата:
hubert depesz lubaczewski wrote:
> ditch the inheritance. it is no good, and makes everything too
> complicated to work with.

Would love to, but that's what the system we use generates (MMBase, for
the record). We can probably rework the generated tables, but it isn't
entirely certain that won't break MMBase.

> in case you can't, do something similar to this:
>
> select * from
> (
> select * from only table_a  order by number desc limit 25
> union
> select * from only table_b  order by number desc limit 25
> union
> select * from only table_c  order by number desc limit 25
> ) x
> order by number desc limit 25;

Actually, the query isn't valid this way.

The "order by" clauses aren't allowed there - I don't know about the
"limit" ones, but they're pointless w/o "order by" anyway.

And of course 'select *' isn't going to work in the subqueries, the base
table contains less columns than the inherited tables.

After removing the "order by" and "limit" clauses, it is about exactly
as slow as querying the inherited table directly:

select * from table_a order by number desc limit 25;

I'm afraid this didn't quite solve the problem...
We probably need to get rid of the inheritence - it can be done, as the
same product manages to run on MySQL as well...

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: A slow query - Help please?

От
"Gurjeet Singh"
Дата:
Hubert said 'do "something similar" to this'... The syntax was
incorrect for sure.

If you wish to be able to do ORDER BY and LIMIT in UNION queries, you
need to make them sub-queries "something similar" to this:

select * from
(
select * from (select * from only table_a  order by number desc limit 25) as A
union
select * from (select * from only table_b  order by number desc limit 25) as B
union
select * from (select * from only table_c  order by number desc limit 25) as C
) x
order by number desc limit 25;

    Again, this query and the syntax is totally untested (I don't have
access to a PG right now); and I don't know much about the history of
this thread!!! But I think it will allow you to do a LIMIT 25 alright.

> And of course 'select *' isn't going to work in the subqueries, the base
> table contains less columns than the inherited tables.

    I don't think you have much of an option in that case...

Re: A slow query - Help please?

От
Alban Hertroys
Дата:
hubert depesz lubaczewski wrote:
> in case you can't, do something similar to this:
>
> select * from
> (
> select * from only table_a  order by number desc limit 25
> union
> select * from only table_b  order by number desc limit 25
> union
> select * from only table_c  order by number desc limit 25
> ) x
> order by number desc limit 25;
>
> it should be faster. and yes, i know it's ugly.

I found a way that works, and is indeed quite a bit faster. It is even
uglier than what you proposed. The problem wasn't the "order by" in the
subquery, but the "order by" combined with the "union":

EXPLAIN ANALYZE
SELECT *
FROM (
     SELECT number, otype, owner, snumber, dnumber, rnumber, dir, pos
       FROM ONLY mm_posrel
      ORDER BY number DESC
      LIMIT 25
) a
UNION ALL
SELECT *
FROM (
     SELECT number, otype, owner, snumber, dnumber, rnumber, dir, pos
       FROM ONLY mm_menu_item
      ORDER BY number DESC
      LIMIT 25
) b
UNION ALL
SELECT *
FROM (
     SELECT number, otype, owner, snumber, dnumber, rnumber, dir, pos
       FROM ONLY mm_cms_operation
      ORDER BY number DESC
      LIMIT 25
) c
ORDER BY number DESC LIMIT 25;

Output of explain is attached, for those interested.

Now all we need to do is getting MMBase to do its queries like this :P
Thanks a bunch for setting me on the right track.
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //
                                                                                        QUERY PLAN
                                                                  

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=7.65..7.71 rows=25 width=60) (actual time=1.352..1.678 rows=25 loops=1)
   ->  Sort  (cost=7.65..7.84 rows=75 width=60) (actual time=1.344..1.428 rows=25 loops=1)
         Sort Key: number
         ->  Append  (cost=0.00..5.32 rows=75 width=60) (actual time=0.069..1.164 rows=25 loops=1)
               ->  Subquery Scan "*SELECT* 1"  (cost=0.00..1.72 rows=25 width=60) (actual time=0.063..0.877 rows=25
loops=1)
                     ->  Subquery Scan a  (cost=0.00..1.47 rows=25 width=60) (actual time=0.052..0.651 rows=25 loops=1)
                           ->  Limit  (cost=0.00..1.22 rows=25 width=60) (actual time=0.040..0.386 rows=25 loops=1)
                                 ->  Index Scan Backward using mm_posrel_pkey on mm_posrel  (cost=0.00..3675.54
rows=75595width=60) (actual time=0.032..0.187 rows=25 loops=1) 
               ->  Subquery Scan "*SELECT* 2"  (cost=0.00..1.80 rows=25 width=60) (actual time=0.031..0.031 rows=0
loops=1)
                     ->  Subquery Scan b  (cost=0.00..1.55 rows=25 width=60) (actual time=0.023..0.023 rows=0 loops=1)
                           ->  Limit  (cost=0.00..1.30 rows=25 width=60) (actual time=0.017..0.017 rows=0 loops=1)
                                 ->  Index Scan Backward using mm_menu_item_pkey on mm_menu_item  (cost=0.00..52.00
rows=1000width=60) (actual time=0.009..0.009 rows=0 loops=1) 
               ->  Subquery Scan "*SELECT* 3"  (cost=0.00..1.80 rows=25 width=60) (actual time=0.028..0.028 rows=0
loops=1)
                     ->  Subquery Scan c  (cost=0.00..1.55 rows=25 width=60) (actual time=0.020..0.020 rows=0 loops=1)
                           ->  Limit  (cost=0.00..1.30 rows=25 width=60) (actual time=0.013..0.013 rows=0 loops=1)
                                 ->  Index Scan Backward using mm_cms_operation_pkey on mm_cms_operation
(cost=0.00..52.00rows=1000 width=60) (actual time=0.007..0.007 rows=0 loops=1) 
 Total runtime: 2.077 ms
(17 rows)


Re: A slow query - Help please?

От
Jim Nasby
Дата:
On Jun 19, 2006, at 7:00 AM, Alban Hertroys wrote:
> Now all we need to do is getting MMBase to do its queries like this :P

Probably a better bet would be going to 8.1 and using constraint
elimination.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461



Re: A slow query - Help please?

От
"hubert depesz lubaczewski"
Дата:
On 6/19/06, Alban Hertroys <alban@magproductions.nl> wrote:
I found a way that works, and is indeed quite a bit faster. It is even
uglier than what you proposed. The problem wasn't the "order by" in the
subquery, but the "order by" combined with the "union":


sorry, i always forget about  the fact that union doesn't like internal order and limits. subquery is of course perfectly fine - i'm glad you found that :)

depesz

--
http://www.depesz.com/ - nowy, lepszy depesz

Re: A slow query - Help please?

От
Alban Hertroys
Дата:
Jim Nasby wrote:
> On Jun 19, 2006, at 7:00 AM, Alban Hertroys wrote:
>
>> Now all we need to do is getting MMBase to do its queries like this :P
>
>
> Probably a better bet would be going to 8.1 and using constraint
> elimination.

I searched the documentation, google and wikipedia for "constraint
elimination", but couldn't find anything more specific than a reference
to an O'Reilly conference about the subject.

Maybe you mean constraint exclusion?

If so, is that going to help excluding partitions (basically the same
thing, it seems) from a query based on an ORDER BY and a LIMIT?

Say we take the query I posted:
    "SELECT * FROM mm_posrel ORDER BY number DESC LIMIT 25;"
and the knowledge that this table is inherited by two other tables, with
number being unique across them (though PostgreSQL probably doesn't know
about this).
Can constraint exclusion determine that the last 25 number values do not
occur in some of the tables?

This looks liek an interesting solution, could save us quite a bit of
work if we manage to use this...

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: A slow query - Help please?

От
Martijn van Oosterhout
Дата:
On Wed, Jun 21, 2006 at 11:56:27AM +0200, Alban Hertroys wrote:
> Say we take the query I posted:
>     "SELECT * FROM mm_posrel ORDER BY number DESC LIMIT 25;"
> and the knowledge that this table is inherited by two other tables, with
> number being unique across them (though PostgreSQL probably doesn't know
> about this).
> Can constraint exclusion determine that the last 25 number values do not
> occur in some of the tables?

ISTM that what would really work well is some kind of "Merge Sort" node
that would work by having multiple subnodes which are already sorted
and merging them into one sorted list.

The planner would use this whenever it saw a query of the form:

SELECT * FROM a
UNION ALL
SELECT * FROM b
ORDER BY c;

It would push the ORDER BY down to the subqueries and then merge the
results. If the subqueries can be read efficiently sorted (via an index
for example) then you would get very quick output, especially if you
have a LIMIT clause.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: A slow query - Help please?

От
Alban Hertroys
Дата:
Alban Hertroys wrote:
> Jim Nasby wrote:
>> Probably a better bet would be going to 8.1 and using constraint
>> elimination.

> Maybe you mean constraint exclusion?
>
> If so, is that going to help excluding partitions (basically the same
> thing, it seems) from a query based on an ORDER BY and a LIMIT?
>
> Say we take the query I posted:
>     "SELECT * FROM mm_posrel ORDER BY number DESC LIMIT 25;"
> and the knowledge that this table is inherited by two other tables, with
> number being unique across them (though PostgreSQL probably doesn't know
> about this).
> Can constraint exclusion determine that the last 25 number values do not
> occur in some of the tables?

I did some experiments on my PostgreSQL 8.1 server at home (gotta love
UNIX & SSH), with the following setup:
                            Table "public.object"
  Column |  Type   |                        Modifiers

--------+---------+---------------------------------------------------------
  number | integer | not null default nextval('object_number_seq'::regclass)
  title  | text    | not null
Indexes:
     "object_pkey" PRIMARY KEY, btree (number)

                            Table "public.content"
  Column  |  Type   |                        Modifiers

---------+---------+---------------------------------------------------------
  number  | integer | not null default
nextval('object_number_seq'::regclass)
  title   | text    | not null
  summary | text    | not null
  body    | text    | not null
Inherits: object

                           Table "public.menu_item"
  Column |  Type   |                        Modifiers

--------+---------+---------------------------------------------------------
  number | integer | not null default nextval('object_number_seq'::regclass)
  title  | text    | not null
  pos    | integer | not null default 1
Inherits: object

I inserted a few records into "object" (30, IIRC) and did:

  SET constraint_exclusion=on;
  explain analyze select number, title from object order by number desc
limit 10;
                                                                QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=131.34..131.37 rows=10 width=36) (actual
time=0.335..0.358 rows=10 loops=1)
    ->  Sort  (cost=131.34..135.67 rows=1730 width=36) (actual
time=0.331..0.338 rows=10 loops=1)
          Sort Key: public."object".number
          ->  Result  (cost=0.00..38.30 rows=1730 width=36) (actual
time=0.097..0.248 rows=30 loops=1)
                ->  Append  (cost=0.00..38.30 rows=1730 width=36)
(actual time=0.091..0.184 rows=30 loops=1)
                      ->  Seq Scan on "object"  (cost=0.00..1.30 rows=30
width=12) (actual time=0.090..0.129 rows=30 loops=1)
                      ->  Seq Scan on menu_item "object"
(cost=0.00..21.00 rows=1100 width=36) (actual time=0.001..0.001 rows=0
loops=1)
                      ->  Seq Scan on content "object"
(cost=0.00..16.00 rows=600 width=36) (actual time=0.001..0.001 rows=0
loops=1)
  Total runtime: 0.446 ms
(9 rows)

As you can see, it still scans the empty tables menu_item and content.
So I'm afraid this is no solution to our problem... :(

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: A slow query - Help please?

От
Alban Hertroys
Дата:
Martijn van Oosterhout wrote:
> ISTM that what would really work well is some kind of "Merge Sort" node
> that would work by having multiple subnodes which are already sorted
> and merging them into one sorted list.

Would... So this isn't available yet?

> The planner would use this whenever it saw a query of the form:
>
> SELECT * FROM a
> UNION ALL
> SELECT * FROM b
> ORDER BY c;
>
> It would push the ORDER BY down to the subqueries and then merge the
> results. If the subqueries can be read efficiently sorted (via an index
> for example) then you would get very quick output, especially if you
> have a LIMIT clause.

I just realized that OFFSET kind of complicates the problem.

If PostgreSQL would handle this (for inheritance as well, I hope), it'd
need to keep track of how many records came from which tables to set the
offsets in the subqueries appropriately, which of course depends on the
previous query... Well, I said it complicates things...

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: A slow query - Help please?

От
Martijn van Oosterhout
Дата:
On Wed, Jun 21, 2006 at 04:12:08PM +0200, Alban Hertroys wrote:
> Martijn van Oosterhout wrote:
> >ISTM that what would really work well is some kind of "Merge Sort" node
> >that would work by having multiple subnodes which are already sorted
> >and merging them into one sorted list.
>
> Would... So this isn't available yet?

Not AFAIK.

> >It would push the ORDER BY down to the subqueries and then merge the
> >results. If the subqueries can be read efficiently sorted (via an index
> >for example) then you would get very quick output, especially if you
> >have a LIMIT clause.
>
> I just realized that OFFSET kind of complicates the problem.
>
> If PostgreSQL would handle this (for inheritance as well, I hope), it'd
> need to keep track of how many records came from which tables to set the
> offsets in the subqueries appropriately, which of course depends on the
> previous query... Well, I said it complicates things...

OFFSET is not a problem at all. It's just code for "throw away first N
rows". Once you have the above node type, the executor would simply
throw away somed rows, whichever table they came from.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Confused about locales

От
John Gunther
Дата:
I've been reading about locales, encodings, sort orders, the to_ascii
function and, embarrasingly, I'm more confused than enlightened.:

What I want is very simple:
1) I want the database to correctly accept, store, and display
alphabetic characters, including European accented characters, in HTML
forms.
2) I want sorting to ignore the diacritical marks so that, for example,
u, u-accent, and u-umlaut are all sorted as if they were plain u.
3) I want sorting to ignore non-alphanumerics, letter case, and white space.

To illustrate, the following data is in sorted order:

St-Émile
stendahl
st ènders
St. Epson

Can someone tell me what combination of PostgreSQL and Linux settings I
need for this? It seems like a very basic question, but I'm just dense,
I guess. I've tried a half dozen time-consuming configs without success.

Thank you.

John Gunther

Re: Confused about locales

От
"Tomi NA"
Дата:
On 8/19/06, John Gunther <owner@bucksvsbytes.com> wrote:
> I've been reading about locales, encodings, sort orders, the to_ascii
> function and, embarrasingly, I'm more confused than enlightened.:
>
> What I want is very simple:
> 1) I want the database to correctly accept, store, and display
> alphabetic characters, including European accented characters, in HTML
> forms.
> 2) I want sorting to ignore the diacritical marks so that, for example,
> u, u-accent, and u-umlaut are all sorted as if they were plain u.
> 3) I want sorting to ignore non-alphanumerics, letter case, and white space.
>
> To illustrate, the following data is in sorted order:
>
> St-Émile
> stendahl
> st ènders
> St. Epson
>
> Can someone tell me what combination of PostgreSQL and Linux settings I
> need for this? It seems like a very basic question, but I'm just dense,
> I guess. I've tried a half dozen time-consuming configs without success.

Well, you'll obviously have to use UTF if you plan on supporting more
then one language with different accented characters. The sorting
issue is a bit of a problem, though. Pgsql uses the same collation in
all databases in a database cluster (carved into stone at cluster
init) so I don't know of a good way you could collate your data....you
could concievably keep a copy of accented strings replacing the
accented characters with their non-accented counterparts as you see
fit and collate on that column, but that's not a very elegant way of
handling the problem, is it?
You might have more luck with another database like mysql 4.1+ (where
accent-insensitive UTF collation is directly supported), MS SQL (where
you can define encoding and collation settings at the database level,
and so concievably have a database for each language, if you know
exactly which languages you'll have) or Firebird (where you define an
encoding at the column level and can collate any way you wish in each
column).

Hope I've helped,
t.n.a.