Обсуждение: A slow query
Hi all, We're using some 3rd party product that uses inheritence, and the following query is rather slow on PostgreSQL 7.4.7 (debian stable). Any suggestions how to speed it up? explain analyze SELECT otype,owner,rnumber,dir,number,dnumber,pos,snumber FROM mm_posrel posrel ORDER BY number DESC LIMIT 25; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=7996.04..7996.10 rows=25 width=60) (actual time=2329.505..2329.767 rows=25 loops=1) -> Sort (cost=7996.04..8157.42 rows=64553 width=60) (actual time=2329.495..2329.585 rows=25 loops=1) Sort Key: posrel.number -> Result (cost=0.00..1510.51 rows=64553 width=60) (actual time=0.045..1644.541 rows=75597 loops=1) -> Append (cost=0.00..1510.51 rows=64553 width=60) (actual time=0.034..977.543 rows=75597 loops=1) -> Seq Scan on mm_posrel posrel (cost=0.00..1510.51 rows=64551 width=39) (actual time=0.027..436.501 rows=75597 loops=1) -> Seq Scan on mm_menu_item posrel (cost=0.00..0.00 rows=1 width=60) (actual time=0.005..0.005 rows=0 loops=1) -> Seq Scan on mm_cms_operation posrel (cost=0.00..0.00 rows=1 width=60) (actual time=0.005..0.005 rows=0 loops=1) Total runtime: 2332.136 ms (9 rows) The tables look like (I added the pkeys after the indexes on number, it didn't change the problem): Table "public.mm_posrel" Column | Type | Modifiers ---------+---------+----------- number | integer | not null otype | integer | not null owner | text | not null snumber | integer | not null dnumber | integer | not null rnumber | integer | not null dir | integer | pos | integer | Indexes: "mm_posrel_pkey" primary key, btree (number) "mm_posrel_dnumber_idx" btree (dnumber) "mm_posrel_number_idx" btree (number) "mm_posrel_rnumber_idx" btree (rnumber) "mm_posrel_snumber_idx" btree (snumber) Inherits: mm_insrel Table "public.mm_menu_item" Column | Type | Modifiers ---------+---------+----------- number | integer | not null otype | integer | not null owner | text | not null snumber | integer | not null dnumber | integer | not null rnumber | integer | not null dir | integer | pos | integer | name | text | not null Indexes: "mm_menu_item_pkey" primary key, btree (number) "mm_menu_item_dnumber_idx" btree (dnumber) "mm_menu_item_number_idx" btree (number) "mm_menu_item_rnumber_idx" btree (rnumber) "mm_menu_item_snumber_idx" btree (snumber) Inherits: mm_posrel Table "public.mm_cms_operation" Column | Type | Modifiers ---------+---------+----------- number | integer | not null otype | integer | not null owner | text | not null snumber | integer | not null dnumber | integer | not null rnumber | integer | not null dir | integer | pos | integer | m_type | text | not null getvars | text | Indexes: "mm_cms_operation_pkey" primary key, btree (number) "mm_cms_operation_dnumber_idx" btree (dnumber) "mm_cms_operation_number_idx" btree (number) "mm_cms_operation_rnumber_idx" btree (rnumber) "mm_cms_operation_snumber_idx" btree (snumber) Inherits: mm_posrel Table "public.mm_insrel" Column | Type | Modifiers ---------+---------+----------- number | integer | not null otype | integer | not null owner | text | not null snumber | integer | not null dnumber | integer | not null rnumber | integer | not null dir | integer | Indexes: "mm_insrel_dnumber_idx" btree (dnumber) "mm_insrel_number_idx" btree (number) "mm_insrel_rnumber_idx" btree (rnumber) "mm_insrel_snumber_idx" btree (snumber) Inherits: mm_object Table "public.mm_object" Column | Type | Modifiers --------+---------+----------- number | integer | not null otype | integer | not null owner | text | not null Indexes: "mm_object_pkey" primary key, btree (number) -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Alban Hertroys wrote: > Hi all, > > We're using some 3rd party product that uses inheritence, and the > following query is rather slow on PostgreSQL 7.4.7 (debian stable). Any > suggestions how to speed it up? A few more datapoints: - Database was vacuum full analyzed just before the query. - The same query on mm_insrel takes just over 11s. - The actual and estimated row numbers are accurate. -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Alban Hertroys wrote: > Hi all, > > We're using some 3rd party product that uses inheritence, and the > following query is rather slow on PostgreSQL 7.4.7 (debian stable). Any > suggestions how to speed it up? We really need this solved. Isn't anybody able to shed some light on this? Is it possible to make this query use an index scan, preferably w/o disabling sequential scanning? Regards, -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
On 6/16/06, Alban Hertroys <alban@magproductions.nl> wrote:
ditch the inheritance. it is no good, and makes everything too complicated to work with.
in case you can't, do something similar to this:
select * from
(
select * from only table_a order by number desc limit 25
union
select * from only table_b order by number desc limit 25
union
select * from only table_c order by number desc limit 25
) x
order by number desc limit 25;
it should be faster. and yes, i know it's ugly.
depesz
--
http://www.depesz.com/ - nowy, lepszy depesz
We really need this solved. Isn't anybody able to shed some light on
this? Is it possible to make this query use an index scan, preferably
w/o disabling sequential scanning?
ditch the inheritance. it is no good, and makes everything too complicated to work with.
in case you can't, do something similar to this:
select * from
(
select * from only table_a order by number desc limit 25
union
select * from only table_b order by number desc limit 25
union
select * from only table_c order by number desc limit 25
) x
order by number desc limit 25;
it should be faster. and yes, i know it's ugly.
depesz
--
http://www.depesz.com/ - nowy, lepszy depesz
hubert depesz lubaczewski wrote: > ditch the inheritance. it is no good, and makes everything too > complicated to work with. Would love to, but that's what the system we use generates (MMBase, for the record). We can probably rework the generated tables, but it isn't entirely certain that won't break MMBase. > in case you can't, do something similar to this: > > select * from > ( > select * from only table_a order by number desc limit 25 > union > select * from only table_b order by number desc limit 25 > union > select * from only table_c order by number desc limit 25 > ) x > order by number desc limit 25; Actually, the query isn't valid this way. The "order by" clauses aren't allowed there - I don't know about the "limit" ones, but they're pointless w/o "order by" anyway. And of course 'select *' isn't going to work in the subqueries, the base table contains less columns than the inherited tables. After removing the "order by" and "limit" clauses, it is about exactly as slow as querying the inherited table directly: select * from table_a order by number desc limit 25; I'm afraid this didn't quite solve the problem... We probably need to get rid of the inheritence - it can be done, as the same product manages to run on MySQL as well... Regards, -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Hubert said 'do "something similar" to this'... The syntax was incorrect for sure. If you wish to be able to do ORDER BY and LIMIT in UNION queries, you need to make them sub-queries "something similar" to this: select * from ( select * from (select * from only table_a order by number desc limit 25) as A union select * from (select * from only table_b order by number desc limit 25) as B union select * from (select * from only table_c order by number desc limit 25) as C ) x order by number desc limit 25; Again, this query and the syntax is totally untested (I don't have access to a PG right now); and I don't know much about the history of this thread!!! But I think it will allow you to do a LIMIT 25 alright. > And of course 'select *' isn't going to work in the subqueries, the base > table contains less columns than the inherited tables. I don't think you have much of an option in that case...
hubert depesz lubaczewski wrote: > in case you can't, do something similar to this: > > select * from > ( > select * from only table_a order by number desc limit 25 > union > select * from only table_b order by number desc limit 25 > union > select * from only table_c order by number desc limit 25 > ) x > order by number desc limit 25; > > it should be faster. and yes, i know it's ugly. I found a way that works, and is indeed quite a bit faster. It is even uglier than what you proposed. The problem wasn't the "order by" in the subquery, but the "order by" combined with the "union": EXPLAIN ANALYZE SELECT * FROM ( SELECT number, otype, owner, snumber, dnumber, rnumber, dir, pos FROM ONLY mm_posrel ORDER BY number DESC LIMIT 25 ) a UNION ALL SELECT * FROM ( SELECT number, otype, owner, snumber, dnumber, rnumber, dir, pos FROM ONLY mm_menu_item ORDER BY number DESC LIMIT 25 ) b UNION ALL SELECT * FROM ( SELECT number, otype, owner, snumber, dnumber, rnumber, dir, pos FROM ONLY mm_cms_operation ORDER BY number DESC LIMIT 25 ) c ORDER BY number DESC LIMIT 25; Output of explain is attached, for those interested. Now all we need to do is getting MMBase to do its queries like this :P Thanks a bunch for setting me on the right track. -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World // QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=7.65..7.71 rows=25 width=60) (actual time=1.352..1.678 rows=25 loops=1) -> Sort (cost=7.65..7.84 rows=75 width=60) (actual time=1.344..1.428 rows=25 loops=1) Sort Key: number -> Append (cost=0.00..5.32 rows=75 width=60) (actual time=0.069..1.164 rows=25 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..1.72 rows=25 width=60) (actual time=0.063..0.877 rows=25 loops=1) -> Subquery Scan a (cost=0.00..1.47 rows=25 width=60) (actual time=0.052..0.651 rows=25 loops=1) -> Limit (cost=0.00..1.22 rows=25 width=60) (actual time=0.040..0.386 rows=25 loops=1) -> Index Scan Backward using mm_posrel_pkey on mm_posrel (cost=0.00..3675.54 rows=75595width=60) (actual time=0.032..0.187 rows=25 loops=1) -> Subquery Scan "*SELECT* 2" (cost=0.00..1.80 rows=25 width=60) (actual time=0.031..0.031 rows=0 loops=1) -> Subquery Scan b (cost=0.00..1.55 rows=25 width=60) (actual time=0.023..0.023 rows=0 loops=1) -> Limit (cost=0.00..1.30 rows=25 width=60) (actual time=0.017..0.017 rows=0 loops=1) -> Index Scan Backward using mm_menu_item_pkey on mm_menu_item (cost=0.00..52.00 rows=1000width=60) (actual time=0.009..0.009 rows=0 loops=1) -> Subquery Scan "*SELECT* 3" (cost=0.00..1.80 rows=25 width=60) (actual time=0.028..0.028 rows=0 loops=1) -> Subquery Scan c (cost=0.00..1.55 rows=25 width=60) (actual time=0.020..0.020 rows=0 loops=1) -> Limit (cost=0.00..1.30 rows=25 width=60) (actual time=0.013..0.013 rows=0 loops=1) -> Index Scan Backward using mm_cms_operation_pkey on mm_cms_operation (cost=0.00..52.00rows=1000 width=60) (actual time=0.007..0.007 rows=0 loops=1) Total runtime: 2.077 ms (17 rows)
On Jun 19, 2006, at 7:00 AM, Alban Hertroys wrote: > Now all we need to do is getting MMBase to do its queries like this :P Probably a better bet would be going to 8.1 and using constraint elimination. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On 6/19/06, Alban Hertroys <alban@magproductions.nl> wrote:
sorry, i always forget about the fact that union doesn't like internal order and limits. subquery is of course perfectly fine - i'm glad you found that :)
depesz
--
http://www.depesz.com/ - nowy, lepszy depesz
I found a way that works, and is indeed quite a bit faster. It is even
uglier than what you proposed. The problem wasn't the "order by" in the
subquery, but the "order by" combined with the "union":
sorry, i always forget about the fact that union doesn't like internal order and limits. subquery is of course perfectly fine - i'm glad you found that :)
--
http://www.depesz.com/ - nowy, lepszy depesz
Jim Nasby wrote: > On Jun 19, 2006, at 7:00 AM, Alban Hertroys wrote: > >> Now all we need to do is getting MMBase to do its queries like this :P > > > Probably a better bet would be going to 8.1 and using constraint > elimination. I searched the documentation, google and wikipedia for "constraint elimination", but couldn't find anything more specific than a reference to an O'Reilly conference about the subject. Maybe you mean constraint exclusion? If so, is that going to help excluding partitions (basically the same thing, it seems) from a query based on an ORDER BY and a LIMIT? Say we take the query I posted: "SELECT * FROM mm_posrel ORDER BY number DESC LIMIT 25;" and the knowledge that this table is inherited by two other tables, with number being unique across them (though PostgreSQL probably doesn't know about this). Can constraint exclusion determine that the last 25 number values do not occur in some of the tables? This looks liek an interesting solution, could save us quite a bit of work if we manage to use this... -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
On Wed, Jun 21, 2006 at 11:56:27AM +0200, Alban Hertroys wrote: > Say we take the query I posted: > "SELECT * FROM mm_posrel ORDER BY number DESC LIMIT 25;" > and the knowledge that this table is inherited by two other tables, with > number being unique across them (though PostgreSQL probably doesn't know > about this). > Can constraint exclusion determine that the last 25 number values do not > occur in some of the tables? ISTM that what would really work well is some kind of "Merge Sort" node that would work by having multiple subnodes which are already sorted and merging them into one sorted list. The planner would use this whenever it saw a query of the form: SELECT * FROM a UNION ALL SELECT * FROM b ORDER BY c; It would push the ORDER BY down to the subqueries and then merge the results. If the subqueries can be read efficiently sorted (via an index for example) then you would get very quick output, especially if you have a LIMIT clause. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Вложения
Alban Hertroys wrote: > Jim Nasby wrote: >> Probably a better bet would be going to 8.1 and using constraint >> elimination. > Maybe you mean constraint exclusion? > > If so, is that going to help excluding partitions (basically the same > thing, it seems) from a query based on an ORDER BY and a LIMIT? > > Say we take the query I posted: > "SELECT * FROM mm_posrel ORDER BY number DESC LIMIT 25;" > and the knowledge that this table is inherited by two other tables, with > number being unique across them (though PostgreSQL probably doesn't know > about this). > Can constraint exclusion determine that the last 25 number values do not > occur in some of the tables? I did some experiments on my PostgreSQL 8.1 server at home (gotta love UNIX & SSH), with the following setup: Table "public.object" Column | Type | Modifiers --------+---------+--------------------------------------------------------- number | integer | not null default nextval('object_number_seq'::regclass) title | text | not null Indexes: "object_pkey" PRIMARY KEY, btree (number) Table "public.content" Column | Type | Modifiers ---------+---------+--------------------------------------------------------- number | integer | not null default nextval('object_number_seq'::regclass) title | text | not null summary | text | not null body | text | not null Inherits: object Table "public.menu_item" Column | Type | Modifiers --------+---------+--------------------------------------------------------- number | integer | not null default nextval('object_number_seq'::regclass) title | text | not null pos | integer | not null default 1 Inherits: object I inserted a few records into "object" (30, IIRC) and did: SET constraint_exclusion=on; explain analyze select number, title from object order by number desc limit 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=131.34..131.37 rows=10 width=36) (actual time=0.335..0.358 rows=10 loops=1) -> Sort (cost=131.34..135.67 rows=1730 width=36) (actual time=0.331..0.338 rows=10 loops=1) Sort Key: public."object".number -> Result (cost=0.00..38.30 rows=1730 width=36) (actual time=0.097..0.248 rows=30 loops=1) -> Append (cost=0.00..38.30 rows=1730 width=36) (actual time=0.091..0.184 rows=30 loops=1) -> Seq Scan on "object" (cost=0.00..1.30 rows=30 width=12) (actual time=0.090..0.129 rows=30 loops=1) -> Seq Scan on menu_item "object" (cost=0.00..21.00 rows=1100 width=36) (actual time=0.001..0.001 rows=0 loops=1) -> Seq Scan on content "object" (cost=0.00..16.00 rows=600 width=36) (actual time=0.001..0.001 rows=0 loops=1) Total runtime: 0.446 ms (9 rows) As you can see, it still scans the empty tables menu_item and content. So I'm afraid this is no solution to our problem... :( -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Martijn van Oosterhout wrote: > ISTM that what would really work well is some kind of "Merge Sort" node > that would work by having multiple subnodes which are already sorted > and merging them into one sorted list. Would... So this isn't available yet? > The planner would use this whenever it saw a query of the form: > > SELECT * FROM a > UNION ALL > SELECT * FROM b > ORDER BY c; > > It would push the ORDER BY down to the subqueries and then merge the > results. If the subqueries can be read efficiently sorted (via an index > for example) then you would get very quick output, especially if you > have a LIMIT clause. I just realized that OFFSET kind of complicates the problem. If PostgreSQL would handle this (for inheritance as well, I hope), it'd need to keep track of how many records came from which tables to set the offsets in the subqueries appropriately, which of course depends on the previous query... Well, I said it complicates things... -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
On Wed, Jun 21, 2006 at 04:12:08PM +0200, Alban Hertroys wrote: > Martijn van Oosterhout wrote: > >ISTM that what would really work well is some kind of "Merge Sort" node > >that would work by having multiple subnodes which are already sorted > >and merging them into one sorted list. > > Would... So this isn't available yet? Not AFAIK. > >It would push the ORDER BY down to the subqueries and then merge the > >results. If the subqueries can be read efficiently sorted (via an index > >for example) then you would get very quick output, especially if you > >have a LIMIT clause. > > I just realized that OFFSET kind of complicates the problem. > > If PostgreSQL would handle this (for inheritance as well, I hope), it'd > need to keep track of how many records came from which tables to set the > offsets in the subqueries appropriately, which of course depends on the > previous query... Well, I said it complicates things... OFFSET is not a problem at all. It's just code for "throw away first N rows". Once you have the above node type, the executor would simply throw away somed rows, whichever table they came from. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Вложения
I've been reading about locales, encodings, sort orders, the to_ascii function and, embarrasingly, I'm more confused than enlightened.: What I want is very simple: 1) I want the database to correctly accept, store, and display alphabetic characters, including European accented characters, in HTML forms. 2) I want sorting to ignore the diacritical marks so that, for example, u, u-accent, and u-umlaut are all sorted as if they were plain u. 3) I want sorting to ignore non-alphanumerics, letter case, and white space. To illustrate, the following data is in sorted order: St-Émile stendahl st ènders St. Epson Can someone tell me what combination of PostgreSQL and Linux settings I need for this? It seems like a very basic question, but I'm just dense, I guess. I've tried a half dozen time-consuming configs without success. Thank you. John Gunther
On 8/19/06, John Gunther <owner@bucksvsbytes.com> wrote: > I've been reading about locales, encodings, sort orders, the to_ascii > function and, embarrasingly, I'm more confused than enlightened.: > > What I want is very simple: > 1) I want the database to correctly accept, store, and display > alphabetic characters, including European accented characters, in HTML > forms. > 2) I want sorting to ignore the diacritical marks so that, for example, > u, u-accent, and u-umlaut are all sorted as if they were plain u. > 3) I want sorting to ignore non-alphanumerics, letter case, and white space. > > To illustrate, the following data is in sorted order: > > St-Émile > stendahl > st ènders > St. Epson > > Can someone tell me what combination of PostgreSQL and Linux settings I > need for this? It seems like a very basic question, but I'm just dense, > I guess. I've tried a half dozen time-consuming configs without success. Well, you'll obviously have to use UTF if you plan on supporting more then one language with different accented characters. The sorting issue is a bit of a problem, though. Pgsql uses the same collation in all databases in a database cluster (carved into stone at cluster init) so I don't know of a good way you could collate your data....you could concievably keep a copy of accented strings replacing the accented characters with their non-accented counterparts as you see fit and collate on that column, but that's not a very elegant way of handling the problem, is it? You might have more luck with another database like mysql 4.1+ (where accent-insensitive UTF collation is directly supported), MS SQL (where you can define encoding and collation settings at the database level, and so concievably have a database for each language, if you know exactly which languages you'll have) or Firebird (where you define an encoding at the column level and can collate any way you wish in each column). Hope I've helped, t.n.a.