Обсуждение: Problem with slow query with WHERE conditions with OR clause on primary keys
Problem with slow query with WHERE conditions with OR clause on primary keys
От
Krzysztof Olszewski
Дата:
Hi, my sql is very simple, returns one row, where conditions are assigned to primary keys select g.gd_index, gd.full_name from gd g join gd_data gd on (g.id_gd = gd.id_gd) where gd.id_gd_data = 1111 OR g.id_gd = 1111; but generates "crazy" plan with Merge Join on big amount of rows (both tables contains 500000 rows) because Index scans ignore conditions, conditions are processed after index sacans on Merge Join Merge Join (cost=0.00..46399.80 rows=2 width=115) (actual time=3.881..644.409 rows=1 loops=1) Merge Cond: (g.id_gd = gd.id_gd) Join Filter: ((gd.id_gd_data = 1111) OR (g.id_gd = 1111)) -> Index Scan using pk_gd on gd g (cost=0.00..14117.79 rows=500001 width=40) (actual time=0.019..146.521 rows=500001 loops=1) -> Index Scan using fki_gd on gd_data gd (cost=0.00..22282.04 rows=500001 width=99) (actual time=0.016..157.384 rows=500001 loops=1) Total runtime: 644.460 ms model is very simple CREATE TABLE gd ( id_gd bigint NOT NULL, gd_index character varying(60) NOT NULL, notes text, notes_exists integer NOT NULL DEFAULT 0, CONSTRAINT pk_gd PRIMARY KEY (id_gd ) ) CREATE TABLE gd_data ( id_gd_data bigint NOT NULL, id_gd bigint NOT NULL, short_name character varying(120) NOT NULL, full_name character varying(512) NOT NULL, notes text, notes_exists integer NOT NULL DEFAULT 0, CONSTRAINT pk_gd_data PRIMARY KEY (id_gd_data ), CONSTRAINT fk_gd FOREIGN KEY (id_gd) REFERENCES gd (id_gd) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) CREATE INDEX fki_gd ON gd_data USING btree (id_gd ); my configuration from (select * from pg_settings): "server_version";"9.1.10" "block_size";"8192" "cpu_index_tuple_cost";"0.005" "cpu_operator_cost";"0.0025" "cpu_tuple_cost";"0.01" "cursor_tuple_fraction";"0.1" "default_statistics_target";"1000" "enable_bitmapscan";"on" "enable_hashagg";"on" "enable_hashjoin";"on" "enable_indexscan";"on" "enable_material";"on" "enable_mergejoin";"on" "enable_nestloop";"on" "enable_seqscan";"on" "enable_sort";"on" "enable_tidscan";"on" "maintenance_work_mem";"262144" "max_connections";"10" "max_files_per_process";"1000" "max_locks_per_transaction";"64" "max_pred_locks_per_transaction";"64" "max_prepared_transactions";"10" "random_page_cost";"1.5" "seq_page_cost";"1" "shared_buffers";"65536" "temp_buffers";"1024" "work_mem";"131072" Thank you for your help. Kris Olszewski
Re: Problem with slow query with WHERE conditions with OR clause on primary keys
От
David Johnston
Дата:
Krzysztof Olszewski wrote > Hi, > > my sql is very simple, > returns one row, > where conditions are assigned to primary keys > > > */select g.gd_index, gd.full_name/**/ > /**/from gd g join gd_data gd on (g.id_gd = gd.id_gd)/**/ > /**/where gd.id_gd_data = 1111 OR g.id_gd = 1111;/* > > > but generates "crazy" plan with Merge Join on big amount of rows (both > tables contains 500000 rows) > because Index scans ignore conditions, conditions are processed after > index sacans on Merge Join > > */Merge Join (cost=0.00..46399.80 rows=2 width=115) (actual > time=3.881..644.409 rows=1 loops=1)/**/ > /**/ Merge Cond: (g.id_gd = gd.id_gd)/**/ > /**/ Join Filter: ((gd.id_gd_data = 1111) OR (g.id_gd = 1111))/**/ > /**/ -> Index Scan using pk_gd on gd g (cost=0.00..14117.79 > rows=500001 width=40) (actual time=0.019..146.521 rows=500001 loops=1)/**/ > /**/ -> Index Scan using fki_gd on gd_data gd (cost=0.00..22282.04 > rows=500001 width=99) (actual time=0.016..157.384 rows=500001 loops=1)/**/ > /**/Total runtime: 644.460 ms/* > > > model is very simple > > > /CREATE TABLE gd (// > // id_gd bigint NOT NULL,// > // gd_index character varying(60) NOT NULL,// > // notes text,// > // notes_exists integer NOT NULL DEFAULT 0,// > // CONSTRAINT pk_gd PRIMARY KEY (id_gd )// > //)// > // > // > //CREATE TABLE gd_data (// > // id_gd_data bigint NOT NULL,// > // id_gd bigint NOT NULL,// > // short_name character varying(120) NOT NULL,// > // full_name character varying(512) NOT NULL,// > // notes text,// > // notes_exists integer NOT NULL DEFAULT 0,// > // CONSTRAINT pk_gd_data PRIMARY KEY (id_gd_data ),// > // CONSTRAINT fk_gd FOREIGN KEY (id_gd)// > // REFERENCES gd (id_gd) MATCH SIMPLE// > // ON UPDATE NO ACTION ON DELETE NO ACTION// > //)// > // > //CREATE INDEX fki_gd// > // ON gd_data// > // USING btree// > // (id_gd );// > / > > > my configuration from (select * from pg_settings): > > "server_version";"9.1.10" > > Thank you for your help. > > > Kris Olszewski It cannot do any better since it cannot pre-filter either table using the where condition without risking removing rows that would meet the other table's condition post-join. The query you are executing makes no sense to me: I don't understand why you would ever filter on gd.id_gd_data given the model you are showing. I believe your understanding of your model - or the model itself - is flawed but as you have only provided code it is impossible to pinpoint where exactly the disconnect resides. You can either fix the model or the query - the later by implementing sub-selects with where clauses manually - which then encodes an assumption about your data that the current query cannot make. Your model implies that a single gd record can have multiple gd_data records associated with it. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-slow-query-with-WHERE-conditions-with-OR-clause-on-primary-keys-tp5782803p5782822.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
Re: Problem with slow query with WHERE conditions with OR clause on primary keys
От
David Johnston
Дата:
kolszew73@gmail.com wrote > Thanx for your answer > > My example is trivial because i want to show strange (for me) postgres > behavior with dealing with primary keys (extreme example), in real > situation user put search condition e.g. "Panas" and this generates query > ... > where gd.other_code like 'Panas%' OR g.code like 'Panas%' > .. > > both columns has very good indexes and selectivity for "like 'Panas%'" ... > > I have experience from Oracle with this type of queries, and Oracle have > no problem with it, > executes select on index on other_code from gd and join g > in next step executes select on index on code from g and join gd > and this two results are connected in last step (like union) > very fast on minimal cost > > and in my opinion read whole huge tables only for 10 rows in result where > conditions are very good ... is strange I suppose the equivalent query that you'd want would be: SELECT ... FROM gd JOIN gd_data USING (id_gd) WHERE id_gd IN ( SELECT id_gd FROM gd WHERE ... UNION ALL -distinct not required in this situation SELECT id_gd FROM gd_data WHERE ... ) --ignoring NULL implications It does make sense conceptually... David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-slow-query-with-WHERE-conditions-with-OR-clause-on-primary-keys-tp5782803p5783942.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
Re: Problem with slow query with WHERE conditions with OR clause on primary keys
От
Andreas Karlsson
Дата:
On 12/11/2013 12:30 AM, Krzysztof Olszewski wrote: > select g.gd_index, gd.full_name > from gd g join gd_data gd on (g.id_gd = gd.id_gd) > where gd.id_gd_data = 1111 OR g.id_gd = 1111; Have you tried writing the query to filter on gd.id_gd rather than g.id_gd? I am not sure if the query planner will realize that it can replace g.id_gd with gd.id_gd in the where clause. select g.gd_index, gd.full_name from gd g join gd_data gd on (g.id_gd = gd.id_gd) where gd.id_gd_data = 1111 OR gd.id_gd = 1111; -- Andreas Karlsson
Re: Problem with slow query with WHERE conditions with OR clause on primary keys
От
"kolszew73@gmail.com"
Дата:
Thanx for your answer My example is trivial because i want to show strange (for me) postgres behavior with dealing with primary keys (extreme example), in real situation user put search condition e.g. "Panas" and this generates query ... where gd.other_code like 'Panas%' OR g.code like 'Panas%' .. both columns has very good indexes and selectivity for "like 'Panas%'" ... I have experience from Oracle with this type of queries, and Oracle have no problem with it, executes select on index on other_code from gd and join g in next step executes select on index on code from g and join gd and this two results are connected in last step (like union) very fast on minimal cost and in my opinion read whole huge tables only for 10 rows in result where conditions are very good ... is strange -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-slow-query-with-WHERE-conditions-with-OR-clause-on-primary-keys-tp5782803p5783927.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
Re: Re: Problem with slow query with WHERE conditions with OR clause on primary keys
От
Pavel Stehule
Дата:
2013/12/18 kolszew73@gmail.com <kolszew73@gmail.com>
Pavel Stehule
Thanx for your answer
My example is trivial because i want to show strange (for me) postgres
behavior with dealing with primary keys (extreme example), in real situation
user put search condition e.g. "Panas" and this generates query
...
where gd.other_code like 'Panas%' OR g.code like 'Panas%'
..
both columns has very good indexes and selectivity for "like 'Panas%'" ...
I have experience from Oracle with this type of queries, and Oracle have no
problem with it,
executes select on index on other_code from gd and join g
in next step executes select on index on code from g and join gd
and this two results are connected in last step (like union)
very fast on minimal cost
and in my opinion read whole huge tables only for 10 rows in result where
conditions are very good ... is strange
Maybe index is not in good form
try to build index with varchar_pattern_ops flag
http://postgres.cz/wiki/PostgreSQL_SQL_Tricks_I#LIKE_optimalization
http://postgres.cz/wiki/PostgreSQL_SQL_Tricks_I#LIKE_optimalization
CREATE INDEX like_index ON people(surname varchar_pattern_ops);Regards
Pavel Stehule
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-slow-query-with-WHERE-conditions-with-OR-clause-on-primary-keys-tp5782803p5783927.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance