Обсуждение: Very slow query
The following query on some small datasets takes over a second to run.
I'd be grateful for some help in understanding the explain output, and
to remake the code.
Looks like the sort is using up quite a bit of resources. I made an
index on boards using columns "b.n_type, b.n_id, b.t_name" but the index
was not recorded in explain analyze. (see "testindex" below).
I am using PostgreSQL 7.4.2 on i386-pc-linux-gnu (Debian). The query is
in a psql function (I've changed variables to hard coded integers
below). The db is vacuumed every night.
Thanks for any help;
Rory
SELECT DISTINCT
b.n_id as id,
b.n_type,
CASE b.n_type WHEN 0 THEN 'personal'
WHEN 1 THEN 'private'
WHEN 2 THEN 'blog'
ELSE 'public'
END as type,
b.t_name as title
FROM
boards b, people p, objects o
WHERE
b.b_hidden = 'f'
AND
(
b.n_type = 3
OR
b.n_creator = 71
OR
( b.n_id = o.n_board_id
AND
o.n_creator = 71
AND
o.n_joined > 0
)
)
ORDER BY
b.n_type, b.n_id ASC, b.t_name;
trial=> \d boards
Table "public.boards"
Column | Type | Modifiers
---------------+-----------------------------+----------------------------------------------------------
n_id | integer | not null default nextval('public.boards_n_id_seq'::text)
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
t_mobile | character varying(15) |
t_email | character varying(50) |
n_creator | integer | not null
n_type | smallint | not null default 0
t_name | character varying(100) | not null
t_description | character varying(500) |
n_id_photo | integer |
n_bg_colour | integer | default 0
Indexes:
"boards_pkey" primary key, btree (n_id)
"boards_t_email_key" unique, btree (t_email)
"boards_t_mobile_key" unique, btree (t_mobile)
"testindex" btree (n_type, n_id, t_name)
Foreign-key constraints:
"$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
"$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
Triggers:
tr_update_modified_time BEFORE UPDATE ON boards FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()
trial=> \d people
Table "public.people"
Column | Type | Modifiers
-------------------+-----------------------------+----------------------------------------------------------
n_id | integer | not null default nextval('public.people_n_id_seq'::text)
n_object_id | integer |
n_objects_counter | integer | default 0
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
t_nickname | character varying(20) | not null
t_firstname | character varying(20) | not null
t_surname | character varying(25) | not null
t_mobile | character varying(15) |
t_email | character varying(50) |
t_password | character varying(15) | not null
b_registered | boolean | default false
n_email_status | smallint | default 0
n_txt_status | smallint | default 0
b_work_hours | boolean | default false
b_confirmations | boolean | default true
Indexes:
"people_pkey" primary key, btree (n_id)
"people_t_email_key" unique, btree (t_email)
"people_t_mobile_key" unique, btree (t_mobile)
"people_t_nickname_key" unique, btree (t_nickname)
Foreign-key constraints:
"object_chk" FOREIGN KEY (n_object_id) REFERENCES objects(n_id) ON UPDATE CASCADE ON DELETE SET NULL
Triggers:
tr_update_modified_time BEFORE UPDATE ON people FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()
trial=> \d objects
Table "public.objects"
Column | Type | Modifiers
---------------+-----------------------------+-----------------------------------------------------------
n_id | integer | not null default nextval('public.objects_n_id_seq'::text)
t_text_id | character varying(25) | not null
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
n_creator | integer | not null
n_type | smallint | not null default 0
t_name | character varying(100) | not null
t_description | text |
t_postcode | character varying(10) |
n_id_photo | integer |
n_board_id | integer | not null
n_joined | smallint |
b_trigger | boolean | default true
Indexes:
"objects_pkey" primary key, btree (n_id)
Foreign-key constraints:
"$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
"$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
"$3" FOREIGN KEY (n_board_id) REFERENCES boards(n_id) ON UPDATE CASCADE ON DELETE CASCADE
Unique (cost=3677.26..3711.48 rows=102 width=18) (actual time=12566.422..13045.404 rows=4 loops=1)
-> Sort (cost=3677.26..3684.10 rows=2738 width=18) (actual time=12566.413..12809.928 rows=158254 loops=1)
Sort Key: b.n_type, b.n_id, b.t_name, CASE WHEN (b.n_type = 0) THEN 'personal'::text WHEN (b.n_type = 1) THEN
'private'::te
xt WHEN (b.n_type = 2) THEN 'blog'::text ELSE 'public'::text END
-> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1)
-> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1)
-> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67)
-> Nested Loop (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362
loops=1)
Join Filter: ((("inner".n_id = "outer".n_board_id) OR ("inner".n_type = 3) OR
("inner".n_creator= 71)) A
ND (("outer".n_creator = 71) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) AND (("outer".n_joined > 0) OR
("inner".n_type= 3
) OR ("inner".n_creator = 71)))
-> Seq Scan on objects o (cost=0.00..27.87 rows=787 width=10) (actual time=0.005..1.372
rows=787loops=
1)
-> Materialize (cost=3.27..4.29 rows=102 width=22) (actual time=0.001..0.105 rows=102
loops=787)
-> Seq Scan on boards b (cost=0.00..3.27 rows=102 width=22) (actual
time=0.017..0.352rows=102 lo
ops=1)
Filter: (b_hidden = false)
Total runtime: 13051.065 ms
--
Rory Campbell-Lange
<rory@campbell-lange.net>
<www.campbell-lange.net>
Sorry for replying to my own post, but I'm anxious for an answer. Should
I provide other information?
Thanks
Rory
On 10/05/04, Rory Campbell-Lange (rory@campbell-lange.net) wrote:
> The following query on some small datasets takes over a second to run.
> I'd be grateful for some help in understanding the explain output, and
> to remake the code.
>
> Looks like the sort is using up quite a bit of resources. I made an
> index on boards using columns "b.n_type, b.n_id, b.t_name" but the index
> was not recorded in explain analyze. (see "testindex" below).
>
> I am using PostgreSQL 7.4.2 on i386-pc-linux-gnu (Debian). The query is
> in a psql function (I've changed variables to hard coded integers
> below). The db is vacuumed every night.
>
> Thanks for any help;
> Rory
>
>
> SELECT DISTINCT
> b.n_id as id,
> b.n_type,
> CASE b.n_type WHEN 0 THEN 'personal'
> WHEN 1 THEN 'private'
> WHEN 2 THEN 'blog'
> ELSE 'public'
> END as type,
> b.t_name as title
> FROM
> boards b, people p, objects o
> WHERE
> b.b_hidden = 'f'
> AND
> (
> b.n_type = 3
> OR
> b.n_creator = 71
> OR
> ( b.n_id = o.n_board_id
> AND
> o.n_creator = 71
> AND
> o.n_joined > 0
> )
> )
> ORDER BY
> b.n_type, b.n_id ASC, b.t_name;
>
> trial=> \d boards
> Table "public.boards"
> Column | Type | Modifiers
> ---------------+-----------------------------+----------------------------------------------------------
> n_id | integer | not null default nextval('public.boards_n_id_seq'::text)
> b_hidden | boolean | default false
> dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
> dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
> t_mobile | character varying(15) |
> t_email | character varying(50) |
> n_creator | integer | not null
> n_type | smallint | not null default 0
> t_name | character varying(100) | not null
> t_description | character varying(500) |
> n_id_photo | integer |
> n_bg_colour | integer | default 0
> Indexes:
> "boards_pkey" primary key, btree (n_id)
> "boards_t_email_key" unique, btree (t_email)
> "boards_t_mobile_key" unique, btree (t_mobile)
> "testindex" btree (n_type, n_id, t_name)
> Foreign-key constraints:
> "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
> "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
> Triggers:
> tr_update_modified_time BEFORE UPDATE ON boards FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()
>
> trial=> \d people
> Table "public.people"
> Column | Type | Modifiers
> -------------------+-----------------------------+----------------------------------------------------------
> n_id | integer | not null default nextval('public.people_n_id_seq'::text)
> n_object_id | integer |
> n_objects_counter | integer | default 0
> b_hidden | boolean | default false
> dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
> dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
> t_nickname | character varying(20) | not null
> t_firstname | character varying(20) | not null
> t_surname | character varying(25) | not null
> t_mobile | character varying(15) |
> t_email | character varying(50) |
> t_password | character varying(15) | not null
> b_registered | boolean | default false
> n_email_status | smallint | default 0
> n_txt_status | smallint | default 0
> b_work_hours | boolean | default false
> b_confirmations | boolean | default true
> Indexes:
> "people_pkey" primary key, btree (n_id)
> "people_t_email_key" unique, btree (t_email)
> "people_t_mobile_key" unique, btree (t_mobile)
> "people_t_nickname_key" unique, btree (t_nickname)
> Foreign-key constraints:
> "object_chk" FOREIGN KEY (n_object_id) REFERENCES objects(n_id) ON UPDATE CASCADE ON DELETE SET NULL
> Triggers:
> tr_update_modified_time BEFORE UPDATE ON people FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()
>
> trial=> \d objects
> Table "public.objects"
> Column | Type | Modifiers
> ---------------+-----------------------------+-----------------------------------------------------------
> n_id | integer | not null default nextval('public.objects_n_id_seq'::text)
> t_text_id | character varying(25) | not null
> b_hidden | boolean | default false
> dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
> dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
> n_creator | integer | not null
> n_type | smallint | not null default 0
> t_name | character varying(100) | not null
> t_description | text |
> t_postcode | character varying(10) |
> n_id_photo | integer |
> n_board_id | integer | not null
> n_joined | smallint |
> b_trigger | boolean | default true
> Indexes:
> "objects_pkey" primary key, btree (n_id)
> Foreign-key constraints:
> "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
> "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
> "$3" FOREIGN KEY (n_board_id) REFERENCES boards(n_id) ON UPDATE CASCADE ON DELETE CASCADE
>
>
>
> Unique (cost=3677.26..3711.48 rows=102 width=18) (actual time=12566.422..13045.404 rows=4 loops=1)
> -> Sort (cost=3677.26..3684.10 rows=2738 width=18) (actual time=12566.413..12809.928 rows=158254 loops=1)
> Sort Key: b.n_type, b.n_id, b.t_name, CASE WHEN (b.n_type = 0) THEN 'personal'::text WHEN (b.n_type = 1)
THEN'private'::te
> xt WHEN (b.n_type = 2) THEN 'blog'::text ELSE 'public'::text END
> -> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254
loops=1)
> -> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1)
> -> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362
loops=67)
> -> Nested Loop (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362
loops=1)
> Join Filter: ((("inner".n_id = "outer".n_board_id) OR ("inner".n_type = 3) OR
("inner".n_creator= 71)) A
> ND (("outer".n_creator = 71) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) AND (("outer".n_joined > 0) OR
("inner".n_type= 3
> ) OR ("inner".n_creator = 71)))
> -> Seq Scan on objects o (cost=0.00..27.87 rows=787 width=10) (actual time=0.005..1.372
rows=787loops=
> 1)
> -> Materialize (cost=3.27..4.29 rows=102 width=22) (actual time=0.001..0.105 rows=102
loops=787)
> -> Seq Scan on boards b (cost=0.00..3.27 rows=102 width=22) (actual
time=0.017..0.352rows=102 lo
> ops=1)
> Filter: (b_hidden = false)
> Total runtime: 13051.065 ms
>
>
>
> --
> Rory Campbell-Lange
> <rory@campbell-lange.net>
> <www.campbell-lange.net>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>
--
Rory Campbell-Lange
<rory@campbell-lange.net>
<www.campbell-lange.net>
[snip] Rory Campbell-Lange wrote: >> >> >>SELECT DISTINCT >> b.n_id as id, >> b.n_type, >> CASE b.n_type WHEN 0 THEN 'personal' >> WHEN 1 THEN 'private' >> WHEN 2 THEN 'blog' >> ELSE 'public' >> END as type, >> b.t_name as title >> FROM >> boards b, people p, objects o >> WHERE >> b.b_hidden = 'f' >> AND >> ( >> b.n_type = 3 >> OR >> b.n_creator = 71 >> OR >> ( b.n_id = o.n_board_id >> AND >> o.n_creator = 71 >> AND >> o.n_joined > 0 >> ) >> ) >> ORDER BY >> b.n_type, b.n_id ASC, b.t_name; >> You don't seem to be joining the people table to the boards or objects table. In fact the people table is never referenced anywhere but the FROM clause. This might be why it is not running so well. People seems to be referenced by both boards and objects, so does there need to be a join between all 3 tables? [snip] >> "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE [snip] >> "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE It is this "missing" join which seems to cause 158254 rows to appear in this nested loop, when PG only reckons it is going to see 2738 rows >> -> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1) I cant remember what the correct term is, but basically all rows from the seq scan combines with all rows from the materialize, which is what is causing the 150,000 odd rows to appear (is that called a cross product anyone?) i.e 67 rows from the seq x 2362 rows from the materialize = 158254 output rows for the nested loop to chug through. >> -> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1) >> -> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67) I reckon fix the unconstrained join and your query might run a little faster. >> -> Nested Loop (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362 loops=1) The estimate for this nested loop seems a little off, note the guesstimate in the first set of brackets of 41, and compare with the actual result in the second set of 2362. Have you vacuum analyzed recently? HTH Nick
On 10/05/2004 22:30 Rory Campbell-Lange wrote: > Sorry for replying to my own post, but I'm anxious for an answer. Should > I provide other information? Look carefully at your column types. I can see several smallint columns in there WHERE clause which are not expicitely typed as such. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
Rory Campbell-Lange wrote:
>Wow, this is an education! My php profiler now shows:
>
>0.02 db->db_board_listing C: /var/www/trial/php/db.php:175 M: 867024
>0.02 ob_start C: /var/www/trial/php/db.php:238 M: 867024
>0.02 pg_exec C: /var/www/trial/php/db.php:239 M: 908328
>1.24 pg_result_status C: /var/www/trial/php/db.php:240 M: 908392
>1.24 ob_end_clean C: /var/www/trial/php/db.php:241 M: 908416
>1.24 db->result_checker C: /var/www/trial/php/db.php:243 M: 867472
>1.24 is_resource C: /var/www/trial/php/db.php:1317 M: 867472
>1.24 pg_numrows C: /var/www/trial/php/db.php:248 M: 867568
>1.24 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867600
>1.24 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867760
>
>and now shows:
>
>C: /var/www/trial/php/core.php:151 M: 867040
> 0.02 db->db_board_listing C: /var/www/trial/php/db.php:175 M: 867040
> 0.02 ob_start C: /var/www/trial/php/db.php:238 M: 867040
> 0.02 pg_exec C: /var/www/trial/php/db.php:239 M: 908344
> 0.14 pg_result_status C: /var/www/trial/php/db.php:240 M: 908408
> 0.14 ob_end_clean C: /var/www/trial/php/db.php:241 M: 908432
> 0.14 db->result_checker C: /var/www/trial/php/db.php:243 M: 867488
> 0.14 is_resource C: /var/www/trial/php/db.php:1317 M: 867488
> 0.14 pg_numrows C: /var/www/trial/php/db.php:248 M: 867584
> 0.14 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867616
> 0.14 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867776
>
>This is a saving of 1.1 seconds!
>
>
>
Thats a bit better ;-)
>>On 10/05/2004 22:30 Rory Campbell-Lange wrote:
>>
>>
>>>Sorry for replying to my own post, but I'm anxious for an answer. Should
>>>I provide other information?
>>>
>>>
>>Look carefully at your column types. I can see several smallint columns in
>>there WHERE clause which are not expicitely typed as such.
>>
>>
>
>I'm not sure how to do this, Paul. do I do b.n_id::smallint ? Is
>smallint not implied?
>
>Thanks
>Rory
>
>
>
Not quite. Explicit casts are needed when you have any numbers in the
WHERE condition and the columns are not of type integer/int4. For
example I have tweaked your query.
WHERE
b.b_hidden = 'f'
AND
(
b.n_type = 3::smallint
OR
b.n_creator = 71
OR
(
b.n_id = o.n_board_id
AND
o.n_creator = 71
AND
o.n_joined > 0::smallint
)
)
Note that b.n_creator and o.n_creator do not need explicit casts because
they are both of type integer anyway. You could of course put them in
for clarity. PG only casts the numbers to integer's, and not to smallint
or bigint, which basically means it does not use any indexes on that
column. This is fixed in 7.4 I believe, which you seem to be running
anyway so you might not be affected.
HTH
Nick
On 11/05/04, Nick Barr (nicky@chuckie.co.uk) wrote: > Rory Campbell-Lange wrote: > > > Look carefully at your column types. I can see several smallint > > > columns in there WHERE clause which are not expicitely typed as > > > such. > > I'm not sure how to do this, Paul. do I do b.n_id::smallint ? Is > > smallint not implied? > Not quite. Explicit casts are needed when you have any numbers in the > WHERE condition and the columns are not of type integer/int4. For > example I have tweaked your query. ... > Note that b.n_creator and o.n_creator do not need explicit casts because > they are both of type integer anyway. You could of course put them in > for clarity. PG only casts the numbers to integer's, and not to smallint > or bigint, which basically means it does not use any indexes on that > column. This is fixed in 7.4 I believe, which you seem to be running > anyway so you might not be affected. Mmm. Seems like I should try profiling with and without the cast. Maybe there is little value in defining a column as a smallint, other than checking the length of input. Thanks again for your help, Rory -- Rory Campbell-Lange <rory@campbell-lange.net> <www.campbell-lange.net>
Rory Campbell-Lange <rory@campbell-lange.net> writes:
> The following query on some small datasets takes over a second to run.
> I'd be grateful for some help in understanding the explain output, and
> to remake the code.
I think the problem is you have an unconstrained join against "people
p", which is wasting a bunch of time generating duplicate records
(and then sorting them, and then discarding them in the DISTINCT step...)
regards, tom lane