query optimization scenarios 17,701 times faster!!!

Поиск
Список
Период
Сортировка
Hi Everyone - this is my first post (but I have been lurking on and off for
a couple of years). Congratulations on a steadily improving product!

This is not a question, just some observations on performance that I thought
might trigger someone's thinking on ways to improve query optimization.

The following is a list of query pairs (one fast, one slow) that must
produce identical results by definition (and do), but have very different
execution times. Especially the last example. I did some development work on
a commercial SQL database product 7 years ago (names will not be used), so
although I am clueless about PostgreSQL internals, I (think) I have a grip
on some of the query optimization issues (though not necessarily a grip on
life). The data set used for all of these queries was very small - most
tables had a few hundred records or less. No, its not very scientific, but I
believe its illustrative none-the-less.

I'll just make a couple of observations on the last query and leave everyone
else to reach their own conclusions.

1) the two versions of the last query must produce identical results by
definition (and they do)
2) it appears that the optimizer is doing all of the join work before ever
considering the impact of where clause restrictions. (this may not be the
case, but it appears so)
3) It could have said, hey, I have a where clause restriction on the primary
key that is equal to a fixed value. So I have a single row from that table
to deal with, all of the columns come from that table too, and further its
left joined to the rest of the crap so I can safely ignore it.

I always think its illustrative to look at extreme examples like this to
point out optimizations that may be overlooked. If I ever get some free
time, I look forward to contributing to this wonderful project!


NOTES:
1) I didn't include the schema to keep this post reasonable. send email to
rdyas@adelphia.net if you want the schema to look into this further.
2) the primary keys for the following tables are
org_milestones.id
tasks.task_id
contacts.contact_id
organizations.org_id


EXPLAIN ANALYZE SELECT DISTINCT org_milestones.completed_on,
org_milestones.id, org_milestones.milestone_id, org_milestones.notes,
org_milestones.org_id FROM org_milestones RIGHT OUTER JOIN organizations ON
(org_milestones.org_id = organizations.org_id) LEFT OUTER JOIN contacts ON
(contacts.org_id = organizations.org_id) LEFT OUTER JOIN tasks ON
(tasks.org_id = organizations.org_id) WHERE (organizations.org_id = 71)
ORDER BY org_milestones.completed_on, org_milestones.id
79.20 msec

EXPLAIN ANALYZE SELECT org_milestones.completed_on, org_milestones.id,
org_milestones.milestone_id, org_milestones.notes, org_milestones.org_id
FROM org_milestones RIGHT OUTER JOIN organizations ON (org_milestones.org_id
= organizations.org_id) WHERE (organizations.org_id = 71) ORDER BY
org_milestones.completed_on, org_milestones.id
6.44 msec
= 12 times faster

---------------
EXPLAIN ANALYZE SELECT DISTINCT tasks.completed, tasks.contact_id,
tasks.created_by, tasks.notes, tasks.objective, tasks.org_id, tasks.outcome,
tasks.priority, tasks.start_date, tasks.start_time, tasks.task_id,
tasks.task_type FROM tasks RIGHT OUTER organizations ON (tasks.org_id =
organizations.org_id) LEFT OUTER JOIN org_milestones ON
(org_milestones.org_id = organizations.org_id) LEFT OUTER JOIN contacts ON
(contacts.org_id = organizations.org_id) WHERE (organizations.org_id = 71)
ORDER BY tasks.start_date, tasks.start_time
2,548.71 msec

EXPLAIN ANALYZE SELECT tasks.completed, tasks.contact_id, tasks.created_by,
tasks.notes, tasks.objective, tasks.org_id, tasks.outcome, tasks.priority,
tasks.start_date, tasks.start_time, tasks.task_id, tasks.task_type FROM
tasks RIGHT OUTER JOIN organizations ON (tasks.org_id =
organizations.org_id) WHERE (organizations.org_id = 71) ORDER BY
tasks.start_date, tasks.start_time
29.21 msec
= 87 times faster

-----------
EXPLAIN ANALYZE SELECT DISTINCT contacts.address_1, contacts.address_2,
contacts.assistant_email, contacts.assistant_name, contacts.assistant_phone,
contacts.city, contacts.contact_id, contacts.email_address,
contacts.first_name, contacts.functional_role, contacts.last_name,
contacts.needs, contacts.notes, contacts.org_id, contacts.pain,
contacts.phone, contacts.reasons, contacts.reports_to, contacts.state,
contacts.title, contacts.zip_code, (contacts.first_name || ' ' ||
contacts.last_name) AS full_name FROM contacts RIGHT OUTER JOIN
organizations ON (contacts.org_id = organizations.org_id) LEFT OUTER JOIN
org_milestones ON (org_milestones.org_id = organizations.org_id) LEFT OUTER
JOIN tasks ON (tasks.org_id = organizations.org_id) WHERE
(organizations.org_id = 71)
2056.83 msec

EXPLAIN ANALYZE SELECT DISTINCT contacts.address_1, contacts.address_2,
contacts.assistant_email, contacts.assistant_name, contacts.assistant_phone,
contacts.city, contacts.contact_id, contacts.email_address,
contacts.first_name, contacts.functional_role, contacts.last_name,
contacts.needs, contacts.notes, contacts.org_id, contacts.pain,
contacts.phone, contacts.reasons, contacts.reports_to, contacts.state,
contacts.title, contacts.zip_code, (contacts.first_name || ' ' ||
contacts.last_name) AS full_name FROM contacts RIGHT OUTER JOIN
organizations ON (contacts.org_id = organizations.org_id) WHERE
(organizations.org_id = 71)
27.41 msec
= 75 times faster
-----------------

EXPLAIN ANALYZE SELECT DISTINCT organizations.city, organizations.inactive,
organizations.java_developers, organizations.name, organizations.org_id,
organizations.overview, organizations.phone, organizations.salesperson,
organizations.state, organizations.time_zone FROM organizations LEFT OUTER
JOIN org_milestones ON (org_milestones.org_id = organizations.org_id) LEFT
OUTER JOIN contacts ON (contacts.org_id = organizations.org_id) LEFT OUTER
JOIN tasks ON (tasks.org_id = organizations.org_id) WHERE
(organizations.org_id = 71) ORDER BY organizations.name
12,567.87 msec

EXPLAIN ANALYZE SELECT organizations.city, organizations.inactive,
organizations.java_developers, organizations.name, organizations.org_id,
organizations.overview, organizations.phone, organizations.salesperson,
organizations.state, organizations.time_zone FROM organizations WHERE
(organizations.org_id = 71)
0.71 msec
= 17,701 times faster
-----------------------



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: CLOSE command tag
Следующее
От: Dennis Björklund
Дата:
Сообщение: Re: query optimization scenarios 17,701 times faster!!!