Обсуждение: BUG #5294: Sorts on more than just the order-by clause

Поиск
Список
Период
Сортировка

BUG #5294: Sorts on more than just the order-by clause

От
"Allen Johnson"
Дата:
The following bug has been logged online:

Bug reference:      5294
Logged by:          Allen Johnson
Email address:      akjohnson78@gmail.com
PostgreSQL version: 8.4.2
Operating system:   Red Hat ES 5.4
Description:        Sorts on more than just the order-by clause
Details:

I've been porting our app from Oracle to Postgres and keeping an eye on
performance. Largely, Postgres is performing just as well or better!

I did run into an issue where we are performing a group-by on about 10
columns followed by an order-by of about 5 columns. This query was taking
twice as long as Oracle. When looking at the explain plan, Postgres seems to
be using all the columns in the group-by for sorting instead of _only_ using
what is in the order-by.

While the results are correct this seems to be causing a performance problem
since Postgres is sorting on more columns than it is being asked to. I
reworked the query to get rid of the extra sorting columns but this seems
like a hack. Below is an example query followed by the ugly hack.

Note: The execution times in this example don't mean anything because they
are running on a blank test db. On the production database there was a huge
difference in execution time in favor of the hack query. I just wanted to
illustrate that the sort keys seem incorrect.

Example Query:

select
  ct.name, c.lname, c.fname, c.mname,
  c.email, c.address1, c.address2,
  c.city, c.state, c.zip_code,
  count(a.id)
from
  contacts c
  inner join contact_types ct on (ct.code = c.contact_type_code)
  left join attachments a on (a.contact_id = c.id)
where
  c.company_id = 1
group by
  ct.name, c.lname, c.fname, c.mname,
  c.email, c.address1, c.address2,
  c.city, c.state, c.zip_code
order by
  ct.name, c.lname, c.fname, c.mname;

Example Explain:
                                                                       QUERY
PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------

 GroupAggregate  (cost=27.98..28.11 rows=3 width=1864) (actual
time=0.037..0.037 rows=0 loops=1)
   ->  Sort  (cost=27.98..27.99 rows=3 width=1864) (actual time=0.035..0.035
rows=0 loops=1)
         Sort Key: ct.name, c.lname, c.fname, c.mname, c.email, c.address1,
c.address2, c.city, c.state, c.zip_code
         Sort Method:  quicksort  Memory: 17kB
         ->  Nested Loop Left Join  (cost=4.27..27.96 rows=3 width=1864)
(actual time=0.017..0.017 rows=0 loops=1)
               ->  Nested Loop  (cost=0.00..16.55 rows=1 width=1864) (actual
time=0.014..0.014 rows=0 loops=1)
                     ->  Index Scan using contacts_company on contacts c
(cost=0.00..8.27 rows=1 width=1760) (actual time=0.012..0.012 rows=0
loops=1)
                           Index Cond: (company_id = 1)
                     ->  Index Scan using contact_types_pkey on
contact_types ct  (cost=0.00..8.27 rows=1 width=152) (never executed)
                           Index Cond: ((ct.code)::text =
(c.contact_type_code)::text)
               ->  Bitmap Heap Scan on attachments a  (cost=4.27..11.37
rows=3 width=12) (never executed)
                     Recheck Cond: (a.contact_id = c.id)
                     ->  Bitmap Index Scan on attachments_contact
(cost=0.00..4.27 rows=3 width=0) (never executed)
                           Index Cond: (a.contact_id = c.id)
 Total runtime: 0.192 ms
(15 rows)

* Notice how the sort key is using many more columns than the order-by has
specified.

Hack Query:
select * from (
  select
    ct.name as ct_name, c.lname, c.fname, c.mname,
    c.email, c.address1, c.address2,
    c.city, c.state, c.zip_code,
    count(a.id)
  from
    contacts c
    inner join contact_types ct on (ct.code = c.contact_type_code)
    left join attachments a on (a.contact_id = c.id)
  where
    c.company_id = 1
  group by
    ct.name, c.lname, c.fname, c.mname,
    c.email, c.address1, c.address2,
    c.city, c.state, c.zip_code
) as results
order by
  ct_name, lname, fname, mname;

Hack Explain:
                                                                       QUERY
PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------

 Sort  (cost=28.13..28.14 rows=3 width=1868) (actual time=0.054..0.054
rows=0 loops=1)
   Sort Key: ct.name, c.lname, c.fname, c.mname
   Sort Method:  quicksort  Memory: 17kB
   ->  HashAggregate  (cost=28.04..28.08 rows=3 width=1864) (actual
time=0.021..0.021 rows=0 loops=1)
         ->  Nested Loop Left Join  (cost=4.27..27.96 rows=3 width=1864)
(actual time=0.018..0.018 rows=0 loops=1)
               ->  Nested Loop  (cost=0.00..16.55 rows=1 width=1864) (actual
time=0.016..0.016 rows=0 loops=1)
                     ->  Index Scan using contacts_company on contacts c
(cost=0.00..8.27 rows=1 width=1760) (actual time=0.013..0.013 rows=0
loops=1)
                           Index Cond: (company_id = 1)
                     ->  Index Scan using contact_types_pkey on
contact_types ct  (cost=0.00..8.27 rows=1 width=152) (never executed)
                           Index Cond: ((ct.code)::text =
(c.contact_type_code)::text)
               ->  Bitmap Heap Scan on attachments a  (cost=4.27..11.37
rows=3 width=12) (never executed)
                     Recheck Cond: (a.contact_id = c.id)
                     ->  Bitmap Index Scan on attachments_contact
(cost=0.00..4.27 rows=3 width=0) (never executed)
                           Index Cond: (a.contact_id = c.id)
 Total runtime: 0.259 ms
(15 rows)

Re: BUG #5294: Sorts on more than just the order-by clause

От
Tom Lane
Дата:
"Allen Johnson" <akjohnson78@gmail.com> writes:
> I did run into an issue where we are performing a group-by on about 10
> columns followed by an order-by of about 5 columns. This query was taking
> twice as long as Oracle. When looking at the explain plan, Postgres seems to
> be using all the columns in the group-by for sorting instead of _only_ using
> what is in the order-by.

If it's using a sort-uniq type of plan (which it is), it has to sort
by all the grouping columns, else the results won't be correct.

In practice, I really doubt this would make a measurable performance
difference, since most row comparisons would arrive at a result before
they got to the lowest-order columns.

I think your gripe may actually have to do with a misestimate of the
relative costs of hash- and sort-based grouping, but analyze results
on a toy example don't illuminate that sort of problem at all :-(

            regards, tom lane

Re: BUG #5294: Sorts on more than just the order-by clause

От
Allen Johnson
Дата:
> In practice, I really doubt this would make a measurable performance
> difference, since most row comparisons would arrive at a result before
> they got to the lowest-order columns.
>
> I think your gripe may actually have to do with a misestimate of the
> relative costs of hash- and sort-based grouping, but analyze results
> on a toy example don't illuminate that sort of problem at all :-(

Yes, this toy example doesn't show how much time was spent on the
actual sorting (of the production data, obviously). What I can do is
assemble a test database with similar amount of data and repost the
`explain analyze` from that if there is any interest.

What I noticed in the production query was that ~1000ms was spent on
sorting alone. The hack query reduced that to ~400ms. I should also
note that there was plenty of work_mem and that the sort was not
hitting disk.

I should be able to get that going sometime early tomorrow. All I'm
going to do is generate a lot of contacts by randomly choosing from a
set of lastnames, firstnames, etc as well as randomly insert some
number of attachments for each.

I'm open to any suggestions on testing methodologies.

AJ

Re: BUG #5294: Sorts on more than just the order-by clause

От
Greg Stark
Дата:
On Thu, Jan 21, 2010 at 9:27 PM, Allen Johnson <akjohnson78@gmail.com> wrote:
> What I noticed in the production query was that ~1000ms was spent on
> sorting alone. The hack query reduced that to ~400ms. I should also
> note that there was plenty of work_mem and that the sort was not
> hitting disk.
>

The "hack" form actually seems far worse to me but I guess it depends
on the actual data layout.

Have you tried something like this which I would expect to be far faster:

select
 ct.name, c.lname, c.fname, c.mname,
 c.email, c.address1, c.address2,
 c.city, c.state, c.zip_code,
 (select count(*) from attachments where attachments.contact_id = c.id)
from
 contacts c
 inner join contact_types ct on (ct.code = c.contact_type_code)
where
 c.company_id = 1
order by
 ct.name, c.lname, c.fname, c.mname;


The question arises why Postgres can't automatically detect that this
query is equivalent. That might come when we start implementing the
"functional dependency" stuff from the standard and can determine that
the group by list uniquely identifies a row from the first join.
Currently we don't do that kind of analysis.

--
greg

Re: BUG #5294: Sorts on more than just the order-by clause

От
Greg Stark
Дата:
On Fri, Jan 22, 2010 at 2:02 PM, Greg Stark <gsstark@mit.edu> wrote:
> The question arises why Postgres can't automatically detect that this
> query is equivalent.

Hm, actually rereading your query it's not technically equivalent.
Since you weren't grouping on contact.id or contact_type.code if you
happened to have multiple contacts that had the same name and address
and multiple codes which had the same name then your query would
consolidate them into one result record. You might know that will
never happen but the database can't prove that's true so it would
never be able to do this transform.


--
greg

Re: BUG #5294: Sorts on more than just the order-by clause

От
Allen Johnson
Дата:
Ok, I've generated a test database with:
  * 20,000 users
  * 250,000 contacts
  * 1,124,700 attachments

The summary of the results is that the normal query takes about 32sec
on my machine. The hack query takes about  13sec.

Below are the queries and their `explain analyze` outputs. I've
attached the test layout as well as a zip file containing the ruby
scripts that generate the data in the default format that the 'copy'
command expects.

If anyone else wants to give it a try this is the procedure. I wrote
these scripts in a hurry so I'm sure there could be many improvements
:)

1. Apply the tables.sql file to a test database (might want to run the
create index commands after the data load)
2. Create the data files using scripts in the create-scripts.zip like this:
      ruby create-user.rb ; creates /tmp/users
      ruby create-contacts.rb ; creates /tmp/contacts
      ruby create-attachments.rb ; creates /tmp/attachments
3. Load data into the test database
      copy users from '/tmp/users';
      copy contacts from '/tmp/contacts';
      copy attachments from '/tmp/attachments';
4. Run the `create index` statements in the tables.sql file

I also have a pg_dump version if anyone wants it (~6MB gzipped).

/* NORMAL QUERY */
select
  users.id, users.full_name, users.username,
  ct.name as type_name,
  c.lname, c.fname, c.mname, c.email,
  c.address1, c.city, c.state, c.zip_code,
  c.created_date,
  count(a.id) as attachment_count

from
  contacts c
  inner join users on (users.id = c.user_id)
  inner join contact_types ct on (ct.code = c.contact_type_code)
  left join attachments a on (a.contact_id = c.id)

where
  users.company_id = 1
  and c.contact_type_code in ('BOSS', 'EMP', 'WORK')

group by
  users.id, users.full_name, users.username,
  ct.name, c.lname, c.fname, c.mname, c.email,
  c.address1, c.city, c.state, c.zip_code,
  c.created_date

order by
  users.full_name,
  ct.name, c.lname, c.fname, c.mname,
  c.created_date

EXPLAIN ANALYZE OUTPUT:

        QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=102724.80..122894.80 rows=403400 width=124)
(actual time=30683.912..32431.153 rows=75228 loops=1)
   ->  Sort  (cost=102724.80..103733.30 rows=403400 width=124) (actual
time=30683.869..31236.760 rows=345543 loops=1)
         Sort Key: users.full_name, ct.name, c.lname, c.fname,
c.mname, c.created_date, users.id, users.username, c.email,
c.address1, c.city, c.state, c.zip_code
         Sort Method:  quicksort  Memory: 96578kB
         ->  Merge Right Join  (cost=16571.91..65164.53 rows=403400
width=124) (actual time=1946.609..7523.831 rows=345543 loops=1)
               Merge Cond: (a.contact_id = c.id)
               ->  Index Scan using attachments_contact on attachments
a  (cost=0.00..39729.87 rows=1124700 width=12) (actual
time=0.082..2150.242 rows=1124662 loops=1)
               ->  Sort  (cost=16571.91..16732.30 rows=64157
width=124) (actual time=1946.463..2384.232 rows=345537 loops=1)
                     Sort Key: c.id
                     Sort Method:  quicksort  Memory: 21439kB
                     ->  Hash Join  (cost=553.92..11449.19 rows=64157
width=124) (actual time=81.120..1584.615 rows=75228 loops=1)
                           Hash Cond: ((c.contact_type_code)::text =
(ct.code)::text)
                           ->  Hash Join  (cost=552.81..10565.92
rows=64157 width=121) (actual time=81.063..1285.727 rows=75228
loops=1)
                                 Hash Cond: (c.user_id = users.id)
                                 ->  Seq Scan on contacts c
(cost=0.00..7534.50 rows=122469 width=96) (actual time=0.023..526.259
rows=150077 loops=1)
                                       Filter:
((contact_type_code)::text = ANY ('{BOSS,EMP,WORK}'::text[]))
                                 ->  Hash  (cost=427.00..427.00
rows=10065 width=33) (actual time=80.974..80.974 rows=10065 loops=1)
                                       ->  Seq Scan on users
(cost=0.00..427.00 rows=10065 width=33) (actual time=0.022..37.797
rows=10065 loops=1)
                                             Filter: (company_id = 1)
                           ->  Hash  (cost=1.05..1.05 rows=5 width=12)
(actual time=0.037..0.037 rows=5 loops=1)
                                 ->  Seq Scan on contact_types ct
(cost=0.00..1.05 rows=5 width=12) (actual time=0.018..0.024 rows=5
loops=1)
 Total runtime: 32551.132 ms
(22 rows)


/* HACK QUERY */
select * from (
  select
    users.id, users.full_name, users.username,
    ct.name as type_name,
    c.lname, c.fname, c.mname, c.email,
    c.address1, c.city, c.state, c.zip_code,
    c.created_date,
    count(a.id) as attachment_count

  from
    contacts c
    inner join users on (users.id = c.user_id)
    inner join contact_types ct on (ct.code = c.contact_type_code)
    left join attachments a on (a.contact_id = c.id)

  where
    users.company_id = 1
    and c.contact_type_code in ('BOSS', 'EMP', 'WORK')

  group by
    users.id, users.full_name, users.username,
    ct.name, c.lname, c.fname, c.mname, c.email,
    c.address1, c.city, c.state, c.zip_code,
    c.created_date
) as results
order by
  full_name,
  type_name, lname, fname, mname,
  created_date

/* EXPLAIN ANALYZE OUTPUT */

        QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=420987.80..421996.30 rows=403400 width=1688) (actual
time=12579.266..12663.383 rows=75228 loops=1)
   Sort Key: users.full_name, ct.name, c.lname, c.fname, c.mname, c.created_date
   Sort Method:  quicksort  Memory: 21435kB
   ->  HashAggregate  (cost=79283.53..84326.03 rows=403400 width=124)
(actual time=9546.773..9721.322 rows=75228 loops=1)
         ->  Merge Right Join  (cost=16571.91..65164.53 rows=403400
width=124) (actual time=1857.597..7428.718 rows=345543 loops=1)
               Merge Cond: (a.contact_id = c.id)
               ->  Index Scan using attachments_contact on attachments
a  (cost=0.00..39729.87 rows=1124700 width=12) (actual
time=0.862..2298.099 rows=1124662 loops=1)
               ->  Sort  (cost=16571.91..16732.30 rows=64157
width=124) (actual time=1856.666..2279.646 rows=345537 loops=1)
                     Sort Key: c.id
                     Sort Method:  quicksort  Memory: 21439kB
                     ->  Hash Join  (cost=553.92..11449.19 rows=64157
width=124) (actual time=54.465..1500.539 rows=75228 loops=1)
                           Hash Cond: ((c.contact_type_code)::text =
(ct.code)::text)
                           ->  Hash Join  (cost=552.81..10565.92
rows=64157 width=121) (actual time=54.375..1210.697 rows=75228
loops=1)
                                 Hash Cond: (c.user_id = users.id)
                                 ->  Seq Scan on contacts c
(cost=0.00..7534.50 rows=122469 width=96) (actual time=0.035..492.531
rows=150077 loops=1)
                                       Filter:
((contact_type_code)::text = ANY ('{BOSS,EMP,WORK}'::text[]))
                                 ->  Hash  (cost=427.00..427.00
rows=10065 width=33) (actual time=54.277..54.277 rows=10065 loops=1)
                                       ->  Seq Scan on users
(cost=0.00..427.00 rows=10065 width=33) (actual time=0.032..27.676
rows=10065 loops=1)
                                             Filter: (company_id = 1)
                           ->  Hash  (cost=1.05..1.05 rows=5 width=12)
(actual time=0.055..0.055 rows=5 loops=1)
                                 ->  Seq Scan on contact_types ct
(cost=0.00..1.05 rows=5 width=12) (actual time=0.026..0.035 rows=5
loops=1)
 Total runtime: 12752.540 ms
(22 rows)

Вложения

Re: BUG #5294: Sorts on more than just the order-by clause

От
Tom Lane
Дата:
Allen Johnson <akjohnson78@gmail.com> writes:
> Ok, I've generated a test database with:
>   * 20,000 users
>   * 250,000 contacts
>   * 1,124,700 attachments
> The summary of the results is that the normal query takes about 32sec
> on my machine. The hack query takes about  13sec.

I poked at this for a bit.  At least with the test data (dunno about
your real data), the first few grouping columns are pretty nearly unique
so the "extra" sort columns really aren't affecting the runtime anyway.
I believe that the reason the hacked query is cheaper is simply that the
sort is sorting fewer rows because it's applied after aggregation
instead of beforehand.

The planner is well aware of that effect, but the reason it fails to
choose hashed aggregation is that it doesn't think the aggregation will
reduce the number of rows --- so it estimates the sort for that case as
being much more expensive than it really is.  Notice that the
post-aggregation and pre-aggregation rowcount estimates are just the
same in both these queries.  If I force choose_hashed_grouping() to
make the other decision, I get the same plan out of the "normal"
query as the hacked query produces.

I have an idea for improving the accuracy of the post-aggregation
rowcount estimate, which I'll post on pgsql-hackers in a bit.  But
it's not something I have enough confidence in to risk back-patching.
So for the moment your hack with forcing the sort to be done separately
is probably your best answer.

            regards, tom lane