Обсуждение: PGSQL-Performance mailing list

Поиск
Список
Период
Сортировка

PGSQL-Performance mailing list

От
John Sherwood
Дата:
Hi,

I'm having some trouble getting this query down for large users.   Basically, we deal with competition entries.  For this customer (a.k.a. site) we've got ~1,000,000 entries from ~100,000 contestants.

SELECT  contestants.*, sum(entries.worth) as db_entries, count(entries.id) as db_actions FROM "contestants" INNER JOIN "entries" ON "entries"."contestant_id" = "contestants"."id" WHERE "entries"."campaign_id" IN (SELECT id FROM "campaigns"  WHERE "campaigns"."site_id" = $1) AND (entries.status != 'Invalid') GROUP BY contestants.id  ORDER BY db_actions desc LIMIT 20 OFFSET 0

Here's the explain:

                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3783168.28..3783168.33 rows=20 width=103)
   ->  Sort  (cost=3783168.28..3792902.85 rows=3893825 width=103)
         Sort Key: (count(entries.id))
         ->  GroupAggregate  (cost=3004896.86..3679555.00 rows=3893825 width=103)
               ->  Merge Join  (cost=3004896.86..3611413.06 rows=3893825 width=103)
                     Merge Cond: (contestants.id = entries.contestant_id)
                     ->  Index Scan using contestants_pkey on contestants  (cost=0.43..534782.85 rows=7490021 width=95)
                     ->  Sort  (cost=3004710.57..3014464.64 rows=3901628 width=12)
                           Sort Key: entries.contestant_id
                           ->  Nested Loop  (cost=19.88..2577567.27 rows=3901628 width=12)
                                 ->  Bitmap Heap Scan on campaigns  (cost=19.32..1332.62 rows=374 width=4)
                                       Recheck Cond: (site_id = 152)
                                       ->  Bitmap Index Scan on index_campaigns_on_site_id  (cost=0.00..19.22 rows=374 width=0)
                                             Index Cond: (site_id = 152)
                                 ->  Index Scan using index_entries_on_campaign_id on entries  (cost=0.57..6784.01 rows=10432 width=16)
                                       Index Cond: (campaign_id = campaigns.id)
                                       Filter: ((status)::text <> 'Invalid'::text)

Any thoughts on tweaks I could apply to speed this up?

Thanks,

John

Re: Help with 'contestant' query

От
"Greg Sabino Mullane"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

John Sherwood asked:

> SELECT  contestants.*, sum(entries.worth) as db_entries, count(entries.id)
> as db_actions FROM "contestants" INNER JOIN "entries" ON
> "entries"."contestant_id" = "contestants"."id" WHERE
> "entries"."campaign_id" IN (SELECT id FROM "campaigns"  WHERE
> "campaigns"."site_id" = $1) AND (entries.status != 'Invalid') GROUP BY
> contestants.id  ORDER BY db_actions desc LIMIT 20 OFFSET 0

If you have a lot of 'Invalid' entries, a partial index will help:

CREATE INDEX index_entries_on_campaign_id_valid ON entries(campaign_id) WHERE status <> 'Invalid';

> Here's the explain:

An EXPLAIN ANALYZE is always better, fwiw.

I noticed you have a contestants.* plus a GROUP BY contestants.id, 
which suggests that a) this is not the exact query, or b) id is the only 
column in that table. Either way, if you only need the contestant id, you 
can remove that table from the query, and just use entries.contestant_id instead, 
getting rid of the IN() clause in the process:

SELECT e.contestant_id, SUM(e.worth) AS db_entries, COUNT(e.id) AS db_actions
FROM entries e
JOIN campaigns c ON (c.id = e.campaign_id AND c.site_id = $1)
AND e.status <> 'Invalid'
GROUP BY e.contestant_id
ORDER BY db_actions DESC
LIMIT 20 OFFSET 0;

- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201507031516
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAlWW4/UACgkQvJuQZxSWSsgzRgCeLrZAoGZPZV/FSVmSAChFT3lS
FSkAoOEEAbH6/RGMqzNxEaW8Fq6OpA0/
=5/ys
-----END PGP SIGNATURE-----