Обсуждение: efficient query help
Hi all, I have crafted the following query which returns the results I want but runs extremely slow (I'm sure it's the "not in" part that does it) since the tables can contain large amounts of data. Is there a better, more efficient way to build this query? It's important that I return zero counts for id flag combinations that do not have an entry in the log table hence the union with the "not in": id is a text field and flgtime is a timestamp. flags is a table that contains a single column with a row for each unique flag (text). select id, flag, count(*) as count from logwhere date(flgtime) = '2002-04-16' and flag is not null group by id, flag union select distinct l.id, f.flag, 0 as count from log l, flags fwhere (l.id, f.flag) not in (select id, flag fromlog where date(flgtime) = '2002-04-16' and fag is not null) group by l.id, f.flag ; Thanks for any suggestions. -- Laurette Cisneros Database Roadie (510) 420-3137 NextBus Information Systems, Inc. www.nextbus.com Where's my....bus?
> I have crafted the following query which returns the results I want but > runs extremely slow (I'm sure it's the "not in" part that does it) since > the tables can contain large amounts of data. > > Is there a better, more efficient way to build this query? Why not try NOT EXISTS? I find that NOT IN bogs down on even simple queries.
Hmmm... Is this better? Please let me know; it looks better with a small set
of sample data, but I'd be curious how it does with your real table:
select id, flag, sum(count)
from ( select id, flag, 1 as count from log where flgtime='2002-04-16'
union all
select id, f.flag, 0 as count from log l, flags f
/* don't think this is neccessary where flgtime <> '2002-04-16' */
) as t
where flag is not null
group by id, flag;
[ for others: the point of the query is to count the number of unique
id/flag combos on 4/16/2002, and union this with all possible combos of
ids/flags that aren't present on non-4/16/2002 ]
Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant
> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Laurette Cisneros
> Sent: Monday, April 22, 2002 2:11 PM
> To: pgsql-sql@postgresql.org
> Subject: [SQL] efficient query help
>
>
>
> Hi all,
>
> I have crafted the following query which returns the results I want but
> runs extremely slow (I'm sure it's the "not in" part that does it) since
> the tables can contain large amounts of data.
>
> Is there a better, more efficient way to build this query?
>
> It's important that I return zero counts for id flag combinations
> that do not
> have an entry in the log table hence the union with the "not in":
>
> id is a text field and flgtime is a timestamp. flags is a table
> that contains
> a single column with a row for each unique flag (text).
>
> select id, flag, count(*) as count
> from log
> where date(flgtime) = '2002-04-16'
> and flag is not null
> group by id, flag
> union
> select distinct l.id, f.flag, 0 as count
> from log l, flags f
> where (l.id, f.flag) not in
> (select id, flag
> from log
> where date(flgtime) = '2002-04-16'
> and fag is not null)
> group by l.id, f.flag
> ;
>
> Thanks for any suggestions.
>
> --
> Laurette Cisneros
> Database Roadie
> (510) 420-3137
> NextBus Information Systems, Inc.
> www.nextbus.com
> Where's my....bus?
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
P.S. It might be faster if your change the second half of the inner query to "SELECT DISTINCT ON(id, f.flag)" and case the numbers to int2 (assuming your data won't overflow this); this improves things about 20% with my dataset (flags n=26, log n=896). By my tests, this is about 4-5x faster than your original query. Is there anything better than this? Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant > -----Original Message----- > From: Joel Burton [mailto:joel@joelburton.com] > Sent: Wednesday, April 24, 2002 10:06 AM > To: Laurette Cisneros; Pgsql-Sql@Postgresql. Org > Subject: RE: [SQL] efficient query help > > > Hmmm... Is this better? Please let me know; it looks better with > a small set of sample data, but I'd be curious how it does with > your real table: > > select id, > flag, > sum(count) > > from ( select id, > flag, > 1::int2 as count <- change here > from log > where flgtime='2002-04-16' > and flag is not null <- change here > > union all > > select DISTINCT ON (id, f.flag) id, <- change here > f.flag, > 0::int2 as count <- change here > from log l, > flags f > > /* don't think this is neccessary, but might make it faster: where flgtime <> '2002-04-16' and l.flag is not null */ > > ) as t > > where flag is not null > > group by id, flag; > > > [ for others: the point of the query is to count the number of > unique id/flag combos on 4/16/2002, and union this with all > possible combos of ids/flags that aren't present on non-4/16/2002 ] > > > Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton > Knowledge Management & Technology Consultant > > > -----Original Message----- > > From: pgsql-sql-owner@postgresql.org > > [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Laurette Cisneros > > Sent: Monday, April 22, 2002 2:11 PM > > To: pgsql-sql@postgresql.org > > Subject: [SQL] efficient query help > > > > > > > > Hi all, > > > > I have crafted the following query which returns the results I want but > > runs extremely slow (I'm sure it's the "not in" part that does it) since > > the tables can contain large amounts of data. > > > > Is there a better, more efficient way to build this query? > > > > It's important that I return zero counts for id flag combinations > > that do not > > have an entry in the log table hence the union with the "not in": > > > > id is a text field and flgtime is a timestamp. flags is a table > > that contains > > a single column with a row for each unique flag (text). > > > > select id, flag, count(*) as count > > from log > > where date(flgtime) = '2002-04-16' > > and flag is not null > > group by id, flag > > union > > select distinct l.id, f.flag, 0 as count > > from log l, flags f > > where (l.id, f.flag) not in > > (select id, flag > > from log > > where date(flgtime) = '2002-04-16' > > and fag is not null) > > group by l.id, f.flag > > ; > > > > Thanks for any suggestions. > > > > -- > > Laurette Cisneros > > Database Roadie > > (510) 420-3137 > > NextBus Information Systems, Inc. > > www.nextbus.com > > Where's my....bus? > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > >