Обсуждение: SQL over my head...
I've got a rather difficult query that I'm hoping someone can help with. I have two tables, foo and bar. I'll just speak SQL... CREATE TABLE foo ( id serial primary key, attr_a int4 not null, attr_b int4 not null, bar_id int4 not null, unique(attr_a , attr_b, bar_id), foreign key(bar_id) references bar(id) ); CREATE TABLE bar ( id serial primary key, attr_a int4 not null, ts timestamp not null ); What I want to do is find all of the foo.id's where the foo.bar_id in that row points to a bar where the bar.ts is the most recent time that is before the current time among the sets of identical combiniations of foo.attr_a, foo.attr_b and bar.attr_a. Said another way... If I find the 3 item sets of foo.attr_a, foo._attr_b, and the bar.attr_a that a foo.bar_id points to, and sort them into groups where the the three values are the same, I want, as a result, one item from each of the groups where the time is the maximum of that group which is still earlier than the current time. Is this possible? I'd appreciate any help anyone can give. Thanks. Greg Brauer greg@wildbrain.com
I think that SELECT F0.id FROM Foo AS F0 JOIN Bar AS B0 ON (F0.id=B0.id)WHERE ts = (SELECT MAX(ts) FROM Foo ASF1, Bar as B1 WHERE ts < CURRENT_TIME AND F0.attr_a=F1.attr_a AND F0.attr_b=F1.attr_b AND B0.attr_a=B1.attr_a); is what you want. If not, can you post a brief dump of INSERT statements with sample data, and the output you're expecting? If you want the combinations of f.attr_a, f.attr_b, and b.attr_a only for naturally joining foo/bars, join the tables in the subquery with the same JOIN ... as in the outer query. It's not clear from your question if you want this or not. - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant > -----Original Message----- > From: pgsql-sql-owner@postgresql.org > [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Gregory Brauer > Sent: Thursday, May 16, 2002 3:10 PM > To: pgsql-sql@postgresql.org > Subject: [SQL] SQL over my head... > > > > I've got a rather difficult query that I'm hoping someone can help with. > > I have two tables, foo and bar. > > I'll just speak SQL... > > CREATE TABLE foo > ( > id serial primary key, > attr_a int4 not null, > attr_b int4 not null, > bar_id int4 not null, > > unique(attr_a , attr_b, bar_id), > > foreign key(bar_id) references bar(id) > ); > > CREATE TABLE bar > ( > id serial primary key, > attr_a int4 not null, > ts timestamp not null > ); > > > What I want to do is find all of the foo.id's where the > foo.bar_id in that row points to a bar where the bar.ts > is the most recent time that is before the current time > among the sets of identical combiniations of foo.attr_a, > foo.attr_b and bar.attr_a. > > Said another way... > > If I find the 3 item sets of foo.attr_a, foo._attr_b, > and the bar.attr_a that a foo.bar_id points to, and sort > them into groups where the the three values are the same, > I want, as a result, one item from each of the groups > where the time is the maximum of that group which is > still earlier than the current time. > > Is this possible? I'd appreciate any help anyone can give. > > Thanks. > > Greg Brauer > greg@wildbrain.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
> -----Original Message----- > From: pgsql-sql-owner@postgresql.org > [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Joel Burton > Sent: Thursday, May 16, 2002 4:14 PM > To: Gregory Brauer; pgsql-sql@postgresql.org > Subject: Re: [SQL] SQL over my head... > > > I think that > > SELECT F0.id > FROM Foo AS F0 > JOIN Bar AS B0 ON (F0.id=B0.id) > WHERE ts = > (SELECT MAX(ts) > FROM Foo AS F1, > Bar as B1 > WHERE ts < CURRENT_TIME > AND F0.attr_a=F1.attr_a > AND F0.attr_b=F1.attr_b > AND B0.attr_a=B1.attr_a); Just glancing over this, I realized that this will perform slowly. If you have indexes on sensible things (ts, ids, attrs, etc.), something like: SELECT F0.id FROM Foo as F0 JOIN Bar as B0 ON (F0.id=B0.id)WHERE NOT EXISTS ( (SELECT * FROM FooAS F1, Bar as B1 WHERE ts < CURRENT_TIME AND F0.attr_a=F1.attr_a AND F0.attr_b=F1.attr_b AND B0.attr_a=B1.attr_a AND B1.ts > B0.ts) should be equivalent (assuming NOT NULL data) and perform better.
Joel Burton wrote: >>-----Original Message----- >>From: pgsql-sql-owner@postgresql.org >>[mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Joel Burton >>Sent: Thursday, May 16, 2002 4:14 PM >>To: Gregory Brauer; pgsql-sql@postgresql.org >>Subject: Re: [SQL] SQL over my head... >> >> >>I think that >> >>SELECT F0.id >> FROM Foo AS F0 >> JOIN Bar AS B0 ON (F0.id=B0.id) >> WHERE ts = >> (SELECT MAX(ts) >> FROM Foo AS F1, >> Bar as B1 >> WHERE ts < CURRENT_TIME >> AND F0.attr_a=F1.attr_a >> AND F0.attr_b=F1.attr_b >> AND B0.attr_a=B1.attr_a); > > > Just glancing over this, I realized that this will perform slowly. If you > have indexes on sensible things (ts, ids, attrs, etc.), something like: > > SELECT F0.id > FROM Foo as F0 > JOIN Bar as B0 ON (F0.id=B0.id) > WHERE NOT EXISTS ( > (SELECT * > FROM Foo AS F1, > Bar as B1 > WHERE ts < CURRENT_TIME > AND F0.attr_a=F1.attr_a > AND F0.attr_b=F1.attr_b > AND B0.attr_a=B1.attr_a > AND B1.ts > B0.ts) > > should be equivalent (assuming NOT NULL data) and perform better. > Wow, thanks for the help! I'm still parsing this, but one comment and one question... First, I realized that though I will probably need this more general case later, in what I am doing right now, I know a single bar.attr_a value that I want, so only foo.attr_a and foo.attr_b are variable. Does that change anything? Secondly, in the third line, should (F0.id=B0.id) actually be (F0.bar_id=B0.id) to join on the foreign key? I'm trying this out now... Greg
> -----Original Message----- > From: Gregory Brauer [mailto:greg@wildbrain.com] > Sent: Thursday, May 16, 2002 4:39 PM > To: Joel Burton > Cc: pgsql-sql@postgresql.org > Subject: Re: [SQL] SQL over my head... > > > > SELECT F0.id > > FROM Foo as F0 > > JOIN Bar as B0 ON (F0.id=B0.id) > > WHERE NOT EXISTS ( > > (SELECT * > > FROM Foo AS F1, > > Bar as B1 > > WHERE ts < CURRENT_TIME > > AND F0.attr_a=F1.attr_a > > AND F0.attr_b=F1.attr_b > > AND B0.attr_a=B1.attr_a > > AND B1.ts > B0.ts) > > > Wow, thanks for the help! I'm still parsing this, but one comment > and one question... > > First, I realized that though I will probably need this > more general case later, in what I am doing right now, I know > a single bar.attr_a value that I want, so only foo.attr_a and > foo.attr_b are variable. Does that change anything? If I'm understanding correctly, just make the line "AND B1.attr_a=<constant>" To only look at groups where that's the attribute you want and add "AND B0.attr_a=<same_constant>" at the end of the query (after the closing paren) to make that count for the outer query, too. > Secondly, in the third line, should (F0.id=B0.id) actually be > (F0.bar_id=B0.id) to join on the foreign key? Yes, it should. - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
Joel Burton wrote: >>-----Original Message----- >>From: Gregory Brauer [mailto:greg@wildbrain.com] >>Sent: Thursday, May 16, 2002 4:39 PM >>To: Joel Burton >>Cc: pgsql-sql@postgresql.org >>Subject: Re: [SQL] SQL over my head... >> >>>SELECT F0.id >>> FROM Foo as F0 >>> JOIN Bar as B0 ON (F0.id=B0.id) >>> WHERE NOT EXISTS ( >>> (SELECT * >>> FROM Foo AS F1, >>> Bar as B1 >>> WHERE ts < CURRENT_TIME >>> AND F0.attr_a=F1.attr_a >>> AND F0.attr_b=F1.attr_b >>> AND B0.attr_a=B1.attr_a >>> AND B1.ts > B0.ts) Excellent! This was very close, and put me on the right track. Here is what ended up working: SELECT F0.id FROM Foo as F0 JOIN Bar as B0 ON (F0.bar_id=B0.id) WHERE NOT EXISTS (SELECT * FROMFoo AS F1, JOIN Bar as B1 ON (F1.bar_id=B1.id) WHERE B1.ts < CURRENT_TIMESTAMP AND F0.attr_a=F1.attr_a AND F0.attr_b=F1.attr_b AND B0.attr_a=B1.attr_a AND B1.ts > B0.ts); There was an extra "(", and it should be CURRENT_TIMESTAMP. Thanks! Greg