Обсуждение: SQL over my head...

Поиск
Список
Период
Сортировка

SQL over my head...

От
Gregory Brauer
Дата:
I've got a rather difficult query that I'm hoping someone can help with.

I have two tables, foo and bar.

I'll just speak SQL...

CREATE TABLE foo  (    id           serial primary key,    attr_a       int4 not null,    attr_b       int4 not null,
bar_id       int4 not null,
 
    unique(attr_a , attr_b, bar_id),
    foreign key(bar_id) references bar(id)  );

CREATE TABLE bar  (    id           serial primary key,    attr_a       int4 not null,    ts           timestamp not
null );
 


What I want to do is find all of the foo.id's where the
foo.bar_id in that row points to a bar where the bar.ts
is the most recent time that is before the current time
among the sets of identical combiniations of foo.attr_a,
foo.attr_b and bar.attr_a.

Said another way...

If I find the 3 item sets of foo.attr_a, foo._attr_b,
and the bar.attr_a that a foo.bar_id points to, and sort
them into groups where the the three values are the same,
I want, as a result, one item from each of the groups
where the time is the maximum of that group which is
still earlier than the current time.

Is this possible?  I'd appreciate any help anyone can give.

Thanks.

Greg Brauer
greg@wildbrain.com



Re: SQL over my head...

От
"Joel Burton"
Дата:
I think that

SELECT F0.id FROM Foo AS F0 JOIN Bar AS B0 ON (F0.id=B0.id)WHERE ts =           (SELECT MAX(ts)              FROM Foo
ASF1,                   Bar as B1             WHERE ts < CURRENT_TIME               AND F0.attr_a=F1.attr_a
 AND F0.attr_b=F1.attr_b               AND B0.attr_a=B1.attr_a);
 

is what you want. If not, can you post a brief dump of INSERT statements
with sample data, and the output you're expecting?

If you want the combinations of f.attr_a, f.attr_b, and b.attr_a only for
naturally joining foo/bars, join the tables in the subquery with the same
JOIN ... as in the outer query. It's not clear from your question if you
want this or not.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Gregory Brauer
> Sent: Thursday, May 16, 2002 3:10 PM
> To: pgsql-sql@postgresql.org
> Subject: [SQL] SQL over my head...
>
>
>
> I've got a rather difficult query that I'm hoping someone can help with.
>
> I have two tables, foo and bar.
>
> I'll just speak SQL...
>
> CREATE TABLE foo
>    (
>      id           serial primary key,
>      attr_a       int4 not null,
>      attr_b       int4 not null,
>      bar_id       int4 not null,
>
>      unique(attr_a , attr_b, bar_id),
>
>      foreign key(bar_id) references bar(id)
>    );
>
> CREATE TABLE bar
>    (
>      id           serial primary key,
>      attr_a       int4 not null,
>      ts           timestamp not null
>    );
>
>
> What I want to do is find all of the foo.id's where the
> foo.bar_id in that row points to a bar where the bar.ts
> is the most recent time that is before the current time
> among the sets of identical combiniations of foo.attr_a,
> foo.attr_b and bar.attr_a.
>
> Said another way...
>
> If I find the 3 item sets of foo.attr_a, foo._attr_b,
> and the bar.attr_a that a foo.bar_id points to, and sort
> them into groups where the the three values are the same,
> I want, as a result, one item from each of the groups
> where the time is the maximum of that group which is
> still earlier than the current time.
>
> Is this possible?  I'd appreciate any help anyone can give.
>
> Thanks.
>
> Greg Brauer
> greg@wildbrain.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



Re: SQL over my head...

От
"Joel Burton"
Дата:
> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Joel Burton
> Sent: Thursday, May 16, 2002 4:14 PM
> To: Gregory Brauer; pgsql-sql@postgresql.org
> Subject: Re: [SQL] SQL over my head...
>
>
> I think that
>
> SELECT F0.id
>   FROM Foo AS F0
>   JOIN Bar AS B0 ON (F0.id=B0.id)
>  WHERE ts =
>             (SELECT MAX(ts)
>                FROM Foo AS F1,
>                     Bar as B1
>               WHERE ts < CURRENT_TIME
>                 AND F0.attr_a=F1.attr_a
>                 AND F0.attr_b=F1.attr_b
>                 AND B0.attr_a=B1.attr_a);

Just glancing over this, I realized that this will perform slowly. If you
have indexes on sensible things (ts, ids, attrs, etc.), something like:

SELECT F0.id FROM Foo as F0 JOIN Bar as B0 ON (F0.id=B0.id)WHERE NOT EXISTS (            (SELECT *               FROM
FooAS F1,                    Bar as B1              WHERE ts < CURRENT_TIME                AND F0.attr_a=F1.attr_a
         AND F0.attr_b=F1.attr_b                AND B0.attr_a=B1.attr_a                AND B1.ts > B0.ts)
 

should be equivalent (assuming NOT NULL data) and perform better.




Re: SQL over my head...

От
Gregory Brauer
Дата:
Joel Burton wrote:
>>-----Original Message-----
>>From: pgsql-sql-owner@postgresql.org
>>[mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Joel Burton
>>Sent: Thursday, May 16, 2002 4:14 PM
>>To: Gregory Brauer; pgsql-sql@postgresql.org
>>Subject: Re: [SQL] SQL over my head...
>>
>>
>>I think that
>>
>>SELECT F0.id
>>  FROM Foo AS F0
>>  JOIN Bar AS B0 ON (F0.id=B0.id)
>> WHERE ts =
>>            (SELECT MAX(ts)
>>               FROM Foo AS F1,
>>                    Bar as B1
>>              WHERE ts < CURRENT_TIME
>>                AND F0.attr_a=F1.attr_a
>>                AND F0.attr_b=F1.attr_b
>>                AND B0.attr_a=B1.attr_a);
> 
> 
> Just glancing over this, I realized that this will perform slowly. If you
> have indexes on sensible things (ts, ids, attrs, etc.), something like:
> 
> SELECT F0.id
>   FROM Foo as F0
>   JOIN Bar as B0 ON (F0.id=B0.id)
>  WHERE NOT EXISTS (
>              (SELECT *
>                 FROM Foo AS F1,
>                      Bar as B1
>                WHERE ts < CURRENT_TIME
>                  AND F0.attr_a=F1.attr_a
>                  AND F0.attr_b=F1.attr_b
>                  AND B0.attr_a=B1.attr_a
>                  AND B1.ts > B0.ts)
> 
> should be equivalent (assuming NOT NULL data) and perform better.
> 


Wow, thanks for the help!  I'm still parsing this, but one comment
and one question...

First, I realized that though I will probably need this
more general case later, in what I am doing right now, I know
a single bar.attr_a value that I want, so only foo.attr_a and
foo.attr_b are variable.  Does that change anything?

Secondly, in the third line, should (F0.id=B0.id) actually be
(F0.bar_id=B0.id) to join on the foreign key?

I'm trying this out now...

Greg



Re: SQL over my head...

От
"Joel Burton"
Дата:
> -----Original Message-----
> From: Gregory Brauer [mailto:greg@wildbrain.com]
> Sent: Thursday, May 16, 2002 4:39 PM
> To: Joel Burton
> Cc: pgsql-sql@postgresql.org
> Subject: Re: [SQL] SQL over my head...
> >
> > SELECT F0.id
> >   FROM Foo as F0
> >   JOIN Bar as B0 ON (F0.id=B0.id)
> >  WHERE NOT EXISTS (
> >              (SELECT *
> >                 FROM Foo AS F1,
> >                      Bar as B1
> >                WHERE ts < CURRENT_TIME
> >                  AND F0.attr_a=F1.attr_a
> >                  AND F0.attr_b=F1.attr_b
> >                  AND B0.attr_a=B1.attr_a
> >                  AND B1.ts > B0.ts)
> >
> Wow, thanks for the help!  I'm still parsing this, but one comment
> and one question...
>
> First, I realized that though I will probably need this
> more general case later, in what I am doing right now, I know
> a single bar.attr_a value that I want, so only foo.attr_a and
> foo.attr_b are variable.  Does that change anything?

If I'm understanding correctly, just make the line "AND
B1.attr_a=<constant>"
To only look at groups where that's the attribute you want and add "AND
B0.attr_a=<same_constant>" at the end of the query (after the closing paren)
to make that count for the outer query, too.

> Secondly, in the third line, should (F0.id=B0.id) actually be
> (F0.bar_id=B0.id) to join on the foreign key?

Yes, it should.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant



Re: SQL over my head...

От
Gregory Brauer
Дата:
Joel Burton wrote:
>>-----Original Message-----
>>From: Gregory Brauer [mailto:greg@wildbrain.com]
>>Sent: Thursday, May 16, 2002 4:39 PM
>>To: Joel Burton
>>Cc: pgsql-sql@postgresql.org
>>Subject: Re: [SQL] SQL over my head...
>>
>>>SELECT F0.id
>>>  FROM Foo as F0
>>>  JOIN Bar as B0 ON (F0.id=B0.id)
>>> WHERE NOT EXISTS (
>>>             (SELECT *
>>>                FROM Foo AS F1,
>>>                     Bar as B1
>>>               WHERE ts < CURRENT_TIME
>>>                 AND F0.attr_a=F1.attr_a
>>>                 AND F0.attr_b=F1.attr_b
>>>                 AND B0.attr_a=B1.attr_a
>>>                 AND B1.ts > B0.ts)


Excellent!  This was very close, and put me on the right track.
Here is what ended up working:


SELECT F0.id  FROM Foo as F0  JOIN Bar as B0 ON (F0.bar_id=B0.id) WHERE NOT EXISTS             (SELECT *
FROMFoo AS F1,                JOIN Bar as B1 ON (F1.bar_id=B1.id)               WHERE B1.ts < CURRENT_TIMESTAMP
       AND F0.attr_a=F1.attr_a                 AND F0.attr_b=F1.attr_b                 AND B0.attr_a=B1.attr_a
      AND B1.ts > B0.ts);
 

There was an extra "(", and it should be CURRENT_TIMESTAMP.

Thanks!

Greg