Обсуждение: sequential joins

Поиск
Список
Период
Сортировка

sequential joins

От
Oleg Lebedev
Дата:
You guys and gals were really helpful!
I hope you can help me with this problem too.
I have an Activity record, that has fields like artist, designer,
programmer containing ids of users from User table, which are assigned
to the current activity.
What I need to do is create a view, that would contain all the Activity
information, except artist, designer, and programmer should be filled
out with corresponding usernames from the User table.
Here is one way to do this:
Schemas:
Activity: name, artist, designer, programmer
User: username, objectid

SELECT name, artistname, designername, programmername
FROM
(SELECT *
FROM activity a
LEFT OUTER JOIN (SELECT username AS artistname, objectid AS userid FROM
User) n1
ON a.artist = n1.userid
LEFT OUTER JOIN (SELECT username AS designername, objectid AS userid
FROM User) n2
ON a.designer = n2.userid
LEFT OUTER JOIN (SELECT username AS programmername, objectid AS userid
FROM User) n3
ON a.programmer = n3.userid) names;

I wonder if there is a better way to do this. Maybe using CASE WHEN THEN
ELSE END clause to avoid multiple scans?
thanks,

Oleg



Re: sequential joins

От
"Josh Berkus"
Дата:
Oleg,

> I wonder if there is a better way to do this. Maybe using CASE WHEN
>  THEN
> ELSE END clause to avoid multiple scans?
> thanks,

No, not really.  PostgreSQL is pretty good about detecting multiplereferences to the same table in subselects and
optimizingyour queryappropriately.  On occassion, I've had to do this with the same tablesub-selected 50 seperate times
andPostgres handles it OK.
 

Sometimes you can use a straight LEFT OUTER JOIN instead of asubselect.  This depends entirely on whether you are
planningon doingany GROUPing or totals on the main query.  If NOT, then:
 

SELECT tablea.f1, tablea.f2, alias1.f3, alias2.f3
FROM tablea LEFT OUTER JOIN (SELECT f2, f3         FROM tableb WHERE f4 = "1") alias1    ON tablea.f1 = alias1.f2LEFT
OUTERJOIN (SELECT f2, f3         FROM tableb WHERE f4 = "2") alias2    ON tablea.f1 = alias2.f2;
 

Is equivalent to:

SELECT tablea.f1, tablea.f2, alias1.f3, alias2.f3
FROM tablea LEFT OUTER JOIN tableb AS alias1    ON (tablea.f1 = alias1.f2 AND alias1.f4 = "1")LEFT OUTER JOIN tableb AS
alias2   ON (tablea.f1 = alias2.f2 AND alias2.f4 = "2") ;
 

And the second should run a bit faster.


(FYI:  MS SQL Server 7.0 does *not* optimize for multiple subselects onthe same table.  I recently found this out the
hardway, and crashedan MS SQL Server despite 1gb of memory in the machine.  The same querydoes OK in Postgres on less
hardware)

-Josh Berkus


Re: sequential joins

От
Oleg Lebedev
Дата:
Josh,
the sub-select statement I was using were not for filtering data (in which
case you can move filtering condition in ON part of the join), but rather
renaming the column 'username' with the name I need in the current join
statement. In this case, columns with new names ('artist', 'designer',
'programmer') would be appended to the overall query result and can be
selected by top-level select statement.
This looks like a very clumsy way to do what I need.
Any ideas?
thanks,

Oleg

Josh Berkus wrote:

> Oleg,
>
> > I wonder if there is a better way to do this. Maybe using CASE WHEN
> >  THEN
> > ELSE END clause to avoid multiple scans?
> > thanks,
>
> No, not really.  PostgreSQL is pretty good about detecting multiple
>  references to the same table in subselects and optimizing your query
>  appropriately.  On occassion, I've had to do this with the same table
>  sub-selected 50 seperate times and Postgres handles it OK.
>
> Sometimes you can use a straight LEFT OUTER JOIN instead of a
>  subselect.  This depends entirely on whether you are planning on doing
>  any GROUPing or totals on the main query.  If NOT, then:
>
> SELECT tablea.f1, tablea.f2, alias1.f3, alias2.f3
> FROM tablea
>         LEFT OUTER JOIN (SELECT f2, f3
>                          FROM tableb WHERE f4 = "1") alias1
>                 ON tablea.f1 = alias1.f2
>         LEFT OUTER JOIN (SELECT f2, f3
>                          FROM tableb WHERE f4 = "2") alias2
>                 ON tablea.f1 = alias2.f2;
>
> Is equivalent to:
>
> SELECT tablea.f1, tablea.f2, alias1.f3, alias2.f3
> FROM tablea
>         LEFT OUTER JOIN tableb AS alias1
>                 ON (tablea.f1 = alias1.f2 AND alias1.f4 = "1")
>         LEFT OUTER JOIN tableb AS alias2
>                 ON (tablea.f1 = alias2.f2 AND alias2.f4 = "2") ;
>
> And the second should run a bit faster.
>
> (FYI:  MS SQL Server 7.0 does *not* optimize for multiple subselects on
>  the same table.  I recently found this out the hard way, and crashed
>  an MS SQL Server despite 1gb of memory in the machine.  The same query
>  does OK in Postgres on less hardware)
>
> -Josh Berkus
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



Re: sequential joins

От
"Josh Berkus"
Дата:
Oleg,

> the sub-select statement I was using were not for filtering data (in
>  which
> case you can move filtering condition in ON part of the join), but
>  rather
> renaming the column 'username' with the name I need in the current
>  join
> statement. In this case, columns with new names ('artist',
>  'designer',
> 'programmer') would be appended to the overall query result and can
>  be
> selected by top-level select statement.
> This looks like a very clumsy way to do what I need.
> Any ideas?

No, now I'm confused.  Can you explain better what you are trying to
 achieve as a result?

-Josh




Re: sequential joins

От
Oleg Lebedev
Дата:
Ok, here it is:
SELECT *
FROM
activity a
LEFT OUTER JOIN       (SELECT     username AS artistname,                           objectid AS userid       FROM
 user) u1
 
ON a.artist = u1.userid;

After this join, I would have a view, that contains original activity table
data, appended with column artistname, which would contain artist username,
referenced from this activity record.
If I join the result with       (SELECT     username AS programmername,                           objectid AS userid
  FROM         user) u2
 
ON a.programmer = u2.userid
I would have programmername column appended, containing programmers username,
etc.
I hope this clarifies the issue.
thanks,


Josh Berkus wrote:

> Oleg,
>
> > the sub-select statement I was using were not for filtering data (in
> >  which
> > case you can move filtering condition in ON part of the join), but
> >  rather
> > renaming the column 'username' with the name I need in the current
> >  join
> > statement. In this case, columns with new names ('artist',
> >  'designer',
> > 'programmer') would be appended to the overall query result and can
> >  be
> > selected by top-level select statement.
> > This looks like a very clumsy way to do what I need.
> > Any ideas?
>
> No, now I'm confused.  Can you explain better what you are trying to
>  achieve as a result?
>
> -Josh
>
>   ------------------------------------------------------------------------
>
>   ------------------------------------------------------------------------
>
>   ------------------------------------------------------------------------



Re: sequential joins

От
Tom Lane
Дата:
Oleg Lebedev <olebedev@waterford.org> writes:
> SELECT *
> FROM
> activity a
> LEFT OUTER JOIN
>         (SELECT     username AS artistname,
>                             objectid AS userid
>         FROM         user) u1
> ON a.artist = u1.userid;

This seems like the hard way compared to

SELECT a.*, u1.username AS artistname, u1.objectid AS userid      ... more fields here ...
FROM
activity a
LEFT OUTER JOIN user u1 ON a.artist = u1.userid
... more JOINs here ...

I think that you will in fact end up with the same plan, but that's
only because the PG planner is smart enough to convert the first
into the second.  Might as well save the notational cruft to begin with.
        regards, tom lane