Re: SELECT too complex?
От | Rory Campbell-Lange |
---|---|
Тема | Re: SELECT too complex? |
Дата | |
Msg-id | 20030627222112.GA553@campbell-lange.net обсуждение исходный текст |
Ответ на | Re: SELECT too complex? (Josh Berkus <josh@agliodbs.com>) |
Ответы |
Re: SELECT too complex?
(Josh Berkus <josh@agliodbs.com>)
|
Список | pgsql-novice |
Hi Josh. Thanks very much for your input. On 27/06/03, Josh Berkus (josh@agliodbs.com) wrote: > As for simplifying, I'm afraid that I don't really understand what > you're trying to get with this query. If I'm clueless, send me a > version of the query with each clause commented. The aim of the select is to find all objects which a person has joined. Objects are shared on boards, recorded in board_objects. People join objects as board_object_members, which point to instances of board_objects. So, an object can show up on one or more boards. One can be a member of the object on one or more boards. One may also not have the ability to "see" a board_object record of an object because one isn't joined to that particular board. So I am also an object and my membership of a board depends on whether or not a board_objects record exists for that board. (See ***) Mmm. Clear as mud. > > FOR resulter IN > > SELECT > > o.t_text_id as itemid, > > pers.t_nickname as itemauthor, > > o.n_type as itemtype, > > o.t_name as itemtitle, > > o.t_description as itembody, > > to_char(o.dt_created, 'DD Mon'') as itemcreated, > > to_char(botimer, ''DD on'') as itemupdated, > > COALESCE (c_count, 0) as itemcommentcount, > > p.n_id as imageid, > > COALESCE(p.t_path, ''none'') as imagesrc, > > p.n_width as imagewidth, > > p.n_height as imageheight > > FROM > > people pers, > > objects o > > LEFT OUTER JOIN photo p ON o.n_id_photo = p.n_id -- join photo on objects. There may or may -- not be a photo. > > LEFT OUTER JOIN ( > > SELECT > > bo.n_object_id as boider, > > max(bo.dt_modified) as botimer > > FROM > > board_objects bo, board_object_members bom > > WHERE > > bom.n_person_id = personid *** > > AND > > bom.n_board_object_id = bo.n_id > > GROUP BY > > bo.n_object_id > > ) as timer -- here I grab the most recently updated instance of a -- board_ object to which I am attached. It is important to -- grab the latest as I may be attached to the same object -- on several boards, each which may have different -- update times. -- personid is my object id. > > You have a lot of left outer joins on aggregate subselects in this query. > try putting the subselect in the SELECT clause, and using ORDER BY ____ DESC > LIMIT 1 instead of MAX. OK. Sounds sensible. > Also, I fail to see the reason you need to left outer join on Board_Objects, > when as subselect to board_objects is in your WHERE criteria. Presumably, if > there are no records in board_objects, you'll lose the row from person > anyway. > What about doing a straight join to board_objects and simplifing your outer > joins thereby? Good point! Excellent analysis; thanks. Presumably reducing outer joins will in theory reduce the query time? > > > ON o.n_id = timer.boider > > LEFT OUTER JOIN ( > > SELECT > > bo.n_object_id as c_bo_id, > > count(comm.n_id) as c_count > > FROM > > board_objects bo, comments comm > > WHERE > > comm.n_board_object_id = bo.n_id > > AND > > bo.n_board_id IN > > You *definitely* don't want to use IN in this subquery, unless you're on 7.4. > Use EXISTS instead. I'm also unclear on why you need this IN subquery at all. > You're checking board_objects against itself; why not just put it in the main > subselect? Mmm. This bit of the query grabs the count of all comments on all versions of the object that I'm attached to. It disregards comments on board_objects I'm not joined to. To paraphrase: Get count of comments on board_objects where board_objects.object_id is this object id BUT only get objects on boards where I'm a member. I haven't really used EXISTS. Makes sense, thanks. > > > (SELECT > > n_board_id > > FROM > > board_objects > > WHERE > > n_object_id = persobjid > > ) > > GROUP BY > > bo.n_object_id, bo.n_object_id > > ) as counter > > ON o.n_id = counter.c_bo_id > > WHERE > > o.n_creator = pers.n_id > > AND > > o.n_id IN ( > > Once again, through 7.3, use EXISTS, not IN. Yup. > > > SELECT DISTINCT ON (bo.dt_modified, bo.n_object_id) > > bo.n_object_id as bo_obid > > Why do you need a SELECT DISTINCT if you're just checking for existence? I found that a normal query was finding more copies of an object because an object can be reflected several times in board_objects. > > > FROM > > board_objects bo, board_object_members bom > > WHERE > > bo.n_id = bom.n_board_object_id > > AND > > bom.n_person_id = personid > > ) > > ORDER BY > > botimer DESC > > LIMIT > > locallimit > > OFFSET > > localoffset LOOP > > Overall, yeah, I think you're making the query too complex. You have 3 > subselects consulting the same table; surely at least 2 of these could be > consolidated. Going through it I'm not sure the aims of the query subselects are all that wrong (i.e. most recent object I'm attached to; sum of comments on this object that I'm attached to but only in the context of the boards I belong to; ensuring that I'm WHEREing only objects that I can access). Although I could be horribly wrong :-). However your analysis has made me think much more deeply about how the joins work and whether or not I need left outer joins. More than that, the difficulty I've had explaining the database structure to you means that I should carefully reconsider it. Many thanks for the time you've taken to answer my question. Rory -- Rory Campbell-Lange <rory@campbell-lange.net> <www.campbell-lange.net>
В списке pgsql-novice по дате отправления: