Обсуждение: Query Problem
Folks: Here's the problem, in abstract: I need to select every record in table A that does not have a link in table B Join Table C where Table C.account = 11 The relevant fields: Table_A CaseID Data Table_B GroupID CaseID Amount Table_C GroupID AccountID Thus, I need to select: SELECT Data FROM Table A WHERE CaseID NOT IN (SELECT CaseID FROM Table_B, Table_CWHERE Table_B.GroupID = TableC.GroupID AND TableC.AccountID =11) The problem is, since Table_B and Table_C are large (10,000 records +) this exclusion query takes several *minutes* to run. I've fooled around with drectional joins, views, and temporary tables, but I can seem to find anything that works faster. Suggestions? -Josh Berkus -- ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 436-9166 for law firms, small businesses fax 436-0137 and non-profit organizations. pager 338-4078 San Francisco
Josh Berkus <josh@agliodbs.com> writes: > Folks: > > Here's the problem, in abstract: I need to select every record in table > A that does not have a link in table B Join Table C where Table > C.account = 11 > > The relevant fields: > > Table_A > CaseID > Data > > Table_B > GroupID > CaseID > Amount > > Table_C > GroupID > AccountID > > Thus, I need to select: > > SELECT Data FROM Table A > WHERE CaseID NOT IN ( > SELECT CaseID FROM Table_B, Table_C > WHERE Table_B.GroupID = TableC.GroupID > AND TableC.AccountID = 11) > > The problem is, since Table_B and Table_C are large (10,000 records +) > this exclusion query takes several *minutes* to run. > > I've fooled around with drectional joins, views, and temporary tables, > but I can seem to find anything that works faster. Suggestions? Without having tried, something like this might be faster: SELECT Data FROM Table A WHERE CaseID NOT IN (SELECT CaseID FROM Table_BWHERE Table_B.GroupID IN ( SELECT GroupID FROM Table_C WHERETableC.AccountID = 11)) Tomas
What is the explain output for the queries you've tried? Stephan Szabo sszabo@bigpanda.com On Wed, 25 Oct 2000, Josh Berkus wrote: > > Folks: > > Here's the problem, in abstract: I need to select every record in table > A that does not have a link in table B Join Table C where Table > C.account = 11 > > The relevant fields: > > Table_A > CaseID > Data > > Table_B > GroupID > CaseID > Amount > > Table_C > GroupID > AccountID > > Thus, I need to select: > > SELECT Data FROM Table A > WHERE CaseID NOT IN ( > SELECT CaseID FROM Table_B, Table_C > WHERE Table_B.GroupID = TableC.GroupID > AND TableC.AccountID = 11) > > The problem is, since Table_B and Table_C are large (10,000 records +) > this exclusion query takes several *minutes* to run. > > I've fooled around with drectional joins, views, and temporary tables, > but I can seem to find anything that works faster. Suggestions? > > -Josh Berkus > > -- > ______AGLIO DATABASE SOLUTIONS___________________________ > Josh Berkus > Complete information technology josh@agliodbs.com > and data management solutions (415) 436-9166 > for law firms, small businesses fax 436-0137 > and non-profit organizations. pager 338-4078 > San Francisco >
Josh Berkus <josh@agliodbs.com> writes: > Thus, I need to select: > SELECT Data FROM Table A > WHERE CaseID NOT IN ( > SELECT CaseID FROM Table_B, Table_C > WHERE Table_B.GroupID = TableC.GroupID > AND TableC.AccountID = 11) > The problem is, since Table_B and Table_C are large (10,000 records +) > this exclusion query takes several *minutes* to run. I don't think there is any good way to make this fast in current sources. A partial workaround is to use a temp table: SELECT CaseID INTO TEMP TABLE mycaseids FROM Table_B, Table_CWHERE Table_B.GroupID = TableC.GroupID AND TableC.AccountID= 11; CREATE INDEX mycaseids_idx ON mycaseids(caseid); -- critical! SELECT Data FROM TableA upper WHERE NOT EXISTS (select 1 from mycaseids where caseid = upper.caseid); You'd need to check with EXPLAIN, but the EXISTS subplan should make use of the index to probe the temp table, so you get one index lookup per outer tuple. Better than a complete scan of the subselect outputs, which is what you'll get with the NOT IN style. In 7.1 it'll be possible to do this with an outer join, which should be a lot quicker: SELECT Data FROM TableA LEFT JOIN (SELECT CaseID FROM Table_B, Table_CWHERE Table_B.GroupID = TableC.GroupIDAND TableC.AccountID = 11) subselect ON (tablea.caseid = subselect.caseid) WHERE subselect.caseid IS NULL; ie, do the outer join and then discard the successfully-matched rows. Further down the pike, we have plans to make the system smart enough to transform IN and NOT IN constructs into join-like queries automatically. Right now, though, they're best rewritten into something else when performance is important. regards, tom lane
Michael, > SELECT Data > FROM Table A > WHERE NOT EXISTS ( > SELECT * FROM Table_B, Table_C > WHERE Table_B.GroupID = TableC.GroupID > AND TableC.AccountID = 11 > ) > > I think that the not exists is a bit quicker than the NOT IN. Give it > a whirl. A *lot* faster. Like, 7x as fast. I'd forgotten about EXISTS, since I so seldom have a use for it ... but this is shy it was created, I guess. Thanks so much for your help! -Josh Berkus -- ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 436-9166 for law firms, small businesses fax 436-0137 and non-profit organizations. pager 338-4078 San Francisco
Hello folks, I got a headache now, cause our admin played with postgres settings, something about pg_locale, as he says. Perhaps the reason is elsewhere, i dont know. As result, now all queries, written inside the php code on multiply lines, returns the following: ERROR: parser: parse error at or near " " The same query, pasted to psql, works well. The same query written on single line in php code works well. But i see that my older scripts with multiply line queries works well. Whats up?! Why? There is no errors in this code: ------ $result = @pg_exec($db,"SET DATESTYLE = 'ISO';SELECT DISTINCT meznieciba, date_part('day',date(datums)) AS dd, date_part('month',date(datums))AS mm, date_part('year',date(datums)) AS yyyy, galvene, kajene FROM izsolesWHERE datums= '$datums'") or die(pg_errormessage()); ------ it works fine when written on the single line but now i see the above stupid error message.. it so funny to have to rewrite queries to single line?.. --:)-- Best regards,Sandis
Sandis Jerics <sandis@mediaparks.lv> writes: > As result, now all queries, written inside the php code on multiply lines, > returns the following: > ERROR: parser: parse error at or near " " At a guess, you're having trouble with newline representations (Unix convention is \n only, DOS/Windows convention is \r\n, and then there's Macintosh which likes \r only). We've been working to change Postgres to accept all of these choices, but depending on which version of which tool you are using, you may need to toe the Unix line faithfully. The above message looks a lot like something spitting up on a stray \r. Dunno what your admin did to make the problem appear where you hadn't had it before... regards, tom lane
Sandis Jerics <sandis@mediaparks.lv> writes: > As result, now all queries, written inside the php code on multiply lines, > returns the following: > ERROR: parser: parse error at or near " " At a guess, you're having trouble with newline representations (Unix convention is \n only, DOS/Windows convention is \r\n, and then there's Macintosh which likes \r only). We've been working to change Postgres to accept all of these choices, but depending on which version of which tool you are using, you may need to toe the Unix line faithfully. The above message looks a lot like something spitting up on a stray \r. Dunno what your admin did to make the problem appear where you hadn't had it before... regards, tom lane