Обсуждение: Is not equal to query...
Hi everyone, A bit of help please. This query works for me, and gives me 332,000 rows of records out of my table of 400,000 rows. It takes about a minute to run. SELECT vehicle.* FROM vehicle, incidents WHERE incidents.key = vehicle.incident_link; Would I would like to do however, is remove the data from this table that does not match the above query. So to start with, I thought that I'd better try and identify this data using the query below. You'll note I have simply changed the '=' for '!=' instead. SELECT vehicle.* FROM vehicle, incidents WHERE incidents.key != vehicle.incident_link; However when I run this query, after about 10-15 minutes, I still don't have a result. I don't really understand why not. Cheers as always. James
On Wed, Jun 8, 2011 at 4:57 PM, James David Smith <james.david.smith@gmail.com> wrote: > Hi everyone, > > A bit of help please. This query works for me, and gives me 332,000 > rows of records out of my table of 400,000 rows. It takes about a > minute to run. > > SELECT vehicle.* > FROM vehicle, incidents > WHERE incidents.key = vehicle.incident_link; > > Would I would like to do however, is remove the data from this table > that does not match the above query. So to start with, I thought that > I'd better try and identify this data using the query below. You'll > note I have simply changed the '=' for '!=' instead. > > SELECT vehicle.* > FROM vehicle, incidents > WHERE incidents.key != vehicle.incident_link; > > However when I run this query, after about 10-15 minutes, I still > don't have a result. I don't really understand why not. You're getting a combination of every record from vehicle combined with every record incidents except in the very specific case where the identifiers match -- not want you want. There are a ton of ways to do what you want in sql. Typically the best/fastest is left join/not null: if you are trying to find vehicles with an incident key that is not in the incident table: SELECT vehicle.* FROM vehicle LEFT JOIN incidents ON incidents.key = vehicle.incident_link WHERE incidents.key IS NULL Incidents with no record in the vehicle table: SELECT incidents .* FROM incidents LEFT JOIN vehicle ON incidents.key = vehicle.incident_link WHERE vehicle.incident_link IS NULL Another way to do it is with 'where not exists' -- which is a lot easier to fold into delete syntax: SELECT vehicle.* FROM vehicle WHERE NOT EXISTS ( SELECT 1 FROM incidents WHERE incidents.key = vehicle.incident_link ) which you can turn into: DELETE FROM vehicle WHERE NOT EXISTS ( SELECT 1 FROM incidents WHERE incidents.key = vehicle.incident_link ) etc (don't jump directly to the delete, test it first!) merlin
On Thu, Jun 9, 2011 at 6:17 AM, James David Smith <james.david.smith@gmail.com> wrote: > Hi Merlin, > > Thank you very much for the full and brilliant reply. The last query > you wrote does exactly what I want it too. I wonder whether it's not > too much trouble whether you could explain to me a couple of things > though...? > > 1) I understand what you are saying my query does. Like an outer join > I think? But don't get why. Using '!=' is the opposite of '=' is it > not? It is the opposite -- just not in the way you are thinking. In SQL, joins between tables means 'give me every combination of data from table A combined with table B given a condition'. If A and B each have 100 records with identifiers 1-100, the not equal join would give you a join result of 99 records for A=1 (with B 2-100), 99 records for A=2 (with B 1, 3-100) etc. for a total of 9900 records. Your problem is that you are still associating tables A and B in your head in a way that is not expressed in the join. You have to imagine both tables as pools of unassociated records with no ordering except for what you give in the query. > 2) I understand the 'where not exists' query you suggest, and have > used that, but I don't see why you use '1' in it. What does the 1 do? where not exists means 'return this record if this query does not return at least 1 record' -- since we don't care what is in the record that is actually returned, I just use 1 as a shorthand because in SQL you can't write queries that return 0 fields -- this is a pretty common convention. I instead of 1, I could have used *, -999, or anything -- it doesn't matter. Aside: exists/not exists in relational parlance are called 'semi joins'. They can often be faster than regular joins because the server can bail early without having to express the full join. merlin