Обсуждение: dumb question
Hi List, I am a noob trying to do something that seems like it should be easy but I can't figure it out. I have a table like so: id | ref_id | sts ------------------ 1 | | 0 2 | 1 | 1 3 | | 0 4 | | 0 5 | 4 | 1 6 | | 0 7 | 6 | 1 I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id. so the answer would be id=3. Thanks for any pointers, Steve --
On Thursday, June 2, 2016, Steve Clark <steve.clark@netwolves.com> wrote:
Hi List,
I am a noob trying to do something that seems like it should be easy but I can't figure it out.
I have a table like so:
id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1
I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.
so the answer would be id=3.
Thanks for any pointers,
Steve
So, of all the rows whose sts is 0 and the id is not in (or not exists) ref_id....subquery for selection
Give me the maximum id...parent query with group by.
David J.
Something like:
That assumes that ref_id is null. It would help to see your table structure and the query you tried that doesn't work. If ref_id is actually a character string then you might need ref_id='' or coalesce(ref_id,'')='' if it can be null or empty string.
Cheers,
Steve
On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com> wrote:
Hi List,
I am a noob trying to do something that seems like it should be easy but I can't figure it out.
I have a table like so:
id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1
I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.
so the answer would be id=3.
Thanks for any pointers,
Steve
--
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
2016-06-02 14:23 GMT-03:00 Steve Crawford <scrawford@pinpointresearch.com>:
Something like:select max(id) from yourtable where sts=0 and ref_id is null;That assumes that ref_id is null. It would help to see your table structure and the query you tried that doesn't work. If ref_id is actually a character string then you might need ref_id='' or coalesce(ref_id,'')='' if it can be null or empty string.Cheers,SteveOn Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com> wrote:Hi List,
I am a noob trying to do something that seems like it should be easy but I can't figure it out.
I have a table like so:
id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1
I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.
so the answer would be id=3.
Thanks for any pointers,
Steve
--
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
I think sts=0 means ref_id is null
So, what I think he wants to achieve is:
select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);
Isn't it?
On Thu, Jun 2, 2016 at 10:40 AM, Felipe Santos <felipepts@gmail.com> wrote:
The OP will need to explain further as we are all guessing. As I mentioned in my earlier (accidental top - curses GMail) post, table structures and the query or queries that don't work would be useful. So would a description of the problem that is being solved since there could be better approaches.2016-06-02 14:23 GMT-03:00 Steve Crawford <scrawford@pinpointresearch.com>:Something like:select max(id) from yourtable where sts=0 and ref_id is null;That assumes that ref_id is null. It would help to see your table structure and the query you tried that doesn't work. If ref_id is actually a character string then you might need ref_id='' or coalesce(ref_id,'')='' if it can be null or empty string.Cheers,SteveOn Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com> wrote:Hi List,
I am a noob trying to do something that seems like it should be easy but I can't figure it out.
I have a table like so:
id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1
I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.
so the answer would be id=3.
Thanks for any pointers,
Steve
--
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-generalI think sts=0 means ref_id is nullSo, what I think he wants to achieve is:select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);Isn't it?
Cheers,
Steve
The OP will need to explain further as we are all guessing. As I mentioned in my earlier (accidental top - curses GMail) post, table structures and the query or queries that don't work would be useful. So would a description of the problem that is being solved since there could be better approaches.On Thu, Jun 2, 2016 at 10:40 AM, Felipe Santos <felipepts@gmail.com> wrote:I think sts=0 means ref_id is nullSo, what I think he wants to achieve is:select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);Isn't it?
Maybe we mis-interpreted but as written this is the solution. My out-loud thinking was a more verbose version of this.
David J.
Thanks all the below seem to do the trick. On 06/02/2016 01:58 PM, David G. Johnston wrote: > select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable); select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable); -- Stephen Clark
On 6/2/2016 11:10 AM, Steve Clark wrote:
Thanks all the below seem to do the trick.
On 06/02/2016 01:58 PM, David G. Johnston wrote:select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);
select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);
do note, this is whats known as an 'anti-join', and these can be pretty expensive on large tables.
-- john r pierce, recycling bits in santa cruz
On 6/2/2016 11:10 AM, Steve Clark wrote:Thanks all the below seem to do the trick.
On 06/02/2016 01:58 PM, David G. Johnston wrote:select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);
select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);do note, this is whats known as an 'anti-join', and these can be pretty expensive on large tables.
+1
Though I suspect that with a partial index on (id, sts=0) and (ref_id, ref_id IS NOT NULL), though highly sensitive to density, that even for large total row counts it would perform pretty well; but I'm not knowledgeable in how smart we are here. Selecting, in descending order, (id where sts = 0), from the index and then poking into index(ref_id) should, particularly if the cross-set is sparse, pretty quickly find a non-match.
David J.
Thanks all the below seem to do the trick.
On 06/02/2016 01:58 PM, David G. Johnston wrote:
On 06/02/2016 01:58 PM, David G. Johnston wrote:
select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);
select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);
--
Stephen Clark
On Thu, Jun 2, 2016 at 3:23 PM, David G. Johnston <david.g.johnston@gmail.com> wrote: > On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote: >> Thanks all the below seem to do the trick. I doubt it -- using NOT IN requires (per the SQL specification) handling NULLs in a way that probably does not give you the answer you want. (NOT IN also is often much slower than the NOT EXISTS test which will actually give you the answer you want.) test=# create table t (id int not null primary key, ref_id int, sts int not null default 0); CREATE TABLE test=# insert into t values (1,null,0),(2,1,1),(3,null,0),(4,null,0),(5,4,1),(6,null,0),(7,6,1); INSERT 0 7 test=# select max(id) from t where sts=0 and id not in (select ref_id from t); max ----- (1 row) test=# select max(id) from t t1 where sts = 0 and not exists (select * from t t2 where t2.ref_id = t1.id); max ----- 3 (1 row) Note that providing minimal setup (like the above) helps in getting good answers quickly. >> do note, this is whats known as an 'anti-join', and these can be pretty >> expensive on large tables. > > +1 *Can* be. Proper indexing can make them very reasonable. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Jun 2, 2016 at 3:23 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote:
>> Thanks all the below seem to do the trick.
I doubt it -- using NOT IN requires (per the SQL specification)
handling NULLs in a way that probably does not give you the answer
you want. (NOT IN also is often much slower than the NOT EXISTS
test which will actually give you the answer you want.)
test=# create table t (id int not null primary key, ref_id int, sts
int not null default 0);
CREATE TABLE
test=# insert into t values
(1,null,0),(2,1,1),(3,null,0),(4,null,0),(5,4,1),(6,null,0),(7,6,1);
INSERT 0 7
test=# select max(id) from t where sts=0 and id not in (select ref_id from t);
max
-----
(1 row)
test=# select max(id) from t t1 where sts = 0 and not exists (select *
from t t2 where t2.ref_id = t1.id);
max
-----
3
(1 row)
Note that providing minimal setup (like the above) helps in getting
good answers quickly.
>> do note, this is whats known as an 'anti-join', and these can be pretty
>> expensive on large tables.
>
> +1
*Can* be. Proper indexing can make them very reasonable.
Doh (me)...
Indeed, NOT IN (...) doesn't qualify as an anti-join since (for one) it cannot (I don't think) be optimized in this way as the entire contents of the IN() need to be determined. IOW, its not really a join but just another predicate condition whose one side is a subquery.
That said, writing out a full anti-join NOT EXISTS (or, similarly, a semi-join EXISTS) clause can be a bit tedious for ad-hoc stuff while the IN() variation is a bit more succinct and, I'd venture to say, unfortunately familiar. In can be made to work in this situation by writing the expression as IN (SELECT ref_id FROM t WHERE ref_id IS NOT NULL).
David J.