Обсуждение: dumb question

Поиск
Список
Период
Сортировка

dumb question

От
Steve Clark
Дата:
Hi List,

I am a noob trying to do something that seems like it should be easy but I can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1  |        |  0
2  | 1      |  1
3  |        |  0
4  |        |  0
5  | 4      |  1
6  |        |  0
7  | 6      |  1

I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve

--



Re: dumb question

От
"David G. Johnston"
Дата:
On Thursday, June 2, 2016, Steve Clark <steve.clark@netwolves.com> wrote:
Hi List,

I am a noob trying to do something that seems like it should be easy but I can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1  |        |  0
2  | 1      |  1
3  |        |  0
4  |        |  0
5  | 4      |  1
6  |        |  0
7  | 6      |  1

I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve


So, of all the rows whose sts is 0 and the id is not in (or not exists) ref_id....subquery for selection
Give me the maximum id...parent query with group by.

David J.

Re: dumb question

От
Steve Crawford
Дата:
Something like:

select max(id) from yourtable where sts=0 and ref_id is null;

That assumes that ref_id is null. It would help to see your table structure and the query you tried that doesn't work. If ref_id is actually a character string then you might need ref_id='' or coalesce(ref_id,'')='' if it can be null or empty string.

Cheers,
Steve


On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com> wrote:
Hi List,

I am a noob trying to do something that seems like it should be easy but I can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1  |        |  0
2  | 1      |  1
3  |        |  0
4  |        |  0
5  | 4      |  1
6  |        |  0
7  | 6      |  1

I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve

--



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: dumb question

От
Felipe Santos
Дата:


2016-06-02 14:23 GMT-03:00 Steve Crawford <scrawford@pinpointresearch.com>:
Something like:

select max(id) from yourtable where sts=0 and ref_id is null;

That assumes that ref_id is null. It would help to see your table structure and the query you tried that doesn't work. If ref_id is actually a character string then you might need ref_id='' or coalesce(ref_id,'')='' if it can be null or empty string.

Cheers,
Steve


On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com> wrote:
Hi List,

I am a noob trying to do something that seems like it should be easy but I can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1  |        |  0
2  | 1      |  1
3  |        |  0
4  |        |  0
5  | 4      |  1
6  |        |  0
7  | 6      |  1

I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve


--



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



I think sts=0 means ref_id is null

So, what I think he wants to achieve is:

select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

Isn't it?

Re: dumb question

От
Steve Crawford
Дата:
On Thu, Jun 2, 2016 at 10:40 AM, Felipe Santos <felipepts@gmail.com> wrote:


2016-06-02 14:23 GMT-03:00 Steve Crawford <scrawford@pinpointresearch.com>:
Something like:

select max(id) from yourtable where sts=0 and ref_id is null;

That assumes that ref_id is null. It would help to see your table structure and the query you tried that doesn't work. If ref_id is actually a character string then you might need ref_id='' or coalesce(ref_id,'')='' if it can be null or empty string.

Cheers,
Steve


On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com> wrote:
Hi List,

I am a noob trying to do something that seems like it should be easy but I can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1  |        |  0
2  | 1      |  1
3  |        |  0
4  |        |  0
5  | 4      |  1
6  |        |  0
7  | 6      |  1

I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve


--



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



I think sts=0 means ref_id is null

So, what I think he wants to achieve is:

select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

Isn't it?

The OP will need to explain further as we are all guessing. As I mentioned in my earlier (accidental top - curses GMail) post, table structures and the query or queries that don't work would be useful. So would a description of the problem that is being solved since there could be better approaches.

Cheers,
Steve

Re: dumb question

От
"David G. Johnston"
Дата:
On Thu, Jun 2, 2016 at 1:48 PM, Steve Crawford <scrawford@pinpointresearch.com> wrote:
On Thu, Jun 2, 2016 at 10:40 AM, Felipe Santos <felipepts@gmail.com> wrote:
I think sts=0 means ref_id is null

So, what I think he wants to achieve is:

select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

Isn't it?

The OP will need to explain further as we are all guessing. As I mentioned in my earlier (accidental top - curses GMail) post, table structures and the query or queries that don't work would be useful. So would a description of the problem that is being solved since there could be better approaches.


​Maybe we mis-interpreted but as written this is the solution.  My out-loud thinking was a more verbose version of this.​
 

​David J.​

Re: dumb question

От
Steve Clark
Дата:
Thanks all the below seem to do the trick.

On 06/02/2016 01:58 PM, David G. Johnston wrote:
> select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

--
Stephen Clark

Re: dumb question

От
John R Pierce
Дата:
On 6/2/2016 11:10 AM, Steve Clark wrote:
Thanks all the below seem to do the trick.

On 06/02/2016 01:58 PM, David G. Johnston wrote:
select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);


do note, this is whats known as an 'anti-join', and these can be pretty expensive on large tables.  



-- 
john r pierce, recycling bits in santa cruz

Re: dumb question

От
"David G. Johnston"
Дата:
On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote:
On 6/2/2016 11:10 AM, Steve Clark wrote:
Thanks all the below seem to do the trick.

On 06/02/2016 01:58 PM, David G. Johnston wrote:
select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);


do note, this is whats known as an 'anti-join', and these can be pretty expensive on large tables.   

​+1
 
​Though I suspect that with a partial index on (id, sts=0) and (ref_id, ref_id IS NOT NULL), though highly sensitive to density, that even for large​ total row counts it would perform pretty well; but I'm not knowledgeable in how smart we are here.  Selecting, in descending order, (id where sts = 0), from the index and then poking into index(ref_id) should, particularly if the cross-set is sparse, pretty quickly find a non-match.

David J.

Re: dumb question

От
Steve Clark
Дата:
Thanks all the below seem to do the trick.

On 06/02/2016 01:58 PM, David G. Johnston wrote:
select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

select max(id) from yourtable where sts=0 and id not in (select ref_id from yourtable);

--
Stephen Clark

Re: dumb question

От
Kevin Grittner
Дата:
On Thu, Jun 2, 2016 at 3:23 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote:
>> Thanks all the below seem to do the trick.

I doubt it -- using NOT IN requires (per the SQL specification)
handling NULLs in a way that probably does not give you the answer
you want.  (NOT IN also is often much slower than the NOT EXISTS
test which will actually give you the answer you want.)

test=# create table t (id int not null primary key, ref_id int, sts
int not null default 0);
CREATE TABLE
test=# insert into t values
(1,null,0),(2,1,1),(3,null,0),(4,null,0),(5,4,1),(6,null,0),(7,6,1);
INSERT 0 7
test=# select max(id) from t where sts=0 and id not in (select ref_id from t);
 max
-----

(1 row)

test=# select max(id) from t t1 where sts = 0 and not exists (select *
from t t2 where t2.ref_id = t1.id);
 max
-----
   3
(1 row)

Note that providing minimal setup (like the above) helps in getting
good answers quickly.

>> do note, this is whats known as an 'anti-join', and these can be pretty
>> expensive on large tables.
>
> +1

*Can* be.  Proper indexing can make them very reasonable.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: dumb question

От
"David G. Johnston"
Дата:
On Thu, Jun 2, 2016 at 5:44 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
On Thu, Jun 2, 2016 at 3:23 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote:
>> Thanks all the below seem to do the trick.

I doubt it -- using NOT IN requires (per the SQL specification)
handling NULLs in a way that probably does not give you the answer
you want.  (NOT IN also is often much slower than the NOT EXISTS
test which will actually give you the answer you want.)

test=# create table t (id int not null primary key, ref_id int, sts
int not null default 0);
CREATE TABLE
test=# insert into t values
(1,null,0),(2,1,1),(3,null,0),(4,null,0),(5,4,1),(6,null,0),(7,6,1);
INSERT 0 7
test=# select max(id) from t where sts=0 and id not in (select ref_id from t);
 max
-----

(1 row)

test=# select max(id) from t t1 where sts = 0 and not exists (select *
from t t2 where t2.ref_id = t1.id);
 max
-----
   3
(1 row)

Note that providing minimal setup (like the above) helps in getting
good answers quickly.

>> do note, this is whats known as an 'anti-join', and these can be pretty
>> expensive on large tables.
>
> +1

*Can* be.  Proper indexing can make them very reasonable.

​Doh (me)...

Indeed, NOT IN (...) doesn't qualify as an anti-join since (for one) it cannot (I don't think) be optimized in this way as the entire contents of the IN() need to be determined.  IOW, its not really a join but just another predicate condition whose one side is a subquery.

That said, writing out a full anti-join NOT EXISTS (or, similarly, a semi-join EXISTS) clause can be a bit tedious for ad-hoc stuff while the IN() variation is a bit more succinct and, I'd venture to say, unfortunately familiar.  In can be made to work in this situation by writing the expression as IN (SELECT ref_id FROM t WHERE ref_id IS NOT NULL).

David J.