Обсуждение: Confusing results with lateral references
Hi,
I am seeing different results with two queries which AFAIU have same semantics and hence are expected to give same results.
Table "public.t1"
Column | Type | Modifiers
--------+---------+-----------
val | integer |
val2 | integer |
postgres=# \d t2
Table "public.t2"
Column | Type | Modifiers
--------+---------+-----------
val | integer |
val2 | integer |
There's no data in the table to start with.
postgres=# insert into t1 values (1, 1);
postgres=# insert into t2 values (1, 1), (2, 2);
Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1
Session 2
postgres=# select * from t1, (select distinct val, val2 from t2) t2 where t1.val = t2.val for update of t1;
query waits here because of FOR UPDATE clause
Session 1
postgres=# commit;
COMMIT
Session 2 gives no rows
postgres=# select * from t1, (select distinct val, val2 from t2) t2 where t1.val = t2.val for update of t1;
val | val2 | val | val2
-----+------+-----+------
(0 rows)
Reset values of t1
postgres=# update t1 set val = 1 where val2 = 1;
UPDATE 1
Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1
Session 2
postgres=# select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2 for update of t1;
query waits here
Session 1
postgres=# commit;
COMMIT
postgres=# select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2 for update of t1;
val | val2 | val | val2
-----+------+-----+------
2 | 1 | 1 | 1
(1 row)
AFAIU, both the queries
select * from t1, (select distinct val, val2 from t2) t2 where t1.val = t2.val for update of t1;
select * from t1, (select distinct val, val2 from t2) t2 where t1.val = t2.val for update of t1;
AND
select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2 for update of t1;
have same semantic and should give same results.
select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2 for update of t1;
have same semantic and should give same results.
Is seeing different results expected behaviour?
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
There's another seemingly wrong result, not with lateral, but with FOR UPDATE.
postgres=# select * from t1;
val | val2
-----+------
1 | 1
(1 row)
postgres=# select * from t2;
val | val2
-----+------
1 | 1
2 | 2
1 | 1
(3 rows)
Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1
Session 2postgres=# select * from t1;
val | val2
-----+------
1 | 1
(1 row)
postgres=# select * from t2;
val | val2
-----+------
1 | 1
2 | 2
1 | 1
(3 rows)
Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1
postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
postgres=# commit;
COMMIT
COMMIT
Session 2 query returns two rows
select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
val | val2 | val | val2
-----+------+-----+------
2 | 1 | |
2 | 1 | |
(2 rows)
select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
val | val2 | val | val2
-----+------+-----+------
2 | 1 | |
2 | 1 | |
(2 rows)
It's confusing to see two rows from left join result when the table really has only a single row. Is this behaviour expected?
On Thu, Dec 3, 2015 at 3:49 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
Hi,I am seeing different results with two queries which AFAIU have same semantics and hence are expected to give same results.postgres=# \d t1
Table "public.t1"
Column | Type | Modifiers
--------+---------+-----------
val | integer |
val2 | integer |
postgres=# \d t2
Table "public.t2"
Column | Type | Modifiers
--------+---------+-----------
val | integer |
val2 | integer |There's no data in the table to start with.Session 2 gives results of the query
postgres=# insert into t1 values (1, 1);
postgres=# insert into t2 values (1, 1), (2, 2);
Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1
Session 2
postgres=# select * from t1, (select distinct val, val2 from t2) t2 where t1.val = t2.val for update of t1;
query waits here because of FOR UPDATE clause
Session 1
postgres=# commit;
COMMIT
Session 2 gives no rows
postgres=# select * from t1, (select distinct val, val2 from t2) t2 where t1.val = t2.val for update of t1;
val | val2 | val | val2
-----+------+-----+------
(0 rows)
Reset values of t1
postgres=# update t1 set val = 1 where val2 = 1;
UPDATE 1
Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1
Session 2
postgres=# select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2 for update of t1;
query waits here
Session 1
postgres=# commit;
COMMIT
postgres=# select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2 for update of t1;
val | val2 | val | val2
-----+------+-----+------
2 | 1 | 1 | 1
(1 row)AFAIU, both the queries
select * from t1, (select distinct val, val2 from t2) t2 where t1.val = t2.val for update of t1;AND
select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2 for update of t1;
have same semantic and should give same results.Is seeing different results expected behaviour?--Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On 2015/12/03 21:26, Ashutosh Bapat wrote: > Session 1 > postgres=# begin; > BEGIN > postgres=# update t1 set val = 2 where val2 = 1; > UPDATE 1 > > Session 2 > postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update of > t1; > > query waits > > Session 1 > postgres=# commit; > COMMIT > > > Session 2 query returns two rows > select * from t1 left join t2 on (t1.val = t2.val) for update of t1; > val | val2 | val | val2 > -----+------+-----+------ > 2 | 1 | | > 2 | 1 | | > (2 rows) > > It's confusing to see two rows from left join result when the table really > has only a single row. Is this behaviour expected? Maybe it is. Because the other table still has two (1, 1) rows, LockRows's subplan would still produce two rows in result, no? Thanks, Amit
On Fri, Dec 4, 2015 at 10:58 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2015/12/03 21:26, Ashutosh Bapat wrote:
> Session 1
> postgres=# begin;
> BEGIN
> postgres=# update t1 set val = 2 where val2 = 1;
> UPDATE 1
>
> Session 2
> postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update of
> t1;
>
> query waits
>
> Session 1
> postgres=# commit;
> COMMIT
>
>
> Session 2 query returns two rows
> select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
> val | val2 | val | val2
> -----+------+-----+------
> 2 | 1 | |
> 2 | 1 | |
> (2 rows)
>
> It's confusing to see two rows from left join result when the table really
> has only a single row. Is this behaviour expected?
Maybe it is. Because the other table still has two (1, 1) rows, LockRows's
subplan would still produce two rows in result, no?
Documentation at http://www.postgresql.org/docs/9.4/static/queries-table-expressions.html says
- (T1) LEFT OUTER JOIN (T2)
First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition with any row in T2, a joined row is added with null values in columns of T2. Thus, the joined table always has at least one row for each row in T1.
Thanks,
Amit
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: > I am seeing different results with two queries which AFAIU have same > semantics and hence are expected to give same results. > postgres=# select * from t1, (select distinct val, val2 from t2) t2ss where t1.val = t2ss.val for update of t1; > postgres=# select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2ss for update of t1; (I renamed your inline sub-selects to avoid confusion between them and the table t2.) I'm skeptical that those should be claimed to have identical semantics. In the first example, after we've found the join row (1,1,1,1), we block to see if the pending update on t1 will commit. After it does, we recheck the join condition using the updated row from t1 (and the original row from t2ss). The condition fails, so the updated row is not output. The same thing happens in the second example, ie, we consider the updated row from t1 and the non-updated row from t2ss (NOT t2). There are no join conditions to recheck (in the outer query level), so the row passes, and we output it. If you'd allowed the FOR UPDATE to propagate into the sub-select, then the sub-select's conditions would be considered as needing rechecks ... of course, that would require removing the DISTINCT. This example does show that a lateral reference to a FOR UPDATE table from a non-FOR-UPDATE subselect has confusing behavior. Maybe we ought to forbid that. regards, tom lane
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: > There's another seemingly wrong result, not with lateral, but with FOR > UPDATE. [ shrug... ] You're getting the post-update images of the two join rows that would have been reported without FOR UPDATE. This one is definitely not a bug. regards, tom lane
On Fri, Dec 4, 2015 at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: >> I am seeing different results with two queries which AFAIU have same >> semantics and hence are expected to give same results. > >> postgres=# select * from t1, (select distinct val, val2 from t2) t2ss where t1.val = t2ss.val for update of t1; > >> postgres=# select * from t1, lateral (select distinct val, val2 from t2 where t2.val = t1.val) t2ss for update of t1; > > (I renamed your inline sub-selects to avoid confusion between them and the > table t2.) > > I'm skeptical that those should be claimed to have identical semantics. > > In the first example, after we've found the join row (1,1,1,1), we block > to see if the pending update on t1 will commit. After it does, we recheck > the join condition using the updated row from t1 (and the original row > from t2ss). The condition fails, so the updated row is not output. Check. > The same thing happens in the second example, ie, we consider the updated > row from t1 and the non-updated row from t2ss (NOT t2). There are no join > conditions to recheck (in the outer query level), so the row passes, and > we output it. What's surprising is that t2.val = t1.val isn't rechecked here. I think that's not really possible, because of the DISTINCT operation, which prevents us from identifying a single row from t2 that accounts for the subquery's output row. Not sure whether it would work without the DISTINCT. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company