Re: Postgress 13.x: wrong result for delete with subquery

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Postgress 13.x: wrong result for delete with subquery
Дата
Msg-id 2828817.1611959177@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Postgress 13.x: wrong result for delete with subquery  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
s.p.e@gmx-topmail.de writes:
> I found a dataset (attachment bug13.csv); which produces a wrong result on postgreSQL 13 for a special delete
command:

Yeah.  "git bisect" pins this on Jeff's commit 230230223, and with a bit
of digging it's not hard to see what the problem is.  The plan ends up
as a hash join on the varchar columns:

 Delete on pg_temp_3.t1  (cost=853.22..1730.13 rows=0 width=0)
   ->  Hash Join  (cost=853.22..1730.13 rows=7289 width=12)
         Output: t1.ctid, t2.ctid
         Inner Unique: true
         Hash Cond: (((t1.id1)::text = (t2.id1)::text) AND ((t1.id2)::text = (t2.id2)::text))
         ->  Seq Scan on pg_temp_3.t1  (cost=0.00..634.55 rows=29155 width=74)
               Output: t1.ctid, t1.id1, t1.id2
         ->  Hash  (cost=809.48..809.48 rows=2916 width=74)
               Output: t2.ctid, t2.id1, t2.id2
               ->  HashAggregate  (cost=780.32..809.48 rows=2916 width=74)
                     Output: t2.ctid, t2.id1, t2.id2
                     Group Key: (t2.id1)::text, (t2.id2)::text
                     ->  Seq Scan on pg_temp_3.t2  (cost=0.00..634.55 rows=29155 width=74)
                           Output: t2.ctid, t2.id1, t2.id2, t2.id1, t2.id2

Notice that the hashagg's input relation produces five columns,
ctid, id1, id2, id1::text, id2::text (EXPLAIN doesn't show the implicit
casts on the last two, which are the hashing columns).  find_hash_columns
decides that only the first three of these need be spilled, which ends up
making the hash keys NULL in reloaded tuples.  The only astonishing thing
about this test case is that just one tuple fails to be joined.

So fundamentally, this patch confused "Vars in the qual trees" with
"input columns that we might access", which is just wrong.  In some
cases, the input columns represent expressions not plain Vars.

We might be able to salvage this by having find_hash_columns examine
the Agg node's grpColIdx list and assume that columns listed there need
to be preserved.  But frankly, now that I've seen this case, I'm not
sure that there's anything correct about the approach being used.
We might be well advised to just revert 230230223 and think harder.

            regards, tom lane



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #16843: pg_upgrade from 12.5 to 13.1 with extension plperlu failed
Следующее
От: Andrey Borodin
Дата:
Сообщение: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data