Обсуждение: Dangerous Naming Confusion
Good evening,
We were expecting the queries that use dts_temp to only return 3 rows. However the subquery starting at line 36 returns ALL 250,000 rows from dts_orders. Note that the "order_id" field doesn't exist in the dts_temp table, so I'm assuming PG is using the "order_id" field from the dts_orders table. If I use explicit table references like in the query at line 48, then I get the error I would expect that the "order_id" column doesn't exist in dts_temp.
Please see my gist at https://gist.github.com/dtseiler/9ef0a5e2b1e0efc6a13d5661436d4056 for a complete test case.
I tested this on PG 12.6 and 13.2 and observed the same on both.
We were expecting the queries that use dts_temp to only return 3 rows. However the subquery starting at line 36 returns ALL 250,000 rows from dts_orders. Note that the "order_id" field doesn't exist in the dts_temp table, so I'm assuming PG is using the "order_id" field from the dts_orders table. If I use explicit table references like in the query at line 48, then I get the error I would expect that the "order_id" column doesn't exist in dts_temp.
When I use the actual column name "a" for dts_temp, then I get the 3 rows back as expected.
I'm wondering if this is expected behavior that PG uses the dts_orders.order_id value in the subquery "select order_id from dts_temp" when dts_temp doesn't have its own order_id column. I would have expected an error that the column doesn't exist. Seems very counter-intuitive to think PG would use a column from a different table.
This issue was discovered today when this logic was used in an UPDATE and ended up locking all rows in a 5M row table and brought many apps to a grinding halt. Thankfully it was caught and killed before it actually updated anything.
Thanks,
Don.
-- Don Seiler
www.seiler.us
www.seiler.us
On 3/29/21 3:00 PM, Don Seiler wrote: > Good evening, > > Please see my gist at > https://gist.github.com/dtseiler/9ef0a5e2b1e0efc6a13d5661436d4056 > <https://gist.github.com/dtseiler/9ef0a5e2b1e0efc6a13d5661436d4056> for > a complete test case. > > I tested this on PG 12.6 and 13.2 and observed the same on both. > > We were expecting the queries that use dts_temp to only return 3 rows. > However the subquery starting at line 36 returns ALL 250,000 rows from > dts_orders. Note that the "order_id" field doesn't exist in the dts_temp > table, so I'm assuming PG is using the "order_id" field from the > dts_orders table. If I use explicit table references like in the query > at line 48, then I get the error I would expect that the "order_id" > column doesn't exist in dts_temp. > > When I use the actual column name "a" for dts_temp, then I get the 3 > rows back as expected. > > I'm wondering if this is expected behavior that PG uses the > dts_orders.order_id value in the subquery "select order_id from > dts_temp" when dts_temp doesn't have its own order_id column. I would > have expected an error that the column doesn't exist. Seems very > counter-intuitive to think PG would use a column from a different table. See: https://www.postgresql.org/message-id/Pine.LNX.4.56.0308011345320.881@krusty.credativ.de > > This issue was discovered today when this logic was used in an UPDATE > and ended up locking all rows in a 5M row table and brought many apps to > a grinding halt. Thankfully it was caught and killed before it actually > updated anything. > > Thanks, > Don. > -- > Don Seiler > www.seiler.us <http://www.seiler.us> -- Adrian Klaver adrian.klaver@aklaver.com
On Mon, Mar 29, 2021 at 3:20 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 3/29/21 3:00 PM, Don Seiler wrote:
>
> I'm wondering if this is expected behavior that PG uses the
> dts_orders.order_id value in the subquery "select order_id from
> dts_temp" when dts_temp doesn't have its own order_id column. I would
> have expected an error that the column doesn't exist. Seems very
> counter-intuitive to think PG would use a column from a different table.
See:
https://www.postgresql.org/message-id/Pine.LNX.4.56.0308011345320.881@krusty.credativ.de
There is also an FAQ entry:
David J.
On Mon, Mar 29, 2021 at 5:22 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Mar 29, 2021 at 3:20 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 3/29/21 3:00 PM, Don Seiler wrote:
>
> I'm wondering if this is expected behavior that PG uses the
> dts_orders.order_id value in the subquery "select order_id from
> dts_temp" when dts_temp doesn't have its own order_id column. I would
> have expected an error that the column doesn't exist. Seems very
> counter-intuitive to think PG would use a column from a different table.
See:
https://www.postgresql.org/message-id/Pine.LNX.4.56.0308011345320.881@krusty.credativ.deThere is also an FAQ entry:David J.
Thanks,