Обсуждение: Broken type checking for empty subqueries
Hello, after upgrade from PG11 to PG14.8 the following statement started failing: select a::BIGINT FROM ( SELECT '$maybeNumber'::VARCHAR AS a WHERE NOT '$maybeNumber' LIKE '%maybeNumber' ) AS foo; Expected (as in PG11): The query should return empty result with one column. Actual: SQL Error [22P02]: ERROR: invalid input syntax for type bigint: "$maybeNumber". Where is a workaround that shows the behavior is quite non-stable. Following passes normally while still using the same subquery with one more empty result from another one: select a::BIGINT FROM ( SELECT '$maybeNumber'::VARCHAR AS a WHERE NOT '$maybeNumber' LIKE '%maybeNumber' UNION SELECT '0' AS a WHERE false ) AS foo; Thanks all for the great work you do on PG. Marek Malevic
On 9/28/23 11:08, Marek Malevič wrote: > Hello, > > after upgrade from PG11 to PG14.8 the following statement started failing: > > select > a::BIGINT > FROM ( > SELECT > '$maybeNumber'::VARCHAR AS a > WHERE NOT '$maybeNumber' LIKE '%maybeNumber' > ) AS foo; > > Expected (as in PG11): The query should return empty result with one > column. > Actual: SQL Error [22P02]: ERROR: invalid input syntax for type bigint: > "$maybeNumber". > > Where is a workaround that shows the behavior is quite non-stable. > Following passes normally while still using the same subquery with one > more empty result from another one: > > select > a::BIGINT > FROM ( > SELECT > '$maybeNumber'::VARCHAR AS a > WHERE NOT '$maybeNumber' LIKE '%maybeNumber' > UNION > SELECT '0' AS a WHERE false > ) AS foo; I bisected this back to 4be058fe9ec5e630239b656af21fc083371f30ed, "In the planner, replace an empty FROM clause with a dummy RTE." -- Vik Fearing
On Fri, 29 Sept 2023 at 00:28, Vik Fearing <vik@postgresfriends.org> wrote:
> I bisected this back to 4be058fe9ec5e630239b656af21fc083371f30ed, "In
> the planner, replace an empty FROM clause with a dummy RTE."
This just seems to be due to the fact that that commit allowed
FROM-less subqueries to be pulled up and that now allows constant
folding to attempt to perform the cast to BIGINT. An OFFSET 0 in the
subquery would prevent the subquery being pulled up.
After the pullup, and constant folding the WHERE clause, the query becomes:
select '$maybeNumber'::bigint where false;
When we constant-fold the target list, we'll get the error as it'll
try and convert the string to a bigint.
Without that ability, we'd have to do more run-time evaluation in
queries such as:
postgres=# explain verbose select '10' + 1;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4)
Output: 11
but as you can see, the planner knows the value is 11.
There have been reports about these things before. For example [1].
I've not looked at the code to see if this would be practical or not,
but I wonder if we could reduce these bug reports by using the new
soft error reporting that's now done in the input functions to have
constant folding just silently not do any folding for the expression
if a cast fails. I don't particularly see anything wrong with making
these queries work if they currently fail. I imagine most people who
have had them fail during constant folding have just redesigned or
found some hack to prevent the folding from taking place anyway.
Overall, it wouldn't be great if we had to ensure we keep these sorts
of queries working as they did in previous versions. That would just
prevent us from adding certain optimisations that might be very
useful.
David
[1] https://www.postgresql.org/message-id/17637-5904e3fdee533c7f@postgresql.org
On Fri, 29 Sept 2023 at 01:41, David Rowley <dgrowleyml@gmail.com> wrote:
> I've not looked at the code to see if this would be practical or not,
> but I wonder if we could reduce these bug reports by using the new
> soft error reporting that's now done in the input functions to have
> constant folding just silently not do any folding for the expression
> if a cast fails. I don't particularly see anything wrong with making
> these queries work if they currently fail. I imagine most people who
> have had them fail during constant folding have just redesigned or
> found some hack to prevent the folding from taking place anyway.
I got curious if this would be difficult or not and came up with the
attached attempt-to-do-it-to-see-if-it-works grade patch.
On a very quick test, it does seem to work.
Patched:
postgres=# explain verbose select '123a'::varchar::bigint where false;
-- previously failed
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=8)
Output: ('123a'::cstring)::bigint
One-Time Filter: false
(3 rows)
postgres=# explain verbose select '123'::varchar::bigint where false;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=8)
Output: '123'::bigint
One-Time Filter: false
(3 rows)
Master:
postgres=# explain verbose select '123a'::varchar::bigint where false;
ERROR: invalid input syntax for type bigint: "123a"
postgres=# explain verbose select '123'::varchar::bigint where false;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=8)
Output: '123'::bigint
One-Time Filter: false
(3 rows)
I didn't really follow the soft error development work done in
d9f7f5d32 very closely. I'm assuming any extension that has not
updated its type input function(s) to use errsave/ereturn instead of
elog/ereport will just continue to fail and there's nothing really we
can do about that. Maybe that's not really a problem as constant
folding would raise errors today for invalid inputs for the given type
anyway. It would at least work nicer for all our built-in types that
did have their input functions modified to suppress invalid inputs
when there's an ErrorSaveContext.
Should we consider doing something like this?
David
Вложения
David Rowley <dgrowleyml@gmail.com> writes:
> I've not looked at the code to see if this would be practical or not,
> but I wonder if we could reduce these bug reports by using the new
> soft error reporting that's now done in the input functions to have
> constant folding just silently not do any folding for the expression
> if a cast fails.
Sadly, I doubt that would cover enough of the problem space to make
much difference to people who try to do this sort of thing.
> ... I imagine most people who
> have had them fail during constant folding have just redesigned or
> found some hack to prevent the folding from taking place anyway.
Yeah. The given query looks like it was already hacked to avoid
constant-folding, though I wonder if whoever wrote it really understood
that. Otherwise it'd be a lot more natural to use something like CASE.
Anyway, adding OFFSET 0 to suppress sub-select folding is probably
the best answer here.
regards, tom lane
On Fri, 29 Sept 2023 at 03:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > I've not looked at the code to see if this would be practical or not, > > but I wonder if we could reduce these bug reports by using the new > > soft error reporting that's now done in the input functions to have > > constant folding just silently not do any folding for the expression > > if a cast fails. > > Sadly, I doubt that would cover enough of the problem space to make > much difference to people who try to do this sort of thing. hmm, I guess it does nothing for stuff like overflow errors since we didn't adjust the ereports of those functions: We'd still get: postgres=# explain verbose select 0x7FFFFFFF + 1 where false; ERROR: integer out of range David
David Rowley <dgrowleyml@gmail.com> writes:
> On Fri, 29 Sept 2023 at 03:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Sadly, I doubt that would cover enough of the problem space to make
>> much difference to people who try to do this sort of thing.
> hmm, I guess it does nothing for stuff like overflow errors since we
> didn't adjust the ereports of those functions:
Yeah. I suppose we could start trying to expand the use of soft
error reporting, but it'd be a herculean, probably multi-year
task to get to where even "most cases" would be covered.
I was actually wondering after sending my previous message whether
CASE wouldn't do the job for the OP. We do have a rule that once
a CASE test expression has reduced to constant-true or constant-false,
we skip const-folding for CASE arms that are thereby proven unreachable.
Thus for instance
regression=# select case when 1 < 2 then 42 else 1/0 end;
case
------
42
(1 row)
The big trick in using this is to understand what is covered by
const-folding and what is not. Initial conversion of a literal string
isn't, because that's done at parse time. So this still fails:
regression=# select case when 1 < 2 then 42 else 'foo'::int end;
ERROR: invalid input syntax for type integer: "foo"
LINE 1: select case when 1 < 2 then 42 else 'foo'::int end;
^
but you can get around that like this:
regression=# select case when 1 < 2 then 42 else 'foo'::text::int end;
case
------
42
(1 row)
because the text-to-int conversion is considered a run-time not
parse-time operation.
So I wonder whether that weird-looking sub-select couldn't be converted to
a simple SELECT CASE with both more readability and better future-proofing
against optimizer improvements.
Another way is to push it all into procedural logic, ie a plpgsql function
that doesn't try to evaluate the problematic expression until it's
checked the test expression.
regards, tom lane
On Fri, 29 Sept 2023 at 10:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah. I suppose we could start trying to expand the use of soft > error reporting, but it'd be a herculean, probably multi-year > task to get to where even "most cases" would be covered. I had a go at trying to categorise the reports we've received over the years about this. My search was just limited to the search term "constant folding" It looks like of the 9 below, the input function soft errors would cover #0 (this report), #1 and #4. Division by 0 covers another 2 (#5 and #8). So we could fix "most cases" if we added soft errors to arithmetic. Also, from looking at [1], there has been some interest in the past to stop these surprising const-folding errors. 0. This thread. Casting issue. text can't be parsed by type's input function. effectively: # explain select '$maybeNumber'::bigint where false; ERROR: invalid input syntax for type bigint: "$maybeNumber" 1. Bug 17637 https://www.postgresql.org/message-id/17637-5904e3fdee533c7f%40postgresql.org Casting issue in case statement with tmp as (select 1::int8 id, '123.4'::text v) select t.id, case m.mapped_to when 1 then v::float8 else null end, case m.mapped_to when 2 then v::bool else null end from tmp t join tmap m on m.id = t.id; ERROR: invalid input syntax for type boolean: "123.4" 2. https://www.postgresql.org/message-id/CAAT35tGXUYgjjViNZ5%2B9nFkrOcmgE4ce%2BVvekjgKDy_C38RT2g%40mail.gmail.com immutable function raising an exception SELECT raise_exception_immutable('foo') WHERE false; ERROR: foo 3. Bug 16545 https://www.postgresql.org/message-id/16545-affff840bc4e72ca%40postgresql.org Complaint about coalesce evaluating arguments after the first non-NULL value. # SELECT coalesce((SELECT 'ONE'), (SELECT 'TWO' WHERE '123' ~ ((xpath('/tag/text()','<tag>[</tag>'))[1]::TEXT) ) ); ERROR: invalid regular expression: brackets [] not balanced ereport in RE_compile_and_cache(). 4. https://www.postgresql.org/message-id/C0FDEC5E-0E01-4FAB-A7A6-3FAC1F94B51E%40gmail.com Appears an error from the t7.value::numeric case below. -- CASE WHEN t9.is_internal_namespace = true -- AND t9.code = 'STORAGE_POSITION.STORAGE_RACK_ROW' -- AND (t10.code = 'INTEGER' OR t10.code = 'REAL') -- THEN t7.value::numeric = 1 -- ELSE false -- END not much further details. 5. https://www.postgresql.org/message-id/A737B7A37273E048B164557ADEF4A58B17BB4EF8%40ntex2010a.host.magwien.gv.at division by zero. test=> CREATE FUNCTION zero() RETURNS integer IMMUTABLE LANGUAGE SQL AS 'SELECT 0'; CREATE FUNCTION test=> SELECT CASE WHEN (SELECT zero()) = 0 THEN -1 ELSE 1/zero() END; ERROR: division by zero 6. https://www.postgresql.org/message-id/flat/11494.1144794560%40sss.pgh.pa.us#68696f2d794cd2d64f9c596782ee8f3a Some procedural language error. 7. https://www.postgresql.org/message-id/flat/815.1049942992%40sss.pgh.pa.us#0874c4dffad1d389fe3f812f18039583 A very old one. Unsure if it relates to casting or a bug that was fixed. 8. https://www.postgresql.org/message-id/20020414165222.914FB475451%40postgresql.org A very old one from 2002 SELECT CASE WHEN 1 = 2 THEN 1 / 0 WHEN 1 = 1 THEN 1.0 END; ERROR: floating point exception! The last floating point operation either exceeded legal ranges or was a divide by zero Stopped as these reports are getting very old and less valuable. David [1] https://www.postgresql.org/message-id/265964.1595523454%40sss.pgh.pa.us