Обсуждение: BUG #17769: Assert triggered in indxpath.c
The following bug has been logged on the website:
Bug reference: 17769
Logged by: Robins Tharakan
Email address: tharakan@gmail.com
PostgreSQL version: 15.1
Operating system: Ubuntu 20.04
Description:
This assert() is reproducible. It is similar to a recent bug-report [1],
but reporting separately since the TRAP signature is different.
TRAP: failed Assert("outer_rel->rows > 0"), File: "indxpath.c", Line: 1909,
PID: 3364016
Git: 117d2604c2@master
SQL / backtrace / backtrace full excerpt below.
Backtrace
=========
Core was generated by `postgres: 117d2604c2@master@sqith: ubuntu postgres
127.0.0.1(55644) SELECT '.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f8691c5c859 in __GI_abort () at abort.c:79
#2 0x0000558e00499dfe in ExceptionalCondition (conditionName=0x558e0064c590
"outer_rel->rows > 0", fileName=0x558e0064c4cc "indxpath.c",
lineNumber=1909) at assert.c:66
#3 0x0000558e0015194f in get_loop_count (root=0x558e01c056e0, cur_relid=3,
outer_relids=0x7f8676a230e8) at indxpath.c:1909
#4 0x0000558e0014fe33 in build_index_paths (root=0x558e01c056e0,
rel=0x558e01c201a0, index=0x558e01c2ba98, clauses=0x7ffed1004890,
useful_predicate=false, scantype=ST_ANYSCAN,
skip_nonnative_saop=0x7ffed10047a6, skip_lower_saop=0x7ffed10047a7) at
indxpath.c:957
#5 0x0000558e0014f9b4 in get_index_paths (root=0x558e01c056e0,
rel=0x558e01c201a0, index=0x558e01c2ba98, clauses=0x7ffed1004890,
bitindexpaths=0x7ffed1004b20) at indxpath.c:728
#6 0x0000558e0014f83b in get_join_index_paths (root=0x558e01c056e0,
rel=0x558e01c201a0, index=0x558e01c2ba98, rclauseset=0x7ffed1004c30,
jclauseset=0x7ffed1004d40,
eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20,
relids=0x7f8676a230c8, considered_relids=0x7ffed1004ac0) at indxpath.c:665
#7 0x0000558e0014f3a5 in consider_index_join_outer_rels
(root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98,
rclauseset=0x7ffed1004c30, jclauseset=0x7ffed1004d40,
eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20,
indexjoinclauses=0x7f8676a23368, considered_clauses=1,
considered_relids=0x7ffed1004ac0) at indxpath.c:579
#8 0x0000558e0014f1cf in consider_index_join_clauses (root=0x558e01c056e0,
rel=0x558e01c201a0, index=0x558e01c2ba98, rclauseset=0x7ffed1004c30,
jclauseset=0x7ffed1004d40,
eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20) at
indxpath.c:476
#9 0x0000558e0014eb20 in create_index_paths (root=0x558e01c056e0,
rel=0x558e01c201a0) at indxpath.c:306
#10 0x0000558e0013625e in set_plain_rel_pathlist (root=0x558e01c056e0,
rel=0x558e01c201a0, rte=0x558e01c05050) at allpaths.c:768
#11 0x0000558e00135dcb in set_rel_pathlist (root=0x558e01c056e0,
rel=0x558e01c201a0, rti=3, rte=0x558e01c05050) at allpaths.c:484
#12 0x0000558e00135a11 in set_base_rel_pathlists (root=0x558e01c056e0) at
allpaths.c:336
#13 0x0000558e00135745 in make_one_rel (root=0x558e01c056e0,
joinlist=0x7f8676a224c8) at allpaths.c:206
#14 0x0000558e00175ae4 in query_planner (root=0x558e01c056e0,
qp_callback=0x558e0017c30f <standard_qp_callback>, qp_extra=0x7ffed1005210)
at planmain.c:278
#15 0x0000558e00178556 in grouping_planner (root=0x558e01c056e0,
tuple_fraction=0) at planner.c:1496
#16 0x0000558e00177c05 in subquery_planner (glob=0x558e01c04e30,
parse=0x558e01aa9238, parent_root=0x0, hasRecursion=false, tuple_fraction=0)
at planner.c:1065
#17 0x0000558e001761bd in standard_planner (parse=0x558e01aa9238,
query_string=0x558e01aa7f38 "SELECT\nFROM
pg_catalog.pg_statio_all_tables AS ref_0,\n LATERAL (SELECT\n", ' '
<repeats 14 times>, "WHERE ref_0.schemaname = ref_0.relname) AS subq_0\n
;", cursorOptions=2048, boundParams=0x0) at planner.c:411
Backtrace full excerpt
======================
#2 0x0000558e00499dfe in ExceptionalCondition (conditionName=0x558e0064c590
"outer_rel->rows > 0", fileName=0x558e0064c4cc "indxpath.c",
lineNumber=1909) at assert.c:66
No locals.
#3 0x0000558e0015194f in get_loop_count (root=0x558e01c056e0, cur_relid=3,
outer_relids=0x7f8676a230e8) at indxpath.c:1909
outer_rel = 0x558e01c2cf58
rowcount = 9.4860604001519336e-322
result = 0
outer_relid = 6
#4 0x0000558e0014fe33 in build_index_paths (root=0x558e01c056e0,
rel=0x558e01c201a0, index=0x558e01c2ba98, clauses=0x7ffed1004890,
useful_predicate=false, scantype=ST_ANYSCAN,
skip_nonnative_saop=0x7ffed10047a6, skip_lower_saop=0x7ffed10047a7) at
indxpath.c:957
result = 0x0
ipath = 0x558e004dff49 <palloc+278>
index_clauses = 0x7f8676a233c8
outer_relids = 0x7f8676a230e8
loop_count = 4.6475954467446791e-310
orderbyclauses = 0x0
orderbyclausecols = 0x0
index_pathkeys = 0x7ffed1004790
useful_pathkeys = 0x6491e3bad6b60900
found_lower_saop_clause = false
pathkeys_possibly_useful = false
index_is_ordered = false
index_only_scan = false
indexcol = 2
#5 0x0000558e0014f9b4 in get_index_paths (root=0x558e01c056e0,
rel=0x558e01c201a0, index=0x558e01c2ba98, clauses=0x7ffed1004890,
bitindexpaths=0x7ffed1004b20) at indxpath.c:728
indexpaths = 0x100000001
skip_nonnative_saop = false
skip_lower_saop = false
lc = 0x7ffed10047e0
#6 0x0000558e0014f83b in get_join_index_paths (root=0x558e01c056e0,
rel=0x558e01c201a0, index=0x558e01c2ba98, rclauseset=0x7ffed1004c30,
jclauseset=0x7ffed1004d40,
eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20,
relids=0x7f8676a230c8, considered_relids=0x7ffed1004ac0) at indxpath.c:665
clauseset = {nonempty = true, indexclauses = {0x7f8676a23398, 0x0
<repeats 31 times>}}
indexcol = 2
#7 0x0000558e0014f3a5 in consider_index_join_outer_rels
(root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98,
rclauseset=0x7ffed1004c30, jclauseset=0x7ffed1004d40,
eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20,
indexjoinclauses=0x7f8676a23368, considered_clauses=1,
considered_relids=0x7ffed1004ac0) at indxpath.c:579
iclause = 0x7f8676a23108
clause_relids = 0x7f8676a230c8
parent_ec = 0x7f8676a209a8
num_considered_relids = 0
lc__state = {l = 0x7f8676a23368, i = 0}
lc = 0x7f8676a23380
SQL
===
SELECT
FROM pg_catalog.pg_statio_all_tables AS ref_0,
LATERAL (SELECT
WHERE ref_0.schemaname = ref_0.relname) AS subq_0;
Thanks to SQLSmith / SQLReduce for the find.
Reference:
1.
https://www.postgresql.org/message-id/flat/CAMbWs490BcsZqXF%2BfoXiQ0zdw-FffZJA6%2BvBNYtJF0vkDyJ98g%40mail.gmail.com#1a3b983d61aeb5b76ac7f4012da5f7ea
-
Robins Tharakan
Amazon Web Services
On Thu, Feb 2, 2023 at 8:13 PM PG Bug reporting form <noreply@postgresql.org> wrote:
SQL
===
SELECT
FROM pg_catalog.pg_statio_all_tables AS ref_0,
LATERAL (SELECT
WHERE ref_0.schemaname = ref_0.relname) AS subq_0;
Thanks for the report! I can reproduce this issue with a simplified
query.
SELECT FROM
(SELECT n.nspname AS schemaname, c.relname
FROM pg_class c LEFT JOIN pg_namespace n
ON n.oid = c.relnamespace) AS ref_0,
LATERAL (SELECT WHERE ref_0.schemaname = ref_0.relname) AS subq_0;
I believe what happens here is that after we pull up the LATERAL
subquery we would have a FromExpr with only one child. Then
remove_useless_result_rtes elides it and merges its qual, which is
'n.nspname = c.relname', up to its parent. This qual is then handed to
EC machinery.
When it comes to remove_useless_joins, we try to remove the LEFT join.
We are supposed to remove any joinquals referencing rel 'n'. But here
remove_rel_from_query only checks rel->joininfo, and neglects to notice
that there is another joinqual referencing rel 'n' existing in EC. I
think this is where the problem arises.
Another finding is that in remove_useless_result_rtes after we elide the
single-child FromExpr and merge its qual up to its parent, the LEFT join
is actually can be reduced to inner join. But we fail to perform that
since all this happens after we've done outer join reduction.
Thanks
Richard
query.
SELECT FROM
(SELECT n.nspname AS schemaname, c.relname
FROM pg_class c LEFT JOIN pg_namespace n
ON n.oid = c.relnamespace) AS ref_0,
LATERAL (SELECT WHERE ref_0.schemaname = ref_0.relname) AS subq_0;
I believe what happens here is that after we pull up the LATERAL
subquery we would have a FromExpr with only one child. Then
remove_useless_result_rtes elides it and merges its qual, which is
'n.nspname = c.relname', up to its parent. This qual is then handed to
EC machinery.
When it comes to remove_useless_joins, we try to remove the LEFT join.
We are supposed to remove any joinquals referencing rel 'n'. But here
remove_rel_from_query only checks rel->joininfo, and neglects to notice
that there is another joinqual referencing rel 'n' existing in EC. I
think this is where the problem arises.
Another finding is that in remove_useless_result_rtes after we elide the
single-child FromExpr and merge its qual up to its parent, the LEFT join
is actually can be reduced to inner join. But we fail to perform that
since all this happens after we've done outer join reduction.
Thanks
Richard
Richard Guo <guofenglinux@gmail.com> writes:
> I believe what happens here is that after we pull up the LATERAL
> subquery we would have a FromExpr with only one child. Then
> remove_useless_result_rtes elides it and merges its qual, which is
> 'n.nspname = c.relname', up to its parent. This qual is then handed to
> EC machinery.
> When it comes to remove_useless_joins, we try to remove the LEFT join.
> We are supposed to remove any joinquals referencing rel 'n'. But here
> remove_rel_from_query only checks rel->joininfo, and neglects to notice
> that there is another joinqual referencing rel 'n' existing in EC. I
> think this is where the problem arises.
Actually, the problem is that we *shouldn't* try to remove the left
join, because the reference to ref_0.schemaname is logically "above"
that join. I messed up the test for that in join_is_removable().
Apparently, we had no test cases that would expose such a condition,
probably because such a reference would typically cause strength-reducing
the left join.
> Another finding is that in remove_useless_result_rtes after we elide the
> single-child FromExpr and merge its qual up to its parent, the LEFT join
> is actually can be reduced to inner join. But we fail to perform that
> since all this happens after we've done outer join reduction.
Yeah. If we *did* do that then this bug would have escaped detection
awhile longer :-(. I'm not terribly concerned about the lack of such
an optimization, but maybe we can look at it someday.
Fix pushed, thanks for investigating!
regards, tom lane