Обсуждение: slow queries over information schema.tables
Hi
one customer reported slow queries over information_schema.tables. There is newer used a index over relname probably due casting to information_schema.sql_identifier.
Slow query
select * from information_schema.tables where table_name = 'pg_class';
Usually, there is hard to fix it on application level due corporate environment.
Regards
Pavel
Pavel Stehule <pavel.stehule@gmail.com> writes: > Slow query > select * from information_schema.tables where table_name = 'pg_class'; Yeah. This has been complained of many times before. The core of the problem, I think, is that we're unable to convert the condition on table_name into an indexscan on pg_class.relname, because the view has cast pg_class.relname to the sql_identifier domain. There are two different issues in that. One is that the domain might have constraints (though in reality it does not), so the planner can't throw away the CoerceToDomain node, and thus can't match the expression to the index. Even if we did throw away the CoerceToDomain, it still would not work because the domain is declared to be over varchar, and so there's a cast-to-varchar underneath the CoerceToDomain. The latter half of the problem can be exhibited in simplified form as regression=# explain select * from pg_class where relname = 'pg_class'::text; QUERY PLAN ------------------------------------------------------------ Seq Scan on pg_class (cost=0.00..175.41 rows=7 width=791) Filter: ((relname)::text = 'pg_class'::text) (2 rows) which is much stupider than what you get with a name comparison: regression=# explain select * from pg_class where relname = 'pg_class'; QUERY PLAN --------------------------------------------------------------------------------------------- Index Scan using pg_class_relname_nsp_index on pg_class (cost=0.28..8.29 rows=1 width=791) Index Cond: (relname = 'pg_class'::name) (2 rows) (We've seen complaints about this form of problem, too.) Since there's no name-vs-text equality operator, we end up applying a cast to text so that the texteq operator can be used, and now we're out of luck for matching that to the index. To add insult to injury, we also fail to match the cast expression to the available statistics, so that we don't get quite the right selectivity estimate. The most straightforward way to fix that would be to add some cross-type comparison operators to the name_ops operator family. While we only directly need 'name = text' to make this case work, the opr_sanity regression test will complain if we don't supply a fairly complete set of operators, and I think not without reason. So I experimented with doing that, as in the very-much-WIP 0001 patch below. A name index can only cope with strcmp-style comparison semantics, not strcoll-style, so while it's okay to call the equality operator = I felt we'd better call the inequality operators ~<~ and so on. While the patch as presented fixes the case shown above, it's not all good: the problem with having a new 'text = name' operator is that it will also capture cases where you would like to have a text index working with a comparison value of type "name". If 'text = name' isn't part of the text_ops opclass then that doesn't work. I think that the reason for the join.sql regression test failure shown in patch 0001 is likewise that since 'text = name' isn't part of the text_ops opclass, the join elimination stuff is unable to prove that it can eliminate a join to a unique text column. This could probably be fixed by creating yet another dozen cross-type operators that implement text vs name comparisons using strcoll semantics (hence, using the normal '<' operator names), and adding that set to the text_ops opfamily. I didn't do that here (yet), because it seems pretty tedious. Also worth noting is that the name_ops and text_pattern_ops opfamilies are implementing identical semantics. I wonder whether we could merge them. While most of the other regression test output changes shown in the 0001 patch are unsurprising, one that is surprising is that a merge join on two text columns has started sorting USING ~<~ rather than the normal text ordering. The reason for this seems to be that the 'text = text' operator is now a member of name_ops as well as text_ops, and select_outer_pathkeys_for_merge arbitrarily uses the lowest-number opfamily OID if it has a choice. We could avoid that by renumbering name_ops to have an OID higher than text_ops, though that's certainly klugy. Or we might just decide that we like that plan better anyway, since strcmp-based comparison is probably faster than strcoll-based comparison. (It'd be kind of nice if the choice were based on something more principled than OID order, though I don't especially want to do anything about that right now.) Now, 0001 by itself doesn't do much for Pavel's complaint, because the view is still forcing a cast to sql_identifier to be inserted atop pg_class.relname, even though we no longer need any cast for simple name-vs-text cases. 0002 is a really preliminary POC for eliminating CoerceToDomain nodes at plan time if the domain has no constraints to check. While this is enough to check the effects on plan selection, it's certainly not committable as-is, because the resulting plan is broken if someone then adds a constraint to the domain. (There is a regression test failure, not shown in 0002, for a case that tests exactly that scenario.) What we would have to do to make 0002 committable is to add sinval signaling to invalidate any cached plan that's dependent on an assumption of no constraints on a domain. This is something that we probably want anyway, since it would be necessary if we ever want to compile domain constraint expressions as part of the plan tree rather than leaving them to run-time. While the sinval additions per se would be fairly straightforward, I wonder whether anyone is likely to complain about the race conditions that will ensue from not taking any locks associated with the domain type; i.e. it's possible that a query would execute with slightly out-of-date assumptions about what constraints a domain has. I suspect that we have race conditions like that already, but they might be worse if we inspect the domain during planning rather than at executor startup. Is anyone worried enough about that to want to add locking overhead to domain usage? (I'm not.) 0001+0002 still don't get the job done: now we strip off the CoerceToDomain successfully, but we still have "relname::varchar" being compared to the comparison value, so we still can't match that to the index. 0003 is a somewhat klugy solution to that part, allowing indxpath.c to construct a name equality test from a texteq restriction condition. (This is semantically OK because equality is equality in both name and text domains. We could not convert inequalities, at least not as freely; maybe it could be done in C locale, but I've not done anything about that here.) With all three patches in place, we get regression=# explain select * from pg_class where relname::information_schema.sql_identifier = 'foo'::text; QUERY PLAN --------------------------------------------------------------------------------------------- Index Scan using pg_class_relname_nsp_index on pg_class (cost=0.28..8.30 rows=7 width=781) Index Cond: (relname = 'foo'::text) Filter: ((relname)::text = 'foo'::text) (3 rows) so we've fixed the lack of indexscan but we're still a bit off about the statistics. I don't have any immediate proposal about how to fix the latter, but anyway this is enough to make Pavel's case a lot better. 0003 essentially converts "namecol::text texteq textvalue" into "namecol nameeqtext textvalue", relying on the new equality operator introduced by 0001. Another way we could approach this is to dispense with 0001 altogether, and use a variant of 0003 that converts such a clause to "namecol nameeq textvalue::name". At first glance it seems like that wouldn't work quite right: the cast to name will silently truncate overlength text values, which could allow such a value to be found equal to a name that it shouldn't be equal to. However, it really would be OK because the context is that we're assuming the converted clause is lossy (i.e. might have false matches), and so we'll recheck the original clause as a filter condition, which will get rid of such false matches. The recheck behavior is just slightly-annoying overhead in 0003 as presented, but it'd be essential in the other version. I'm not entirely sure whether to go with 0001+0003 or this alternative approach. The alternative is surely far less invasive; 0001 might have more side effects I haven't thought of. However, my gut feeling is that 0001 would be a good thing on balance because it'd give the system considerably more information about name-vs-text comparisons than it has now. In principle that should lead to better plans for other cases besides the narrow one we're specifically thinking of. Comments, complaints, other ideas? regards, tom lane diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c index 6f2ad23..103c97a 100644 *** a/src/backend/access/nbtree/nbtcompare.c --- b/src/backend/access/nbtree/nbtcompare.c *************** btnamesortsupport(PG_FUNCTION_ARGS) *** 360,362 **** --- 360,408 ---- ssup->comparator = btnamefastcmp; PG_RETURN_VOID(); } + + Datum + btnametextcmp(PG_FUNCTION_ARGS) + { + Name arg1 = PG_GETARG_NAME(0); + text *arg2 = PG_GETARG_TEXT_PP(1); + size_t len1 = strlen(NameStr(*arg1)); + size_t len2 = VARSIZE_ANY_EXHDR(arg2); + int32 result; + + result = memcmp(NameStr(*arg1), VARDATA_ANY(arg2), Min(len1, len2)); + if (result == 0) + { + if (len1 > len2) + result = A_GREATER_THAN_B; + else if (len1 < len2) + result = A_LESS_THAN_B; + } + + PG_FREE_IF_COPY(arg2, 1); + + PG_RETURN_INT32(result); + } + + Datum + bttextnamecmp(PG_FUNCTION_ARGS) + { + text *arg1 = PG_GETARG_TEXT_PP(0); + Name arg2 = PG_GETARG_NAME(1); + size_t len1 = VARSIZE_ANY_EXHDR(arg1); + size_t len2 = strlen(NameStr(*arg2)); + int32 result; + + result = memcmp(VARDATA_ANY(arg1), NameStr(*arg2), Min(len1, len2)); + if (result == 0) + { + if (len1 > len2) + result = A_GREATER_THAN_B; + else if (len1 < len2) + result = A_LESS_THAN_B; + } + + PG_FREE_IF_COPY(arg1, 0); + + PG_RETURN_INT32(result); + } diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c index c266da2..1ae3214 100644 *** a/src/backend/utils/adt/name.c --- b/src/backend/utils/adt/name.c *************** namege(PG_FUNCTION_ARGS) *** 184,191 **** PG_RETURN_BOOL(strncmp(NameStr(*arg1), NameStr(*arg2), NAMEDATALEN) >= 0); } ! /* (see char.c for comparison/operation routines) */ int namecpy(Name n1, const NameData *n2) --- 184,292 ---- PG_RETURN_BOOL(strncmp(NameStr(*arg1), NameStr(*arg2), NAMEDATALEN) >= 0); } + /* + * Cross-type comparison functions using name (i.e., strcmp) semantics. + */ ! Datum ! nameeqtext(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(btnametextcmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) == 0); ! } ! ! Datum ! namenetext(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(btnametextcmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) != 0); ! } ! ! Datum ! namelttext(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(btnametextcmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) < 0); ! } ! ! Datum ! nameletext(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(btnametextcmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) <= 0); ! } ! ! Datum ! namegttext(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(btnametextcmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) > 0); ! } ! ! Datum ! namegetext(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(btnametextcmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) >= 0); ! } ! ! Datum ! texteqname(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(bttextnamecmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) == 0); ! } ! ! Datum ! textnename(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(bttextnamecmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) != 0); ! } ! ! Datum ! textltname(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(bttextnamecmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) < 0); ! } ! ! Datum ! textlename(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(bttextnamecmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) <= 0); ! } ! ! Datum ! textgtname(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(bttextnamecmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) > 0); ! } ! ! Datum ! textgename(PG_FUNCTION_ARGS) ! { ! PG_RETURN_BOOL(DatumGetInt32(DirectFunctionCall2(bttextnamecmp, ! PG_GETARG_DATUM(0), ! PG_GETARG_DATUM(1))) >= 0); ! } ! ! /* ! * Miscellaneous support functions ! */ int namecpy(Name n1, const NameData *n2) diff --git a/src/include/catalog/pg_amop.dat b/src/include/catalog/pg_amop.dat index 075a54c..914cc2a 100644 *** a/src/include/catalog/pg_amop.dat --- b/src/include/catalog/pg_amop.dat *************** *** 317,322 **** --- 317,373 ---- amoprighttype => 'name', amopstrategy => '5', amopopr => '>(name,name)', amopmethod => 'btree' }, + # crosstype operators name v text + { amopfamily => 'btree/name_ops', amoplefttype => 'name', + amoprighttype => 'text', amopstrategy => '1', amopopr => '~<~(name,text)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'name', + amoprighttype => 'text', amopstrategy => '2', amopopr => '~<=~(name,text)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'name', + amoprighttype => 'text', amopstrategy => '3', amopopr => '=(name,text)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'name', + amoprighttype => 'text', amopstrategy => '4', amopopr => '~>=~(name,text)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'name', + amoprighttype => 'text', amopstrategy => '5', amopopr => '~>~(name,text)', + amopmethod => 'btree' }, + + # crosstype operators text v name + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'name', amopstrategy => '1', amopopr => '~<~(text,name)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'name', amopstrategy => '2', amopopr => '~<=~(text,name)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'name', amopstrategy => '3', amopopr => '=(text,name)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'name', amopstrategy => '4', amopopr => '~>=~(text,name)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'name', amopstrategy => '5', amopopr => '~>~(text,name)', + amopmethod => 'btree' }, + + # crosstype operators text v text (same as text_pattern_ops) + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'text', amopstrategy => '1', amopopr => '~<~(text,text)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'text', amopstrategy => '2', amopopr => '~<=~(text,text)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'text', amopstrategy => '3', amopopr => '=(text,text)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'text', amopstrategy => '4', amopopr => '~>=~(text,text)', + amopmethod => 'btree' }, + { amopfamily => 'btree/name_ops', amoplefttype => 'text', + amoprighttype => 'text', amopstrategy => '5', amopopr => '~>~(text,text)', + amopmethod => 'btree' }, + # btree text_ops { amopfamily => 'btree/text_ops', amoplefttype => 'text', diff --git a/src/include/catalog/pg_amproc.dat b/src/include/catalog/pg_amproc.dat index 0ef2c08..e0bf039 100644 *** a/src/include/catalog/pg_amproc.dat --- b/src/include/catalog/pg_amproc.dat *************** *** 152,157 **** --- 152,166 ---- amprocrighttype => 'name', amprocnum => '1', amproc => 'btnamecmp' }, { amprocfamily => 'btree/name_ops', amproclefttype => 'name', amprocrighttype => 'name', amprocnum => '2', amproc => 'btnamesortsupport' }, + { amprocfamily => 'btree/name_ops', amproclefttype => 'name', + amprocrighttype => 'text', amprocnum => '1', amproc => 'btnametextcmp' }, + { amprocfamily => 'btree/name_ops', amproclefttype => 'text', + amprocrighttype => 'name', amprocnum => '1', amproc => 'bttextnamecmp' }, + { amprocfamily => 'btree/name_ops', amproclefttype => 'text', + amprocrighttype => 'text', amprocnum => '1', amproc => 'bttext_pattern_cmp' }, + { amprocfamily => 'btree/name_ops', amproclefttype => 'text', + amprocrighttype => 'text', amprocnum => '2', + amproc => 'bttext_pattern_sortsupport' }, { amprocfamily => 'btree/numeric_ops', amproclefttype => 'numeric', amprocrighttype => 'numeric', amprocnum => '1', amproc => 'numeric_cmp' }, { amprocfamily => 'btree/numeric_ops', amproclefttype => 'numeric', diff --git a/src/include/catalog/pg_opclass.dat b/src/include/catalog/pg_opclass.dat index 13928ba..9ea8783 100644 *** a/src/include/catalog/pg_opclass.dat --- b/src/include/catalog/pg_opclass.dat *************** *** 96,101 **** --- 96,103 ---- { opcmethod => 'btree', opcname => 'name_ops', opcfamily => 'btree/name_ops', opcintype => 'name', opckeytype => 'cstring' }, + { opcmethod => 'btree', opcname => 'name_text_ops', + opcfamily => 'btree/name_ops', opcintype => 'text', opcdefault => 'f' }, { opcmethod => 'hash', opcname => 'name_ops', opcfamily => 'hash/name_ops', opcintype => 'name' }, { oid => '3125', oid_symbol => 'NUMERIC_BTREE_OPS_OID', diff --git a/src/include/catalog/pg_operator.dat b/src/include/catalog/pg_operator.dat index ce23c2f..d488c85 100644 *** a/src/include/catalog/pg_operator.dat --- b/src/include/catalog/pg_operator.dat *************** *** 74,80 **** oprright => 'char', oprresult => 'bool', oprcom => '=(char,char)', oprnegate => '<>(char,char)', oprcode => 'chareq', oprrest => 'eqsel', oprjoin => 'eqjoinsel' }, ! { oid => '93', descr => 'equal', oprname => '=', oprcanmerge => 't', oprcanhash => 't', oprleft => 'name', oprright => 'name', oprresult => 'bool', oprcom => '=(name,name)', oprnegate => '<>(name,name)', oprcode => 'nameeq', oprrest => 'eqsel', --- 74,80 ---- oprright => 'char', oprresult => 'bool', oprcom => '=(char,char)', oprnegate => '<>(char,char)', oprcode => 'chareq', oprrest => 'eqsel', oprjoin => 'eqjoinsel' }, ! { oid => '93', oid_symbol => 'NameEqualOperator', descr => 'equal', oprname => '=', oprcanmerge => 't', oprcanhash => 't', oprleft => 'name', oprright => 'name', oprresult => 'bool', oprcom => '=(name,name)', oprnegate => '<>(name,name)', oprcode => 'nameeq', oprrest => 'eqsel', *************** *** 107,112 **** --- 107,169 ---- oprcode => 'starts_with', oprrest => 'prefixsel', oprjoin => 'prefixjoinsel' }, + { oid => '254', oid_symbol => 'NameEqualTextOperator', descr => 'equal', + oprname => '=', oprcanmerge => 't', oprleft => 'name', oprright => 'text', + oprresult => 'bool', oprcom => '=(text,name)', oprnegate => '<>(name,text)', + oprcode => 'nameeqtext', oprrest => 'eqsel', oprjoin => 'eqjoinsel' }, + { oid => '255', descr => 'less than', + oprname => '~<~', oprleft => 'name', oprright => 'text', oprresult => 'bool', + oprcom => '~>~(text,name)', oprnegate => '~>=~(name,text)', + oprcode => 'namelttext', oprrest => 'scalarltsel', + oprjoin => 'scalarltjoinsel' }, + { oid => '256', descr => 'less than or equal', + oprname => '~<=~', oprleft => 'name', oprright => 'text', oprresult => 'bool', + oprcom => '~>=~(text,name)', oprnegate => '~>~(name,text)', + oprcode => 'nameletext', oprrest => 'scalarlesel', + oprjoin => 'scalarlejoinsel' }, + { oid => '257', descr => 'greater than or equal', + oprname => '~>=~', oprleft => 'name', oprright => 'text', oprresult => 'bool', + oprcom => '~<=~(text,name)', oprnegate => '~<~(name,text)', + oprcode => 'namegetext', oprrest => 'scalargesel', + oprjoin => 'scalargejoinsel' }, + { oid => '258', descr => 'greater than', + oprname => '~>~', oprleft => 'name', oprright => 'text', oprresult => 'bool', + oprcom => '~<~(text,name)', oprnegate => '~<=~(name,text)', + oprcode => 'namegttext', oprrest => 'scalargtsel', + oprjoin => 'scalargtjoinsel' }, + { oid => '259', descr => 'not equal', + oprname => '<>', oprleft => 'name', oprright => 'text', oprresult => 'bool', + oprcom => '<>(text,name)', oprnegate => '=(name,text)', + oprcode => 'namenetext', oprrest => 'neqsel', oprjoin => 'neqjoinsel' }, + { oid => '260', oid_symbol => 'TextEqualNameOperator', descr => 'equal', + oprname => '=', oprcanmerge => 't', oprleft => 'text', oprright => 'name', + oprresult => 'bool', oprcom => '=(name,text)', oprnegate => '<>(text,name)', + oprcode => 'texteqname', oprrest => 'eqsel', oprjoin => 'eqjoinsel' }, + { oid => '261', descr => 'less than', + oprname => '~<~', oprleft => 'text', oprright => 'name', oprresult => 'bool', + oprcom => '~>~(name,text)', oprnegate => '~>=~(text,name)', + oprcode => 'textltname', oprrest => 'scalarltsel', + oprjoin => 'scalarltjoinsel' }, + { oid => '262', descr => 'less than or equal', + oprname => '~<=~', oprleft => 'text', oprright => 'name', oprresult => 'bool', + oprcom => '~>=~(name,text)', oprnegate => '~>~(text,name)', + oprcode => 'textlename', oprrest => 'scalarlesel', + oprjoin => 'scalarlejoinsel' }, + { oid => '263', descr => 'greater than or equal', + oprname => '~>=~', oprleft => 'text', oprright => 'name', oprresult => 'bool', + oprcom => '~<=~(name,text)', oprnegate => '~<~(text,name)', + oprcode => 'textgename', oprrest => 'scalargesel', + oprjoin => 'scalargejoinsel' }, + { oid => '264', descr => 'greater than', + oprname => '~>~', oprleft => 'text', oprright => 'name', oprresult => 'bool', + oprcom => '~<~(name,text)', oprnegate => '~<=~(text,name)', + oprcode => 'textgtname', oprrest => 'scalargtsel', + oprjoin => 'scalargtjoinsel' }, + { oid => '265', descr => 'not equal', + oprname => '<>', oprleft => 'text', oprright => 'name', oprresult => 'bool', + oprcom => '<>(name,text)', oprnegate => '=(text,name)', + oprcode => 'textnename', oprrest => 'neqsel', oprjoin => 'neqjoinsel' }, + { oid => '349', descr => 'append element onto end of array', oprname => '||', oprleft => 'anyarray', oprright => 'anyelement', oprresult => 'anyarray', oprcode => 'array_append' }, diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 034a41e..b74a4cb 100644 *** a/src/include/catalog/pg_proc.dat --- b/src/include/catalog/pg_proc.dat *************** *** 673,678 **** --- 673,721 ---- proname => 'line_distance', prorettype => 'float8', proargtypes => 'line line', prosrc => 'line_distance' }, + { oid => '240', + proname => 'nameeqtext', proleakproof => 't', prorettype => 'bool', + proargtypes => 'name text', prosrc => 'nameeqtext' }, + { oid => '241', + proname => 'namelttext', proleakproof => 't', prorettype => 'bool', + proargtypes => 'name text', prosrc => 'namelttext' }, + { oid => '242', + proname => 'nameletext', proleakproof => 't', prorettype => 'bool', + proargtypes => 'name text', prosrc => 'nameletext' }, + { oid => '243', + proname => 'namegetext', proleakproof => 't', prorettype => 'bool', + proargtypes => 'name text', prosrc => 'namegetext' }, + { oid => '244', + proname => 'namegttext', proleakproof => 't', prorettype => 'bool', + proargtypes => 'name text', prosrc => 'namegttext' }, + { oid => '245', + proname => 'namenetext', proleakproof => 't', prorettype => 'bool', + proargtypes => 'name text', prosrc => 'namenetext' }, + { oid => '246', descr => 'less-equal-greater', + proname => 'btnametextcmp', proleakproof => 't', prorettype => 'int4', + proargtypes => 'name text', prosrc => 'btnametextcmp' }, + { oid => '247', + proname => 'texteqname', proleakproof => 't', prorettype => 'bool', + proargtypes => 'text name', prosrc => 'texteqname' }, + { oid => '248', + proname => 'textltname', proleakproof => 't', prorettype => 'bool', + proargtypes => 'text name', prosrc => 'textltname' }, + { oid => '249', + proname => 'textlename', proleakproof => 't', prorettype => 'bool', + proargtypes => 'text name', prosrc => 'textlename' }, + { oid => '250', + proname => 'textgename', proleakproof => 't', prorettype => 'bool', + proargtypes => 'text name', prosrc => 'textgename' }, + { oid => '251', + proname => 'textgtname', proleakproof => 't', prorettype => 'bool', + proargtypes => 'text name', prosrc => 'textgtname' }, + { oid => '252', + proname => 'textnename', proleakproof => 't', prorettype => 'bool', + proargtypes => 'text name', prosrc => 'textnename' }, + { oid => '253', descr => 'less-equal-greater', + proname => 'bttextnamecmp', proleakproof => 't', prorettype => 'int4', + proargtypes => 'text name', prosrc => 'bttextnamecmp' }, + { oid => '274', descr => 'current date and time - increments during transactions', proname => 'timeofday', provolatile => 'v', prorettype => 'text', diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index 1f53780..a72881a 100644 *** a/src/test/regress/expected/join.out --- b/src/test/regress/expected/join.out *************** from *** 4393,4410 **** left join uniquetbl u1 ON u1.f1 = t1.string4) ss on t0.f1 = ss.case1 where ss.stringu2 !~* ss.case1; ! QUERY PLAN ! -------------------------------------------------------------------------------------------- ! Nested Loop ! Join Filter: (CASE t1.ten WHEN 0 THEN 'doh!'::text ELSE NULL::text END = t0.f1) -> Nested Loop ! -> Seq Scan on int4_tbl i4 ! -> Index Scan using tenk1_unique2 on tenk1 t1 ! Index Cond: (unique2 = i4.f1) ! Filter: (stringu2 !~* CASE ten WHEN 0 THEN 'doh!'::text ELSE NULL::text END) ! -> Materialize ! -> Seq Scan on text_tbl t0 ! (9 rows) select t0.* from --- 4393,4413 ---- left join uniquetbl u1 ON u1.f1 = t1.string4) ss on t0.f1 = ss.case1 where ss.stringu2 !~* ss.case1; ! QUERY PLAN ! -------------------------------------------------------------------------------------------------- ! Nested Loop Left Join ! Join Filter: (u1.f1 = t1.string4) -> Nested Loop ! Join Filter: (CASE t1.ten WHEN 0 THEN 'doh!'::text ELSE NULL::text END = t0.f1) ! -> Nested Loop ! -> Seq Scan on int4_tbl i4 ! -> Index Scan using tenk1_unique2 on tenk1 t1 ! Index Cond: (unique2 = i4.f1) ! Filter: (stringu2 !~* CASE ten WHEN 0 THEN 'doh!'::text ELSE NULL::text END) ! -> Materialize ! -> Seq Scan on text_tbl t0 ! -> Seq Scan on uniquetbl u1 ! (12 rows) select t0.* from diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out index 6072f6b..5edeaab 100644 *** a/src/test/regress/expected/opr_sanity.out --- b/src/test/regress/expected/opr_sanity.out *************** int24ge(smallint,integer) *** 525,530 **** --- 525,544 ---- int42ge(integer,smallint) oideq(oid,oid) oidne(oid,oid) + nameeqtext(name,text) + namelttext(name,text) + nameletext(name,text) + namegetext(name,text) + namegttext(name,text) + namenetext(name,text) + btnametextcmp(name,text) + texteqname(text,name) + textltname(text,name) + textlename(text,name) + textgename(text,name) + textgtname(text,name) + textnename(text,name) + bttextnamecmp(text,name) float4eq(real,real) float4ne(real,real) float4lt(real,real) diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out index c55de5d..a000695 100644 *** a/src/test/regress/expected/partition_join.out --- b/src/test/regress/expected/partition_join.out *************** SELECT t1.a, t2.b FROM prt1 t1, prt2 t2 *** 1045,1058 **** Merge Join Merge Cond: ((t1.a = t2.b) AND (((((t1.*)::prt1))::text) = ((((t2.*)::prt2))::text))) -> Sort ! Sort Key: t1.a, ((((t1.*)::prt1))::text) -> Result -> Append -> Seq Scan on prt1_p1 t1 -> Seq Scan on prt1_p2 t1_1 -> Seq Scan on prt1_p3 t1_2 -> Sort ! Sort Key: t2.b, ((((t2.*)::prt2))::text) -> Result -> Append -> Seq Scan on prt2_p1 t2 --- 1045,1058 ---- Merge Join Merge Cond: ((t1.a = t2.b) AND (((((t1.*)::prt1))::text) = ((((t2.*)::prt2))::text))) -> Sort ! Sort Key: t1.a, ((((t1.*)::prt1))::text) USING ~<~ -> Result -> Append -> Seq Scan on prt1_p1 t1 -> Seq Scan on prt1_p2 t1_1 -> Seq Scan on prt1_p3 t1_2 -> Sort ! Sort Key: t2.b, ((((t2.*)::prt2))::text) USING ~<~ -> Result -> Append -> Seq Scan on prt2_p1 t2 diff --git a/src/test/regress/expected/select_views.out b/src/test/regress/expected/select_views.out index bf003ad..1aeed84 100644 *** a/src/test/regress/expected/select_views.out --- b/src/test/regress/expected/select_views.out *************** NOTICE: f_leak => hamburger *** 1326,1335 **** (1 row) EXPLAIN (COSTS OFF) SELECT * FROM my_property_normal WHERE f_leak(passwd); ! QUERY PLAN ! -------------------------------------------------------------- Seq Scan on customer ! Filter: (f_leak(passwd) AND (name = (CURRENT_USER)::text)) (2 rows) SELECT * FROM my_property_secure WHERE f_leak(passwd); --- 1326,1335 ---- (1 row) EXPLAIN (COSTS OFF) SELECT * FROM my_property_normal WHERE f_leak(passwd); ! QUERY PLAN ! ------------------------------------------------------ Seq Scan on customer ! Filter: (f_leak(passwd) AND (name = CURRENT_USER)) (2 rows) SELECT * FROM my_property_secure WHERE f_leak(passwd); *************** NOTICE: f_leak => passwd123 *** 1340,1351 **** (1 row) EXPLAIN (COSTS OFF) SELECT * FROM my_property_secure WHERE f_leak(passwd); ! QUERY PLAN ! ----------------------------------------------- Subquery Scan on my_property_secure Filter: f_leak(my_property_secure.passwd) -> Seq Scan on customer ! Filter: (name = (CURRENT_USER)::text) (4 rows) -- --- 1340,1351 ---- (1 row) EXPLAIN (COSTS OFF) SELECT * FROM my_property_secure WHERE f_leak(passwd); ! QUERY PLAN ! --------------------------------------------- Subquery Scan on my_property_secure Filter: f_leak(my_property_secure.passwd) -> Seq Scan on customer ! Filter: (name = CURRENT_USER) (4 rows) -- *************** NOTICE: f_leak => hamburger *** 1367,1376 **** EXPLAIN (COSTS OFF) SELECT * FROM my_property_normal v WHERE f_leak('passwd') AND f_leak(passwd); ! QUERY PLAN ! ----------------------------------------------------------------------------------------- Seq Scan on customer ! Filter: (f_leak('passwd'::text) AND f_leak(passwd) AND (name = (CURRENT_USER)::text)) (2 rows) SELECT * FROM my_property_secure v --- 1367,1376 ---- EXPLAIN (COSTS OFF) SELECT * FROM my_property_normal v WHERE f_leak('passwd') AND f_leak(passwd); ! QUERY PLAN ! --------------------------------------------------------------------------------- Seq Scan on customer ! Filter: (f_leak('passwd'::text) AND f_leak(passwd) AND (name = CURRENT_USER)) (2 rows) SELECT * FROM my_property_secure v *************** NOTICE: f_leak => passwd *** 1386,1397 **** EXPLAIN (COSTS OFF) SELECT * FROM my_property_secure v WHERE f_leak('passwd') AND f_leak(passwd); ! QUERY PLAN ! ---------------------------------------------------------------------------- Subquery Scan on v Filter: f_leak(v.passwd) -> Seq Scan on customer ! Filter: (f_leak('passwd'::text) AND (name = (CURRENT_USER)::text)) (4 rows) -- --- 1386,1397 ---- EXPLAIN (COSTS OFF) SELECT * FROM my_property_secure v WHERE f_leak('passwd') AND f_leak(passwd); ! QUERY PLAN ! -------------------------------------------------------------------- Subquery Scan on v Filter: f_leak(v.passwd) -> Seq Scan on customer ! Filter: (f_leak('passwd'::text) AND (name = CURRENT_USER)) (4 rows) -- *************** NOTICE: f_leak => 9801-2345-6789-0123 *** 1409,1423 **** (1 row) EXPLAIN (COSTS OFF) SELECT * FROM my_credit_card_normal WHERE f_leak(cnum); ! QUERY PLAN ! ----------------------------------------------------- Hash Join Hash Cond: (r.cid = l.cid) -> Seq Scan on credit_card r Filter: f_leak(cnum) -> Hash -> Seq Scan on customer l ! Filter: (name = (CURRENT_USER)::text) (7 rows) SELECT * FROM my_credit_card_secure WHERE f_leak(cnum); --- 1409,1423 ---- (1 row) EXPLAIN (COSTS OFF) SELECT * FROM my_credit_card_normal WHERE f_leak(cnum); ! QUERY PLAN ! --------------------------------------------- Hash Join Hash Cond: (r.cid = l.cid) -> Seq Scan on credit_card r Filter: f_leak(cnum) -> Hash -> Seq Scan on customer l ! Filter: (name = CURRENT_USER) (7 rows) SELECT * FROM my_credit_card_secure WHERE f_leak(cnum); *************** NOTICE: f_leak => 1111-2222-3333-4444 *** 1428,1435 **** (1 row) EXPLAIN (COSTS OFF) SELECT * FROM my_credit_card_secure WHERE f_leak(cnum); ! QUERY PLAN ! ----------------------------------------------------------- Subquery Scan on my_credit_card_secure Filter: f_leak(my_credit_card_secure.cnum) -> Hash Join --- 1428,1435 ---- (1 row) EXPLAIN (COSTS OFF) SELECT * FROM my_credit_card_secure WHERE f_leak(cnum); ! QUERY PLAN ! --------------------------------------------------- Subquery Scan on my_credit_card_secure Filter: f_leak(my_credit_card_secure.cnum) -> Hash Join *************** EXPLAIN (COSTS OFF) SELECT * FROM my_cre *** 1437,1443 **** -> Seq Scan on credit_card r -> Hash -> Seq Scan on customer l ! Filter: (name = (CURRENT_USER)::text) (8 rows) -- --- 1437,1443 ---- -> Seq Scan on credit_card r -> Hash -> Seq Scan on customer l ! Filter: (name = CURRENT_USER) (8 rows) -- *************** EXPLAIN (COSTS OFF) SELECT * FROM my_cre *** 1471,1477 **** -> Seq Scan on credit_card r_1 -> Hash -> Seq Scan on customer l_1 ! Filter: (name = (CURRENT_USER)::text) (13 rows) SELECT * FROM my_credit_card_usage_secure --- 1471,1477 ---- -> Seq Scan on credit_card r_1 -> Hash -> Seq Scan on customer l_1 ! Filter: (name = CURRENT_USER) (13 rows) SELECT * FROM my_credit_card_usage_secure *************** EXPLAIN (COSTS OFF) SELECT * FROM my_cre *** 1502,1508 **** -> Seq Scan on credit_card r_1 -> Hash -> Seq Scan on customer l ! Filter: (name = (CURRENT_USER)::text) (13 rows) -- --- 1502,1508 ---- -> Seq Scan on credit_card r_1 -> Hash -> Seq Scan on customer l ! Filter: (name = CURRENT_USER) (13 rows) -- diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 8df3693..34c8f68 100644 *** a/src/backend/optimizer/util/clauses.c --- b/src/backend/optimizer/util/clauses.c *************** eval_const_expressions_mutator(Node *nod *** 3699,3704 **** --- 3699,3761 ---- newbtest->location = btest->location; return (Node *) newbtest; } + case T_CoerceToDomain: + { + /* + * If the domain currently has no constraints, we replace the + * CoerceToDomain node with a simple RelabelType, which is + * both far faster to execute and more amenable to later + * optimization. (XXX this would need to be backed up with + * some plan invalidation support, but leave that for later.) + * + * Also, in estimation mode, just drop the CoerceToDomain + * node, effectively assuming that the coercion will succeed. + */ + CoerceToDomain *cdomain = (CoerceToDomain *) node; + CoerceToDomain *newcdomain; + Node *arg; + + arg = eval_const_expressions_mutator((Node *) cdomain->arg, + context); + if (context->estimate || + !DomainHasConstraints(cdomain->resulttype)) + { + /* This bit should match the RelabelType logic above */ + while (arg && IsA(arg, RelabelType)) + arg = (Node *) ((RelabelType *) arg)->arg; + + if (arg && IsA(arg, Const)) + { + Const *con = (Const *) arg; + + con->consttype = cdomain->resulttype; + con->consttypmod = cdomain->resulttypmod; + con->constcollid = cdomain->resultcollid; + return (Node *) con; + } + else + { + RelabelType *newrelabel = makeNode(RelabelType); + + newrelabel->arg = (Expr *) arg; + newrelabel->resulttype = cdomain->resulttype; + newrelabel->resulttypmod = cdomain->resulttypmod; + newrelabel->resultcollid = cdomain->resultcollid; + newrelabel->relabelformat = cdomain->coercionformat; + newrelabel->location = cdomain->location; + return (Node *) newrelabel; + } + } + + newcdomain = makeNode(CoerceToDomain); + newcdomain->arg = (Expr *) arg; + newcdomain->resulttype = cdomain->resulttype; + newcdomain->resulttypmod = cdomain->resulttypmod; + newcdomain->resultcollid = cdomain->resultcollid; + newcdomain->coercionformat = cdomain->coercionformat; + newcdomain->location = cdomain->location; + return (Node *) newcdomain; + } case T_PlaceHolderVar: /* *************** eval_const_expressions_mutator(Node *nod *** 3770,3776 **** * For any node type not handled above, copy the node unchanged but * const-simplify its subexpressions. This is the correct thing for node * types whose behavior might change between planning and execution, such ! * as CoerceToDomain. It's also a safe default for new node types not * known to this routine. */ return ece_generic_processing(node); --- 3827,3833 ---- * For any node type not handled above, copy the node unchanged but * const-simplify its subexpressions. This is the correct thing for node * types whose behavior might change between planning and execution, such ! * as CurrentOfExpr. It's also a safe default for new node types not * known to this routine. */ return ece_generic_processing(node); diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 5f46415..c490364 100644 *** a/src/backend/optimizer/path/indxpath.c --- b/src/backend/optimizer/path/indxpath.c *************** *** 35,40 **** --- 35,41 ---- #include "optimizer/var.h" #include "utils/builtins.h" #include "utils/bytea.h" + #include "utils/fmgroids.h" #include "utils/lsyscache.h" #include "utils/pg_locale.h" #include "utils/selfuncs.h" *************** *** 44,49 **** --- 45,53 ---- #define IndexCollMatchesExprColl(idxcollation, exprcollation) \ ((idxcollation) == InvalidOid || (idxcollation) == (exprcollation)) + /* XXX this ought to be available from fmgroids.h, but it isn't */ + #define F_NAME_VARCHAR 1401 + /* Whether we are looking for plain indexscan, bitmap scan, or either */ typedef enum { *************** static bool ec_member_matches_indexcol(P *** 172,177 **** --- 176,189 ---- void *arg); static bool match_boolean_index_clause(Node *clause, int indexcol, IndexOptInfo *index); + static bool match_name_index_clause(Node *leftop, + Node *rightop, + Relids left_relids, + Relids right_relids, + Oid expr_op, + bool plain_op, + int indexcol, + IndexOptInfo *index); static bool match_special_index_operator(Expr *clause, Oid opfamily, Oid idxcollation, bool indexkey_on_left); *************** match_clause_to_indexcol(IndexOptInfo *i *** 2474,2479 **** --- 2486,2504 ---- return false; } + /* + * If we didn't match above, perhaps it is a textual operator that we + * could convert to a name operator. + */ + if (opfamily == NAME_BTREE_FAM_OID) + { + if (match_name_index_clause(leftop, rightop, + left_relids, right_relids, + expr_op, plain_op, + indexcol, index)) + return true; + } + return false; } *************** match_index_to_operand(Node *operand, *** 3331,3337 **** * because constant simplification does the reverse transformation; * without this code there'd be no way to use such an index at all.) * ! * Three routines are provided here: * * match_special_index_operator() is just an auxiliary function for * match_clause_to_indexcol(); after the latter fails to recognize a --- 3356,3372 ---- * because constant simplification does the reverse transformation; * without this code there'd be no way to use such an index at all.) * ! * Another thing we do here is to support indexing of textual comparisons to ! * "name" columns, for instance "WHERE pg_class.relname::text = something". ! * We can handle that by replacing the text operator with a name operator. ! * ! * Because this infrastructure was designed for the lossy-operator case, ! * we'll end up applying the original operator as a qpqual condition even ! * though it is redundant in the "boolean" and "name" cases. This is ! * annoying, but such cases don't come up often enough to justify refactoring ! * to avoid it (at least, not yet). ! * ! * Four routines are provided here: * * match_special_index_operator() is just an auxiliary function for * match_clause_to_indexcol(); after the latter fails to recognize a *************** match_index_to_operand(Node *operand, *** 3342,3353 **** * match_boolean_index_clause() similarly detects clauses that can be * converted into boolean equality operators. * * expand_indexqual_conditions() converts a list of RestrictInfo nodes * (with implicit AND semantics across list elements) into a list of clauses * that the executor can actually handle. For operators that are members of * the index's opfamily this transformation is a no-op, but clauses recognized ! * by match_special_index_operator() or match_boolean_index_clause() must be ! * converted into one or more "regular" indexqual conditions. */ /* --- 3377,3391 ---- * match_boolean_index_clause() similarly detects clauses that can be * converted into boolean equality operators. * + * match_name_index_clause() similarly detects clauses that can be + * converted into name comparison operators. + * * expand_indexqual_conditions() converts a list of RestrictInfo nodes * (with implicit AND semantics across list elements) into a list of clauses * that the executor can actually handle. For operators that are members of * the index's opfamily this transformation is a no-op, but clauses recognized ! * by the above routines must be converted into one or more "regular" ! * indexqual conditions. */ /* *************** match_boolean_index_clause(Node *clause, *** 3393,3398 **** --- 3431,3523 ---- } /* + * extract_name_from_text_coercion + * Helper function for match_name_index_clause. + * + * If the given operand represents coercion of a "name" value to "text" + * or "varchar", return the "name" subexpression. Else return NULL. + */ + static Node * + extract_name_from_text_coercion(Node *operand) + { + /* Ignore any RelabelType node above the operand */ + if (operand && IsA(operand, RelabelType)) + operand = (Node *) ((RelabelType *) operand)->arg; + + /* Match if it's a call to name_text(), in either of its guises */ + if (operand && IsA(operand, FuncExpr) && + (((FuncExpr *) operand)->funcid == F_NAME_TEXT || + ((FuncExpr *) operand)->funcid == F_NAME_VARCHAR)) + { + Assert(list_length(((FuncExpr *) operand)->args) == 1); + return (Node *) linitial(((FuncExpr *) operand)->args); + } + + return NULL; + } + + /* + * match_name_index_clause + * Recognize restriction clauses that can be matched to a "name" index. + * + * This should be called only when the index's operator family is + * NAME_BTREE_FAM_OID. We check to see if the clause is a textual comparison + * clause that can be converted into a name comparison matching the index. + * Since this is called after match_clause_to_indexcol has disassembled the + * clause, we rely on that work rather than re-disassembling the clause. + * + * Converting texteq is safe because text and name have the same notion of + * equality, viz bitwise. If we ever support collations in which text + * equality is not bitwise, we'd need to reject texteq clauses using such + * collations. We cannot convert textual inequality clauses because they + * might use a different sort order. (XXX: maybe convert if C locale?) + */ + static bool + match_name_index_clause(Node *leftop, + Node *rightop, + Relids left_relids, + Relids right_relids, + Oid expr_op, + bool plain_op, + int indexcol, + IndexOptInfo *index) + { + Index index_relid = index->rel->relid; + Node *nameop; + + /* + * For the moment, only consider text equality. + */ + if (expr_op != TextEqualOperator) + return false; + + /* We don't support ScalarArrayOp either */ + if (!plain_op) + return false; + + /* + * Check to see if we have (indexkey::text = constant), following the same + * rules as in match_clause_to_indexcol about what is a constant. + */ + if ((nameop = extract_name_from_text_coercion(leftop)) != NULL && + match_index_to_operand(nameop, indexcol, index) && + !bms_is_member(index_relid, right_relids) && + !contain_volatile_functions(rightop)) + return true; + + /* + * Also try (constant = indexkey::text). + */ + if ((nameop = extract_name_from_text_coercion(rightop)) != NULL && + match_index_to_operand(nameop, indexcol, index) && + !bms_is_member(index_relid, left_relids) && + !contain_volatile_functions(leftop)) + return true; + + return false; + } + + /* * match_special_index_operator * Recognize restriction clauses that can be used to generate * additional indexscanable qualifications. *************** expand_boolean_index_clause(Node *clause *** 3732,3738 **** * The input is a single RestrictInfo, the output a list of RestrictInfos. * * In the base case this is just list_make1(), but we have to be prepared to ! * expand special cases that were accepted by match_special_index_operator(). */ static List * expand_indexqual_opclause(RestrictInfo *rinfo, Oid opfamily, Oid idxcollation) --- 3857,3864 ---- * The input is a single RestrictInfo, the output a list of RestrictInfos. * * In the base case this is just list_make1(), but we have to be prepared to ! * expand special cases that were accepted by match_special_index_operator() ! * or match_name_index_clause(). */ static List * expand_indexqual_opclause(RestrictInfo *rinfo, Oid opfamily, Oid idxcollation) *************** expand_indexqual_opclause(RestrictInfo * *** 3814,3819 **** --- 3940,3990 ---- patt->constvalue); } break; + + /* + * This is in lots of opclasses, but we should only change it if + * it was matched to a "name" index. + */ + case TextEqualOperator: + if (opfamily == NAME_BTREE_FAM_OID) + { + Node *nameop; + bool leftisname = false; + bool rightisname = false; + Oid new_op; + Expr *opexpr; + + /* Drop any coercion to text on either side */ + if ((nameop = extract_name_from_text_coercion(leftop)) != NULL) + { + leftop = nameop; + leftisname = true; + } + if ((nameop = extract_name_from_text_coercion(rightop)) != NULL) + { + rightop = nameop; + rightisname = true; + } + /* Choose new operator, construct replacement clause */ + if (leftisname) + { + if (rightisname) + new_op = NameEqualOperator; + else + new_op = NameEqualTextOperator; + } + else + { + /* Should not be here unless one side was coerced */ + Assert(rightisname); + new_op = TextEqualNameOperator; + } + opexpr = make_opclause(new_op, BOOLOID, false, + (Expr *) leftop, (Expr *) rightop, + InvalidOid, InvalidOid); + return list_make1(make_simple_restrictinfo(opexpr)); + } + break; } /* Default case: just make a list of the unmodified indexqual */
Hi, On 2018-12-05 12:24:54 -0500, Tom Lane wrote: > The core of the problem, I think, is that we're unable to convert the > condition on table_name into an indexscan on pg_class.relname, because > the view has cast pg_class.relname to the sql_identifier domain. > > There are two different issues in that. One is that the domain might > have constraints (though in reality it does not), so the planner can't > throw away the CoerceToDomain node, and thus can't match the expression > to the index. Even if we did throw away the CoerceToDomain, it still > would not work because the domain is declared to be over varchar, and > so there's a cast-to-varchar underneath the CoerceToDomain. Couldn't we make expression simplification replace CoerceToDomain with a RelabelType if the constraint is simple enough? That's not entirely trivial because we'd have to look into the typecache to get the constraints, but that doesn't sound too bad. > The most straightforward way to fix that would be to add some cross-type > comparison operators to the name_ops operator family. That seems reasonable. > While we only > directly need 'name = text' to make this case work, the opr_sanity > regression test will complain if we don't supply a fairly complete set of > operators, and I think not without reason. So I experimented with doing > that, as in the very-much-WIP 0001 patch below. A name index can only > cope with strcmp-style comparison semantics, not strcoll-style, so while > it's okay to call the equality operator = I felt we'd better call the > inequality operators ~<~ and so on. While the patch as presented fixes > the case shown above, it's not all good: the problem with having a new > 'text = name' operator is that it will also capture cases where you would > like to have a text index working with a comparison value of type "name". > If 'text = name' isn't part of the text_ops opclass then that doesn't > work. I think that the reason for the join.sql regression test failure > shown in patch 0001 is likewise that since 'text = name' isn't part of the > text_ops opclass, the join elimination stuff is unable to prove that it > can eliminate a join to a unique text column. This could probably be > fixed by creating yet another dozen cross-type operators that implement > text vs name comparisons using strcoll semantics (hence, using the normal > '<' operator names), and adding that set to the text_ops opfamily. > I didn't do that here (yet), because it seems pretty tedious. Is there a way we could make that less laborious by allowing a bit more casting? > 0002 is a really preliminary POC for eliminating CoerceToDomain nodes > at plan time if the domain has no constraints to check. While this is > enough to check the effects on plan selection, it's certainly not > committable as-is, because the resulting plan is broken if someone then > adds a constraint to the domain. (There is a regression test failure, > not shown in 0002, for a case that tests exactly that scenario.) Hah. > What we would have to do to make 0002 committable is to add sinval > signaling to invalidate any cached plan that's dependent on an > assumption of no constraints on a domain. This is something that > we probably want anyway, since it would be necessary if we ever want > to compile domain constraint expressions as part of the plan tree > rather than leaving them to run-time. Yea, that seems good. I don't like the fact that expression initialization does the lookups for constraints right now. > While the sinval additions per se would be fairly straightforward, > I wonder whether anyone is likely to complain about the race conditions > that will ensue from not taking any locks associated with the domain > type; i.e. it's possible that a query would execute with slightly > out-of-date assumptions about what constraints a domain has. I suspect > that we have race conditions like that already, but they might be worse > if we inspect the domain during planning rather than at executor > startup. Is anyone worried enough about that to want to add locking > overhead to domain usage? (I'm not.) I'm not either. > 0001+0002 still don't get the job done: now we strip off the > CoerceToDomain successfully, but we still have "relname::varchar" > being compared to the comparison value, so we still can't match > that to the index. 0003 is a somewhat klugy solution to that part, > allowing indxpath.c to construct a name equality test from a texteq > restriction condition. (This is semantically OK because equality > is equality in both name and text domains. We could not convert > inequalities, at least not as freely; maybe it could be done in > C locale, but I've not done anything about that here.) > > With all three patches in place, we get > > regression=# explain select * from pg_class where relname::information_schema.sql_identifier = 'foo'::text; > QUERY PLAN > --------------------------------------------------------------------------------------------- > Index Scan using pg_class_relname_nsp_index on pg_class (cost=0.28..8.30 rows=7 width=781) > Index Cond: (relname = 'foo'::text) > Filter: ((relname)::text = 'foo'::text) > (3 rows) > > so we've fixed the lack of indexscan but we're still a bit off about the > statistics. I don't have any immediate proposal about how to fix the > latter, but anyway this is enough to make Pavel's case a lot better. > 0003 essentially converts "namecol::text texteq textvalue" into > "namecol nameeqtext textvalue", relying on the new equality > operator introduced by 0001. Ugh, that's indeed a bit kludgy. It'd be nice to have an approach that's usable outside of one odd builtin type. I was wondering for a bit whether we could have logic to move the cast to the other side of an operator, but I don't see how we could make that generally safe. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-12-05 12:24:54 -0500, Tom Lane wrote: >> There are two different issues in that. One is that the domain might >> have constraints (though in reality it does not), so the planner can't >> throw away the CoerceToDomain node, and thus can't match the expression >> to the index. Even if we did throw away the CoerceToDomain, it still >> would not work because the domain is declared to be over varchar, and >> so there's a cast-to-varchar underneath the CoerceToDomain. > Couldn't we make expression simplification replace CoerceToDomain with a > RelabelType if the constraint is simple enough? That's not entirely > trivial because we'd have to look into the typecache to get the > constraints, but that doesn't sound too bad. Not following what you have in mind here? My 0002 throws away the CoerceToDomain if there are *no* constraints, but I can't see any situation in which we'd likely be able to ignore a constraint, simple or not. >> 0003 essentially converts "namecol::text texteq textvalue" into >> "namecol nameeqtext textvalue", relying on the new equality >> operator introduced by 0001. > Ugh, that's indeed a bit kludgy. It'd be nice to have an approach that's > usable outside of one odd builtin type. I was wondering for a bit > whether we could have logic to move the cast to the other side of an > operator, but I don't see how we could make that generally safe. Yeah. It seems like it could be a special case of a more general expression transform facility, but we have no such facility now. On the other hand, all of match_special_index_operator is an ugly single-purpose kluge already, so I'm not feeling that awful about throwing another special case into it. Someday it would be nice to replace that code with something more general and extensible, but today is not that day as far as I'm concerned. regards, tom lane
Hi, On 2018-12-05 13:22:23 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2018-12-05 12:24:54 -0500, Tom Lane wrote: > >> There are two different issues in that. One is that the domain might > >> have constraints (though in reality it does not), so the planner can't > >> throw away the CoerceToDomain node, and thus can't match the expression > >> to the index. Even if we did throw away the CoerceToDomain, it still > >> would not work because the domain is declared to be over varchar, and > >> so there's a cast-to-varchar underneath the CoerceToDomain. > > > Couldn't we make expression simplification replace CoerceToDomain with a > > RelabelType if the constraint is simple enough? That's not entirely > > trivial because we'd have to look into the typecache to get the > > constraints, but that doesn't sound too bad. > > Not following what you have in mind here? My 0002 throws away the > CoerceToDomain if there are *no* constraints, but I can't see any > situation in which we'd likely be able to ignore a constraint, > simple or not. Yea, simple probably means nonexistant for now. We could e.g. optimize some NOT NULL checks away, but it's probably not worth it. > >> 0003 essentially converts "namecol::text texteq textvalue" into > >> "namecol nameeqtext textvalue", relying on the new equality > >> operator introduced by 0001. > > > Ugh, that's indeed a bit kludgy. It'd be nice to have an approach that's > > usable outside of one odd builtin type. I was wondering for a bit > > whether we could have logic to move the cast to the other side of an > > operator, but I don't see how we could make that generally safe. > > Yeah. It seems like it could be a special case of a more general > expression transform facility, but we have no such facility now. > > On the other hand, all of match_special_index_operator is an ugly > single-purpose kluge already, so I'm not feeling that awful about > throwing another special case into it. Someday it would be nice > to replace that code with something more general and extensible, > but today is not that day as far as I'm concerned. Fair enough. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-12-05 13:22:23 -0500, Tom Lane wrote: >> Not following what you have in mind here? My 0002 throws away the >> CoerceToDomain if there are *no* constraints, but I can't see any >> situation in which we'd likely be able to ignore a constraint, >> simple or not. > Yea, simple probably means nonexistant for now. We could e.g. optimize > some NOT NULL checks away, but it's probably not worth it. Ah, yes, that is a case where we might have enough knowledge to prove the test redundant --- but considering that we explicitly discourage domain NOT NULL as bad style and not fully supported, I can't get excited about it. I suppose in some cases we might be able to use predtest.c to prove domain CHECK constraints redundant, but I bet that it's not worth the trouble. The bigger picture here is that people seem to like to use domains as simple type aliases, which will never have any constraints, but we're very dumb about that today. So the patch as presented seems like it has lots of potential applicability, as long as we fix the invalidation aspect. regards, tom lane
On 2018-12-05 13:41:32 -0500, Tom Lane wrote: > The bigger picture here is that people seem to like to use domains > as simple type aliases, which will never have any constraints, but > we're very dumb about that today. So the patch as presented seems > like it has lots of potential applicability, as long as we fix the > invalidation aspect. Entirely agreed.
On Wed, Dec 5, 2018 at 1:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ah, yes, that is a case where we might have enough knowledge to prove > the test redundant --- but considering that we explicitly discourage > domain NOT NULL as bad style and not fully supported, I can't get > excited about it. I suppose in some cases we might be able to use > predtest.c to prove domain CHECK constraints redundant, but I bet that > it's not worth the trouble. > > The bigger picture here is that people seem to like to use domains > as simple type aliases, which will never have any constraints, but > we're very dumb about that today. So the patch as presented seems > like it has lots of potential applicability, as long as we fix the > invalidation aspect. I'm not thrilled about depending on sinval without locking, particularly given that my proposal to make sure we AcceptInvalidationMessages() at least once per query was shouted down. That means that in corner cases, you could execute many queries without flushing the old cache entries. However, I do agree that locking every type, function, operator, etc. involved in the query is impractical. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I'm not thrilled about depending on sinval without locking, > particularly given that my proposal to make sure we > AcceptInvalidationMessages() at least once per query was shouted down. It's fairly hard to imagine practical cases where we'd not call AcceptInvalidationMessages at least once per query, so I'm not very sure what you're on about. regards, tom lane
On Thu, Dec 6, 2018 at 11:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > I'm not thrilled about depending on sinval without locking, > > particularly given that my proposal to make sure we > > AcceptInvalidationMessages() at least once per query was shouted down. > > It's fairly hard to imagine practical cases where we'd not call > AcceptInvalidationMessages at least once per query, so I'm not > very sure what you're on about. Unless I'm confused, it happens any time you run a query that only touches tables using lockmodes previously acquired by the current transaction. Like: BEGIN; some query; the same query again; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 6, 2018 at 11:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It's fairly hard to imagine practical cases where we'd not call >> AcceptInvalidationMessages at least once per query, so I'm not >> very sure what you're on about. > Unless I'm confused, it happens any time you run a query that only > touches tables using lockmodes previously acquired by the current > transaction. Like: > BEGIN; > some query; > the same query again; In my testing, that still hits AIM() during parserOpenTable(). [ further experimentation... ] It looks like if you prepare a query and then just execute it repeatedly in one transaction, you'd not reach AIM (as long as you were getting generic plans). Possibly that's a gap worth closing. regards, tom lane
On Thu, Dec 6, 2018 at 12:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > In my testing, that still hits AIM() during parserOpenTable(). Oh, I see. relation_openrv_extended() calls it. > [ further experimentation... ] It looks like if you prepare > a query and then just execute it repeatedly in one transaction, > you'd not reach AIM (as long as you were getting generic plans). > Possibly that's a gap worth closing. If we called it at the start of every query, couldn't we dispense with the call in relation_openrv_extended()? On net, that would actually mean fewer calls to AcceptInvalidationMessages(), assuming you sometimes run queries that touch multiple relations. And the behavior would be more predictable, too, because you'd (hopefully) have no cases where a command failed to see the results of DDL that committed before the command was issued. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 6, 2018 at 12:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> [ further experimentation... ] It looks like if you prepare >> a query and then just execute it repeatedly in one transaction, >> you'd not reach AIM (as long as you were getting generic plans). >> Possibly that's a gap worth closing. > If we called it at the start of every query, couldn't we dispense with > the call in relation_openrv_extended()? No. You need to do AIM *after* obtaining the lock, else you still have the race condition that you can execute a query on a table without being aware of recent DDL on it. What we could possibly do to close the gap cited above is to have plancache.c's CheckCachedPlan force an AIM call if it notices that the plan it wants to re-use has a non-empty PlanInvalItems list. If it does not, then there is nothing that AIM would cause invalidation for anyway. This still leaves us with a query-duration-sized race condition window for DDL on functions and domains, but not any larger than that. regards, tom lane
On Thu, Dec 6, 2018 at 12:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > If we called it at the start of every query, couldn't we dispense with > > the call in relation_openrv_extended()? > > No. You need to do AIM *after* obtaining the lock, else you still > have the race condition that you can execute a query on a table > without being aware of recent DDL on it. Huh? The call in relation_openrv_extended() happens before acquiring the lock. > What we could possibly do to close the gap cited above is to have > plancache.c's CheckCachedPlan force an AIM call if it notices that > the plan it wants to re-use has a non-empty PlanInvalItems list. > If it does not, then there is nothing that AIM would cause invalidation > for anyway. This still leaves us with a query-duration-sized race > condition window for DDL on functions and domains, but not any larger > than that. That would be a nice place to get. Not perfect, but better than now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 6, 2018 at 12:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> No. You need to do AIM *after* obtaining the lock, else you still >> have the race condition that you can execute a query on a table >> without being aware of recent DDL on it. > Huh? The call in relation_openrv_extended() happens before acquiring the lock. Oh, I was looking at the wrong AIM call :-(. You're talking about this one: /* * Check for shared-cache-inval messages before trying to open the * relation. This is needed even if we already hold a lock on the * relation, because GRANT/REVOKE are executed without taking any lock on * the target relation, and we want to be sure we see current ACL * information. We can skip this if asked for NoLock, on the assumption * that such a call is not the first one in the current command, and so we * should be reasonably up-to-date already. (XXX this all could stand to * be redesigned, but for the moment we'll keep doing this like it's been * done historically.) */ if (lockmode != NoLock) AcceptInvalidationMessages(); which is indeed about as bletcherous as it could possibly be. The ideal thing would be for GRANT/REVOKE to act more like other DDL, but I'm not sure how we get to that point: REVOKE SELECT, at the very least, would have to take AccessExclusiveLock so that it'd block SELECT. People doing things like GRANT/REVOKE ON ALL TABLES would doubtless complain about the greatly increased risk of deadlock from taking a pile of AELs. And that still would only fix the problem for table privileges, not privileges on other object types that we have no consistent locking scheme for. I can see the attraction of replacing these AIM calls (and, probably, the one in AtStart_Cache) with a single call that occurs somewhere during statement startup. Not sure exactly where that should be; but there is certainly not anything principled about doing it in relation_open_xxx(). regards, tom lane
I wrote: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> Slow query >> select * from information_schema.tables where table_name = 'pg_class'; > Yeah. This has been complained of many times before. > The core of the problem, I think, is that we're unable to convert the > condition on table_name into an indexscan on pg_class.relname, because > the view has cast pg_class.relname to the sql_identifier domain. > There are two different issues in that. One is that the domain might > have constraints (though in reality it does not), so the planner can't > throw away the CoerceToDomain node, and thus can't match the expression > to the index. Even if we did throw away the CoerceToDomain, it still > would not work because the domain is declared to be over varchar, and > so there's a cast-to-varchar underneath the CoerceToDomain. After my last few commits, the only issue that's left here is the cast-to-varchar implied by casting to sql_identifier. Upthread I showed a possible planner hack to get rid of that, and we could still solve it that way so far as allowing indexscans on catalogs is concerned. However, I wonder what people would think of a more aggressive approach, viz: diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index 0fbcfa8..3891e3b 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -216,7 +216,7 @@ CREATE DOMAIN character_data AS character varying COLLATE "C"; * SQL_IDENTIFIER domain */ -CREATE DOMAIN sql_identifier AS character varying COLLATE "C"; +CREATE DOMAIN sql_identifier AS name; I've not checked to verify that sql_identifier is used for all and only those view columns that expose "name" catalog columns. If the SQL committee was sloppy about that, this idea might not work. But assuming that the length restriction is valid for the columns that have this type, would this be an OK idea? It does seem to fix the poor-plan-quality problem at a stroke, with no weird planner hacks. What I find in the SQL spec is 5.5 SQL_IDENTIFIER domain Function Define a domain that contains all valid <identifier body>s and <delimited identifier body>s. Definition CREATE DOMAIN SQL_IDENTIFIER AS CHARACTER VARYING (L) CHARACTER SET SQL_IDENTIFIER; GRANT USAGE ON DOMAIN SQL_IDENTIFIER TO PUBLIC WITH GRANT OPTION; Description 1) This domain specifies all variable-length character values that conform to the rules for formation and representation of an SQL <identifier body> or an SQL <delimited identifier body>. NOTE 4 - There is no way in SQL to specify a <domain constraint> that would be true for the body of any valid SQL <regular identifier> or <delimited identifier> and false for all other character string values. 2) L is the implementation-defined maximum length of <identifier body> and <delimited identifier body>. So we'd be violating the part of the spec that says that the domain's base type is varchar, but considering all the other requirements here that we're blithely ignoring, maybe that's not such a sin. With the recent collation changes, type name is hard to functionally distinguish from a domain over varchar anyway. Furthermore, since name's length limit corresponds to the "implementation-defined maximum length" part of the spec, you could argue that in some ways this definition is closer to the spec than what we've got now. Thoughts? regards, tom lane
čt 20. 12. 2018 v 0:14 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I wrote:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> Slow query
>> select * from information_schema.tables where table_name = 'pg_class';
> Yeah. This has been complained of many times before.
> The core of the problem, I think, is that we're unable to convert the
> condition on table_name into an indexscan on pg_class.relname, because
> the view has cast pg_class.relname to the sql_identifier domain.
> There are two different issues in that. One is that the domain might
> have constraints (though in reality it does not), so the planner can't
> throw away the CoerceToDomain node, and thus can't match the expression
> to the index. Even if we did throw away the CoerceToDomain, it still
> would not work because the domain is declared to be over varchar, and
> so there's a cast-to-varchar underneath the CoerceToDomain.
After my last few commits, the only issue that's left here is the
cast-to-varchar implied by casting to sql_identifier. Upthread
I showed a possible planner hack to get rid of that, and we could
still solve it that way so far as allowing indexscans on catalogs
is concerned. However, I wonder what people would think of a
more aggressive approach, viz:
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 0fbcfa8..3891e3b 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -216,7 +216,7 @@ CREATE DOMAIN character_data AS character varying COLLATE "C";
* SQL_IDENTIFIER domain
*/
-CREATE DOMAIN sql_identifier AS character varying COLLATE "C";
+CREATE DOMAIN sql_identifier AS name;
I've not checked to verify that sql_identifier is used for all and only
those view columns that expose "name" catalog columns. If the SQL
committee was sloppy about that, this idea might not work. But assuming
that the length restriction is valid for the columns that have this
type, would this be an OK idea? It does seem to fix the poor-plan-quality
problem at a stroke, with no weird planner hacks.
What I find in the SQL spec is
5.5 SQL_IDENTIFIER domain
Function
Define a domain that contains all valid <identifier body>s and
<delimited identifier body>s.
Definition
CREATE DOMAIN SQL_IDENTIFIER AS
CHARACTER VARYING (L)
CHARACTER SET SQL_IDENTIFIER;
GRANT USAGE ON DOMAIN SQL_IDENTIFIER
TO PUBLIC WITH GRANT OPTION;
Description
1) This domain specifies all variable-length character values that
conform to the rules for formation and representation of an SQL
<identifier body> or an SQL <delimited identifier body>.
NOTE 4 - There is no way in SQL to specify a <domain
constraint> that would be true for the body of any valid SQL
<regular identifier> or <delimited identifier> and false for all
other character string values.
2) L is the implementation-defined maximum length of <identifier
body> and <delimited identifier body>.
So we'd be violating the part of the spec that says that the domain's
base type is varchar, but considering all the other requirements here
that we're blithely ignoring, maybe that's not such a sin. With the
recent collation changes, type name is hard to functionally distinguish
from a domain over varchar anyway. Furthermore, since name's length limit
corresponds to the "implementation-defined maximum length" part of the
spec, you could argue that in some ways this definition is closer to the
spec than what we've got now.
Thoughts?
The very common will be compare with text type - some like
SELECT * FROM information_schema.tables WHERE table_name = lower('somename');
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > čt 20. 12. 2018 v 0:14 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal: >> After my last few commits, the only issue that's left here is the >> cast-to-varchar implied by casting to sql_identifier. Upthread >> I showed a possible planner hack to get rid of that, and we could >> still solve it that way so far as allowing indexscans on catalogs >> is concerned. However, I wonder what people would think of a >> more aggressive approach, viz: >> -CREATE DOMAIN sql_identifier AS character varying COLLATE "C"; >> +CREATE DOMAIN sql_identifier AS name; > The very common will be compare with text type - some like > SELECT * FROM information_schema.tables WHERE table_name = > lower('somename'); Yeah, that's not really an issue. After applying the above one-liner to HEAD, I get plans like this: regression=# explain SELECT * FROM information_schema.tables WHERE table_name = lower('somename'); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=8.76..18.60 rows=1 width=608) -> Hash Join (cost=8.34..10.07 rows=1 width=141) Hash Cond: (nc.oid = c.relnamespace) -> Seq Scan on pg_namespace nc (cost=0.00..1.62 rows=33 width=68) Filter: (NOT pg_is_other_temp_schema(oid)) -> Hash (cost=8.33..8.33 rows=1 width=77) -> Index Scan using pg_class_relname_nsp_index on pg_class c (cost=0.28..8.33 rows=1 width=77) Index Cond: ((relname)::name = 'somename'::text) Filter: ((relkind = ANY ('{r,v,f,p}'::"char"[])) AND (pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid,'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(oid,'SELECT, INSERT, UPDATE, REFERENCES'::text))) -> Nested Loop (cost=0.42..8.46 rows=1 width=132) -> Index Scan using pg_type_oid_index on pg_type t (cost=0.28..8.29 rows=1 width=72) Index Cond: (c.reloftype = oid) -> Index Scan using pg_namespace_oid_index on pg_namespace nt (cost=0.14..0.16 rows=1 width=68) Index Cond: (oid = t.typnamespace) (14 rows) You could surely argue about whether this is too complicated, but it's not the planner's fault that we've got so many conditions here ... regards, tom lane
čt 20. 12. 2018 v 5:29 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> čt 20. 12. 2018 v 0:14 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>> After my last few commits, the only issue that's left here is the
>> cast-to-varchar implied by casting to sql_identifier. Upthread
>> I showed a possible planner hack to get rid of that, and we could
>> still solve it that way so far as allowing indexscans on catalogs
>> is concerned. However, I wonder what people would think of a
>> more aggressive approach, viz:
>> -CREATE DOMAIN sql_identifier AS character varying COLLATE "C";
>> +CREATE DOMAIN sql_identifier AS name;
> The very common will be compare with text type - some like
> SELECT * FROM information_schema.tables WHERE table_name =
> lower('somename');
Yeah, that's not really an issue. After applying the above one-liner
to HEAD, I get plans like this:
regression=# explain SELECT * FROM information_schema.tables WHERE table_name =
lower('somename');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=8.76..18.60 rows=1 width=608)
-> Hash Join (cost=8.34..10.07 rows=1 width=141)
Hash Cond: (nc.oid = c.relnamespace)
-> Seq Scan on pg_namespace nc (cost=0.00..1.62 rows=33 width=68)
Filter: (NOT pg_is_other_temp_schema(oid))
-> Hash (cost=8.33..8.33 rows=1 width=77)
-> Index Scan using pg_class_relname_nsp_index on pg_class c (cost=0.28..8.33 rows=1 width=77)
Index Cond: ((relname)::name = 'somename'::text)
Filter: ((relkind = ANY ('{r,v,f,p}'::"char"[])) AND (pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)))
-> Nested Loop (cost=0.42..8.46 rows=1 width=132)
-> Index Scan using pg_type_oid_index on pg_type t (cost=0.28..8.29 rows=1 width=72)
Index Cond: (c.reloftype = oid)
-> Index Scan using pg_namespace_oid_index on pg_namespace nt (cost=0.14..0.16 rows=1 width=68)
Index Cond: (oid = t.typnamespace)
(14 rows)
You could surely argue about whether this is too complicated, but it's not
the planner's fault that we've got so many conditions here ...
this plan looks great
Pavel
regards, tom lane
I wrote: > ... However, I wonder what people would think of a > more aggressive approach, viz: > -CREATE DOMAIN sql_identifier AS character varying COLLATE "C"; > +CREATE DOMAIN sql_identifier AS name; > > I've not checked to verify that sql_identifier is used for all and only > those view columns that expose "name" catalog columns. If the SQL > committee was sloppy about that, this idea might not work. I poked into this by instrumenting the parser to see what type conversions it inserts into the information_schema views. It appears that the vast majority of conversions to sql_identifier are indeed on "name" columns, but we have some occurrences of cases like this: CAST(a.attnum AS sql_identifier) AS dtd_identifier, and some like this: CAST(p.proname || '_' || CAST(p.oid AS text) AS sql_identifier) AS specific_name It doesn't look to me like converting to name rather than varchar would have any real performance consequence in either case: certainly we're not going to be able to propagate WHERE conditions on these view columns back to any existing catalog index, regardless of the cast. However, the second case offers something else to worry about: what if the concatenation yields a string longer than NAMEDATALEN? As the code stands, the view will simply return a string that's too long to be a name, which arguably is a violation of SQL spec. If we change sql_identifier to be "name", the cast will silently truncate, which also arguably is a violation of SQL spec, because I think specific_name is supposed to be unique. (The point of concatenating the function OID is to make it so.) Perhaps we could fix this by truncating the p.proname part to ensure that the concatenation result fits in NAMEDATALEN. I'm not sure about a good way to get a correct value of NAMEDATALEN into the information_schema script, though. Worse, I don't think we expose any convenient way to truncate a string based on byte length rather than character length (substr() does the latter). So I think that a reasonable way to tackle this might be to provide a C function along the lines of nameconcatoid(name, oid) returns name which contracts to produce "$1 || '_' || $2" while truncating $1 only as much as needed to make the result fit in NAMEDATALEN. regards, tom lane
Just brainstorming here. Another crazy idea would be to get rid of "name" data type, at least from the user-visible planner point of view. It would probably have to be stored as a fixed length data type like today but with a one-byte length header. That would make it possible for the planner to use as if it was just another varchar.