Re: slow queries over information schema.tables

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: slow queries over information schema.tables
Дата
Msg-id CAFj8pRDrfRR8MNxPTuPnN27FUcNeKwiPgiFttzt5K_XVWKOH7g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: slow queries over information schema.tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: slow queries over information schema.tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers


čt 20. 12. 2018 v 0:14 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I wrote:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> Slow query
>> select * from information_schema.tables where table_name = 'pg_class';

> Yeah.  This has been complained of many times before.

> The core of the problem, I think, is that we're unable to convert the
> condition on table_name into an indexscan on pg_class.relname, because
> the view has cast pg_class.relname to the sql_identifier domain.

> There are two different issues in that.  One is that the domain might
> have constraints (though in reality it does not), so the planner can't
> throw away the CoerceToDomain node, and thus can't match the expression
> to the index.  Even if we did throw away the CoerceToDomain, it still
> would not work because the domain is declared to be over varchar, and
> so there's a cast-to-varchar underneath the CoerceToDomain.

After my last few commits, the only issue that's left here is the
cast-to-varchar implied by casting to sql_identifier.  Upthread
I showed a possible planner hack to get rid of that, and we could
still solve it that way so far as allowing indexscans on catalogs
is concerned.  However, I wonder what people would think of a
more aggressive approach, viz:

diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 0fbcfa8..3891e3b 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -216,7 +216,7 @@ CREATE DOMAIN character_data AS character varying COLLATE "C";
  * SQL_IDENTIFIER domain
  */

-CREATE DOMAIN sql_identifier AS character varying COLLATE "C";
+CREATE DOMAIN sql_identifier AS name;



I've not checked to verify that sql_identifier is used for all and only
those view columns that expose "name" catalog columns.  If the SQL
committee was sloppy about that, this idea might not work.  But assuming
that the length restriction is valid for the columns that have this
type, would this be an OK idea?  It does seem to fix the poor-plan-quality
problem at a stroke, with no weird planner hacks.

What I find in the SQL spec is

         5.5  SQL_IDENTIFIER domain

         Function

         Define a domain that contains all valid <identifier body>s and
         <delimited identifier body>s.

         Definition

         CREATE DOMAIN SQL_IDENTIFIER AS
             CHARACTER VARYING (L)
             CHARACTER SET SQL_IDENTIFIER;

         GRANT USAGE ON DOMAIN SQL_IDENTIFIER
             TO PUBLIC WITH GRANT OPTION;

         Description

         1) This domain specifies all variable-length character values that
            conform to the rules for formation and representation of an SQL
            <identifier body> or an SQL <delimited identifier body>.

            NOTE 4 - There is no way in SQL to specify a <domain
            constraint> that would be true for the body of any valid SQL
            <regular identifier> or <delimited identifier> and false for all
            other character string values.

         2) L is the implementation-defined maximum length of <identifier
            body> and <delimited identifier body>.

So we'd be violating the part of the spec that says that the domain's
base type is varchar, but considering all the other requirements here
that we're blithely ignoring, maybe that's not such a sin.  With the
recent collation changes, type name is hard to functionally distinguish
from a domain over varchar anyway.  Furthermore, since name's length limit
corresponds to the "implementation-defined maximum length" part of the
spec, you could argue that in some ways this definition is closer to the
spec than what we've got now.

Thoughts?

The very common will be compare with text type - some like

SELECT * FROM information_schema.tables WHERE table_name = lower('somename');



                        regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Следующее
От: David Rowley
Дата:
Сообщение: Switching to 64-bit Bitmapsets