Обсуждение: [GENERAL] optimize pg_tables query ( text vs varchar ) ...why ?

Поиск
Список
Период
Сортировка

[GENERAL] optimize pg_tables query ( text vs varchar ) ...why ?

От
gmb
Дата:
Hi
For DDL purposes we make significant use of pg_catalog tables/views.
Were investigating performance issues in a typical function:

CREATE FUNCTION tableexists( s TEXT , t TEXT ) returns boolean as
$$
  SELECT count(tablename) = 1 FROM pg_tables WHERE schemaname=$1 and
tablename=$2;
$$
language sql

When change the params of above function to VARCHAR (instead of TEXT),
performance improved dramatically.
We then changed params to NAME ( as per pg_tables column type ) , but the
performance stayed more or less the same.

Can somebody explain this to me ? Is there a better way in which to handle
these ?
(This will be implemented on most object in the catalog e.g. columns,
sequences, functions, etc )

Regards
gmb



--
View this message in context:
http://www.postgresql-archive.org/optimize-pg-tables-query-text-vs-varchar-why-tp5978592.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: [GENERAL] optimize pg_tables query ( text vs varchar ) ...why ?

От
vinny
Дата:
On 2017-08-16 14:41, gmb wrote:
> Hi
> For DDL purposes we make significant use of pg_catalog tables/views.
> Were investigating performance issues in a typical function:
>
> CREATE FUNCTION tableexists( s TEXT , t TEXT ) returns boolean as
> $$
>   SELECT count(tablename) = 1 FROM pg_tables WHERE schemaname=$1 and
> tablename=$2;
> $$
> language sql
>
> When change the params of above function to VARCHAR (instead of TEXT),
> performance improved dramatically.
> We then changed params to NAME ( as per pg_tables column type ) , but
> the
> performance stayed more or less the same.
>
> Can somebody explain this to me ? Is there a better way in which to
> handle
> these ?
> (This will be implemented on most object in the catalog e.g. columns,
> sequences, functions, etc )
>
> Regards
> gmb
>
>
>
> --
> View this message in context:
> http://www.postgresql-archive.org/optimize-pg-tables-query-text-vs-varchar-why-tp5978592.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.


A wild stab in the dark: typecasting?
pg_tables returns 'name' type, not TEXT, so some sort of transformation
has to be done and that takestime.


Re: [GENERAL] optimize pg_tables query ( text vs varchar ) ...why ?

От
Tom Lane
Дата:
gmb <gmbouwer@gmail.com> writes:
> CREATE FUNCTION tableexists( s TEXT , t TEXT ) returns boolean as
> $$
>   SELECT count(tablename) = 1 FROM pg_tables WHERE schemaname=$1 and
> tablename=$2;
> $$
> language sql

> When change the params of above function to VARCHAR (instead of TEXT),
> performance improved dramatically.
> We then changed params to NAME ( as per pg_tables column type ) , but the
> performance stayed more or less the same.

> Can somebody explain this to me ?

The parser has two plausible choices for interpreting the "=" operators
in your WHERE clause: they could mean the text = text operator, or the
name = name operator.  (Type varchar has no operators of its own.)

When the presented situation is name = text, the parser will choose
the text = text operator because text is a preferred type.  When the
presented situation is name = varchar or name = name, it will choose
the name = name operator due to being a closer match.  See
https://www.postgresql.org/docs/current/static/typeconv.html
So you end up with either something like "schemaname::text = param"
or "schemaname = param::name".

After that, the planner has to implement the query, and the problem
is that the available indexes are on "schemaname" not "schemaname::text",
and they can only use the name = name operator anyway.  So you're
getting either a plan like

regression=# explain SELECT count(tablename) = 1 FROM pg_tables WHERE schemaname='foo'::text and tablename='bar'::text;
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Aggregate  (cost=116.23..116.24 rows=1 width=1)
   ->  Nested Loop  (cost=0.00..116.22 rows=1 width=64)
         Join Filter: (c.relnamespace = n.oid)
         ->  Seq Scan on pg_namespace n  (cost=0.00..1.72 rows=1 width=4)
               Filter: ((nspname)::text = 'foo'::text)
         ->  Seq Scan on pg_class c  (cost=0.00..114.48 rows=2 width=72)
               Filter: ((relkind = ANY ('{r,p}'::"char"[])) AND ((relname)::text = 'bar'::text))
(7 rows)

or one like

regression=# explain SELECT count(tablename) = 1 FROM pg_tables WHERE schemaname='foo'::name and tablename='bar'::name;
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Aggregate  (cost=9.91..9.92 rows=1 width=1)
   ->  Nested Loop  (cost=0.28..9.91 rows=1 width=64)
         Join Filter: (c.relnamespace = n.oid)
         ->  Index Scan using pg_class_relname_nsp_index on pg_class c  (cost=0.28..8.30 rows=1 width=72)
               Index Cond: (relname = 'bar'::name)
               Filter: (relkind = ANY ('{r,p}'::"char"[]))
         ->  Seq Scan on pg_namespace n  (cost=0.00..1.60 rows=1 width=4)
               Filter: (nspname = 'foo'::name)
(8 rows)

You don't generally have to worry about this when you're writing queries
with simple literal comparison values, because the parser will interpret
untyped literals as having the appropriate type automatically.  But in a
function, those parameters already have types, and they might not be the
most desirable ones for the purpose.

Personally I'd have left the function parameters as text and inserted
explicit coercions:

  SELECT count(tablename) = 1 FROM pg_tables
  WHERE schemaname = $1::name and tablename = $2::name;

            regards, tom lane


Re: [GENERAL] optimize pg_tables query ( text vs varchar ) ...why ?

От
gmb
Дата:
Thanks for taking the time, Tom.


Tom Lane-2 wrote
> After that, the planner has to implement the query, and the problem
> is that the available indexes are on "schemaname" not "schemaname::text",
> and they can only use the name = name operator anyway.

Did some digging earlier, and found exactly what you refer to here: seq scan
when using 'text' and index scan on 'name'.
I was not aware that an "incorrect" typecast can have that effect on how the
planner choose to use indexes or not use them ( may have to go back and
review a lot of other poor performing queries as well ).


Tom Lane-2 wrote
> Personally I'd have left the function parameters as text and inserted
> explicit coercions:

Just out of curiosity , is there a reason why this will be you preference ?
I ran some benchmarks using *function tableexists(  s name, t name )* and
the performance turned out pretty well.
I guess, from a "readability" point of view it may be unclear to uninformed
people what a 'name' type actually is ( it was unknown to me until this
morning ), so that may be reason enough to stick with "known" types like
TEXT.

Thanks, appreciate this.

Regards
gmb



--
View this message in context:
http://www.postgresql-archive.org/optimize-pg-tables-query-text-vs-varchar-why-tp5978592p5978619.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: [GENERAL] optimize pg_tables query ( text vs varchar ) ...why ?

От
Tom Lane
Дата:
gmb <gmbouwer@gmail.com> writes:
> Tom Lane-2 wrote
>> Personally I'd have left the function parameters as text and inserted
>> explicit coercions:

> Just out of curiosity , is there a reason why this will be you preference ?

Well, if the rest of your code thinks that table names are of type text
(which is reasonable, as most of what you might want to do with them
would be better served by text), then it seems like you want to keep
this odd catalog interaction isolated within the function rather than
advertise it in the function's API.

            regards, tom lane


Re: [GENERAL] optimize pg_tables query ( text vs varchar ) ...why ?

От
gmb
Дата:
Thanks for this , Tom



--
View this message in context:
http://www.postgresql-archive.org/optimize-pg-tables-query-text-vs-varchar-why-tp5978592p5978654.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.