Обсуждение: Explain is slow with tables having many columns

Поиск
Список
Период
Сортировка

Explain is slow with tables having many columns

От
legrand legrand
Дата:
Hello,
I have found that explain on tables with many (hundreds) columns
are slow compare to nominal executions.

This can break application performances when using auto_explain or
pg_store_plans.

Here is my test case (with 500 columns, can be pushed to 1000 or 1600)

create table a();

DECLARE
i int;
BEGIN
for i in 1..500
loop
execute 'alter table a add column a'||i::text||' int';
end loop;
END
$$;

#\timing
#select a500 from a;
 a500 
------
(0 rows)
Time: 0,319 ms


#explain analyze select a500 from a;
                                         QUERY PLAN                             
--------------------------------------------------------------------------------------------
 Seq Scan on a  (cost=0.00..10.40 rows=40 width=4) (actual time=0.010..0.010
rows=0 loops=1)
 Planning time: 0.347 ms
 Execution time: 0.047 ms
(3 rows)
Time: 4,290 ms


Here is a loop to try to understand where this comes from 

DO
$$
DECLARE
i int;
j int;
BEGIN
for j in 1..100
loop
for i in 1..500
loop
execute 'explain select a'||i::text||' from a';
end loop;
end loop;
END
$$;

Using perf top, most of the cpu time seems to come from relutils.c
colname_is_unique:

 59,54%  libc-2.26.so                        [.] __GI___strcmp_ssse3
  26,11%  postgres                           [.] colname_is_unique.isra.2
   1,46%  postgres                           [.] AllocSetAlloc
   1,43%  postgres                           [.] SearchCatCache3
   0,70%  postgres                           [.] set_relation_column_names
   0,56%  libc-2.26.so                       [.] __strlen_avx2


select version();
 PostgreSQL 11devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
7.2.0-8ubuntu3) 7.2.0, 64-bit

Could this be improved ?

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Re: Explain is slow with tables having many columns

От
Justin Pryzby
Дата:
On Mon, Sep 24, 2018 at 12:22:28PM -0700, legrand legrand wrote:
> Hello,
> I have found that explain on tables with many (hundreds) columns
> are slow compare to nominal executions.

See also this thread from last month:

https://www.postgresql.org/message-id/flat/CAEe%3DmRnNNL3RDKJDmY%3D_mpcpAb5ugYL9NcchELa6Qgtoz2NjCw%40mail.gmail.com

Justin


Re: Explain is slow with tables having many columns

От
legrand legrand
Дата:
Justin Pryzby wrote
> On Mon, Sep 24, 2018 at 12:22:28PM -0700, legrand legrand wrote:
>> Hello,
>> I have found that explain on tables with many (hundreds) columns
>> are slow compare to nominal executions.
> 
> See also this thread from last month:
> 
> https://www.postgresql.org/message-id/flat/CAEe%3DmRnNNL3RDKJDmY%3D_mpcpAb5ugYL9NcchELa6Qgtoz2NjCw%40mail.gmail.com
> 
> Justin

maybe, I will check that patch ...

I thought it would also have been related to
https://www.postgresql.org/message-id/CAMkU%3D1xPqHP%3D7YPeChq6n1v_qd4WGf%2BZvtnR-b%2BgyzFqtJqMMQ%40mail.gmail.com

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Re: Explain is slow with tables having many columns

От
Andres Freund
Дата:
Hi,

(CCing -hackers)

On 2018-09-24 12:22:28 -0700, legrand legrand wrote:
> I have found that explain on tables with many (hundreds) columns
> are slow compare to nominal executions.

Yea, colname_is_unique() (called via make_colname_unique()) is
essentially O(#total_columns) and rougly called once for each column in
a select list (or using or ...).  IIRC we've hit this once when I was at
citus, too.

We really should be usign a more appropriate datastructure here - very
likely a hashtable.  Unfortunately such a change would likely be a bit
too much to backpatch...

Greetings,

Andres Freund


Re: Explain is slow with tables having many columns

От
Andres Freund
Дата:
Hi,

(CCing -hackers)

On 2018-09-24 12:22:28 -0700, legrand legrand wrote:
> I have found that explain on tables with many (hundreds) columns
> are slow compare to nominal executions.

Yea, colname_is_unique() (called via make_colname_unique()) is
essentially O(#total_columns) and rougly called once for each column in
a select list (or using or ...).  IIRC we've hit this once when I was at
citus, too.

We really should be usign a more appropriate datastructure here - very
likely a hashtable.  Unfortunately such a change would likely be a bit
too much to backpatch...

Greetings,

Andres Freund


Re: Explain is slow with tables having many columns

От
Andres Freund
Дата:
Hi,

On 2018-09-24 12:43:44 -0700, legrand legrand wrote:
> Justin Pryzby wrote
> > On Mon, Sep 24, 2018 at 12:22:28PM -0700, legrand legrand wrote:
> >> Hello,
> >> I have found that explain on tables with many (hundreds) columns
> >> are slow compare to nominal executions.
> > 
> > See also this thread from last month:
> > 
> >
https://www.postgresql.org/message-id/flat/CAEe%3DmRnNNL3RDKJDmY%3D_mpcpAb5ugYL9NcchELa6Qgtoz2NjCw%40mail.gmail.com
> > 
> > Justin
> 
> maybe, I will check that patch ...
> 
> I thought it would also have been related to
> https://www.postgresql.org/message-id/CAMkU%3D1xPqHP%3D7YPeChq6n1v_qd4WGf%2BZvtnR-b%2BgyzFqtJqMMQ%40mail.gmail.com

Neither of these are related to the problem.

Greetings,

Andres Freund