Обсуждение: Slow pg_publication_tables with many schemas and tables

Поиск
Список
Период
Сортировка

Slow pg_publication_tables with many schemas and tables

От
Edilmar Alves
Дата:
Hi,

I use PG 11.5 into CentOS6 server, with 50 schemas, exactly equals in 
tables structure, and more than 400 tables/schema. Then, there is more 
than 20000 tables.

I found the discussion in pgsql-general thread:

https://www.postgresql.org/message-id/flat/11566.1558463253%40sss.pgh.pa.us#ec144ebcd8a829010fc82a7fe2abfd3f

but thread was closed.

Then, I sent here in performance list my problem.

-------------------------------------

I changed the original PG view like said in the above thread:

CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
SELECT
P.pubname AS pubname,
N.nspname AS schemaname,
C.relname AS tablename
FROM pg_publication P, pg_class C
JOIN pg_namespace N ON (N.oid = C.relnamespace),
LATERAL pg_get_publication_tables(P.pubname)
WHERE C.oid = pg_get_publication_tables.relid;

but the problem continues. It is very slow to process the query used by 
replication system:

SELECT DISTINCT t.schemaname, t.tablename FROM 
pg_catalog.pg_publication_tables t WHERE t.pubname IN ('mypubschema');

-------------------------------------

Then, in my case I created a publication for each schema and all tables 
with the same same of the schema, creating 50 publications.

After this, I changed the view above to this:

CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
SELECT p.pubname, c.schemaname, c.tablename
FROM pg_publication p
JOIN pg_tables c ON p.pubname = c.schemaname;

And the query below became very fast:

SELECT DISTINCT t.schemaname, t.tablename FROM 
pg_catalog.pg_publication_tables t WHERE t.pubname IN ('mypubschema');

My problem was solved but I think next version of pg should verify this 
problem to find a general solution.






Re: Slow pg_publication_tables with many schemas and tables

От
Tom Lane
Дата:
Edilmar Alves <edilmaralves@intersite.com.br> writes:
> I use PG 11.5 into CentOS6 server, with 50 schemas, exactly equals in 
> tables structure, and more than 400 tables/schema. Then, there is more 
> than 20000 tables.

Possibly you should rethink that design, but ...

> I changed the original PG view like said in the above thread:
> CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
> SELECT
> P.pubname AS pubname,
> N.nspname AS schemaname,
> C.relname AS tablename
> FROM pg_publication P, pg_class C
> JOIN pg_namespace N ON (N.oid = C.relnamespace),
> LATERAL pg_get_publication_tables(P.pubname)
> WHERE C.oid = pg_get_publication_tables.relid;
> but the problem continues. It is very slow to process the query used by 
> replication system:
> SELECT DISTINCT t.schemaname, t.tablename FROM 
> pg_catalog.pg_publication_tables t WHERE t.pubname IN ('mypubschema');

What do you get from EXPLAIN ANALYZE for that?

> After this, I changed the view above to this:
> CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
> SELECT p.pubname, c.schemaname, c.tablename
> FROM pg_publication p
> JOIN pg_tables c ON p.pubname = c.schemaname;
> And the query below became very fast:

As a wise man once said, I can make my program arbitrarily fast
if it doesn't have to give the right answer ... and this query
obviously doesn't produce the correct answer, except in the
contrived special case where the content of a publication is
exactly the content of a schema.  So I don't see what your
point is here.

Please see

https://wiki.postgresql.org/wiki/Slow_Query_Questions

regarding useful ways to present performance problems.

            regards, tom lane



Re: Slow pg_publication_tables with many schemas and tables

От
Edilmar Alves
Дата:

Hi,

Em 26/09/2019 19:11, Tom Lane escreveu:
Edilmar Alves <edilmaralves@intersite.com.br> writes:
I use PG 11.5 into CentOS6 server, with 50 schemas, exactly equals in 
tables structure, and more than 400 tables/schema. Then, there is more 
than 20000 tables.
Possibly you should rethink that design, but ...

My design is this because I have a system with 50 enterprises using the same server.

Before each enterprise used a separated database, and my webapp had a connection pool

for each database. Then, if for example, my connection pool had minconn=10 and maxconn=20,

it was totalminconn=500 and totalmaxconn=1000. When I migrated to just one database and 50

schemas, it was so better to manage just one connection pool, minor hardware resource usage.


I changed the original PG view like said in the above thread:
CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
SELECT
P.pubname AS pubname,
N.nspname AS schemaname,
C.relname AS tablename
FROM pg_publication P, pg_class C
JOIN pg_namespace N ON (N.oid = C.relnamespace),
LATERAL pg_get_publication_tables(P.pubname)
WHERE C.oid = pg_get_publication_tables.relid;
but the problem continues. It is very slow to process the query used by 
replication system:
SELECT DISTINCT t.schemaname, t.tablename FROM 
pg_catalog.pg_publication_tables t WHERE t.pubname IN ('mypubschema');
What do you get from EXPLAIN ANALYZE for that?

The Analyze from original VIEW and the VIEW suggested below

for PGv12 update have a flow diagram very similar, just one

step better in the updated version, for my cenario with 50 schemas.


After this, I changed the view above to this:
CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
SELECT p.pubname, c.schemaname, c.tablename
FROM pg_publication p
JOIN pg_tables c ON p.pubname = c.schemaname;
And the query below became very fast:
As a wise man once said, I can make my program arbitrarily fast
if it doesn't have to give the right answer ... and this query
obviously doesn't produce the correct answer, except in the
contrived special case where the content of a publication is
exactly the content of a schema.  So I don't see what your
point is here.

I know my VIEW is not a general purpose solution.

I just submitted this message to the group because

in this kind of situation of many schemas and tables/schema,

the original VIEW and the VIEW below suggested to become

the new on in PGv12 run very slow.


Please see

https://wiki.postgresql.org/wiki/Slow_Query_Questions

regarding useful ways to present performance problems.
			regards, tom lane
--
Вложения