Re: Initial COPY of Logical Replication is too slow
| От | Masahiko Sawada |
|---|---|
| Тема | Re: Initial COPY of Logical Replication is too slow |
| Дата | |
| Msg-id | CAD21AoA6i2ui8FMZeuU_KxX4t-fM8G==zTW2Dp6-goujttrpew@mail.gmail.com обсуждение исходный текст |
| Ответ на | Initial COPY of Logical Replication is too slow (Marcos Pegoraro <marcos@f10.com.br>) |
| Список | pgsql-hackers |
Hi, On Sat, Dec 6, 2025 at 4:19 AM Marcos Pegoraro <marcos@f10.com.br> wrote: > > Subscriber needs to ask publisher about tables and fields to COPY and it uses pg_get_publication_tables for that, and itis too slow when the number of tables is high because on every table it's subscribed it has to run this select. Yeah, if we pass a publication that a lot of tables belong to to pg_get_publication_tables(), it could take a long time to return as it needs to construct many entries. > We can get the same result with a join on pg_publication_rel. You changed the query not to use pg_get_publication_tables(): - "SELECT DISTINCT" - " (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)" - " THEN NULL ELSE gpt.attrs END)" - " FROM pg_publication p," - " LATERAL pg_get_publication_tables(p.pubname) gpt," - " pg_class c" - " WHERE gpt.relid = %u AND c.oid = gpt.relid" - " AND p.pubname IN ( %s )", - lrel->remoteid, - pub_names->data); + "SELECT CASE WHEN cardinality(r.prattrs) <> relnatts THEN " + "r.prattrs END FROM pg_class c " + "LEFT JOIN LATERAL (SELECT DISTINCT prattrs FROM " + "pg_publication_rel r INNER JOIN pg_publication p " + "ON p.oid = r.prpubid WHERE c.oid = r.prrelid AND " + "pubname in ( %s )) r ON TRUE WHERE c.oid = %u", + pub_names->data, + lrel->remoteid); Simply replacing pg_get_publication_tables() with joining on pg_publication_rel doesn't work since pg_get_publication_tables() cares for several cases, for example where the specified columns are generated columns and the specified table is a partitioned table etc. Therefore the patch doesn't pass the regression tests. I think it would make more sense to introduce a dedicated SQL function that takes the reloid as well as the list of publications and returns the relation's the column list and row filter expression while filtering unnecessary rows inside the function. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: