Re: [PATCH 07/16] Log enough data into the wal to reconstruct logical changes from it if wal_level=logical
От | Kevin Grittner |
---|---|
Тема | Re: [PATCH 07/16] Log enough data into the wal to reconstruct logical changes from it if wal_level=logical |
Дата | |
Msg-id | 4FD86AFA02000025000483F6@gw.wicourts.gov обсуждение исходный текст |
Ответ на | [PATCH 07/16] Log enough data into the wal to reconstruct logical changes from it if wal_level=logical (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: [PATCH 07/16] Log enough data into the wal to reconstruct logical changes from it if wal_level=logical
(Andres Freund <andres@2ndquadrant.com>)
|
Список | pgsql-hackers |
Andres Freund <andres@2ndquadrant.com> wrote: > This adds a new wal_level value 'logical' > > Missing cases: > - heap_multi_insert > - primary key changes for updates > - no primary key > - LOG_NEWPAGE First, Wow! I look forward to the point where we can replace our trigger-based replication with this! Your "missing cases" for primary key issues would not cause us any pain for our current system, since we require a primary key and don't support updates to PKs for replicated tables. While I don't expect that the first cut of this will be able to replace our replication-related functionality, I'm interested in making sure it can be extended in that direction, so I have a couple things to consider: (1) For our usage, with dozens of source databases feeding into multiple aggregate databases and interfaces, DDL replication is not of much if any interest. It should be easy enough to ignore as long as it is low volume, so that doesn't worry me too much; but if I'm missing something any you run across any logical WAL logging for DDL which does generate a lot of WAL traffic, it would be nice to have a way to turn that off at generation time rather than filtering it or ignoring it later. (Probably won't be an issue, just a head-up.) (2) To match the functionality we now have, we would need the logical stream to include the *before* image of the whole tuple for each row updated or deleted. I understand that this is not needed for the use cases you are initially targeting; I just hope the design leaves this option open without needing to disturb other use cases. Perhaps this would require yet another wal_level value. Perhaps rather than testing the current value directly for determining whether to log something, the GUC processing could set some booleans for faster testing and less code churn when the initial implementation is expanded to support other use cases (like ours). (3) Similar to point 2, it would be extremely desirable to be able to determine table name and columns names for the tuples in a stream from that stream, without needing to query a hot standby or similar digging into other sources of information. Not only will the various source databases all have different OID values for the same objects, and the aggregate targets have different values from each other and the sources, but some targets don't have the tables at all. I'm talking about our database transaction repository and the interfaces to business partners which we currently drive off of the same transaction stream which drives replication. Would it be helpful or just a distraction if I were to provide a more detailed description of our whole replication / transaction store / interface area? If it would be useful, I could also describe some other replication patterns I have seen over the years. In particular, one which might be interesting is where subsets of the data are distributed to multiple standalone machines which have intermittent or unreliable connections to a central site, which periodically collects data from all the remote sites, recalculates distribution, and sends transactions back out to those remote sites to add, remove, and update rows based on the distribution rules and the new data. -Kevin
В списке pgsql-hackers по дате отправления:
Следующее
От: Merlin MoncureДата:
Сообщение: Re: [PATCH 16/16] current version of the design document