Обсуждение: Reversing pg_logical replication
I dug around on the web for comments and tools about how to reverse the master/slave relationship of a pg_logical publisher/subscriber pair without having to start fresh and re-copy all the data. All I found were several discussions that seemed to say, "You can't do it. You have to promote the subscriber to publisher, then blow off the original publisher's tables, make it a subscriber, and start from scratch."
--
eMolecules, Inc.
---------------------------------
But the create-subscription SQL has a "with (copy_data = false)", which seems to do exactly what I need. So I tried it, and it works fine. For a publish/subscript from servers A --> B, the procedure is:
- Shut off activity
- delete subscription from B
- delete publication from A
- create publication on B
- create subscription on A with copy_data = false.
It seems simple. It works. Have I overlooked something? For all the hand-wringing I found when others asked about this, I'm worried that I overlooked something.
Thanks,
Craig
---------------------------------
Craig A. James
Chief Technology OfficerCraig A. James
On 21/09/2018 17:21, Craig James wrote: > I dug around on the web for comments and tools about how to reverse the > master/slave relationship of a pg_logical publisher/subscriber pair > without having to start fresh and re-copy all the data. All I found were > several discussions that seemed to say, "You can't do it. You have to > promote the subscriber to publisher, then blow off the original > publisher's tables, make it a subscriber, and start from scratch." > > But the create-subscription SQL has a "with (copy_data = false)", which > seems to do exactly what I need. So I tried it, and it works fine. This works as long as both nodes are in sync and no one is writing to either of them while you do these changes. The hard part is doing it while both nodes are active. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Using Linux/PostgreSQL 9.5 & setup on a VM. I'm receiving the following errors during an insert using a function but this does not happen every time. When this doesoccur the database/system is unavailable until the processes are killed or the service is restarted. "LOG: server process (PID xxxxx) was terminated by signal 9: Killed DETAIL: Failed process was running select insert_data ($1, $2,....) WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because anotherserver process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect .... LOG: connection received FATAL: the database system is in recovery mode" The last 2 lines are repeated and logging is not available. How can I determine data/table/shared memory corruption? Thank you in advance for your responses. D. Ferrell
On 27/9/18 3:39 μ.μ., Ferrell, Denise SDC wrote: > Using Linux/PostgreSQL 9.5 & setup on a VM. > > I'm receiving the following errors during an insert using a function but this does not happen every time. When this doesoccur the database/system is unavailable until the processes are killed or the service is restarted. > > "LOG: server process (PID xxxxx) was terminated by signal 9: Killed > DETAIL: Failed process was running > select insert_data ($1, $2,....) > WARNING: terminating connection because of crash of another server process > DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because anotherserver process exited abnormally and possibly corrupted shared memory. > HINT: In a moment you should be able to reconnect .... > LOG: connection received > FATAL: the database system is in recovery mode" Did your DB eventually recover? Have you ruled reasons like OOM, non-optimal vm.swapiness , etc ? > > The last 2 lines are repeated and logging is not available. > > How can I determine data/table/shared memory corruption? > > Thank you in advance for your responses. > > D. Ferrell -- Achilleas Mantzios IT DEV Lead IT DEPT Dynacom Tankers Mgmt
On Thu, Sep 27, 2018 at 7:40 AM Ferrell, Denise SDC <denise.ferrell@sdc-world.com> wrote: > Using Linux/PostgreSQL 9.5 & setup on a VM. Please always show the entire version number -- it can matter a lot. One way to see it is to run `select version();` on a database connection. From the shell you can run one of the executables with the --version option, which will show the version and do nothing else. If you are running in a VM, please say which product, and show the exact version of that. > I'm receiving the following errors during an insert using a function but this does not happen every time. When this doesoccur the database/system is unavailable until the processes are killed or the service is restarted. > > "LOG: server process (PID xxxxx) was terminated by signal 9: Killed > DETAIL: Failed process was running > select insert_data ($1, $2,....) > WARNING: terminating connection because of crash of another server process > DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because anotherserver process exited abnormally and possibly corrupted shared memory. > HINT: In a moment you should be able to reconnect .... > LOG: connection received > FATAL: the database system is in recovery mode" > > The last 2 lines are repeated and logging is not available. > > How can I determine data/table/shared memory corruption? You have not shown any evidence of corruption. What happened was that some database service process was killed with SIGKILL (kill -9). Never do that. It may have been done by the Out of Memory Killer process. You should disable that by setting vm.overcommit_memory=2 and vm.overcommit_ratio=80 *or maybe as high as 90). When a database service process is killed with SIGKILL the service cannot continue without *RISK* of corruption, so it does a crash and restart. These messages: FATAL: the database system is in recovery mode Just mean that an attempt was made to the database while the recovery was in progress, so the connection attempt was rejected. Just ignore those messages. Just wait for the service to restart. -- Kevin Grittner VMware vCenter Server https://www.vmware.com/
This seems to have been the issue. Thank you for the responses. ________________________________________ From: Kevin Grittner <kgrittn@gmail.com> Sent: Thursday, September 27, 2018 9:24 AM To: Ferrell, Denise SDC Cc: pgsql-admin@postgresql.org Subject: Re: Data Corruption Errors On Thu, Sep 27, 2018 at 7:40 AM Ferrell, Denise SDC <denise.ferrell@sdc-world.com> wrote: > Using Linux/PostgreSQL 9.5 & setup on a VM. Please always show the entire version number -- it can matter a lot. One way to see it is to run `select version();` on a database connection. From the shell you can run one of the executables with the --version option, which will show the version and do nothing else. If you are running in a VM, please say which product, and show the exact version of that. > I'm receiving the following errors during an insert using a function but this does not happen every time. When this doesoccur the database/system is unavailable until the processes are killed or the service is restarted. > > "LOG: server process (PID xxxxx) was terminated by signal 9: Killed > DETAIL: Failed process was running > select insert_data ($1, $2,....) > WARNING: terminating connection because of crash of another server process > DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because anotherserver process exited abnormally and possibly corrupted shared memory. > HINT: In a moment you should be able to reconnect .... > LOG: connection received > FATAL: the database system is in recovery mode" > > The last 2 lines are repeated and logging is not available. > > How can I determine data/table/shared memory corruption? You have not shown any evidence of corruption. What happened was that some database service process was killed with SIGKILL (kill -9). Never do that. It may have been done by the Out of Memory Killer process. You should disable that by setting vm.overcommit_memory=2 and vm.overcommit_ratio=80 *or maybe as high as 90). When a database service process is killed with SIGKILL the service cannot continue without *RISK* of corruption, so it does a crash and restart. These messages: FATAL: the database system is in recovery mode Just mean that an attempt was made to the database while the recovery was in progress, so the connection attempt was rejected. Just ignore those messages. Just wait for the service to restart. -- Kevin Grittner VMware vCenter Server https://www.vmware.com/