At Wed, 01 Feb 2017 12:13:04 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20170201.121304.267734380.horiguchi.kyotaro@lab.ntt.co.jp>
> > > I tried a committed Logical Replication environment. I found
> > > that replication between databases of different encodings did
> > > not convert encodings in character type columns. Is this
> > > behavior correct?
> >
> > The output plugin for subscription is pgoutput and it currently
> > doesn't consider encoding but would easiliy be added if desired
> > encoding is informed.
> >
> > The easiest (but somewhat seems fragile) way I can guess is,
> >
> > - Subscriber connects with client_encoding specification and the
> > output plugin pgoutput decide whether it accepts the encoding
> > or not. If the subscriber doesn't, pgoutput send data without
> > conversion.
> >
> > The attached small patch does this and works with the following
> > CREATE SUBSCRIPTION.
>
> Oops. It forgets to care conversion failure. It is amended in the
> attached patch.
>
> > CREATE SUBSCRIPTION sub1 CONNECTION 'host=/tmp port=5432 dbname=postgres client_encoding=EUC_JP' PUBLICATION pub1;
> >
> >
> > Also we may have explicit negotiation on, for example,
> > CREATE_REPLICATION_SLOT.
> >
> > 'CREATE_REPLICATION_SLOT sub1 LOGICAL pgoutput ENCODING EUC_JP'
> >
> > Or output plugin may take options.
> >
> > 'CREATE_REPLICATION_SLOT sub1 LOGICAL pgoutput OPTIONS(encoding EUC_JP)'
> >
> >
> > Any opinions?
This patch chokes replication when the publisher finds an
inconvertible character in a tuple to be sent. For the case,
dropping-then-recreating subscription is necessary to go forward.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center