Re: [HACKERS] Logical Replication and Character encoding
От | Kyotaro HORIGUCHI |
---|---|
Тема | Re: [HACKERS] Logical Replication and Character encoding |
Дата | |
Msg-id | 20170227.142312.56921714.horiguchi.kyotaro@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: [HACKERS] Logical Replication and Character encoding ("Shinoda, Noriyoshi" <noriyoshi.shinoda@hpe.com>) |
Ответы |
Re: [HACKERS] Logical Replication and Character encoding
(Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
|
Список | pgsql-hackers |
Sorry for the abesnse. At Fri, 24 Feb 2017 02:43:14 +0000, "Shinoda, Noriyoshi" <noriyoshi.shinoda@hpe.com> wrote in <AT5PR84MB00847ABEA48EAE9A97D51157EE520@AT5PR84MB0084.NAMPRD84.PROD.OUTLOOK.COM> > >From: Peter Eisentraut [mailto:peter.eisentraut@2ndquadrant.com] > >Sent: Friday, February 24, 2017 1:32 AM > >To: Petr Jelinek <petr.jelinek@2ndquadrant.com>; Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> > >Cc: craig@2ndquadrant.com; Shinoda, Noriyoshi <noriyoshi.shinoda@hpe.com>; pgsql-hackers@postgresql.org > >Subject: Re: [HACKERS] Logical Replication and Character encoding > > > >On 2/17/17 10:14, Peter Eisentraut wrote: > >> Well, it is sort of a libpq connection, and a proper libpq client > >> should set the client encoding, and a proper libpq server should do > >> encoding conversion accordingly. If we just play along with this, it > >> all works correctly. > >> > >> Other output plugins are free to ignore the encoding settings (just > >> like libpq can send binary data in some cases). > >> > >> The attached patch puts it all together. > > > >committed .. > However, in the case of PUBLICATION(UTF-8) and SUBSCRIOTION(EUC_JP) environment, the following error was output and theprocess went down. ... > LOG: starting logical replication worker for subscription "sub1" > LOG: logical replication apply for subscription "sub1" has started > ERROR: insufficient data left in message > LOG: worker process: logical replication worker for subscription 16439 (PID 22583) exited with exit code 1 Yeah, the patch sends converted string with the length of the orignal length. Usually encoding conversion changes the length of a string. I doubt that the reverse case was working correctly. As the result pg_sendstring is not usable for this case since we don't have the true length of the string to be sent. So my first patch did the same thing using pg_server_to_client() explicitly. That being said, I think that a more important thing is that the consensus about the policy of logical replication between databases with different encodings is refusing connection. The reason for that is it surely breaks BDR usage for some combinations of encodings. Anyway the attached patch fixes the current bug about encoding in logical replication. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c index bc6e9b5..da81a2d 100644 --- a/src/backend/replication/logical/proto.c +++ b/src/backend/replication/logical/proto.c @@ -16,6 +16,7 @@#include "catalog/pg_namespace.h"#include "catalog/pg_type.h"#include "libpq/pqformat.h" +#include "mb/pg_wchar.h"#include "replication/logicalproto.h"#include "utils/builtins.h"#include "utils/lsyscache.h" @@ -442,9 +443,13 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple) pq_sendbyte(out, 't'); /* 'text' data follows */ outputstr = OidOutputFunctionCall(typclass->typoutput, values[i]); + + if (pg_get_client_encoding() != GetDatabaseEncoding()) + outputstr = pg_server_to_client(outputstr, strlen(outputstr)); + len = strlen(outputstr) + 1; /* null terminated */ pq_sendint(out, len, 4); /* length */ - pq_sendstring(out, outputstr); /* data */ + appendBinaryStringInfo(out, outputstr, len); /* data */ pfree(outputstr);
В списке pgsql-hackers по дате отправления:
Следующее
От: Michael PaquierДата:
Сообщение: Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog