Re: JSON decoding plugin
От | Merlin Moncure |
---|---|
Тема | Re: JSON decoding plugin |
Дата | |
Msg-id | CAHyXU0zE-kcofYtnoQqu7i2odH+L8zhfQz9-Ozbdga4NUXgBJg@mail.gmail.com обсуждение исходный текст |
Ответ на | JSON decoding plugin (Euler Taveira <euler@timbira.com.br>) |
Ответы |
Re: JSON decoding plugin
(Euler Taveira <euler@timbira.com.br>)
|
Список | pgsql-hackers |
On Mon, Dec 9, 2013 at 7:03 AM, Euler Taveira <euler@timbira.com.br> wrote: > Hi, > > A few months ago, it was proposed [1] that would be interested to have a > json output plugin for logical decoding. Here it is. > > Each transaction is a JSON object that can contain xid (optional), > timestamp (optional), and change array. Each change's element is a > command that was decoded and it can contains: kind (I/U/D), schema > (optional), table, columnnames, columntypes (optional), columnvalues, > and oldkeys (only for U/D). columnnames, columntypes and columnvalues > are arrays. oldkeys is an object that contains the following arrays: > keynames, keytypes (optional), and keyvalues. > > The JSON objects are serialized if you are decoding a serie of > transactions. Here is an output example: > > { > "xid": 702, > "change": [ > { > "kind": "insert", > "schema": "public", > "table": "foo", > "columnnames": ["a", "b", "c"], > "columntypes": ["int4", "int4", "text"], > "columnvalues": [1, 2, "test"] > } > ,{ > "kind": "update", > "schema": "public", > "table": "foo", > "columnnames": ["a", "b", "c"], > "columntypes": ["int4", "int4", "text"], > "columnvalues": [1, 2, "test2"], > "oldkeys": { > "keynames": ["a", "b"], > "keytypes": ["int4", "int4"], > "keyvalues": [1, 2] > } > } > ] > } > { > "xid": 703, > "change": [ > { > "kind": "update", > "schema": "public", > "table": "foo", > "columnnames": ["a", "b", "c"], > "columntypes": ["int4", "int4", "text"], > "columnvalues": [1, 3, "test2"], > "oldkeys": { > "keynames": ["a", "b"], > "keytypes": ["int4", "int4"], > "keyvalues": [1, 2] > } > } > ] > } > { > "xid": 704, > "change": [ > { > "kind": "delete", > "schema": "public", > "table": "foo", > "oldkeys": { > "keynames": ["a", "b"], > "keytypes": ["int4", "int4"], > "keyvalues": [1, 3] > } > } > ] > } > > > Some data types was adapted to conform with JSON spec. NAN and Infinity > are not valid JSON symbols so their representation is NULL (as some JSON > implementations). Due to JSON datatype simplicity, I represent the vast > majority of Postgres datatypes as string (However, I admit that we could > mimic the json datatype conversion rules). > > The oldkeys treatment follows what was defined by the commit [2]. It uses: > > (i) primary key (default behavior); > (ii) unique index (if REPLICA IDENTITY USING INDEX is defined for table); > (iii) full tuple (if REPLICA IDENTITY FULL is defined for table); > (iv) nothing means an error (if REPLICA IDENTITY NOTHING is defined for > table). > > The TOAST columns have a special treatment for UPDATEs. If a tuple that > contains a TOAST field is updated, the TOAST field is included iif it is > changed too. It means that unchanged TOAST field are omitted from > columns* arrays. This means less overhead while transmitting, > processing and applying changes. > > By design, (i) output plugin doesn't know about aborted transactions and > (ii) subtransactions are reordered into a toplevel transaction and only > the committed pieces are passed to the plugin. > > You can test it firing the regression tests (e.g. 'make test') or using > the following steps? > > postgresql.conf: > wal_level = logical > max_wal_senders = 2 > max_logical_slots = 2 > > start collecting WAL records: > > $ pg_recvlogical --slot=foo -d euler -f /dev/stdout > --plugin=json_decoding_plugin --init > > [execute some transactions] > > start printing decoded transactions: > > $ pg_recvlogical --slot=foo -d euler -f /dev/stdout --start > > stop collecting WAL records: > > $ pg_recvlogical --slot=foo -d euler -f /dev/stdout --stop > > > Comments? This is pretty neat. Couple minor questions: *) Aren't you *en*coding data into json, not the other way around (decoding?) *) Consider generating a long bytea instead of explicitly writing a 32kb sql into the patch. *) You've built your own json serializer here. Maybe some code can be shared with the json type? *) Consider removing 'plugin ' from the name of the plugin. --plugin=json_decoding etc. merlin
В списке pgsql-hackers по дате отправления: