Re: additional json functionality
От | Merlin Moncure |
---|---|
Тема | Re: additional json functionality |
Дата | |
Msg-id | CAHyXU0xa=1jJ3+4LJBRpyoFfUsz=7mQxQg8mdteAacbnNxJahw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: additional json functionality ("David E. Wheeler" <david@justatheory.com>) |
Ответы |
Re: additional json functionality
(Andrew Dunstan <andrew@dunslane.net>)
Re: additional json functionality (Hannu Krosing <hannu@2ndQuadrant.com>) |
Список | pgsql-hackers |
On Fri, Nov 15, 2013 at 1:51 PM, David E. Wheeler <david@justatheory.com> wrote: > On Nov 15, 2013, at 6:35 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > >> Here are the options on the table: >> 1) convert existing json type to binary flavor (notwithstanding objections) >> 2) maintain side by side types, one representing binary, one text. >> unfortunately, i think the text one must get the name 'json' due to >> unfortunate previous decision. >> 3) merge the behaviors into a single type and get the best of both >> worlds (as suggested upthread). >> >> I think we need to take a *very* hard look at #3 before exploring #1 >> or #2: Haven't through it through yet but it may be possible to handle >> this in such a way that will be mostly transparent to the end user and >> may have other benefits such as a faster path for serialization. > > If it’s possible to preserve order and still get the advantages of binary representation --- which are substantial (seehttp://theory.so/pg/2013/10/23/testing-nested-hstore/ and http://theory.so/pg/2013/10/25/indexing-nested-hstore/ fora couple of examples) --- without undue maintenance overhead, then great. > > I am completely opposed to duplicate key preservation in JSON, though. It has caused us a fair number of headaches at $work. Kinda yes, kinda no. Here's a rough sketch of what I'm thinking: *) 'json' type internally has a binary as well a text representation. The text representation is basically the current type behavior (duduplicated unordered). The binary representation is the hstore-ish variant. The text mode is discarded when it's deemed no longer appropriate to be needed, and, once gone, can never be rebuilt as it was. *) only the binary internal representation ever gets stored to disk (or anything else). *) the text mode is preferred for output if it is there. otherwise, a deduplicated, reordered text representation is generated *) When literal text is casted to json, the binary structure is built up and kept alongside binary mode. So, if you went: 'select '{"a": 1, "a": 2}'::json', you'd get the same thing back. (This is how it works now.). but, if you went: 'insert into foo select '{"a": 1, "a": 2}'::json returning *', you'd get {"a": 2} backessentially (although technically that would be a kind of race). *) When the json is stored to table, the text representation gets immediately discarded on the basis that it's no longer the true representation of the data. *) Ditto when making any equality operation (not as sure on this point). *) Ditto when doing any operation that mutates the structure in any way. the text representation is immutable except during serialization and if it gets invalidated it gets destroyed. *) New API function: json_simplify(); or some such. It reorders and dedups from user's point of view (but really just kills off the text representation) *) once the text mode is gone, you get basically the proposed 'hstore' behavior. *) serialization functions are generally used in contexts that do not store anything but get output as query data. They create *only* the text mode. However, if the resultant json is stored anywhere, the text mode is destroyed and replaced with binary variant. This is both a concession to the current behavior and an optimization of 'serialization-in-query' for which I think the binary mode is pessimal performance wise. so, xxx_to_json serialization functions work exactly as they do now which fixes my problem essentially. *) if you are unhappy with duplicates in the above, just get use to calling json_simpify() on the serialized json (or deal with in on the client side). This is all pretty glossy, but maybe there is a way forward... merlin
В списке pgsql-hackers по дате отправления: