Re: Wanted: jsonb on-disk representation documentation
| От | Andres Freund |
|---|---|
| Тема | Re: Wanted: jsonb on-disk representation documentation |
| Дата | |
| Msg-id | 20140507181344.GK13397@awork2.anarazel.de обсуждение исходный текст |
| Ответ на | Re: Wanted: jsonb on-disk representation documentation (Peter Geoghegan <pg@heroku.com>) |
| Список | pgsql-hackers |
On 2014-05-07 10:55:28 -0700, Peter Geoghegan wrote:
> On Wed, May 7, 2014 at 4:40 AM, Andres Freund <andres@anarazel.de> wrote:
> > * The jentry representation should be changed so it's possible to get the type
> > of a entry without checking individual types. That'll make code like
> > findJsonbValueFromSuperHeader() (well, whatever you've renamed it to)
> > much easier to read. Preferrably so it an have the same values (after
> > shifting/masking) ask the JBE variants. And it needs space for futher
> > types (or representations thereof).
>
> I don't know what you mean. Every place that macros like
> JBE_ISNUMERIC() are used subsequently involves access to (say) numeric
> union fields. At best, you could have those functions check that the
> types match, and then handle each case in a switch that only looked at
> (say) the "candidate", but that doesn't really save much at all. It
> wouldn't take much to have the macros give enum constant values back
> as you suggest, though.
Compare for (i = 0; i < count; i++) { JEntry *e = array + i;
if (JBE_ISNULL(*e) && key->type == jbvNull) { result->type = jbvNull;
result->estSize= sizeof(JEntry); } else if (JBE_ISSTRING(*e) && key->type == jbvString) {
result->type = jbvString; result->val.string.val = data + JBE_OFF(*e); result->val.string.len =
JBE_LEN(*e); result->estSize = sizeof(JEntry) + result->val.string.len; } else if
(JBE_ISNUMERIC(*e)&& key->type == jbvNumeric) { result->type = jbvNumeric;
result->val.numeric= (Numeric) (data + INTALIGN(JBE_OFF(*e)));
result->estSize = 2 * sizeof(JEntry) + VARSIZE_ANY(result->val.numeric); }
elseif (JBE_ISBOOL(*e) && key->type == jbvBool) { result->type = jbvBool;
result->val.boolean= JBE_ISBOOL_TRUE(*e) != 0; result->estSize = sizeof(JEntry); } else
continue;
if (compareJsonbScalarValue(key, result) == 0) return result; }
with
for (i = 0; i < count; i++) { JEntry *e = array + i;
if (!JBE_TYPE_IS_SCALAR(*e)) continue;
if (JBE_TYPE(*e) != key->type) continue;
result = getJsonbValue(e);
if (compareJsonbScalarValue(key, result) == 0) return result; }
Yes, it's not a fair comparison because I made up getJsonbValue(). But
it's *much* more readable regardless. And there's more places that could
use it. Like the second half of findJsonbValueFromSuperHeader(). FWIW,
that's one of the requests I definitely made before.
> > * I wonder if the hash/object pair representation is wise and if it
> > shouldn't be keys combined with offsets first, and then the
> > values. That will make access much faster. And that's what jsonb
> > essentially is about.
>
> I like that the physical layout reflects the layout of the original
> JSON document.
Don't see much value in that. This is a binary representation and it'd
be bijective.
> Besides, I don't think this is obviously the case. Are
> you sure that it won't be more useful to make children as close to
> their parents as possible? Particularly given idiomatic usage.
Because - if done right - it would allow elementwise access without
scanning previous values.
> > * I think both arrays and hashes should grow individual structs. With
> > space for additional flags.
> Why?
Because a) it will make the code more readable b) it'll allow for
adding different representations of hashes/arrays.
Greetings,
Andres Freund
--Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services
В списке pgsql-hackers по дате отправления: