Re: Wanted: jsonb on-disk representation documentation

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Wanted: jsonb on-disk representation documentation
Дата
Msg-id 20140507181344.GK13397@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: Wanted: jsonb on-disk representation documentation  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
On 2014-05-07 10:55:28 -0700, Peter Geoghegan wrote:
> On Wed, May 7, 2014 at 4:40 AM, Andres Freund <andres@anarazel.de> wrote:
> > * The jentry representation should be changed so it's possible to get the type
> >   of a entry without checking individual types. That'll make code like
> >   findJsonbValueFromSuperHeader() (well, whatever you've renamed it to)
> >   much easier to read. Preferrably so it an have the same values (after
> >   shifting/masking) ask the JBE variants. And it needs space for futher
> >   types (or representations thereof).
>
> I don't know what you mean. Every place that macros like
> JBE_ISNUMERIC() are used subsequently involves access to (say) numeric
> union fields. At best, you could have those functions check that the
> types match, and then handle each case in a switch that only looked at
> (say) the "candidate", but that doesn't really save much at all. It
> wouldn't take much to have the macros give enum constant values back
> as you suggest, though.

Compare    for (i = 0; i < count; i++)    {        JEntry       *e = array + i;
        if (JBE_ISNULL(*e) && key->type == jbvNull)        {            result->type = jbvNull;
result->estSize= sizeof(JEntry);        }        else if (JBE_ISSTRING(*e) && key->type == jbvString)        {
 result->type = jbvString;            result->val.string.val = data + JBE_OFF(*e);            result->val.string.len =
JBE_LEN(*e);           result->estSize = sizeof(JEntry) + result->val.string.len;        }        else if
(JBE_ISNUMERIC(*e)&& key->type == jbvNumeric)        {            result->type = jbvNumeric;
result->val.numeric= (Numeric) (data + INTALIGN(JBE_OFF(*e)));
 
            result->estSize = 2 * sizeof(JEntry) +                VARSIZE_ANY(result->val.numeric);        }
elseif (JBE_ISBOOL(*e) && key->type == jbvBool)        {            result->type = jbvBool;
result->val.boolean= JBE_ISBOOL_TRUE(*e) != 0;            result->estSize = sizeof(JEntry);        }        else
   continue;
 
        if (compareJsonbScalarValue(key, result) == 0)            return result;    }
with
    for (i = 0; i < count; i++)    {        JEntry       *e = array + i;
                       if (!JBE_TYPE_IS_SCALAR(*e))                           continue;
                       if (JBE_TYPE(*e) != key->type)                           continue;
                       result = getJsonbValue(e);
        if (compareJsonbScalarValue(key, result) == 0)            return result;    }

Yes, it's not a fair comparison because I made up getJsonbValue(). But
it's *much* more readable regardless. And there's more places that could
use it. Like the second half of findJsonbValueFromSuperHeader(). FWIW,
that's one of the requests I definitely made before.

> > * I wonder if the hash/object pair representation is wise and if it
> >   shouldn't be keys combined with offsets first, and then the
> >   values. That will make access much faster. And that's what jsonb
> >   essentially is about.
>
> I like that the physical layout reflects the layout of the original
> JSON document.

Don't see much value in that. This is a binary representation and it'd
be bijective.

> Besides, I don't think this is obviously the case. Are
> you sure that it won't be more useful to make children as close to
> their parents as possible? Particularly given idiomatic usage.

Because - if done right - it would allow elementwise access without
scanning previous values.

> > * I think both arrays and hashes should grow individual structs. With
> >   space for additional flags.

> Why?

Because a) it will make the code more readable b) it'll allow for
adding different representations of hashes/arrays.

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers