Re: [HACKERS] GSOC - TOAST'ing in slices

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] GSOC - TOAST'ing in slices
Дата
Msg-id CA+TgmoZOTjzt4qCHQJNtvJ-EksUqdHNhT_WvEwaL5ZjpAyz3Eg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] GSOC - TOAST'ing in slices  (George Papadrosou <gpapadrosou@gmail.com>)
Ответы Re: [HACKERS] GSOC - TOAST'ing in slices
Список pgsql-hackers
On Tue, Mar 14, 2017 at 10:03 PM, George Papadrosou
<gpapadrosou@gmail.com> wrote:
> The project’s idea is implement different slicing approaches according to
> the value’s datatype. For example a text field could be split upon character
> boundaries while a JSON document would be split in a way that allows fast
> access to it’s keys or values.

Hmm.  So if you had a long text field containing multibyte characters,
and you split it after, say, every 1024 characters rather than after
every N bytes, then you could do substr() without detoasting the whole
field.  On the other hand, my guess is that you'd waste a fair amount
of space in the TOAST table, because it's unlikely that the chunks
would be exactly the right size to fill every page of the table
completely.  On balance it seems like you'd be worse off, because
substr() probably isn't all that common an operation.

Now, in contrast, slicing JSON is a very common operation, so a
smarter slicing scheme might well pay off, but the question is - what
kind of a splitting method would actually allow fast access to the
keys or values?  It strikes me that this might be a difficult problem.
Tabula raza, you could design a serialization format that was aware
that it might get toasted and was constructed in such a way that as to
contain boundaries that are actually referenced from within the
format, so that, say, after reading the toplevel keys and values, you
could know that you next need chunk #103.  But unless the existing
jsonb binary format was designed with that in mind, it doesn't seem
likely to end up being true just by chance.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Remove obsolete text from hash/README
Следующее
От: Arthur Zakirov
Дата:
Сообщение: Re: [HACKERS] IF NOT EXISTS option for CREATE SERVER and CREATE USERMAPPING statements