Re: jsonb format is pessimal for toast compression

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: jsonb format is pessimal for toast compression
Дата
Msg-id CAM3SWZSDMkntNCG8dm-grcke_BjZ6U3sSDdMVWhpC_VXJwQ_Jw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: jsonb format is pessimal for toast compression  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: jsonb format is pessimal for toast compression  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
On Mon, Aug 11, 2014 at 12:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I think that's a good point.

I think that there may be something to be said for the current layout.
Having adjacent keys and values could take better advantage of CPU
cache characteristics. I've heard of approaches to improving B-Tree
locality that forced keys and values to be adjacent on individual
B-Tree pages [1], for example. I've heard of this more than once. And
FWIW, I believe based on earlier research of user requirements in this
area that very large jsonb datums are not considered all that
compelling. Document database systems have considerable limitations
here.

> On the general topic, I don't think it's reasonable to imagine that
> we're going to come up with a single heuristic that works well for
> every kind of input data.  What pglz is doing - assuming that if the
> beginning of the data is incompressible then the rest probably is too
> - is fundamentally reasonable, nonwithstanding the fact that it
> doesn't happen to work out well for JSONB.  We might be able to tinker
> with that general strategy in some way that seems to fix this case and
> doesn't appear to break others, but there's some risk in that, and
> there's no obvious reason in my mind why PGLZ should be require to fly
> blind.  So I think it would be a better idea to arrange some method by
> which JSONB (and perhaps other data types) can provide compression
> hints to pglz.

If there is to be any effort to make jsonb a more effective target for
compression, I imagine that that would have to target redundancy
between JSON documents. With idiomatic usage, we can expect plenty of
it.

[1] http://www.vldb.org/conf/1999/P7.pdf , "We also forced each key
and child pointer to be adjacent to each other physically"
-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: jsonb format is pessimal for toast compression
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: psql: show only failed queries