Re: Compressed TOAST Slicing

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Compressed TOAST Slicing
Дата
Msg-id f51ef505-bec2-928b-f63c-9865b0e8fcd3@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Compressed TOAST Slicing  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 2/20/19 7:50 PM, Robert Haas wrote:
> On Wed, Feb 20, 2019 at 1:45 PM Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>> What this does not support: any function that probably wants 
>> less-than-everything, but doesn’t know how big a slice to look
>> for. Stephen thinks I should put an iterator on decompression,
>> which would be an interesting piece of work. >> Having looked at
>> the json code a little doing partial searches would require a lot
>> of re-work that is above my paygrade, but if there was an iterator
>> in place, at least that next stop would then be open.
>>
>> Note that adding an iterator isn’t adding two ways to do the same 
>> thing, since the iterator would slot nicely underneath the existing
>> slicing API, and just iterate to the requested slice size. So this
>> is easily just “another step” along the train line to providing
>> streaming access to compressed and TOASTed data.
> 
> Yeah.  Plus, I'm not sure the iterator thing is even the right design
> for the JSONB case.  It might be better to think, for that case, about
> whether there's someway to operate directly on the compressed data.
> If you could somehow jigger the format and the chunking so that you
> could jump directly to the right chunk and decompress from there,
> rather than having to walk over all of the earlier chunks to figure
> out where the data you want is, you could probably obtain a large
> performance benefit.  But figuring out how to design such a scheme
> seems pretty far afield from the topic at hand.
> 

I doubt working directly on compressed data is doable / efficient unless
the compression was designed with that use case in mind. Which pglz
almost certainly was not. Furthermore, I think there's an issue with
layering - the compression currently happens in the TOAST infrastructure
(and Paul's patch does not change this), but operating on compressed
data is inherently specific to a given data type.

> I'd actually be inclined not to add an iterator until we have a real 
> user for it, for exactly the reason that we don't know that it is
> the right thing.  But there is certain value in decompressing
> partially, to a known byte position, as your patch does, no matter
> what we decide to do about that stuff.
> 

Well, I think Simon's suggestion was that we should also use the
iterator from JSONB code, so that would be the use of it. And if Paul
implemented the current slicing on top of the iterator, that would also
be an user (even without the JSONB stuff).

But I think Andres is right this might increase the complexity of the
patch too much, possibly pushing it from PG12. I don't see anything
wrong with doing the simple approach now and then extending it to handle
JSONB later, if someone wants to invest their time into it.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Compressed TOAST Slicing
Следующее
От: Sergei Kornilov
Дата:
Сообщение: Re: bgwriter_lru_maxpages limits in PG 10 sample conf