Re: Zedstore - compressed in-core columnar storage

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Zedstore - compressed in-core columnar storage
Дата
Msg-id b9af4752-a844-f97b-8f08-dd483b5e6070@iki.fi
обсуждение исходный текст
Ответ на Re: Zedstore - compressed in-core columnar storage  (Ashwin Agrawal <aagrawal@pivotal.io>)
Ответы Re: Zedstore - compressed in-core columnar storage  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Список pgsql-hackers
On 14/08/2019 20:32, Ashwin Agrawal wrote:
> On Wed, Aug 14, 2019 at 2:51 AM Ashutosh Sharma wrote:
>> 2) Is there a chance that IndexOnlyScan would ever be required for
>>    zedstore tables considering the design approach taken for it?
> 
> We have not given much thought to IndexOnlyScans so far. But I think
> IndexOnlyScan definitely would be beneficial for zedstore as
> well. Even for normal index scans as well, fetching as many columns
> possible from Index itself and only getting rest of required columns
> from the table would be good for zedstore. It would help to further
> cut down IO. Ideally, for visibility checking only TidTree needs to be
> scanned and visibility checked with the same, so the cost of checking
> is much lower compared to heap (if VM can't be consulted) but still is
> a cost. Also, with vacuum, if UNDO log gets trimmed, the visibility
> checks are pretty cheap. Still given all that, having VM type thing to
> optimize the same further would help.

Hmm, yeah. An index-only scan on a zedstore table could perform the "VM 
checks" by checking the TID tree in the zedstore. It's not as compact as 
the 2 bits per TID in the heapam's visibility map, but it's pretty good.

>> Further, I tried creating a zedstore table with btree index on one of
>> it's column and loaded around 50 lacs record into the table. When the
>> indexed column was scanned (with enable_seqscan flag set to off), it
>> went for IndexOnlyScan and that took around 15-20 times more than it
>> would take for IndexOnly Scan on heap table just because IndexOnlyScan
>> in zedstore always goes to heap as the visibility check fails.

Currently, an index-only scan on zedstore should be pretty much the same 
speed as a regular index scan. All the visibility checks will fail, and 
you end up fetching every row from the table, just like a regular index 
scan. So I think what you're seeing is that the index fetches on a 
zedstore table is much slower than on heap.

Ideally, on a column store the index fetches would only fetch the needed 
columns, but I don't think that's been implemented yet, so all the 
columns are fetched. That can make a big difference, if you have a wide 
table with lots of columns, but only actually need a few of them. Was 
your test case something like that?

We haven't spent much effort on optimizing index fetches yet, so I hope 
there's many other little tweaks there as well, that we can do to make 
it faster.

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Antonin Houska
Дата:
Сообщение: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Zedstore - compressed in-core columnar storage