[proposal] de-TOAST'ing using a iterator

Поиск
Список
Период
Сортировка
От Binguo Bao
Тема [proposal] de-TOAST'ing using a iterator
Дата
Msg-id CAL-OGks_onzpc9M9bXPCztMofWULcFkyeCeKiAgXzwRL8kXiag@mail.gmail.com
обсуждение исходный текст
Ответы Re: [proposal] de-TOAST'ing using a iterator  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
Hi hackers!
This proposal aims to provide the ability to de-TOAST a fully TOAST'd and compressed field using an iterator and then update the appropriate parts of the code to use the iterator where possible instead of de-TOAST'ing and de-compressing the entire value. Examples where this can be helpful include using position() from the beginning of the value, or doing a pattern or substring match.

de-TOAST iterator overview:
1. The caller requests the slice of the attribute value from the de-TOAST iterator.
2. The de-TOAST iterator checks if there is a slice available in the output buffer, if there is, return the result directly,
    otherwise goto the step3.
3. The de-TOAST iterator checks if there is the slice available in the input buffer, if there is, goto step44. Otherwise,
    call fetch_datum_iterator to fetch datums from disk to input buffer.
4. If the data in the input buffer is compressed, extract some data from the input buffer to the output buffer until the caller's
    needs are met.

I've implemented the prototype and apply it to the position() function to test performance.
Test tables:
-----------------------------------------------------------------------------------------------------
create table detoast_c (id serial primary key,
a text
);
insert into detoast_c (a) select repeat('1234567890-=abcdefghijklmnopqrstuvwxyz', 1000000)||'321' as a from generate_series(1,100);

create table detoast_u (id serial primary key,
a text
);
alter table detoast_u alter a set storage external;
insert into detoast_u (a) select repeat('1234567890-=abcdefghijklmnopqrstuvwxyz', 1000000)||'321' as a from generate_series(1,100);
**************************************************************************************
-----------------------------------------------------------------------------------------------------
                         query                                    |     master (ms)  |  patch  (ms)  |
-----------------------------------------------------------------------------------------------------
select position('123' in a) from detoast_c;    |     4054.838       |    1440.735   |
-----------------------------------------------------------------------------------------------------
select position('321' in a) from detoast_c;    |     25549.270     |   27696.245  |
-----------------------------------------------------------------------------------------------------
select position('123' in a) from detoast_u;    |     8116.996       |    1386.802   |
-----------------------------------------------------------------------------------------------------
select position('321' in a) from detoast_u     |     28442.116     |   27672.319  |
-----------------------------------------------------------------------------------------------------
**************************************************************************************
It can be seen that the iterator greatly improves the efficiency of partial de-TOAST when it has almost no degradation in full de-TOAST efficiency.
Next, I will continue to study how to apply iterators to more queries 
and improve iterator efficiency, such as using macros instead of function calls.

The patch is also available on github[1].
Any suggestions or comments would be much appreciated:)

Best regards, Binguo Bao.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: POC: Cleaning up orphaned files using undo logs
Следующее
От: Dmitry Dolgov
Дата:
Сообщение: Re: Index Skip Scan