Re: substring on bit(n) and bytea types is slow

Поиск
Список
Период
Сортировка
От Evgeny Morozov
Тема Re: substring on bit(n) and bytea types is slow
Дата
Msg-id CALtd4uVUrocyVzqthXmOJ665FZTFWv-nRR99FF7Ds5AyXR-9Cg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: substring on bit(n) and bytea types is slow  (Arjen Nienhuis <a.g.nienhuis@gmail.com>)
Список pgsql-general
On 2 March 2016 at 00:33, Arjen Nienhuis <a.g.nienhuis@gmail.com> wrote:


On Feb 29, 2016 22:26, "Evgeny Morozov" <evgeny.morozov+list+pgsql@shift-technology.com> wrote
> SELECT substring(bitarray from (32 * (n - 1) + 1) for 32) -- bitarray is a column of type bit(64000000)
> FROM array_test_bit
> JOIN generate_series(1, 10000) n ON true;

Substring on a bit string is not optimized for long TOASTed values. Substring on text is optimized for that. The current code fetches the whole 8MB from the table every time.

I see, thanks. Is there a better way to pack a large number of integers efficiently with reasonable read/write performance?

I tried arrays bit varying, which seemed perfect, but in practice when I stored 4M integers in it, each one taking as few bits as possible, the table takes 13MB - same as if I just store all of them as bit(24). In fact, an array of 4M bit(10) integers also takes 13MB. bit(8) takes only 0.7 MB. bit(9) is where things get weird: for integer 1 to 4M it takes 13MB, but if I multiple them by 2 (i.e. store 4M even integers) it takes 0.7MB! So there must be some kind of compression going on there, but I don't understand how it works.

В списке pgsql-general по дате отправления:

Предыдущее
От: Evgeny Morozov
Дата:
Сообщение: Re: substring on bit(n) and bytea types is slow
Следующее
От: schoetbi schoetbi
Дата:
Сообщение: pg_upgrade 9.5.1: pg_upgrade_support missing