[HACKERS] Remove 1MB size limit in tsvector

Поиск
Список
Период
Сортировка
От Ildus Kurbangaliev
Тема [HACKERS] Remove 1MB size limit in tsvector
Дата
Msg-id 20170801170846.66e3ab06@wp.localdomain
обсуждение исходный текст
Ответы Re: [HACKERS] Remove 1MB size limit in tsvector  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hello, hackers!

Historically tsvector type can't hold more than 1MB data.
I want to propose a patch that removes that limit.

That limit is created by 'pos' field from WordEntry, which have only
20 bits for storage.

In the proposed patch I removed this field and instead of it I keep
offsets only at each Nth item in WordEntry's array. Now I set N as 4,
because it gave best results in my benchmarks. It can be increased in
the future without affecting already saved data in database. Also
removing the field improves compression of tsvectors.

I simplified the code by creating functions that can be used to
build tsvectors. There were duplicated code fragments in places where
tsvector was built.

Also new patch frees some space in WordEntry that can be used to
save some additional information about saved words.

- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] PostgreSQL 10 (latest beta) and older ICU
Следующее
От: Alexander Kuzmenkov
Дата:
Сообщение: Re: [HACKERS] Proposal for CSN based snapshots