Re: Fillfactor for GIN indexes
| От | Tomas Vondra |
|---|---|
| Тема | Re: Fillfactor for GIN indexes |
| Дата | |
| Msg-id | 54EC8779.6000103@2ndquadrant.com обсуждение исходный текст |
| Ответ на | Re: Fillfactor for GIN indexes (Alexander Korotkov <aekorotkov@gmail.com>) |
| Ответы |
Re: Fillfactor for GIN indexes
|
| Список | pgsql-hackers |
Hi,
I've been wondering whether this might improve behavior with one of my
workloads, suffering by GIN bloat - the same one I used to test GIN
fastscan, for example.
It's a batch process that loads a mailing list archive into a table with
a GIN index on message body, by doing something like this:
for file in files: BEGIN; for message in file:
SAVEPOINT s; INSERT INTO messages VALUES (...)
if error: ROLLBACK TO s;
COMMIT;
And there are multiple processes, each processing subset of mbox files.
There are ~1M messages and right after the load I see this:
List of relations Schema | Name | Type | Owner | Table | Size
--------+------------------+-------+-------+----------+--------- public | message_body_idx | index | tomas | messages
|2247 MB (1 row)
and after VACUUM FULL:
List of relations Schema | Name | Type | Owner | Table | Size
--------+------------------+-------+-------+----------+--------- public | message_body_idx | index | tomas | messages
|403 MB (1 row)
So the index is ~5x larger, which is probably expected due to the amount
of random inserts within a very short time (~15 minutes), executed in
parallel.
I hoped lowering the fillfactor will improve this, but fillfactor=75 had
pretty much no effect in this case. Is that expected for this kind of
workload? I see the previous discussion talked about random updates, not
inserts, so maybe that's the culprit?
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: