Hash Index on Partitioned Table

Поиск
Список
Период
Сортировка
От peter.borissow@kartographia.com
Тема Hash Index on Partitioned Table
Дата
Msg-id 1685540675.65414951@apps.rackspace.com
обсуждение исходный текст
Ответы Re: Hash Index on Partitioned Table
Список pgsql-general

Dear PostgreSQL Community,

 

I have a rather large database with ~250 billion records in a partitioned table. The database has been running and ingesting data continuously for about 3 years.

 

I had a "regular" BTree index on one of the fields (a unique bigint column) but it was getting too big for the disk it was on. The index was consuming 6.4 TB of disk space.

 

I created a new disk with double the size, dropped the original index, and started to generate a new index.

 

After doing some research I decided to try to create a hash index instead of a BTree. For my purposes, the index is only used to find specific numbers ("=" and "IN" queries). From what I read, the hash index should run a little faster than btree for my use case and should use less disk space.

 

After 115 hours, the hash index is still generating and is using significantly more disk space than the original BTree index (8.4 TB vs 6.4 TB). I don't know how to check the status of the index creation task so I can't really estimate how much longer it will take or how much disk space it will consume.

 

Questions:

 

(1) Why is the hash index consuming more disk space than the btree index? Is it because the hash of the bigint values larger than the storing the bigints in the btree?

(2) Are there any known issues having a hash index on this many records?

(3) Are there any known issues having a hash index on partitioned tables?

(4) Is there any way to estimate when the index process will complete?

 

Server info:

 - PostgreSQL 13

 - Ubuntu 20.04.6 LTS

 - 64 cores (only 1 is ever used during index creation)

 - Memory usage is steady at 58GB/188GB

 - All disks are high speed NVMe drives

 - 1,686 tables in the partition

 

Thanks in advance,

Peter

В списке pgsql-general по дате отправления:

Предыдущее
От: hubert depesz lubaczewski
Дата:
Сообщение: Re: Is there a bug in psql? (SELECT ''';)
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Pg 16: will pg_dump & pg_restore be faster?