I have this table and index:
create table t(id int, hash int);
create index idx_t on t(hash);
The value of the hash column, which is indexed, is a pseudo-random
number. I load the table and measure the time per insert.
What I've observed is that inserts slow down as the table grows to
1,000,000 records. Observing the pg_stat* tables, I see that the data
page reads per unit time stay steady, but that index page reads grow
quickly, (shared_buffers was set to 2000).
I'm guessing that this is because inserts will append to data pages,
but there is a random probe to the btree to maintain the index.
This is a test program, but I'm wondering about my application that
will need row counts much beyond 1,000,000.
Questions:
- Am I reasoning about the data and index costs correctly?
- In order to keep insert times from dropping too much, do I simply
need to increase shared_buffers in order to accomodate more of the
index?
- Assuming I'm willing to buy enough RAM and set shmmax high enough,
are there practical limits on how big shared_buffers can be that will
limit how far I can pursue such a strategy?
Jack Orenstein