I'm stuck trying to tune a big-ish postgres db and wondering if anyone
has any pointers.
I cannot get Postgres to make good use of plenty of available RAM and
stop thrashing the disks.
One main table. ~30 million rows, 20 columns all integer, smallint or
char(2). Most have an index. It's a table for holding webserver
logs. The main table is all foreign key ids. Row size is ~100bytes.
The typical query is an aggregate over a large number of rows (~25% say).
SELECT COUNT(*), COUNT(DISTINCT user_id)
FROM table
WHERE epoch > ...
AND epoch < ...
AND country = ...
The box has 4 x Opterons, 4Gb RAM & five 15k rpm disks, RAID 5. We
wanted fast query/lookup. We know we can get fast disk IO.
Running a typical query like above seems to:
* hardly tax a single CPU
* plenty of RAM free
* disks thrash about
The last is based mostly on the observation that another tiddly
unrelated mysql db which normally runs fast, grinds to a halt when
we're querying the postgres db (and cpu, memory appear to have spare
capacity).
We've currently got these settings, and have tried doubling/halving
them, restarted and benchmarked a test query. They don't appear to
materially alter our query time.
shared_buffers = 128MB
temp_buffers = 160MB
work_mem = 200MB
max_stack_depth = 7MB
We're less concerned about insert speed. Typically 1 or 2 users, but
want fast queries.
Perhaps a little extreme, but I'm trying to find a way to express this
in a way that Postgres understands:
* Load this table, and one or two indexes (epoch, user_id) into RAM.
* All of the table, all of those indexes.
* Keep them there, but keep a disk based backup for integrity.
* Run all selects against the in RAM copy. Always.
Aka, I know we're hitting this table (and a couple of columns) lots
and lots, so just get it into RAM and stop thrashing disks.
Pointers welcome.