Folks,
One of Postgres' poorest performing areas is aggregates. This is the
unfortunate side effect of our fully extensible aggregate and type
system. However, I thought that the folks on this list might have a
few tips on making aggregates perform faster.
Here's mine: Aggregate Caching Table
This is a brute-force approach. However, if you have a table with a
million records for which users *frequently* ask for grand totals or
counts, it can work fine.
A simple example:
Table client_case_counts (
client_id INT NOT NULL REFERENCES clients(client_id) ON DELETE
CASCADE;
no_cases INT NOT NULL DEFAULT 0
);
Then create triggers:
Function tf_maintain_client_counts ()
returns opaque as '
BEGIN
UPDATE client_case_counts SET no_cases = no_cases + 1
WHERE client_id = NEW.client_id;
INSERT INTO client_case_counts ( client_id, no_cases )
VALUES ( NEW.client_id, 1 )
WHERE NOT EXISTS (SELECT client_id FROM client_case_counts ccc2
WHERE ccc2.client_id = NEW.client_id);
RETURN NEW;
END;' LANGUAGE 'plpgsql';
Trigger tg_maintain_client_counts ON INSERT INTO cases
FOR EACH ROW EXECUTE tf_maintain_client_counts();
etc.
While effective, this approach is costly in terms of update/insert
processing. It is also limited to whatever aggregate requests you have
anticipated ... it does no good for aggregates over a user-defined
range.
What have other Postgres users done to speed up aggregates on large
tables?
-Josh Berkus