I thought I'd make a followup to the question of storage bloat. I
tried approaching the problem by living a little loosely with database
normalization and use arrays instead of flattening everything out in
the tables.
So my original table,
CREATE TABLE original (
dbIndex integer,
index1 smallint,
index2 smallint,
index3 smallint,
dvalue real
):
becomes
CREATE TABLE newtable (
dbIndex integer,
dvalue real[][][]
);
Now I'd expect to get better storage here just because I'm dropping 3
smallint fields (so 2 bytes x 3 for each value, or a 6 fold decrease).
However, in actuality I'm seeing a 30 fold decrease:
original table = 48 MB
newtable = 1.6 MB !!!!!
If I'm understanding the data directory correctly, the array tuples
are being toasted (which I think must compress them). The actual table
in the newtable format is only 8K and the pg_toast that goes with this
table is 1.6MB.
So I'm very pleased with the storage here. Plus, reads are faster than
the original design. All in all, I think I've found a good solution
for my kind of data.
Now I know that the use of arrays is considered taboo in the books,
but in my case the data being stored actually do fall into an array
(matrix) structure naturally (it's a timebased recording so there's a
sample for each time point and each X,Y position -- a natural 3D
matrix).
What are the possible downsides that I may be missing?