Hello,
First post on the list, please be indulgent :)
I'm working on a fairly large DB (for me) - 23 GB data, 42 GB indexes, 100 M rows for my biggest table - and trying to find possible optimisations on the storage requirements... and hopefully trying to keep more indexes in RAM.
This DB is actually the import of "events" on a sliding window of 30 days, and I have no control over the events which are generated. I have control of the import script though.
Many of the columns have the following characteristics:
- VARCHAR
- low cardinality (typically < 100 distinct values)
- but I can see new values appearing "at any time" when importing data from external systems. I don't expect the cardinality to grow significantly though.
The naive storage of those columns is quite demanding when compared to the amount of information they carry, and I'm looking at solutions to optimise this. Obviously I could:
- normalize the DB by adding another table and a FOREIGN KEY - the management of this table could be done via triggers for instance, with a cost in complexity (triggers, applications accessing the DB, ...)
I would find much more elegant to use a datatype where my VARCHARs would be be internally stored as a SMALLINT (or similar), indexed as SMALLINT, while still being able to be externally seen as if it was a VARCHAR (comparison, ORM bindings, ...)
I didn't find any datatype which would work like this :(
Does anyone know of such a solution ?
Thank you,
Olivier