On Tue, Jun 29, 2010 at 4:00 PM, sam mulube <sam.mulube@gmail.com> wrote:
> Hi,
>
> we are considering database partitioning as a possible solution to
> some performance issues we are having with our database, and we are
> trying to decide on a partitioning scheme. We have a moderately write
> heavy application (approx 50 inserts per second, with writes
> outnumbering reads by roughly 5:1), and the table in question looks
> something like this:
>
> -------------------------------------------------------------------------------
> column name : id | value | server_id | created_at
> column type : integer | string | integer | timestamp
> with time zone
> other info : pk | | fk, indexed |
> indexed
> -------------------------------------------------------------------------------
>
> Or initial thoughts on partitioning was to partition by date using the
> created_at column, with a separate partition for each month; however
> the vast majority of our inserts would be for 'now', so we would be
> almost entirely writing to the partition for the current month. Other
> month partitions might get occasional updates, but this would be a
> relatively infrequent occurrence.
>
> Alternatively we wondered about partitioning by the server_id foreign
> key, using for example the modulo of the foreign key id. This would
> give us a finite number of partitions (rather than the potentially
> unbounded date option), and would likely cause writes to be much more
> evenly distributed between the partitions.
>
> Does anyone have any likely idea which would be the better choice. The
> single hot table getting most of the inserts, which might mean any
> indexes are fully in memory, or dividing the writes more evenly over
> all of our partitions?
Are most of your selects for now to now - 1 day or so as well? If so,
then look at having one big partition for historical data and one
small one for the last day. Every x hours run a cron job that moves
everything in the current partition to the old archive partition(s).