On 1/17/09, Erik Jones <ejones@engineyard.com> wrote:
> On Jan 15, 2009, at 8:06 PM, Tom Lane wrote:
>> "Jamie Tufnell" <diesql@googlemail.com> writes:
>>> item_count int -- this is derived from (select count(*) from items
>>> where group_id = id)
>>> ...
>>
>>> item_count would be updated by insert/update/delete triggers on the
>>> items table, hopefully that would ensure it is always correct?
>>
>> Concurrent updates to the items table make this much harder than
>> it might first appear. If you're willing to serialize all your
>> updating
>> transactions then you can make it work, but ...
>
> That was exactly the caveat I was about to point out. That being
> said, keeping COUNT() values and other computed statistics based on
> other data in the database *is* a fairly common "tactic". On method
> that I've used to great success to avoid the serialization problem is
> to have your triggers actually insert the necessary information for
> the update into a separate "update queue" table. You then have
> separate process that routinely sweeps that update queue, aggregates
> the updates and then updates your count values in the groups table
> with the total update values for each groups entry with updates.
Fortunately our items table rarely sees concurrent writes. It's over
99% reads and is typically updated by just one user. We are already
caching these aggregates and other data in a separate layer and my
goal is to see if I can get rid of that layer.
In light of your advice though, I will think things through a bit more first.
Thanks for your help!
Jamie