On Wed, Feb 8, 2012 at 12:03 PM, David Yeu <david.yeu@skype.net> wrote:
> Hi there,
>
> We've got a pretty large table that sees millions of new rows a day, and
> we're trying our best to optimize queries against it. We're hoping to find
> some guidance on this list.
>
> Thankfully, the types of queries that we perform against this table are
> pretty constrained. We never update rows and we never join against other
> tables. The table essentially looks like this:
>
> | id | group_id | created_at | everything elseŠ
>
> Where `id' is the primary key, auto-incrementing, `group_id' is the
> foreign key that we always scope against, and `created_at' is the
> insertion time. We have indices against the primary key and the group_id.
> Our queries essentially fall into the following cases:
>
> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
> * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
> * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;
>
> In human words, we're looking for:
>
> * The most recent (20) rows.
> * The most recent rows after a given `id'.
> * Twenty rows before a given `id'.
> * Pages of twenty rows.
You can probably significantly optimize this. But first, can we see
some explain analyze for the affected queries?
merlin