Обсуждение: Performance on large, append-only tables

Поиск
Список
Период
Сортировка

Performance on large, append-only tables

От
David Yeu
Дата:
Hi there,

We've got a pretty large table that sees millions of new rows a day, and
we're trying our best to optimize queries against it. We're hoping to find
some guidance on this list.

Thankfully, the types of queries that we perform against this table are
pretty constrained. We never update rows and we never join against other
tables. The table essentially looks like this:

| id | group_id | created_at | everything elseŠ

Where `id' is the primary key, auto-incrementing, `group_id' is the
foreign key that we always scope against, and `created_at' is the
insertion time. We have indices against the primary key and the group_id.
Our queries essentially fall into the following cases:

 * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
 * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
 * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
 * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;

In human words, we're looking for:

 * The most recent (20) rows.
 * The most recent rows after a given `id'.
 * Twenty rows before a given `id'.
 * Pages of twenty rows.

Originally, this table was part of our primary database, but recently we
saw queries take upwards of thirty seconds or more to complete. Since
we're serving web requests, that's basically unacceptable, and caused a
lot of requests to backup. Our interim solution has been to simply carve
out a new database that hosts only this table, and that has worked to some
degree. We aren't seeing thirty seconds plus database response times
anymore, but some queries still take many seconds and the cost of spinning
up a new master-slave configuration hasn't been cheap.

In the meantime, we're hoping to investigate other ways to optimize this
table and the queries against it. Heroku's data team has suggested balling
up these rows into arrays, where a single row would represent a group_id,
and the data would occupy a single column as an array. We don't have any
experience with this and were wondering if anyone here has tried it.

And finally, we're also trying out alternative stores, since it seems like
this data and its retrieval could be well suited to document-oriented
backends. Redis and DynamoDB are currently the best contenders.

Thanks in advance for any help,

Regards,

Dave Yeu & Neil Sarkar
GroupMe



Re: Performance on large, append-only tables

От
Merlin Moncure
Дата:
On Wed, Feb 8, 2012 at 12:03 PM, David Yeu <david.yeu@skype.net> wrote:
> Hi there,
>
> We've got a pretty large table that sees millions of new rows a day, and
> we're trying our best to optimize queries against it. We're hoping to find
> some guidance on this list.
>
> Thankfully, the types of queries that we perform against this table are
> pretty constrained. We never update rows and we never join against other
> tables. The table essentially looks like this:
>
> | id | group_id | created_at | everything elseŠ
>
> Where `id' is the primary key, auto-incrementing, `group_id' is the
> foreign key that we always scope against, and `created_at' is the
> insertion time. We have indices against the primary key and the group_id.
> Our queries essentially fall into the following cases:
>
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
>  * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;
>
> In human words, we're looking for:
>
>  * The most recent (20) rows.
>  * The most recent rows after a given `id'.
>  * Twenty rows before a given `id'.
>  * Pages of twenty rows.

You can probably significantly optimize this.  But first, can we see
some explain analyze for the affected queries?

merlin

Re: Performance on large, append-only tables

От
Claudio Freire
Дата:
On Wed, Feb 8, 2012 at 3:03 PM, David Yeu <david.yeu@skype.net> wrote:
> Thankfully, the types of queries that we perform against this table are
> pretty constrained. We never update rows and we never join against other
> tables. The table essentially looks like this:
>
> | id | group_id | created_at | everything elseŠ
...
> Our queries essentially fall into the following cases:
>
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
>  * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;

I think you have something to gain from partitioning.
You could partition on group_id, which is akin to sharding only on a
single server, and that would significantly decrease each partition's
index size. Since those queries' performance is highly dependent on
index size, and since you seem to have such a huge table, I would
imagine such partitioning would help keep the indices performant.

Now, we do need statistics. How many groups are there? Do they grow
with your table, or is the number of groups constant? Which values of
offsets do you use? (offset is quite expensive)

And of course... explain analyze.

Re: Performance on large, append-only tables

От
Marti Raudsepp
Дата:
On Wed, Feb 8, 2012 at 20:03, David Yeu <david.yeu@skype.net> wrote:
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;
>  * Pages of twenty rows.

A good improvement for this sort of queries is the "scalable paging"
trick. Instead of increasing the OFFSET argument -- which means that
Postgres has to scan more and more rows -- you should remember an
index key where the last page ended.

In other words, you get the first page using:
WHERE group_id = ? ORDER BY created_at DESC LIMIT 20

Say, this page returns created_at values between 2012-01-01 and
2012-01-10. If the user clicks "next page", you run a query like this
instead:
WHERE group_id = ? AND created_at>'2012-01-10' ORDER BY created_at DESC LIMIT 20

Thus, every "next page" fetch always takes a constant time. Of course
there's a small problem when two rows have equal times. Then, you can
add primary key to the sort key to disambiguate those rows:

WHERE group_id = ? AND (created_at, pkey_col) > ('2012-01-10', 712)
ORDER BY created_at, pkey_col DESC LIMIT 20

Of course an index on (group_id, created_at) or (group_id, created_at,
pkey_col) is necessary for these to work well

Regards,
Marti

Re: Performance on large, append-only tables

От
Tom Lane
Дата:
David Yeu <david.yeu@skype.net> writes:
> Our queries essentially fall into the following cases:

>  * � WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>  * � WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
>  * � WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
>  * � WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;

All of those should be extremely cheap if you've got the right indexes,
with the exception of the last one.  Large OFFSET values are never a
good idea, because Postgres always has to scan and discard that many
rows.  If you need to fetch successive pages, consider using a cursor
with a series of FETCH commands.  Another possibility, if the data is
sufficiently constrained, is to move the limit point with each new
query, ie instead of OFFSET use something like

    WHERE group_id = ? AND created_at < last-previous-value
    ORDER BY created_at DESC LIMIT 20;

            regards, tom lane

Re: Performance on large, append-only tables

От
"Kevin Grittner"
Дата:
David Yeu <david.yeu@skype.net> wrote:

> We have indices against the primary key and the group_id.
> Our queries essentially fall into the following cases:
>
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
>  * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT
>    20;
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET
>    ?;
>
> In human words, we're looking for:
>
>  * The most recent (20) rows.
>  * The most recent rows after a given `id'.
>  * Twenty rows before a given `id'.
>  * Pages of twenty rows.

The first thing I would try is building an index (perhaps
CONCURRENTLY to avoid disrupting production) on (group_id,
created_at).  It might also be worth creating an index on (group_id,
id, created_at), but that's a less-sure win.

> Originally, this table was part of our primary database, but
> recently we saw queries take upwards of thirty seconds or more to
> complete. Since we're serving web requests, that's basically
> unacceptable, and caused a lot of requests to backup.

With only the indexes you mention, it had to be doing either
complete table scans for each request, or a lot of random access to
rows it didn't need.

> Our interim solution has been to simply carve out a new database
> that hosts only this table, and that has worked to some degree. We
> aren't seeing thirty seconds plus database response times anymore,
> but some queries still take many seconds and the cost of spinning
> up a new master-slave configuration hasn't been cheap.

Well, throwing hardware at something doesn't generally hurt, but
it's not the first solution I would try, especially when the product
you're using has ways to tune performance.

> In the meantime, we're hoping to investigate other ways to
> optimize this table and the queries against it. Heroku's data team
> has suggested balling up these rows into arrays, where a single
> row would represent a group_id, and the data would occupy a single
> column as an array.

Ugh.  You're a long way from needing to give up on the relational
model here.

> And finally, we're also trying out alternative stores, since it
> seems like this data and its retrieval could be well suited to
> document-oriented backends. Redis and DynamoDB are currently the
> best contenders.

Your current use of PostgreSQL is more or less equivalent to driving
a car around in first gear.  You might consider a tuned PostgreSQL
as another alternative store.  :-)

-Kevin

Re: Performance on large, append-only tables

От
David Yeu
Дата:
Yeah, Reply-All...

Begin forwarded message:

> From: David Yeu <david.yeu@skype.net>
> Subject: Re: [PERFORM] Performance on large, append-only tables
> Date: February 10, 2012 10:59:04 AM EST
> To: Merlin Moncure <mmoncure@gmail.com>
>
> On Feb 10, 2012, at 10:19 AM, Merlin Moncure wrote:
>
>> You can probably significantly optimize this.  But first, can we see
>> some explain analyze for the affected queries?
>
> Sorry, we should have included these in the original post. Here's the EXPLAIN output for a "id < ?" query:
>
>
> => EXPLAIN ANALYZE SELECT  "lines".* FROM "lines" WHERE (lines.deleted_at IS NULL) AND ("lines".group_id = ?) AND (id
<?) ORDER BY id DESC LIMIT 20 OFFSET 0; 
>                                                                       QUERY PLAN
                                  
>
---------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=9267.44..9267.45 rows=20 width=1321) (actual time=348.844..348.877 rows=20 loops=1)
>   ->  Sort  (cost=9267.44..9269.76 rows=4643 width=1321) (actual time=348.840..348.852 rows=20 loops=1)
>         Sort Key: id
>         Sort Method:  top-N heapsort  Memory: 29kB
>         ->  Index Scan using index_lines_on_group_id on lines  (cost=0.00..9242.73 rows=4643 width=1321) (actual
time=6.131..319.835rows=23038 loops=1) 
>               Index Cond: (group_id = ?)
>               Filter: ((deleted_at IS NULL) AND (id < ?))
> Total runtime: 348.987 ms
>
>
> A quick suggestion from Heroku yesterday was a new index on (group_id, id). After adding it to a database fork, we
endedup with: 
>
>
> => EXPLAIN ANALYZE SELECT  "lines".* FROM "lines" WHERE (lines.deleted_at IS NULL) AND ("lines".group_id = ?) AND (id
<?) ORDER BY id DESC LIMIT 20 OFFSET 0; 
>                                                                            QUERY PLAN
                                           
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=0.00..28.88 rows=20 width=1321) (actual time=17.216..109.905 rows=20 loops=1)
>   ->  Index Scan Backward using index_lines_on_group_id_and_id on lines  (cost=0.00..6416.04 rows=4443 width=1321)
(actualtime=17.207..109.867 rows=20 loops=1) 
>         Index Cond: ((group_id = ?) AND (id < ?))
>         Filter: (deleted_at IS NULL)
> Total runtime: 110.039 ms
>
>
> The result has been pretty dramatic for the "id <> ?" queries, which make up the bulk of the queries. Running a whole
bunchof EXPLAIN ANAYLZE queries also showed that some queries were actually choosing to use the index on `id' instead
of`group_id', and that performed about as poorly as expected. Thankfully, the new index on (group_id, id) seems to be
preferablenearly always. 
>
> And for reference, here's the EXPLAIN for the LIMIT, OFFSET query:
>
>
> => EXPLAIN ANALYZE SELECT  "lines".* FROM "lines" WHERE (lines.deleted_at IS NULL) AND ("lines".group_id = ?) ORDER
BYid DESC LIMIT 20 OFFSET 60; 
>                                                                      QUERY PLAN
                                
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=9274.45..9274.46 rows=20 width=1321) (actual time=109.674..109.708 rows=20 loops=1)
>   ->  Sort  (cost=9274.42..9276.75 rows=4646 width=1321) (actual time=109.606..109.657 rows=80 loops=1)
>         Sort Key: id
>         Sort Method:  top-N heapsort  Memory: 43kB
>         ->  Index Scan using index_lines_on_group_id on lines  (cost=0.00..9240.40 rows=4646 width=1321) (actual
time=0.117..98.905rows=7999 loops=1) 
>               Index Cond: (group_id = ?)
>               Filter: (deleted_at IS NULL)
> Total runtime: 109.753 ms
>
>
> - Dave
>



Re: Performance on large, append-only tables

От
Claudio Freire
Дата:
On Fri, Feb 10, 2012 at 1:19 PM, David Yeu <david.yeu@skype.net> wrote:
>> => EXPLAIN ANALYZE SELECT  "lines".* FROM "lines" WHERE (lines.deleted_at IS NULL) AND ("lines".group_id = ?) AND
(id< ?) ORDER BY id DESC LIMIT 20 OFFSET 0; 

Interesting...

Do you have many "deleted" rows?
Do you always filter them out like this?

Because in that case, you can add the condition to the indices to
exclude deleted rows from the index. This is a big win if you have
many deleted rows, only the index expression has to be exactly the
same (verbatim) as the one used in the query.

That, and an index on "(group_id, created_at) where (deleted_at IS
NULL)" to catch the sorted by date kind of query, and you'll be done I
think.

Re: Performance on large, append-only tables

От
David Yeu
Дата:
On Feb 10, 2012, at 11:26 AM, Claudio Freire wrote:
> That, and an index on "(group_id, created_at) where (deleted_at IS
> NULL)" to catch the sorted by date kind of query, and you'll be done I
> think.

Yeah, I didn't quite get that right -- we're actually sorting all these queries by "id DESC", not "created_at DESC", so
thatseems to obviate the need for any index on created_at.  

Dave




Re: Performance on large, append-only tables

От
Claudio Freire
Дата:
On Fri, Feb 10, 2012 at 1:45 PM, David Yeu <david.yeu@skype.net> wrote:
> On Feb 10, 2012, at 11:26 AM, Claudio Freire wrote:
>> That, and an index on "(group_id, created_at) where (deleted_at IS
>> NULL)" to catch the sorted by date kind of query, and you'll be done I
>> think.
>
> Yeah, I didn't quite get that right -- we're actually sorting all these queries by "id DESC", not "created_at DESC",
sothat seems to obviate the need for any index on created_at. 

From your OP:

>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;

Re: Performance on large, append-only tables

От
David Yeu
Дата:
On Feb 10, 2012, at 11:58 AM, Claudio Freire wrote:

> From your OP:
>
>> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;

Yup, sorry.

Dave



Re: Performance on large, append-only tables

От
Claudio Freire
Дата:
On Fri, Feb 10, 2012 at 2:00 PM, David Yeu <david.yeu@skype.net> wrote:
>> From your OP:
>>
>>> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>
> Yup, sorry.

Ah, ok, so that should do it.
If you need further improvement, remember to take a look at the deleted stuff.