Обсуждение: Performance on large, append-only tables
Hi there, We've got a pretty large table that sees millions of new rows a day, and we're trying our best to optimize queries against it. We're hoping to find some guidance on this list. Thankfully, the types of queries that we perform against this table are pretty constrained. We never update rows and we never join against other tables. The table essentially looks like this: | id | group_id | created_at | everything elseŠ Where `id' is the primary key, auto-incrementing, `group_id' is the foreign key that we always scope against, and `created_at' is the insertion time. We have indices against the primary key and the group_id. Our queries essentially fall into the following cases: * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC; * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20; * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?; In human words, we're looking for: * The most recent (20) rows. * The most recent rows after a given `id'. * Twenty rows before a given `id'. * Pages of twenty rows. Originally, this table was part of our primary database, but recently we saw queries take upwards of thirty seconds or more to complete. Since we're serving web requests, that's basically unacceptable, and caused a lot of requests to backup. Our interim solution has been to simply carve out a new database that hosts only this table, and that has worked to some degree. We aren't seeing thirty seconds plus database response times anymore, but some queries still take many seconds and the cost of spinning up a new master-slave configuration hasn't been cheap. In the meantime, we're hoping to investigate other ways to optimize this table and the queries against it. Heroku's data team has suggested balling up these rows into arrays, where a single row would represent a group_id, and the data would occupy a single column as an array. We don't have any experience with this and were wondering if anyone here has tried it. And finally, we're also trying out alternative stores, since it seems like this data and its retrieval could be well suited to document-oriented backends. Redis and DynamoDB are currently the best contenders. Thanks in advance for any help, Regards, Dave Yeu & Neil Sarkar GroupMe
On Wed, Feb 8, 2012 at 12:03 PM, David Yeu <david.yeu@skype.net> wrote: > Hi there, > > We've got a pretty large table that sees millions of new rows a day, and > we're trying our best to optimize queries against it. We're hoping to find > some guidance on this list. > > Thankfully, the types of queries that we perform against this table are > pretty constrained. We never update rows and we never join against other > tables. The table essentially looks like this: > > | id | group_id | created_at | everything elseŠ > > Where `id' is the primary key, auto-incrementing, `group_id' is the > foreign key that we always scope against, and `created_at' is the > insertion time. We have indices against the primary key and the group_id. > Our queries essentially fall into the following cases: > > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; > * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC; > * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20; > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?; > > In human words, we're looking for: > > * The most recent (20) rows. > * The most recent rows after a given `id'. > * Twenty rows before a given `id'. > * Pages of twenty rows. You can probably significantly optimize this. But first, can we see some explain analyze for the affected queries? merlin
On Wed, Feb 8, 2012 at 3:03 PM, David Yeu <david.yeu@skype.net> wrote: > Thankfully, the types of queries that we perform against this table are > pretty constrained. We never update rows and we never join against other > tables. The table essentially looks like this: > > | id | group_id | created_at | everything elseŠ ... > Our queries essentially fall into the following cases: > > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; > * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC; > * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20; > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?; I think you have something to gain from partitioning. You could partition on group_id, which is akin to sharding only on a single server, and that would significantly decrease each partition's index size. Since those queries' performance is highly dependent on index size, and since you seem to have such a huge table, I would imagine such partitioning would help keep the indices performant. Now, we do need statistics. How many groups are there? Do they grow with your table, or is the number of groups constant? Which values of offsets do you use? (offset is quite expensive) And of course... explain analyze.
On Wed, Feb 8, 2012 at 20:03, David Yeu <david.yeu@skype.net> wrote: > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?; > * Pages of twenty rows. A good improvement for this sort of queries is the "scalable paging" trick. Instead of increasing the OFFSET argument -- which means that Postgres has to scan more and more rows -- you should remember an index key where the last page ended. In other words, you get the first page using: WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 Say, this page returns created_at values between 2012-01-01 and 2012-01-10. If the user clicks "next page", you run a query like this instead: WHERE group_id = ? AND created_at>'2012-01-10' ORDER BY created_at DESC LIMIT 20 Thus, every "next page" fetch always takes a constant time. Of course there's a small problem when two rows have equal times. Then, you can add primary key to the sort key to disambiguate those rows: WHERE group_id = ? AND (created_at, pkey_col) > ('2012-01-10', 712) ORDER BY created_at, pkey_col DESC LIMIT 20 Of course an index on (group_id, created_at) or (group_id, created_at, pkey_col) is necessary for these to work well Regards, Marti
David Yeu <david.yeu@skype.net> writes: > Our queries essentially fall into the following cases: > * � WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; > * � WHERE group_id = ? AND id > ? ORDER BY created_at DESC; > * � WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20; > * � WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?; All of those should be extremely cheap if you've got the right indexes, with the exception of the last one. Large OFFSET values are never a good idea, because Postgres always has to scan and discard that many rows. If you need to fetch successive pages, consider using a cursor with a series of FETCH commands. Another possibility, if the data is sufficiently constrained, is to move the limit point with each new query, ie instead of OFFSET use something like WHERE group_id = ? AND created_at < last-previous-value ORDER BY created_at DESC LIMIT 20; regards, tom lane
David Yeu <david.yeu@skype.net> wrote: > We have indices against the primary key and the group_id. > Our queries essentially fall into the following cases: > > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; > * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC; > * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT > 20; > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET > ?; > > In human words, we're looking for: > > * The most recent (20) rows. > * The most recent rows after a given `id'. > * Twenty rows before a given `id'. > * Pages of twenty rows. The first thing I would try is building an index (perhaps CONCURRENTLY to avoid disrupting production) on (group_id, created_at). It might also be worth creating an index on (group_id, id, created_at), but that's a less-sure win. > Originally, this table was part of our primary database, but > recently we saw queries take upwards of thirty seconds or more to > complete. Since we're serving web requests, that's basically > unacceptable, and caused a lot of requests to backup. With only the indexes you mention, it had to be doing either complete table scans for each request, or a lot of random access to rows it didn't need. > Our interim solution has been to simply carve out a new database > that hosts only this table, and that has worked to some degree. We > aren't seeing thirty seconds plus database response times anymore, > but some queries still take many seconds and the cost of spinning > up a new master-slave configuration hasn't been cheap. Well, throwing hardware at something doesn't generally hurt, but it's not the first solution I would try, especially when the product you're using has ways to tune performance. > In the meantime, we're hoping to investigate other ways to > optimize this table and the queries against it. Heroku's data team > has suggested balling up these rows into arrays, where a single > row would represent a group_id, and the data would occupy a single > column as an array. Ugh. You're a long way from needing to give up on the relational model here. > And finally, we're also trying out alternative stores, since it > seems like this data and its retrieval could be well suited to > document-oriented backends. Redis and DynamoDB are currently the > best contenders. Your current use of PostgreSQL is more or less equivalent to driving a car around in first gear. You might consider a tuned PostgreSQL as another alternative store. :-) -Kevin
Yeah, Reply-All... Begin forwarded message: > From: David Yeu <david.yeu@skype.net> > Subject: Re: [PERFORM] Performance on large, append-only tables > Date: February 10, 2012 10:59:04 AM EST > To: Merlin Moncure <mmoncure@gmail.com> > > On Feb 10, 2012, at 10:19 AM, Merlin Moncure wrote: > >> You can probably significantly optimize this. But first, can we see >> some explain analyze for the affected queries? > > Sorry, we should have included these in the original post. Here's the EXPLAIN output for a "id < ?" query: > > > => EXPLAIN ANALYZE SELECT "lines".* FROM "lines" WHERE (lines.deleted_at IS NULL) AND ("lines".group_id = ?) AND (id <?) ORDER BY id DESC LIMIT 20 OFFSET 0; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=9267.44..9267.45 rows=20 width=1321) (actual time=348.844..348.877 rows=20 loops=1) > -> Sort (cost=9267.44..9269.76 rows=4643 width=1321) (actual time=348.840..348.852 rows=20 loops=1) > Sort Key: id > Sort Method: top-N heapsort Memory: 29kB > -> Index Scan using index_lines_on_group_id on lines (cost=0.00..9242.73 rows=4643 width=1321) (actual time=6.131..319.835rows=23038 loops=1) > Index Cond: (group_id = ?) > Filter: ((deleted_at IS NULL) AND (id < ?)) > Total runtime: 348.987 ms > > > A quick suggestion from Heroku yesterday was a new index on (group_id, id). After adding it to a database fork, we endedup with: > > > => EXPLAIN ANALYZE SELECT "lines".* FROM "lines" WHERE (lines.deleted_at IS NULL) AND ("lines".group_id = ?) AND (id <?) ORDER BY id DESC LIMIT 20 OFFSET 0; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Limit (cost=0.00..28.88 rows=20 width=1321) (actual time=17.216..109.905 rows=20 loops=1) > -> Index Scan Backward using index_lines_on_group_id_and_id on lines (cost=0.00..6416.04 rows=4443 width=1321) (actualtime=17.207..109.867 rows=20 loops=1) > Index Cond: ((group_id = ?) AND (id < ?)) > Filter: (deleted_at IS NULL) > Total runtime: 110.039 ms > > > The result has been pretty dramatic for the "id <> ?" queries, which make up the bulk of the queries. Running a whole bunchof EXPLAIN ANAYLZE queries also showed that some queries were actually choosing to use the index on `id' instead of`group_id', and that performed about as poorly as expected. Thankfully, the new index on (group_id, id) seems to be preferablenearly always. > > And for reference, here's the EXPLAIN for the LIMIT, OFFSET query: > > > => EXPLAIN ANALYZE SELECT "lines".* FROM "lines" WHERE (lines.deleted_at IS NULL) AND ("lines".group_id = ?) ORDER BYid DESC LIMIT 20 OFFSET 60; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=9274.45..9274.46 rows=20 width=1321) (actual time=109.674..109.708 rows=20 loops=1) > -> Sort (cost=9274.42..9276.75 rows=4646 width=1321) (actual time=109.606..109.657 rows=80 loops=1) > Sort Key: id > Sort Method: top-N heapsort Memory: 43kB > -> Index Scan using index_lines_on_group_id on lines (cost=0.00..9240.40 rows=4646 width=1321) (actual time=0.117..98.905rows=7999 loops=1) > Index Cond: (group_id = ?) > Filter: (deleted_at IS NULL) > Total runtime: 109.753 ms > > > - Dave >
On Fri, Feb 10, 2012 at 1:19 PM, David Yeu <david.yeu@skype.net> wrote: >> => EXPLAIN ANALYZE SELECT "lines".* FROM "lines" WHERE (lines.deleted_at IS NULL) AND ("lines".group_id = ?) AND (id< ?) ORDER BY id DESC LIMIT 20 OFFSET 0; Interesting... Do you have many "deleted" rows? Do you always filter them out like this? Because in that case, you can add the condition to the indices to exclude deleted rows from the index. This is a big win if you have many deleted rows, only the index expression has to be exactly the same (verbatim) as the one used in the query. That, and an index on "(group_id, created_at) where (deleted_at IS NULL)" to catch the sorted by date kind of query, and you'll be done I think.
On Feb 10, 2012, at 11:26 AM, Claudio Freire wrote: > That, and an index on "(group_id, created_at) where (deleted_at IS > NULL)" to catch the sorted by date kind of query, and you'll be done I > think. Yeah, I didn't quite get that right -- we're actually sorting all these queries by "id DESC", not "created_at DESC", so thatseems to obviate the need for any index on created_at. Dave
On Fri, Feb 10, 2012 at 1:45 PM, David Yeu <david.yeu@skype.net> wrote: > On Feb 10, 2012, at 11:26 AM, Claudio Freire wrote: >> That, and an index on "(group_id, created_at) where (deleted_at IS >> NULL)" to catch the sorted by date kind of query, and you'll be done I >> think. > > Yeah, I didn't quite get that right -- we're actually sorting all these queries by "id DESC", not "created_at DESC", sothat seems to obviate the need for any index on created_at. From your OP: > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
On Feb 10, 2012, at 11:58 AM, Claudio Freire wrote: > From your OP: > >> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; Yup, sorry. Dave
On Fri, Feb 10, 2012 at 2:00 PM, David Yeu <david.yeu@skype.net> wrote: >> From your OP: >> >>> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; > > Yup, sorry. Ah, ok, so that should do it. If you need further improvement, remember to take a look at the deleted stuff.