Re: Tuning resource parameters for a logging database.
От | Alex Thurlow |
---|---|
Тема | Re: Tuning resource parameters for a logging database. |
Дата | |
Msg-id | 4A15ADD9.6040600@blastro.com обсуждение исходный текст |
Ответ на | Re: Tuning resource parameters for a logging database. (Scott Marlowe <scott.marlowe@gmail.com>) |
Ответы |
Re: Tuning resource parameters for a logging database.
(Scott Marlowe <scott.marlowe@gmail.com>)
|
Список | pgsql-general |
I should also specify how my inserts are happening I guess. I'm actually doing the logs to flat files and then inserting them into the database on an hourly basis using COPY, so I don't need to worry as much about the log insert speed as I do the reporting.
On 5/21/2009 2:36 PM, Scott Marlowe wrote:
On 5/21/2009 2:36 PM, Scott Marlowe wrote:
On Thu, May 21, 2009 at 1:13 PM, Alex Thurlow <alex@blastro.com> wrote:I have a postgresql database that I'm using for logging of data. There's basically one table where each row is a line from my log files. It's getting to a size where it's running very slow though. There are about 10 million log lines per day and I keep 30 days of data in it. All the columns I filter on are indexed (mostly I just use date).**DING DING DING** you've just said the magic phrase that says that partitioning would be a help.And I tend to pull one day of data at a time with grouped counts by 1 or 2 other columns. There also tends to be only 1 or 2 of these large queries running at any given time, so a lot of resources can be thrown at each one. I'm wondering what my resource parameters should be for optimal speed of the selects on this database, since I haven't seen a good example where someone has done anything like this.With a logging database you're optimizing two often opposing actions. Lots of small inserts in a stream that HAVE to get processed and put in efficiently. This is often accomplished with minimum shared_buffers and work_mem, because there's no need for the overhead of large shared_buffers and insert queries for logging dbs don't need much work_mem. With a reporting database you run queries that chew up tons of memory both shared_buffers and work_mem for efficient operation.The machine is an 8 core opteron (I know I won't really use those, but Dell threw in the 2nd proc for free) with 8 Gb RAM. The database is on a RAID 10 JFS partition.Yeah CPUs are cheap, might as well stock up on them. A reporting database can quickly go cpu bound if everything the users want to see fits in memory.This is what I have in postgresql.conf right now.. shared_buffers = 64MBSmall for reporting, just right for logging. I'd try something bigger but not insanely huge. Let the OS do the caching of 90% of the data, let the db cache a good sized working set. 256M to 1G is reasonable based on benchmarks of your own queries.work_mem = 128MBBigger than needed for logging, good for reporting. You can probably just leave it.maintenance_work_mem = 256MB max_fsm_pages = 614400If you're not partitioning then this needs to be big enough to contain 1 days+ worth of dead rows. Look at lowering your random_page_cost, and increasing default stats target to 100 to 1000 depending on your data and explain analyze query testing.
-- Alex Thurlow Blastro Networks http://www.blastro.com http://www.roxwel.com http://www.yallwire.com
В списке pgsql-general по дате отправления:
Предыдущее
От: Scott MarloweДата:
Сообщение: Re: Tuning resource parameters for a logging database.