Re: Tuning resource parameters for a logging database.

Поиск
Список
Период
Сортировка
От Alex Thurlow
Тема Re: Tuning resource parameters for a logging database.
Дата
Msg-id 4A15ADD9.6040600@blastro.com
обсуждение исходный текст
Ответ на Re: Tuning resource parameters for a logging database.  (Scott Marlowe <scott.marlowe@gmail.com>)
Ответы Re: Tuning resource parameters for a logging database.  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-general
I should also specify how my inserts are happening I guess.  I'm actually doing the logs to flat files and then inserting them into the database on an hourly basis using COPY, so I don't need to worry as much about the log insert speed as I do the reporting.

On 5/21/2009 2:36 PM, Scott Marlowe wrote:
On Thu, May 21, 2009 at 1:13 PM, Alex Thurlow <alex@blastro.com> wrote: 
I have a postgresql database that I'm using for logging of data. There's
basically one table where each row is a line from my log files.  It's
getting to a size where it's running very slow though.  There are about 10
million log lines per day and I keep 30 days of data in it. All the columns
I filter on are indexed (mostly I just use date).   
**DING DING DING**  you've just said the magic phrase that says that
partitioning would be a help.
 
 And I tend to pull one
day of data at a time with grouped counts by 1 or 2 other columns.  There
also tends to be only 1 or 2 of these large queries running at any given
time, so a lot of resources can be thrown at each one.

I'm wondering what my resource parameters should be for optimal speed of the
selects on this database, since I haven't seen a good example where someone
has done anything like this.   
With a logging database you're optimizing two often opposing actions.
Lots of small inserts in a stream that HAVE to get processed and put
in efficiently.  This is often accomplished with minimum
shared_buffers and work_mem, because there's no need for the overhead
of large shared_buffers and insert queries for logging dbs don't need
much work_mem.

With a reporting database you run queries that chew up tons of memory
both shared_buffers and work_mem for efficient operation.
 
The machine is an 8 core opteron (I know I won't really use those, but Dell
threw in the 2nd proc for free) with 8 Gb RAM.  The database is on a RAID 10
JFS partition.   
Yeah CPUs are cheap, might as well stock up on them.  A reporting
database can quickly go cpu bound if everything the users want to see
fits in memory.
 
This is what I have in postgresql.conf right now..

shared_buffers = 64MB   
Small for reporting, just right for logging.  I'd try something bigger
but not insanely huge.  Let the OS do the caching of 90% of the data,
let the db cache a good sized working set.  256M to 1G is reasonable
based on benchmarks of your own queries.
 
work_mem = 128MB   
Bigger than needed for logging, good for reporting.  You can probably
just leave it.
 
maintenance_work_mem = 256MB
max_fsm_pages = 614400   
If you're not partitioning then this needs to be big enough to contain
1 days+ worth of dead rows.

Look at lowering your random_page_cost, and increasing default stats
target to 100 to 1000 depending on your data and explain analyze query
testing.
 


-- 
Alex Thurlow
Blastro Networks

http://www.blastro.com
http://www.roxwel.com
http://www.yallwire.com

В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: Tuning resource parameters for a logging database.
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: Tuning resource parameters for a logging database.