Re: Large Database Performance suggestions
От | Scott Marlowe |
---|---|
Тема | Re: Large Database Performance suggestions |
Дата | |
Msg-id | 1098421413.21035.78.camel@localhost.localdomain обсуждение исходный текст |
Ответ на | Large Database Performance suggestions (Joshua Marsh <icub3d@gmail.com>) |
Список | pgsql-performance |
On Thu, 2004-10-21 at 21:14, Joshua Marsh wrote: > Hello everyone, > > I am currently working on a data project that uses PostgreSQL > extensively to store, manage and maintain the data. We haven't had > any problems regarding database size until recently. The three major > tables we use never get bigger than 10 million records. With this > size, we can do things like storing the indexes or even the tables in > memory to allow faster access. > > Recently, we have found customers who are wanting to use our service > with data files between 100 million and 300 million records. At that > size, each of the three major tables will hold between 150 million and > 700 million records. At this size, I can't expect it to run queries > in 10-15 seconds (what we can do with 10 million records), but would > prefer to keep them all under a minute. > > We did some original testing and with a server with 8GB or RAM and > found we can do operations on data file up to 50 million fairly well, > but performance drop dramatically after that. Does anyone have any > suggestions on a good way to improve performance for these extra large > tables? Things that have come to mind are Replication and Beowulf > clusters, but from what I have recently studied, these don't do so wel > with singular processes. We will have parallel process running, but > it's more important that the speed of each process be faster than > several parallel processes at once. I'd assume that what's happening is that up to a certain data set size, it all fits in memory, and you're going from CPU/memory bandwidth limited to I/O limited. If this is the case, then a faster storage subsystem is the only real answer. If the database is mostly read, then a large RAID5 or RAID 1+0 array should help quite a bit. You might wanna post some explain analyze of the queries that are going slower at some point in size, along with schema for those tables etc...
В списке pgsql-performance по дате отправления: