Re: Speed / Server

Поиск
Список
Период
Сортировка
От Anthony Presley
Тема Re: Speed / Server
Дата
Msg-id 1254867456.3477.312.camel@speedy.resolution.com
обсуждение исходный текст
Ответ на Re: Speed / Server  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-performance
On Tue, 2009-10-06 at 17:16 -0400, Merlin Moncure wrote:
> On Sun, Oct 4, 2009 at 6:45 PM,  <anthony@resolution.com> wrote:
> > All:
> >
> > We have a web-application which is growing ... fast.  We're currently
> > running on (1) quad-core Xeon 2.0Ghz with a RAID-1 setup, and 8GB of RAM.
> >
> > Our application collects a lot of sensor data, which means that we have 1
> > table which has about 8 million rows, and we're adding about 2.5 million
> > rows per month.
> >
> > The problem is, this next year we're anticipating significant growth,
> > where we may be adding more like 20 million rows per month (roughly 15GB
> > of data).
> >
> > A row of data might have:
> >  The system identifier (int)
> >  Date/Time read (timestamp)
> >  Sensor identifier (int)
> >  Data Type (int)
> >  Data Value (double)
>
> One approach that can sometimes help is to use arrays to pack data.
> Arrays may or may not work for the data you are collecting: they work
> best when you always pull the entire array for analysis and not a
> particular element of the array.  Arrays work well because they pack
> more data into index fetches and you get to skip the 20 byte tuple
> header.  That said, they are an 'optimization trade off'...you are
> making one type of query fast at the expense of others.
>
> In terms of hardware, bulking up memory will only get you so
> far...sooner or later you have to come to terms with the fact that you
> are dealing with 'big' data and need to make sure your storage can cut
> the mustard.  Your focus on hardware upgrades should probably be size
> and quantity of disk drives in a big raid 10.
>
> Single user or 'small number of user'  big data queries tend to
> benefit more from fewer core, fast cpus.
>
> Also, with big data, you want to make sure your table design and
> indexing strategy is as tight as possible.

Thanks for all of the input.  One thing we're going to try is to slice
up the data based on the data type ... so that we can spread the data
rows into about 15 different tables.  This should produce 15 tables, the
largest which will have about 50% of the data, with the rest having an
uneven distribution of the remaining data.

Most of the graphs / reports that we're doing need to only use one type
of data at a time, but several will need to stitch / combine data from
multiple data tables.

These combined with some new processors, and a fast RAID-10 system
should give us what we need going forward.

Thanks again!


--
Anthony


В списке pgsql-performance по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: Speed / Server
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Speed / Server