Re: High Frequency Inserts to Postgres Database vs Writing to a File

Поиск
Список
Период
Сортировка
От Anj Adu
Тема Re: High Frequency Inserts to Postgres Database vs Writing to a File
Дата
Msg-id f2fd819a0911040858w103c90cbi63d2fda3b814707f@mail.gmail.com
обсуждение исходный текст
Ответ на High Frequency Inserts to Postgres Database vs Writing to a File  (Jay Manni <JManni@FireEye.com>)
Список pgsql-performance
> I have an application wherein a process needs to read data from a stream and
> store the records for further analysis and reporting. The data in the stream
> is in the form of variable length records with clearly defined fields – so
> it can be stored in a database or in a file. The only caveat is that the
> rate of records coming in the stream could be several 1000 records a second.
> The design choice I am faced with currently is whether to use a postgres
> database or a flat file for this purpose. My application already maintains a
> postgres (8.3.4) database for other reasons – so it seemed like the
> straightforward thing to do. However I am concerned about the performance
> overhead of writing several 1000 records a second to the database. The same
> database is being used simultaneously for other activities as well and I do
> not want those to be adversely affected by this operation (especially the
> query times). The advantage of running complex queries to mine the data in
> various different ways is very appealing but the performance concerns are
> making me wonder if just using a flat file to store the data would be a
> better approach.
>
>
>
> Anybody have any experience in high frequency writes to a postgres database?


As mentioned earlier in this thread,,make sure your hardware can
scale. You may hit a "monolithic hardware" wall and may have to
distribute your data across multiple boxes and have your application
manage the distribution and access. A RAID 10 storage
architecture(since fast writes are critical) with a mulitple core box
(preferably 8) having fast scsi disks (15K rpm) may be a good starting
point.

We have a similar requirement and we scale by distributing the data
across multiple boxes. This is key.

If you need to run complex queries..plan on aggregation strategies
(processes that aggregate and optimize the data storage to facilitate
faster access).

Partitioning is key. You will need to purge old data at some point.
Without partitions..you will run into trouble with the time taken to
delete old data as well as availability of disk space.

These are just guidelines for a big warehouse style database.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: maintaining a reference to a fetched row
Следующее
От: Brian Karlak
Дата:
Сообщение: Re: maintaining a reference to a fetched row