Re: Using Postgres to store high volume streams of sensor readings

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Using Postgres to store high volume streams of sensor readings
Дата
Msg-id Pine.GSO.4.64.0811211224290.23150@westnet.com
обсуждение исходный текст
Ответ на Re: Using Postgres to store high volume streams of sensor readings  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Using Postgres to store high volume streams of sensor readings  ("Ciprian Dorin Craciun" <ciprian.craciun@gmail.com>)
Список pgsql-general
On Fri, 21 Nov 2008, Tom Lane wrote:

> Not sure if it applies to your real use-case, but if you can try doing
> the COPY from a local file instead of across the network link, it
> might go faster.

The fact that the inserts are reported as fast initially but slow as the
table and index size grow means it's probably a disk bottleneck rather
than anything related to the client itself.  If the network overhead was
the issue, I wouldn't expect it to start fast like that.  Ditto for
concerns about the random function being slow.  Either of those might
speed up the initial, fast period a bit, but they're not going to impact
the later slowdown.

Ciprian, the main interesting piece of data to collect here is a snapshot
of a few samples lines from the output from "vmstat 1" during the initial,
fast loading section versus the slower period.  I think what's happening
to you is that maintaining the index blocks on the disk is turning into
increasingly random I/O as the size of the table grows, and your disks
just can't keep up with that.  What I'd expect is that initially the
waiting for I/O "wa" figure will be low, but it will creep up constantly
and at some point spike up hard after the working set of data operated on
exceeds memory.

The fact that PostgreSQL performs badly here compared to the more
lightweight databases you've used isn't that much of a surprise.  There's
a fair amount of overhead for the write-ahead log and the MVCC
implementation in the database, and your application is suffering from all
that robustness overhead but not really gaining much of a benefit from it.
The main things that help in this sort of situation are increases in
shared_buffers and checkpoint_segments, so that more database information
is stored in RAM for longer rather than being pushed to disk too quickly,
but what Rafael suggested already got you most of the possible improvement
here.  You might get an extra bit of boost by adjusting the index
FILLFACTOR upwards (default is 90, if you're never updating you could try
100).  I doubt that will be anything but a minor incremental improvement
though.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

В списке pgsql-general по дате отправления:

Предыдущее
От: "Ciprian Dorin Craciun"
Дата:
Сообщение: Re: Using Postgres to store high volume streams of sensor readings
Следующее
От: "Ciprian Dorin Craciun"
Дата:
Сообщение: Re: Using Postgres to store high volume streams of sensor readings