Re: Using Postgres to store high volume streams of sensor readings

Поиск

Список

Период

Сортировка

От	Scott Marlowe
Тема	Re: Using Postgres to store high volume streams of sensor readings
Дата	23 ноября 2008 г. 17:24:00
Msg-id	dcc563d10811231023g4d9c91b2yebdec169ff2f2b4@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Using Postgres to store high volume streams of sensor readings ("Ciprian Dorin Craciun" <ciprian.craciun@gmail.com>)
Список	pgsql-general

Дерево обсуждения

On Sun, Nov 23, 2008 at 10:01 AM, Ciprian Dorin Craciun
<ciprian.craciun@gmail.com> wrote:
> On Sun, Nov 23, 2008 at 3:28 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> * Ciprian Dorin Craciun (ciprian.craciun@gmail.com) wrote:
>>> > Even better might be partitioning on the timestamp.  IF all access is
>>> > in a certain timestamp range it's usually a big win, especially
>>> > because he can move to a new table every hour / day / week or whatever
>>> > and merge the old one into a big "old data" table.
>>>
>>>     Yes, If i would speed the inserts tremendously... I've tested it
>>> and the insert speed is somewhere at 200k->100k.
>>>
>>>     But unfortunately the query speed is not good at all because most
>>> queries are for a specific client (and sensor) in a given time
>>> range...
>>
>> Have you set up your partitions correctly (eg, with appropriate CHECK
>> constraints and with constraint_exclusion turned on)?  Also, you'd want
>> to keep your indexes on the individual partitions, of course..  That
>> should improve query time quite a bit since it should only be hitting
>> the partitions where the data might be.
>>
>>        Stephen
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.9 (GNU/Linux)
>>
>> iEYEARECAAYFAkkpWpEACgkQrzgMPqB3kihvyACgm6ITdkodTqZvDLCjqavj9lkR
>> w1oAnRrB1rbW+bF6Spr77VcH5/Mty4S6
>> =G7aX
>> -----END PGP SIGNATURE-----
>
>    Well, now that I've read the previous two emails better, I
> understand what Scot and Stephen are talking about...
>
>    So if I understood it correctly: I should build indexes only for
> certain parts of the data (like previous full hour and so). But I see
> a problem: wouldn't this lead to a lot of indices beeing created (24 /
> hour, ~150 / week, ...)?

No, not exactly what I'm talking about.  I'm talking about
pre-creating partitions that the data will soon go into (let's say a
new one every hour) with indexes in place, and having a trigger that
fires on insert to put the data into the right partition.  Once that
partition is no longer being inserted into, and we aren't running a
bunch of queries on it, we migrate it to a historical partition.

So, your table looks something like this all the time:

|**|^^|##|##|##|##|$$$$$$$$$$$$$$...|

Where:

** is a partition we have created in advance of needing it.
^^ is the partition we are currently writing to
## are the partitions we're still using in select queries a lot
$$$... are the old data stuffed into the monolithic history table.

When it's time to switch to writing to the new partition (i.e. **) we
make a new one ahead of that, and the trigger starts writing to what
was a ** partition but is now the new ^^, and the ^^ becomes a ##.  At
the end of the day / week whatever, we take all the old ## partitions
and move their data into the $$$ and drop the ## partitions.

Note that we only need to put data into an archive partition to keep
from having hundreds or thousands of partitions.  There's a limit of a
few hundred partitions where things start getting slow again due to
planner overhead.

В списке pgsql-general по дате отправления:

Предыдущее

От: Scara Maccai
Дата: 23 ноября 2008 г., 17:10:41
Сообщение: Re: Using Postgres to store high volume streams of sensor readings

Следующее

От: "Scott Marlowe"
Дата: 23 ноября 2008 г., 18:11:11
Сообщение: Re: [Q]updating multiple rows with Different values

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Using Postgres to store high volume streams of sensor readings

Предыдущее

Следующее