Re: Streaming large data into postgres [WORM like applications]

Поиск
Список
Период
Сортировка
От Dhaval Shah
Тема Re: Streaming large data into postgres [WORM like applications]
Дата
Msg-id 565237760705121749r4b331fa5v81cf235f3a371d0@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Streaming large data into postgres [WORM like applications]  (Lincoln Yeoh <lyeoh@pop.jaring.my>)
Ответы Re: Streaming large data into postgres [WORM like applications]  (Kevin Hunter <hunteke@earlham.edu>)
Re: Streaming large data into postgres [WORM like applications]  (Ron Johnson <ron.l.johnson@cox.net>)
Список pgsql-general
Consolidating my responses in one email.

1. The total data that is expected is some 1 - 1.5 Tb a day. 75% of
the data comes in a period of 10 hours. Rest 25% comes in the 14
hours. Of course there are ways to smooth the load patterns, however
the current scenario is as explained.

2 I do expect that the customer rolls in something like a NAS/SAN with
Tb of disk space. The idea is to retain the data for a duration and
offload it to tape.

That leads to the question, can the data be compressed? Since the data
is very similar, any compression would result in some 6x-10x
compression. Is there a way to identify which partitions are in which
data files and compress them until they are actually read?

Regards
Dhaval

On 5/12/07, Lincoln Yeoh <lyeoh@pop.jaring.my> wrote:
> At 04:43 AM 5/12/2007, Dhaval Shah wrote:
>
> >1. Large amount of streamed rows. In the order of @50-100k rows per
> >second. I was thinking that the rows can be stored into a file and the
> >file then copied into a temp table using copy and then appending those
> >rows to the master table. And then dropping and recreating the index
> >very lazily [during the first query hit or something like that]
>
> Is it one process inserting or can it be many processes?
>
> Is it just a short (relatively) high burst or is that rate sustained
> for a long time? If it's sustained I don't see the point of doing so
> many copies.
>
> How many bytes per row? If the rate is sustained and the rows are big
> then you are going to need LOTs of disks (e.g. a large RAID10).
>
> When do you need to do the reads, and how up to date do they need to be?
>
> Regards,
> Link.
>
>
>
>


--
Dhaval Shah

В списке pgsql-general по дате отправления:

Предыдущее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: [ADMIN] increasing of the shared memory does not solve the problem of "OUT of shared memory"
Следующее
От: Kevin Hunter
Дата:
Сообщение: Re: Streaming large data into postgres [WORM like applications]