Re: Simple (hopefully) throughput question?

Поиск
Список
Период
Сортировка
От Robert Klemme
Тема Re: Simple (hopefully) throughput question?
Дата
Msg-id ib1f0e$f73$1@dough.gmane.org
обсуждение исходный текст
Ответ на Simple (hopefully) throughput question?  (Nick Matheson <Nick.D.Matheson@noaa.gov>)
Список pgsql-performance
On 11/03/2010 04:52 PM, Nick Matheson wrote:

> We have an application that needs to do bulk reads of ENTIRE
> Postgres tables very quickly (i.e. select * from table). We have
> observed that such sequential scans run two orders of magnitude slower
> than observed raw disk reads (5 MB/s versus 100 MB/s). Part of this is
> due to the storage overhead we have observed in Postgres. In the
> example below, it takes 1 GB to store 350 MB of nominal data. However
> that suggests we would expect to get 35 MB/s bulk read rates.

> Our business logic does operations on the resulting data such that
> the output is several orders of magnitude smaller than the input. So
> we had hoped that by putting our business logic into stored procedures
> (and thus drastically reducing the amount of data flowing to the
> client) our throughput would go way up. This did not happen.

Can you disclose what kinds of manipulations you want to do on the data?
  I am asking because maybe there is a fancy query (possibly using
windowing functions and / or aggregation functions) that gets you the
speed that you need without transferring the whole data set to the client.

> So our questions are as follows:
>
> Is there any way using stored procedures (maybe C code that calls
> SPI directly) or some other approach to get close to the expected 35
> MB/s doing these bulk reads? Or is this the price we have to pay for
> using SQL instead of some NoSQL solution. (We actually tried Tokyo
> Cabinet and found it to perform quite well. However it does not measure
> up to Postgres in terms of replication, data interrogation, community
> support, acceptance, etc).

Kind regards

    robert

В списке pgsql-performance по дате отправления:

Предыдущее
От: Mladen Gogala
Дата:
Сообщение: Re: Running PostgreSQL as fast as possible no matter the consequences
Следующее
От: Josh Berkus
Дата:
Сообщение: Major Linux performance regression; shouldn't we be worried about RHEL6?