Re: Advice requested on structuring aggregation queries

Поиск
Список
Период
Сортировка
От Joe Conway
Тема Re: Advice requested on structuring aggregation queries
Дата
Msg-id 4B83664D.3030401@joeconway.com
обсуждение исходный текст
Ответ на Advice requested on structuring aggregation queries  (Dave Crooke <dcrooke@gmail.com>)
Список pgsql-performance
On 02/22/2010 07:01 PM, Dave Crooke wrote:
> The original data is keyed by a 3-ary tuple of strings .... to keep the
> row size down, in the new data model I'm actually storing 32-bit int's
> in Postgres. The new schema for each table looks like this:
>
> (a integer,
>  b integer,
>  c integer,
>  ts timestamp without timezone,
>  value double precision)
>
> with two indexes: (a, b, ts) and (b, ts)

[...snip...]

> There are about 60 different values of b, and for each such value there
> is a exactly one type of rollup. The old code is doing the rollups in
> Postgres with 60 bulk "insert into .... select" statements, hence the
> need for the second index.

[...snip...]

> For better scalability, I am partitioning these tables by time .... I am
> not using PG's table inheritance and triggers to do the partitioning,
> but instead dynamically generating the SQL and table names in the
> application code (Java). In most cases, the rollups will still happen
> from a single source "data_table" and I plan to continue using the
> existing SQL, but I have a few cases where the source "data_table" rows
> may actually come from two adjacent tables.

Without going through your very long set of questions in detail, it
strikes me that you might be better off if you:

1) use PostgreSQL partitioning (constraint exclusion)
2) partition by ts range
3) consider also including b in your partitioning scheme
4) create one index as (ts, a)
5) use dynamically generated SQL and table names in the application
   code to create (conditionally) and load the tables

But of course test both this and your proposed method and compare ;-)

Also you might consider PL/R for some of your analysis (e.g. mode would
be simple, but perhaps not as fast):
  http://www.joeconway.com/web/guest/pl/r

HTH,

Joe


Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Dave Crooke
Дата:
Сообщение: Advice requested on structuring aggregation queries
Следующее
От: david@lang.hm
Дата:
Сообщение: Re: SSD + RAID