Re: partitioning for speed, but query planner ignores

Поиск

Список

Период

Сортировка

От	Kevin Grittner
Тема	Re: partitioning for speed, but query planner ignores
Дата	2 октября 2013 г. 15:20:09
Msg-id	1380727198.99522.YahooMailNeo@web162901.mail.bf1.yahoo.com обсуждение
Ответ на	partitioning for speed, but query planner ignores (David Rysdam <drysdam@ll.mit.edu>)
Ответы	Re: partitioning for speed, but query planner ignores
Список	pgsql-general

Дерево обсуждения

David Rysdam <drysdam@ll.mit.edu> wrote:

> We have a by-our-standards large table (about 40e6 rows). Since it is
> the bottleneck in some places, I thought I'd experiment with
> partitioning.

In my personal experience I have gone into hundreds of millions of
rows with good performance without partitioning.  It's all about
designing good indexes for the workload.

I have only seen partitioning help in two cases:
(1)  There will be bulk deletes of rows, and you know at insert
time which bulk delete the row belongs with.  Dropping a partition
table is a very fast way to delete a large number of rows.
(2)  The bulk of activity will be on a relatively small subset of
the rows at any one time, and you can partition such that the set
of active rows will be in a small number of partitions.

In all other cases, I have only seen partitioning harm performance.
 There is no reason to think that checking the table-level
constraints on every partition table will be faster than descending
through an index tree level.

> The table holds data about certain objects, each of which has an object
> number and some number of historical entries (like account activity at a
> bank, say). The typical usage pattern is: relatively rare inserts that
> happen in the background via an automated process (meaning I don't care
> if they take a little longer) and frequent querying, including some
> where a human is sitting in front of it (i.e. I'd like it to be a lot
> faster).
>
> Our most frequent queries either select "all history for object N" or
> "most recent item for some subset of objects".
>
> Because object number figure so prominently, I thought I'd partition on
> that. To me, it makes the most sense from a load-balancing perspective

Load balancing?  Hitting a single partition more heavily improves
your cache hit ratio.  What sort of benefit are you expecting from
spreading the reads across all the partitions?  *Maybe* that could
help if you carefully placed each partition table on a separate set
of spindles, but usually you are better off having one big RAID so
that every partition is spread across all the spindles
automatically.

> Lower numbers are going to be queried much less often than higher
> numbers.

This suggests to me that you *might* get a performance boost if you
define partitions on object number *ranges*.  It still seems a bit
dubious, but it has a chance.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: partitioning for speed, but query planner ignores