Re: very very slow inserts into very large table

Поиск
Список
Период
Сортировка
От Jon Nelson
Тема Re: very very slow inserts into very large table
Дата
Msg-id CAKuK5J2i5XF1828+fB6Z=Yfkxz1iq9OiR-AkcAAuDA+kgfQ7LQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: very very slow inserts into very large table  (Samuel Gendler <sgendler@ideasculptor.com>)
Ответы Re: very very slow inserts into very large table
Re: very very slow inserts into very large table
Список pgsql-performance
On Mon, Jul 16, 2012 at 12:35 PM, Samuel Gendler
<sgendler@ideasculptor.com> wrote:
> On Mon, Jul 16, 2012 at 7:06 AM, Mark Thornton <mthornton@optrak.com> wrote:
>>>
>>>
>> Every insert updates four indexes, so at least 3 of those will be in
>> random order. The indexes don't fit in memory, so all those updates will
>> involve reading most of the relevant b-tree pages from disk (or at least the
>> leaf level). A total of 10ms of random read from disk (per inserted row)
>> wouldn't surprise me ... which adds up to more than 10 days for your 93
>> million rows.
>
>
> Which is the long way of saying that you will likely benefit from
> partitioning that table into a number of smaller tables, especially if
> queries on that table tend to access only a subset of the data that can be
> defined to always fit into a smaller number of partitions than the total.
> At the very least, inserts will be faster because individual indexes will be
> smaller.  But unless all queries can't be constrained to fit within a subset
> of partitions, you'll also see improved performance on selects.

Acknowledged. My data is actually partitioned into individual tables,
but this was an experiment to see what the performance was like. I was
expecting that effectively appending all of the individual tables into
a great big table would result in less redundant information being
stored in indices and, therefore, a bit more speed and efficiency.
However, I have to admit I was very surprised at the performance
reduction.

What is the greater lesson to take away, here? If you are working with
data that is larger (substantially larger) than available memory, is
the architecture and design of postgresql such that the only real
approach is some type of data partitioning? It is not my intent to
insult or even disparage my favorite software, but it took less time
to *build* the indices for 550GB of data than it would have to insert
1/20th as much. That doesn't seem right.

--
Jon

В списке pgsql-performance по дате отправления:

Предыдущее
От: Samuel Gendler
Дата:
Сообщение: Re: very very slow inserts into very large table
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: very very slow inserts into very large table