Re: Why creating GIN table index is so slow than inserting data into empty table with the same index?

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Why creating GIN table index is so slow than inserting data into empty table with the same index?
Дата	24 марта 2009 г. 00:49:57
Msg-id	18581.1237865712@sss.pgh.pa.us обсуждение
Ответ на	Why creating GIN table index is so slow than inserting data into empty table with the same index? (Sergey Burladyan <eshkinkot@gmail.com>)
Ответы	Re: Why creating GIN table index is so slow than inserting data into empty table with the same index?
Список	pgsql-performance

Дерево обсуждения

Sergey Burladyan <eshkinkot@gmail.com> writes:
> show maintenance_work_mem ;
>  maintenance_work_mem
> ----------------------
>  128MB

> create table a (i1 int, i2 int, i3 int, i4 int, i5 int, i6 int);
> insert into a select n, n, n, n, n, n from generate_series(1, 100000) as n;
> create index arr_gin on a using gin ( (array[i1, i2, i3, i4, i5, i6]) );

[ takes forever ]

Seems the reason this is so awful is that the incoming data is exactly
presorted, meaning that the tree structure that ginInsertEntry() is
trying to build degenerates to a linear list (ie, every new element
becomes the right child of the prior one).  So the processing becomes
O(N^2) up till you reach maintenance_work_mem and flush the tree.  With
a large setting for maintenance_work_mem it gets spectacularly slow.

I think a reasonable solution for this might be to keep an eye on
maxdepth and force a flush if that gets too large (more than a few
hundred, perhaps?).  Something like this:

    /* If we've maxed out our available memory, dump everything to the index */
+   /* Also dump if the tree seems to be getting too unbalanced */
-   if (buildstate->accum.allocatedMemory >= maintenance_work_mem * 1024L)
+   if (buildstate->accum.allocatedMemory >= maintenance_work_mem * 1024L ||
+       buildstate->accum.maxdepth > DEPTH_LIMIT)
    {

The new fast-insert code likely will need a similar defense.

Comments?

            regards, tom lane

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Why creating GIN table index is so slow than inserting data into empty table with the same index?