Re: WIP: Fast GiST index build

Поиск

Список

Период

Сортировка

От	Alexander Korotkov
Тема	Re: WIP: Fast GiST index build
Дата	30 августа 2011 г. 10:38:37
Msg-id	CAPpHfdu=BUTTqk-t04DrqTyWH1MHH2JPJZwsNLjbDAk8SH5EyQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: WIP: Fast GiST index build (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы	Re: WIP: Fast GiST index build Re: WIP: Fast GiST index build
Список	pgsql-hackers

Дерево обсуждения

On Tue, Aug 30, 2011 at 1:08 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

Thanks. Meanwhile, I hacked together my own set of test scripts, and let them run over the weekend. I'm still running tests with ordered data, but here are some preliminary results:

testname | nrows | duration | accesses
-----------------------------+-----------+-----------------+----------
points unordered auto | 250000000 | 08:08:39.174956 | 3757848
points unordered buffered | 250000000 | 09:29:16.47012 | 4049832
points unordered unbuffered | 250000000 | 03:48:10.999861 | 4564986

As you can see, the results are very disappointing :-(. The buffered builds take a lot *longer* than unbuffered ones. I was expecting the buffering to be very helpful at least in these unordered tests. On the positive side, the buffering made index quality somewhat better (accesses column, smaller is better), but that's not what we're aiming at.

What's going on here? This data set was large enough to not fit in RAM, the table was about 8.5 GB in size (and I think the index is even larger than that), and the box has 4GB of RAM. Does the buffering only help with even larger indexes that exceed the cache size even more?

This seems pretty strange for me. Time of unbuffered index build shows that there is not bottleneck at IO. That radically differs from my experiments. I'm going to try your test script on my test setup.

While I have only express assumption that random function appears to be somewhat bad. Thereby unordered dataset behave like the ordered one. Can you rerun tests on your test setup with dataset generation on the backend like this?

CREATE TABLE points AS (SELECT point(random(), random() FROM generate_series(1,10000000));

------
With best regards,
Alexander Korotkov.

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alexander Korotkov
Дата: 30 августа 2011 г., 10:29:29
Сообщение: Re: WIP: Fast GiST index build

Следующее

От: Pavan Deolasee
Дата: 30 августа 2011 г., 10:39:15
Сообщение: Re: Single pass vacuum - take 2

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: WIP: Fast GiST index build

Предыдущее

Следующее