Re: WIP: Fast GiST index build

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: WIP: Fast GiST index build
Дата
Msg-id CAPpHfdsKWB44vQhL99puFxfNwFODEGtoV9vhk5npxC4SYAf97Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WIP: Fast GiST index build  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
On Tue, Aug 30, 2011 at 9:29 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
On 30.08.2011 13:29, Alexander Korotkov wrote:
On Tue, Aug 30, 2011 at 1:13 PM, Heikki Linnakangas<
heikki.linnakangas@enterprisedb.com>  wrote:

So, over 50% of the CPU time is spent in choosing a block from the
temporary files. That should be pretty easy to improve..

Yes, probably we can just remove free blocks sorting.

Ok, the first results are in for that:

        testname          |   nrows   |    duration     | accesses
---------------------------+-----------+-----------------+----------
 points unordered buffered | 250000000 | 06:00:23.707579 |  4049832

>From the previous test runs, the unbuffered index build took under 4 hours, so even though this is a lot better than with the sorting, it's still a loss compared to non-buffered build.

I had vmstat running during most of this index build. At a quick glance, it doesn't seem to be CPU bound anymore. I suspect the buffers in the temporary file gets very fragmented. Or, we're reading it in backwards order because the buffers work in a LIFO fashion. The system seems to be doing about 5 MB/s of I/O during the build, which sounds like a figure you'd get for more or less random I/O.

So, we still have two questions:
1) Why buffering build algorithm doesn't show any benefit on these tests?
2) Why test results on your test setup differs from test results on my test setup?

I can propose following answers now:
1) I think it's because high overlaps in the tree. As I mentioned before high overlaps can cause only fraction of the tree to be used for actual inserts. For comparison, with my split algorithm (which produce almost no overlaps on uniform dataset) buffering index build took 4 hours, while regular build is still running (already more than 8 days = 192 hours)!
2) Probably it's because different behavour of OS cache. For example, on my test setup OS displace unused pages from cache too slowly. Thereby buffering algorithm showed benefit nevertheless.

Also it seems to me that I start to understand problem of new linear splitting algorithm. On dataset with 1M rows it produces almost no overlaps while it produces significant overlaps already on 10M rows (drama!). Probably nobody tested it on large enough datasets (neither while original research or before commit). I'll dig it in more details and provide some testing results.

------
With best regards,
Alexander Korotkov. 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: spinlocks on HP-UX
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg_upgrade automatic testing