Re: Parallel Seq Scan
От | Heikki Linnakangas |
---|---|
Тема | Re: Parallel Seq Scan |
Дата | |
Msg-id | 54C88ADD.5010205@vmware.com обсуждение исходный текст |
Ответ на | Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Parallel Seq Scan
(Amit Kapila <amit.kapila16@gmail.com>)
Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>) Re: Parallel Seq Scan (Jeff Janes <jeff.janes@gmail.com>) |
Список | pgsql-hackers |
On 01/28/2015 04:16 AM, Robert Haas wrote: > On Tue, Jan 27, 2015 at 6:00 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> Now, when you did what I understand to be the same test on the same >> machine, you got times ranging from 9.1 seconds to 35.4 seconds. >> Clearly, there is some difference between our test setups. Moreover, >> I'm kind of suspicious about whether your results are actually >> physically possible. Even in the best case where you somehow had the >> maximum possible amount of data - 64 GB on a 64 GB machine - cached, >> leaving no space for cache duplication between PG and the OS and no >> space for the operating system or postgres itself - the table is 120 >> GB, so you've got to read *at least* 56 GB from disk. Reading 56 GB >> from disk in 9 seconds represents an I/O rate of >6 GB/s. I grant that >> there could be some speedup from issuing I/O requests in parallel >> instead of serially, but that is a 15x speedup over dd, so I am a >> little suspicious that there is some problem with the test setup, >> especially because I cannot reproduce the results. > > So I thought about this a little more, and I realized after some > poking around that hydra's disk subsystem is actually six disks > configured in a software RAID5[1]. So one advantage of the > chunk-by-chunk approach you are proposing is that you might be able to > get all of the disks chugging away at once, because the data is > presumably striped across all of them. Reading one block at a time, > you'll never have more than 1 or 2 disks going, but if you do > sequential reads from a bunch of different places in the relation, you > might manage to get all 6. So that's something to think about. > > One could imagine an algorithm like this: as long as there are more > 1GB segments remaining than there are workers, each worker tries to > chug through a separate 1GB segment. When there are not enough 1GB > segments remaining for that to work, then they start ganging up on the > same segments. That way, you get the benefit of spreading out the I/O > across multiple files (and thus hopefully multiple members of the RAID > group) when the data is coming from disk, but you can still keep > everyone busy until the end, which will be important when the data is > all in-memory and you're just limited by CPU bandwidth. OTOH, spreading the I/O across multiple files is not a good thing, if you don't have a RAID setup like that. With a single spindle, you'll just induce more seeks. Perhaps the OS is smart enough to read in large-enough chunks that the occasional seek doesn't hurt much. But then again, why isn't the OS smart enough to read in large-enough chunks to take advantage of the RAID even when you read just a single file? - Heikki
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Haribabu KommiДата:
Сообщение: Re: Providing catalog view to pg_hba.conf file - Patch submission
Следующее
От: Pavel StehuleДата:
Сообщение: Re: proposal: searching in array function - array_position