Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel Seq Scan
Дата
Msg-id CAA4eK1JdhxgwPH+kdq6fKuJkjcNZ0jCjvHRQX+D1PPHX9oMAfg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
On Wed, Jan 28, 2015 at 12:38 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
>
> On 01/28/2015 04:16 AM, Robert Haas wrote:
>>
>> On Tue, Jan 27, 2015 at 6:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>
>>> Now, when you did what I understand to be the same test on the same
>>> machine, you got times ranging from 9.1 seconds to 35.4 seconds.
>>> Clearly, there is some difference between our test setups.  Moreover,
>>> I'm kind of suspicious about whether your results are actually
>>> physically possible.  Even in the best case where you somehow had the
>>> maximum possible amount of data - 64 GB on a 64 GB machine - cached,
>>> leaving no space for cache duplication between PG and the OS and no
>>> space for the operating system or postgres itself - the table is 120
>>> GB, so you've got to read *at least* 56 GB from disk.  Reading 56 GB
>>> from disk in 9 seconds represents an I/O rate of >6 GB/s. I grant that
>>> there could be some speedup from issuing I/O requests in parallel
>>> instead of serially, but that is a 15x speedup over dd, so I am a
>>> little suspicious that there is some problem with the test setup,
>>> especially because I cannot reproduce the results.
>>
>>
>> So I thought about this a little more, and I realized after some
>> poking around that hydra's disk subsystem is actually six disks
>> configured in a software RAID5[1].  So one advantage of the
>> chunk-by-chunk approach you are proposing is that you might be able to
>> get all of the disks chugging away at once, because the data is
>> presumably striped across all of them.  Reading one block at a time,
>> you'll never have more than 1 or 2 disks going, but if you do
>> sequential reads from a bunch of different places in the relation, you
>> might manage to get all 6.  So that's something to think about.
>>
>> One could imagine an algorithm like this: as long as there are more
>> 1GB segments remaining than there are workers, each worker tries to
>> chug through a separate 1GB segment.  When there are not enough 1GB
>> segments remaining for that to work, then they start ganging up on the
>> same segments.  That way, you get the benefit of spreading out the I/O
>> across multiple files (and thus hopefully multiple members of the RAID
>> group) when the data is coming from disk, but you can still keep
>> everyone busy until the end, which will be important when the data is
>> all in-memory and you're just limited by CPU bandwidth.
>
>
> OTOH, spreading the I/O across multiple files is not a good thing, if you don't have a RAID setup like that. With a single spindle, you'll just induce more seeks.
>

Yeah, if such a thing happens then there is less chance that user
will get any major benefit via parallel sequential scan unless
the qualification expressions or other expressions used in
statement are costly.  So here one way could be that either user
should configure the parallel sequence scan parameters in such
a way that only when it can be beneficial it should perform parallel
scan (something like increase parallel_tuple_comm_cost or we can
have some another parameter) or just not use parallel sequential scan
(parallel_seqscan_degree=0).


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: proposal: plpgsql - Assert statement
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Hot Standby WAL reply uses heavyweight session locks, but doesn't have enough infrastructure set up