Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: Parallel Seq Scan
Дата
Msg-id 54C822A4.7040106@BlueTreble.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 1/26/15 11:11 PM, Amit Kapila wrote:
> On Tue, Jan 27, 2015 at 3:18 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>> wrote:
>  >
>  > On 1/23/15 10:16 PM, Amit Kapila wrote:
>  >>
>  >> Further, if we want to just get the benefit of parallel I/O, then
>  >> I think we can get that by parallelising partition scan where different
>  >> table partitions reside on different disk partitions, however that is
>  >> a matter of separate patch.
>  >
>  >
>  > I don't think we even have to go that far.
>  >
>  >
>  > We'd be a lot less sensitive to IO latency.
>  >
>  > I wonder what kind of gains we would see if every SeqScan in a query spawned a worker just to read tuples and
shovethem in a queue (or shove a pointer to a buffer in the queue).
 
>  >
>
> Here IIUC, you want to say that just get the read done by one parallel
> worker and then all expression calculation (evaluation of qualification
> and target list) in the main backend, it seems to me that by doing it
> that way, the benefit of parallelisation will be lost due to tuple
> communication overhead (may be the overhead is less if we just
> pass a pointer to buffer but that will have another kind of problems
> like holding buffer pins for a longer period of time).
>
> I could see the advantage of testing on lines as suggested by Tom Lane,
> but that seems to be not directly related to what we want to achieve by
> this patch (parallel seq scan) or if you think otherwise then let me know?

There's some low-hanging fruit when it comes to improving our IO performance (or more specifically, decreasing our
sensitivityto IO latency). Perhaps the way to do that is with the parallel infrastructure, perhaps not. But I think
it'spremature to look at parallelism for increasing IO performance, or worrying about things like how many IO threads
weshould have before we at least look at simpler things we could do. We shouldn't assume there's nothing to be gained
shortof a full parallelization implementation.
 

That's not to say there's nothing else we could use parallelism for. Sort, merge and hash operations come to mind.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: jsonb, unicode escapes and escaped backslashes
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: Parallel Seq Scan