Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: Parallel Seq Scan
Дата
Msg-id 54C6B637.9050408@BlueTreble.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Parallel Seq Scan  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 1/23/15 10:16 PM, Amit Kapila wrote:
> Further, if we want to just get the benefit of parallel I/O, then
> I think we can get that by parallelising partition scan where different
> table partitions reside on different disk partitions, however that is
> a matter of separate patch.

I don't think we even have to go that far.

My experience with Postgres is that it is *very* sensitive to IO latency (not bandwidth). I believe this is the case
becausecomplex queries tend to interleave CPU intensive code in-between IO requests. So we see this pattern:
 

Wait 5ms on IO
Compute for a few ms
Wait 5ms on IO
Compute for a few ms
...

We blindly assume that the kernel will magically do read-ahead for us, but I've never seen that work so great. It
certainlyfalls apart on something like an index scan.
 

If we could instead do this:

Wait for first IO, issue second IO request
Compute
Already have second IO request, issue third
...

We'd be a lot less sensitive to IO latency.

I wonder what kind of gains we would see if every SeqScan in a query spawned a worker just to read tuples and shove
themin a queue (or shove a pointer to a buffer in the queue). Similarly, have IndexScans have one worker reading the
indexand another worker taking index tuples and reading heap tuples...
 
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: New CF app deployment
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: proposal: plpgsql - Assert statement