Re: CUDA Sorting

Поиск
Список
Период
Сортировка
От Gaetano Mendola
Тема Re: CUDA Sorting
Дата
Msg-id 4F37125B.8080605@gmail.com
обсуждение исходный текст
Ответ на Re: CUDA Sorting  (Greg Smith <greg@2ndQuadrant.com>)
Ответы Re: CUDA Sorting  (Oleg Bartunov <oleg@sai.msu.su>)
Re: CUDA Sorting  (Greg Smith <greg@2ndQuadrant.com>)
Список pgsql-hackers
On 19/09/2011 16:36, Greg Smith wrote:
> On 09/19/2011 10:12 AM, Greg Stark wrote:
>> With the GPU I'm curious to see how well
>> it handles multiple processes contending for resources, it might be a
>> flashy feature that gets lots of attention but might not really be
>> very useful in practice. But it would be very interesting to see.
>
> The main problem here is that the sort of hardware commonly used for
> production database servers doesn't have any serious enough GPU to
> support CUDA/OpenCL available. The very clear trend now is that all
> systems other than gaming ones ship with motherboard graphics chipsets
> more than powerful enough for any task but that. I just checked the 5
> most popular configurations of server I see my customers deploy
> PostgreSQL onto (a mix of Dell and HP units), and you don't get a
> serious GPU from any of them.
>
> Intel's next generation Ivy Bridge chipset, expected for the spring of
> 2012, is going to add support for OpenCL to the built-in motherboard
> GPU. We may eventually see that trickle into the server hardware side of
> things too.


The trend is to have server capable of running CUDA providing GPU via 
external hardware (PCI Express interface with PCI Express switches), 
look for example at PowerEdge C410x PCIe Expansion Chassis from DELL.

I did some experimenst timing the sort done with CUDA and the sort done 
with pg_qsort:                       CUDA      pg_qsort
33Milion integers:   ~ 900 ms,  ~ 6000 ms
1Milion integers:    ~  21 ms,  ~  162 ms
100k integers:       ~   2 ms,  ~   13 ms

CUDA time has already in the copy operations (host->device, device->host).

As GPU I was using a C2050, and the CPU doing the pg_qsort was a 
Intel(R) Xeon(R) CPU X5650  @ 2.67GHz

Copy operations and kernel runs (the sort for instance) can run in 
parallel, so while you are sorting a batch of data, you can copy the 
next batch in parallel.

As you can see the boost is not negligible.

Next Nvidia hardware (Keplero family) is PCI Express 3 ready, so expect 
in the near future the "bottle neck" of the device->host->device copies 
to have less impact.

I strongly believe there is space to provide modern database engine of
a way to offload sorts to GPU.
> I've never seen a PostgreSQL server capable of running CUDA, and I> don't expect that to change.

That sounds like:

"I think there is a world market for maybe five computers."
- IBM Chairman Thomas Watson, 1943

Regards
Gaetano Mendola



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gaetano Mendola
Дата:
Сообщение: Re: CUDA Sorting
Следующее
От: Vik Reykja
Дата:
Сообщение: Optimize referential integrity checks (todo item)