Re: Queries runs slow on GPU with PG-Strom

Поиск
Список
Период
Сортировка
От Kouhei Kaigai
Тема Re: Queries runs slow on GPU with PG-Strom
Дата
Msg-id 9A28C8860F777E439AA12E8AEA7694F80111B9CB@BPXM15GP.gisp.nec.co.jp
обсуждение исходный текст
Ответ на Queries runs slow on GPU with PG-Strom  (YANG <stonetable@outlook.com>)
Список pgsql-hackers
Hi Yang,

> I've performed some tests on pg_strom according to the wiki. But it seems that
> queries run slower on GPU than CPU. Can someone shed a light on what's wrong
> with my settings. My setup was Quadro K620 + CUDA 7.0 (For Ubuntu 14.10) +
> Ubuntu 15.04. And the results was
> :
>         ,----
>         | LOG:  CUDA Runtime version: 7.0.0
>         | LOG:  NVIDIA driver version: 346.59
>         | LOG:  GPU0 Quadro K620 (384 CUDA cores, 1124MHz), L2 2048KB, RAM 2047MB
> (128bits, 900KHz), capability 5.0
>         | LOG:  NVRTC - CUDA Runtime Compilation vertion 7.0
>         | LOG:  redirecting log output to logging collector process
>         | HINT:  Future log output will appear in directory "pg_log".
>         `----
>
It looks to me your GPU processor has poor memory access capability,
thus, preprocess of aggregation (that heavy uses atomic operations
towards the global memory) consumes majority of processing time.
Please try the query with: SET pg_strom.enable_gpupreagg = off;
GpuJoin uses less atomic operation, so it has an advantage.

GPU's two major advantage are massive amount of cores and higher
memory bandwidth than GPU, so, fundamentally, I'd like to recommend
to use better GPU board...
According to NVIDIA, K620 uses DDR3 DRAM, thus here is no advantage
on memory access speed. How about GTX750Ti (no external power is
needed like K620) or AWS's g2.2xlarge instance type?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of YANG
> Sent: Thursday, July 23, 2015 12:16 AM
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] Queries runs slow on GPU with PG-Strom
>
>
> Hello,
>
> I've performed some tests on pg_strom according to the wiki. But it seems that
> queries run slower on GPU than CPU. Can someone shed a light on what's wrong
> with my settings. My setup was Quadro K620 + CUDA 7.0 (For Ubuntu 14.10) +
> Ubuntu 15.04. And the results was
>
> with pg_strom
> =============
>
> explain SELECT count(*) FROM t0 WHERE sqrt((x-25.6)^2 + (y-12.8)^2) < 10;
>
>
> QUERY PLAN
> ----------------------------------------------------------------------------
> ----------------------------------------------------------------------------
> -----------------------
>  Aggregate  (cost=190993.70..190993.71 rows=1 width=0) (actual
> time=18792.236..18792.236 rows=1 loops=1)
>    ->  Custom Scan (GpuPreAgg)  (cost=7933.07..184161.18 rows=86 width=108)
> (actual time=4249.656..18792.074 rows=77 loops=1)
>          Bulkload: On (density: 100.00%)
>          Reduction: NoGroup
>          Device Filter: (sqrt((((x - '25.6'::double precision) ^ '2'::double
> precision) + ((y - '12.8'::double precision) ^ '2'::double precision))) <
> '10'::double precision)
>          ->  Custom Scan (BulkScan) on t0  (cost=6933.07..182660.32
> rows=10000060 width=0) (actual time=139.399..18499.246 rows=10000000 loops=1)
>  Planning time: 0.262 ms
>  Execution time: 19268.650 ms
> (8 rows)
>
>
>
> explain analyze SELECT cat, AVG(x) FROM t0 NATURAL JOIN t1 GROUP BY cat;
>
>                                                                     QUERY
> PLAN
> ----------------------------------------------------------------------------
> ----------------------------------------------------------------------
>  HashAggregate  (cost=298541.48..298541.81 rows=26 width=12) (actual
> time=11311.568..11311.572 rows=26 loops=1)
>    Group Key: t0.cat
>    ->  Custom Scan (GpuPreAgg)  (cost=5178.82..250302.07 rows=1088 width=52)
> (actual time=3304.727..11310.021 rows=2307 loops=1)
>          Bulkload: On (density: 100.00%)
>          Reduction: Local + Global
>          ->  Custom Scan (GpuJoin)  (cost=4178.82..248541.18 rows=10000060
> width=12) (actual time=923.417..2661.113 rows=10000000 loops=1)
>                Bulkload: On (density: 100.00%)
>                Depth 1: Logic: GpuHashJoin, HashKeys: (aid), JoinQual: (aid =
> aid), nrows_ratio: 1.00000000
>                ->  Custom Scan (BulkScan) on t0  (cost=0.00..242858.60
> rows=10000060 width=16) (actual time=6.980..871.431 rows=10000000 loops=1)
>                ->  Seq Scan on t1  (cost=0.00..734.00 rows=40000 width=4)
> (actual time=0.204..7.309 rows=40000 loops=1)
>  Planning time: 47.834 ms
>  Execution time: 11355.103 ms
> (12 rows)
>
>
> without pg_strom
> ================
>
> test=# explain analyze SELECT count(*) FROM t0 WHERE sqrt((x-25.6)^2 +
> (y-12.8)^2) < 10;
>
> QUERY PLAN
> ----------------------------------------------------------------------------
> ----------------------------------------------------------------------------
> ----------------
>  Aggregate  (cost=426193.03..426193.04 rows=1 width=0) (actual
> time=3880.379..3880.379 rows=1 loops=1)
>    ->  Seq Scan on t0  (cost=0.00..417859.65 rows=3333353 width=0) (actual
> time=0.075..3859.200 rows=314063 loops=1)
>          Filter: (sqrt((((x - '25.6'::double precision) ^ '2'::double precision)
> + ((y - '12.8'::double precision) ^ '2'::double precision))) < '10'::double
> precision)
>          Rows Removed by Filter: 9685937
>  Planning time: 0.411 ms
>  Execution time: 3880.445 ms
> (6 rows)
>
> t=# explain analyze SELECT cat, AVG(x) FROM t0 NATURAL JOIN t1 GROUP BY cat;
>                                                           QUERY PLAN
> ----------------------------------------------------------------------------
> --------------------------------------------------
>  HashAggregate  (cost=431593.73..431594.05 rows=26 width=12) (actual
> time=4960.810..4960.812 rows=26 loops=1)
>    Group Key: t0.cat
>    ->  Hash Join  (cost=1234.00..381593.43 rows=10000060 width=12) (actual
> time=20.859..3367.510 rows=10000000 loops=1)
>          Hash Cond: (t0.aid = t1.aid)
>          ->  Seq Scan on t0  (cost=0.00..242858.60 rows=10000060 width=16)
> (actual time=0.021..895.908 rows=10000000 loops=1)
>          ->  Hash  (cost=734.00..734.00 rows=40000 width=4) (actual
> time=20.567..20.567 rows=40000 loops=1)
>                Buckets: 65536  Batches: 1  Memory Usage: 1919kB
>                ->  Seq Scan on t1  (cost=0.00..734.00 rows=40000 width=4)
> (actual time=0.017..11.013 rows=40000 loops=1)
>  Planning time: 0.567 ms
>  Execution time: 4961.029 ms
> (10 rows)
>
>
>
> Here is the details how I installed pg_strom,
>
> 1. download postgresql 9.5alpha1 and compile it with
>
>     ,----
>     | ./configure --prefix=/export/pg-9.5 --enable-debug --enable-cassert
>     | make -j8 all
>     | make install
>     `----
>
> 2. install cuda-7.0 (ubuntu 14.10 package from nvidia website)
>
> 3. download and compile pg_strom with pg_config in /export/pg-9.5/bin
>
>         ,----
>         | make
>         | make install
>         `----
>
>
> 4. create a db with --no-local
>
>         ,----
>         | initdb --no-local 9.5
>         `----
>
> 5. change postgresql.conf
>
>         ,----
>         | shared_buffers=1GB
>         | shared_preload_libraries='pg_strom.so'
>         | logging_collector = on
>         | log_filename='postgresql-%d.log'
>         | pg_strom.enabled=on
>         `----
>
>
> 6. start postgres
>
>         ,----
>         | pg_ctl -D 9.5 start
>         `----
>
>    and got the following outputs
>
>         ,----
>         | LOG:  CUDA Runtime version: 7.0.0
>         | LOG:  NVIDIA driver version: 346.59
>         | LOG:  GPU0 Quadro K620 (384 CUDA cores, 1124MHz), L2 2048KB, RAM 2047MB
> (128bits, 900KHz), capability 5.0
>         | LOG:  NVRTC - CUDA Runtime Compilation vertion 7.0
>         | LOG:  redirecting log output to logging collector process
>         | HINT:  Future log output will appear in directory "pg_log".
>         `----
>
>
>
> 7. import testdb
>
>         ,----
>         | createdb test
>         | psql test < ~/devel/pg_strom/test/testdb.sql
>         | psql test -c 'create extension pg_strom'
>         `----
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



В списке pgsql-hackers по дате отправления:

Предыдущее
От: dinesh kumar
Дата:
Сообщение: Re: [PATCH] SQL function to report log message
Следующее
От: Kouhei Kaigai
Дата:
Сообщение: Re: Queries runs slow on GPU with PG-Strom