Обсуждение: HDD vs SSD without explanation

Поиск
Список
Период
Сортировка

HDD vs SSD without explanation

От
Neto pr
Дата:
Dear all

Someone help me analyze the two execution plans below (Explain ANALYZE used), is the  query 9 of TPC-H benchmark [1].
I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB 15 Krpm AND SSD Sansung EVO 500GB.
My DBMS parameters presents in postgresql.conf is default, but in SSD I have changed random_page_cost = 1.0.

I do not understand, because running on an HDD a query used half the time. I explain better, in HDD spends on average 12 minutes the query execution and on SSD spent 26 minutes.
I think maybe the execution plan is using more write operations, and so the HDD SAS 15Krpm has been faster.
Anyway, I always thought that an SSD would be equal or faster, but in the case and four more cases we have here, it lost a lot for the HDDs.

Any help in understanding, is welcome

Best Regards
Neto

----------------- Query execution Time on SSD ---------------
execution 1: 00:23:29
execution 2: 00:28:38
execution 3: 00:27:32
execution 4: 00:27:54
execution 5: 00:27:35
execution 6: 00:26:19
Average: 26min 54 seconds

------------Query execution Time on HDD -------------------------------------------------------------------------------
execution 1: 00:12:44
execution 2: 00:12:30
execution 3: 00:12:47
execution 4: 00:13:02
execution 5: 00:13:00
execution 6: 00:12:47
Average: 12 minutes 48 seconds

---------------------------------- EXECUTION PLAN SSD Storage--------------------------------------------------------
Finalize GroupAggregate  (cost=15.694.362.41..15842178.65 rows=60150 width=66) (actual time=1670577.649..1674717.444 rows=175 loops=1)  Group Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone))  ->  Gather Merge  (cost=15694362.41..15839923.02 rows=120300 width=66) (actual time=1670552.446..1674716.748 rows=525 loops=1)        Workers Planned: 2        Workers Launched: 2        ->  Partial GroupAggregate  (cost=15693362.39..15825037.39 rows=60150 width=66) (actual time=1640482.164..1644619.574 rows=175 loops=3)              Group Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone))              ->  Sort  (cost=15693362.39..15709690.19 rows=6531119 width=57) (actual time=1640467.384..1641511.970 rows=4344197 loops=3)                    Sort Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone)) DESC                    Sort Method: external merge  Disk: 319512kB                    ->  Hash Join  (cost=4708869.23..14666423.78 rows=6531119 width=57) (actual time=1366753.586..1634128.122 rows=4344197 loops=3)                          Hash Cond: (lineitem.l_suppkey = supplier.s_suppkey)                          ->  Hash Join  (cost=4683027.67..14400582.74 rows=6531119 width=43) (actual time=1328019.213..1623919.675 rows=4344197 loops=3)                                Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)                                ->  Hash Join  (cost=1993678.29..11279593.98 rows=6531119 width=47) (actual time=245906.330..1316201.213 rows=4344197 loops=3)                                      Hash Cond: ((lineitem.l_suppkey = partsupp.ps_suppkey) AND (lineitem.l_partkey = partsupp.ps_partkey))                                      ->  Hash Join  (cost=273200.59..9157211.71 rows=6531119 width=45) (actual time=5103.563..1007657.993 rows=4344197 loops=3)                                            Hash Cond: (lineitem.l_partkey = part.p_partkey)                                            ->  Parallel Seq Scan on lineitem  (cost=0.00..5861332.93 rows=100005093 width=41) (actual time=3.494..842667.110 rows=80004097 loops=3)                                            ->  Hash  (cost=263919.95..263919.95 rows=565651 width=4) (actual time=4973.807..4973.807 rows=434469 loops=3)                                                  Buckets: 131072  Batches: 8  Memory Usage: 2933kB                                                  ->  Seq Scan on part  (cost=0.00..263919.95 rows=565651 width=4) (actual time=11.810..4837.287 rows=434469 loops=3)                                                        Filter: ((p_name)::text ~~ _%orchid%_::text)                                                        Rows Removed by Filter: 7565531                                      ->  Hash  (cost=1052983.08..1052983.08 rows=31999708 width=22) (actual time=240711.936..240711.936 rows=32000000 loops=3)                                            Buckets: 65536  Batches: 512  Memory Usage: 3941kB                                            ->  Seq Scan on partsupp  (cost=0.00..1052983.08 rows=31999708 width=22) (actual time=0.033..228828.149 rows=32000000 loops=3)                                ->  Hash  (cost=1704962.28..1704962.28 rows=60000728 width=8) (actual time=253669.242..253669.242 rows=60000000 loops=3)                                      Buckets: 131072  Batches: 1024  Memory Usage: 3316kB                                      ->  Seq Scan on orders  (cost=0.00..1704962.28 rows=60000728 width=8) (actual time=0.038..237545.226 rows=60000000 loops=3)                          ->  Hash  (cost=18106.56..18106.56 rows=400000 width=30) (actual time=277.283..277.283 rows=400000 loops=3)                                Buckets: 65536  Batches: 8  Memory Usage: 3549kB                                ->  Hash Join  (cost=1.56..18106.56 rows=400000 width=30) (actual time=45.155..205.372 rows=400000 loops=3)                                      Hash Cond: (supplier.s_nationkey = nation.n_nationkey)                                      ->  Seq Scan on supplier  (cost=0.00..13197.00 rows=400000 width=12) (actual time=45.094..129.333 rows=400000 loops=3)                                      ->  Hash  (cost=1.25..1.25 rows=25 width=30) (actual time=0.038..0.038 rows=25 loops=3)                                            Buckets: 1024  Batches: 1  Memory Usage: 10kB                                            ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=30) (actual time=0.026..0.029 rows=25 loops=3)Planning time: 2.251 msExecution time: 1674790.954 ms


--------------------------------------------------Execution plan on HDD Storage -------------------------------------------------
Finalize GroupAggregate  (cost=14.865.093.59..14942715.87 rows=60150 width=66) (actual time=763039.932..767231.344 rows=175 loops=1)  Group Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone))  ->  Gather Merge  (cost=14865093.59..14940460.24 rows=120300 width=66) (actual time=763014.187..767230.826 rows=525 loops=1)        Workers Planned: 2        Workers Launched: 2        ->  Partial GroupAggregate  (cost=14864093.57..14925574.61 rows=60150 width=66) (actual time=758405.567..762576.512 rows=175 loops=3)              Group Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone))              ->  Sort  (cost=14864093.57..14871647.12 rows=3021421 width=57) (actual time=758348.786..759400.608 rows=4344197 loops=3)                    Sort Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone)) DESC                    Sort Method: external merge  Disk: 324568kB                    ->  Hash Join  (cost=4703389.12..14311687.00 rows=3021421 width=57) (actual time=474033.697..736861.120 rows=4344197 loops=3)                          Hash Cond: (lineitem.l_suppkey = supplier.s_suppkey)                          ->  Hash Join  (cost=4677547.56..14173154.89 rows=3030463 width=43) (actual time=420246.635..728731.259 rows=4344197 loops=3)                                Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)                                ->  Hash Join  (cost=1988224.59..11157928.33 rows=3030463 width=47) (actual time=92246.411..545600.522 rows=4344197 loops=3)                                      Hash Cond: ((lineitem.l_suppkey = partsupp.ps_suppkey) AND (lineitem.l_partkey = partsupp.ps_partkey))                                      ->  Hash Join  (cost=267897.64..9150646.81 rows=3030463 width=45) (actual time=9247.722..368140.568 rows=4344197 loops=3)                                            Hash Cond: (lineitem.l_partkey = part.p_partkey)                                            ->  Parallel Seq Scan on lineitem  (cost=0.00..5861333.40 rows=100005140 width=41) (actual time=41.805..224438.909 rows=80004097 loops=3)                                            ->  Hash  (cost=263920.35..263920.35 rows=242423 width=4) (actual time=9181.407..9181.407 rows=434469 loops=3)                                                  Buckets: 131072 (originally 131072)  Batches: 8 (originally 4)  Memory Usage: 3073kB                                                  ->  Seq Scan on part  (cost=0.00..263920.35 rows=242423 width=4) (actual time=5.608..9027.871 rows=434469 loops=3)                                                        Filter: ((p_name)::text ~~ _%orchid%_::text)                                                        Rows Removed by Filter: 7565531                                      ->  Hash  (cost=1052934.38..1052934.38 rows=31994838 width=22) (actual time=82524.045..82524.045 rows=32000000 loops=3)                                            Buckets: 65536  Batches: 512  Memory Usage: 3941kB                                            ->  Seq Scan on partsupp  (cost=0.00..1052934.38 rows=31994838 width=22) (actual time=0.037..37865.003 rows=32000000 loops=3)                                ->  Hash  (cost=1704952.32..1704952.32 rows=59999732 width=8) (actual time=98182.919..98182.919 rows=60000000 loops=3)                                      Buckets: 131072  Batches: 1024  Memory Usage: 3316kB                                      ->  Seq Scan on orders  (cost=0.00..1704952.32 rows=59999732 width=8) (actual time=0.042..43977.490 rows=60000000 loops=3)                          ->  Hash  (cost=18106.56..18106.56 rows=400000 width=30) (actual time=555.225..555.225 rows=400000 loops=3)                                Buckets: 65536  Batches: 8  Memory Usage: 3549kB                                ->  Hash Join  (cost=1.56..18106.56 rows=400000 width=30) (actual time=1.748..484.203 rows=400000 loops=3)                                      Hash Cond: (supplier.s_nationkey = nation.n_nationkey)                                      ->  Seq Scan on supplier  (cost=0.00..13197.00 rows=400000 width=12) (actual time=1.718..408.463 rows=400000 loops=3)                                      ->  Hash  (cost=1.25..1.25 rows=25 width=30) (actual time=0.019..0.019 rows=25 loops=3)                                            Buckets: 1024  Batches: 1  Memory Usage: 10kB                                            ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=30) (actual time=0.007..0.010 rows=25 loops=3)Planning time: 12.145 msExecution time: 767503.736 ms


-- Query SQL ------------------

select
    nation,
    o_year,
    sum(amount) as sum_profit
from
    (
        select
            n_name as nation,
            extract(year from o_orderdate) as o_year,
            l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
        from
            part,
            supplier,
            lineitem,
            partsupp,
            orders,
            nation
        where
            s_suppkey = l_suppkey
            and ps_suppkey = l_suppkey
            and ps_partkey = l_partkey
            and p_partkey = l_partkey
            and o_orderkey = l_orderkey
            and s_nationkey = n_nationkey
            and p_name like '%orchid%'
    ) as profit
group by
    nation,
    o_year
order by
    nation,
    o_year desc

Re: HDD vs SSD without explanation

От
Justin Pryzby
Дата:
On Sun, Jan 14, 2018 at 12:44:00PM -0800, Neto pr wrote:
> Dear all
> 
> Someone help me analyze the two execution plans below (Explain ANALYZE
> used), is the  query 9 of TPC-H benchmark [1].
>
> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
> 15 Krpm AND SSD Sansung EVO 500GB.
>
> I think maybe the execution plan is using more write operations, and so the
> HDD SAS 15Krpm has been faster.

The query plan is all garbled by mail , could you resend?  Or post a link from
https://explain.depesz.com/

To see if the query is causing many writes (due to dirty pages, sorts, etc),
run with explain(analyze,buffers) 

But from what I could tell, your problems are here:

->  Parallel Seq Scan on lineitem  (cost=0.00..5861332.93 rows=100005093 width=41) (actual TIME=3.494..842667.110
rows=80004097loops=3)
 
vs
->  Parallel Seq Scan on lineitem  (cost=0.00..5861333.40 rows=100005140 width=41) (actual TIME=41.805..224438.909
rows=80004097loops=3)
 

->  Seq Scan on partsupp (cost=0.00..1052983.08 rows=31999708 width=22) (actual TIME=0.033..228828.149 rows=32000000
loops=3)
vs
->  Seq Scan on partsupp  (cost=0.00..1052934.38 rows=31994838 width=22) (actual TIME=0.037..37865.003 rows=32000000
loops=3)           
 

Can you reproduce the speed difference using dd ?
time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size 

Or: bonnie++ -f -n0

What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
readahead?  blockdev --getra

If you're running under linux, maybe you can just send the output of:
for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}

Justin


Re: HDD vs SSD without explanation

От
Neto pr
Дата:
Thanks for the reply.
I'll try upload the execution plan with Explain (analyse, buffer) for
website:  https://explain.depesz.com/

I'm make an experiment for a scientific research and this is what I
find strange, explaining better, strange HDD performance far outweigh
the performance of an SSD.

Do you think that if you run a VACUMM FULL the performance with the
SSD will be better than a 15Krpm SAS HDD?

Best Regards
Neto
<div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table
style="border-top: 1px solid #D3D4DE;">
    <tr>
      <td style="width: 55px; padding-top: 18px;"><a
href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
target="_blank"><img
src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif"
alt="" width="46" height="29" style="width: 46px; height: 29px;"
/></a></td>
        <td style="width: 470px; padding-top: 17px; color: #41424e;
font-size: 13px; font-family: Arial, Helvetica, sans-serif;
line-height: 18px;">Livre de vírus. <a
href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
target="_blank" style="color: #4453ea;">www.avast.com</a>.         </td>
    </tr>
</table>
<a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div>

2018-01-14 19:40 GMT-02:00 Justin Pryzby <pryzby@telsasoft.com>:
> On Sun, Jan 14, 2018 at 12:44:00PM -0800, Neto pr wrote:
>> Dear all
>>
>> Someone help me analyze the two execution plans below (Explain ANALYZE
>> used), is the  query 9 of TPC-H benchmark [1].
>>
>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>> 15 Krpm AND SSD Sansung EVO 500GB.
>>
>> I think maybe the execution plan is using more write operations, and so the
>> HDD SAS 15Krpm has been faster.
>
> The query plan is all garbled by mail , could you resend?  Or post a link from
> https://explain.depesz.com/
>
> To see if the query is causing many writes (due to dirty pages, sorts, etc),
> run with explain(analyze,buffers)
>
> But from what I could tell, your problems are here:
>
> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861332.93 rows=100005093 width=41) (actual TIME=3.494..842667.110
rows=80004097loops=3) 
> vs
> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861333.40 rows=100005140 width=41) (actual TIME=41.805..224438.909
rows=80004097loops=3) 
>
> ->  Seq Scan on partsupp (cost=0.00..1052983.08 rows=31999708 width=22) (actual TIME=0.033..228828.149 rows=32000000
loops=3)
> vs
> ->  Seq Scan on partsupp  (cost=0.00..1052934.38 rows=31994838 width=22) (actual TIME=0.037..37865.003 rows=32000000
loops=3)
>
> Can you reproduce the speed difference using dd ?
> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>
> Or: bonnie++ -f -n0
>
> What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
> readahead?  blockdev --getra
>
> If you're running under linux, maybe you can just send the output of:
> for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
> or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>
> Justin


Re: HDD vs SSD without explanation

От
Neto pr
Дата:
2018-01-14 13:40 GMT-08:00 Justin Pryzby <pryzby@telsasoft.com>:
> On Sun, Jan 14, 2018 at 12:44:00PM -0800, Neto pr wrote:
>> Dear all
>>
>> Someone help me analyze the two execution plans below (Explain ANALYZE
>> used), is the  query 9 of TPC-H benchmark [1].
>>
>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>> 15 Krpm AND SSD Sansung EVO 500GB.
>>
>> I think maybe the execution plan is using more write operations, and so the
>> HDD SAS 15Krpm has been faster.
>
> The query plan is all garbled by mail , could you resend?  Or post a link from
> https://explain.depesz.com/
>
> To see if the query is causing many writes (due to dirty pages, sorts, etc),
> run with explain(analyze,buffers)
>
> But from what I could tell, your problems are here:
>
> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861332.93 rows=100005093 width=41) (actual TIME=3.494..842667.110
rows=80004097loops=3)
 
> vs
> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861333.40 rows=100005140 width=41) (actual TIME=41.805..224438.909
rows=80004097loops=3)
 
>
> ->  Seq Scan on partsupp (cost=0.00..1052983.08 rows=31999708 width=22) (actual TIME=0.033..228828.149 rows=32000000
loops=3)
> vs
> ->  Seq Scan on partsupp  (cost=0.00..1052934.38 rows=31994838 width=22) (actual TIME=0.037..37865.003 rows=32000000
loops=3)
>
> Can you reproduce the speed difference using dd ?
> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>
> Or: bonnie++ -f -n0
>
> What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
> readahead?  blockdev --getra

OS = Debian 8 64bits - 3.16.0-4

See below the Disk FileSystem --------------------------------
root@hp2ml110deb:/# fdisk -l
Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 26F5EB21-30DB-44E4-B9E2-E8105846B6C4

Device          Start        End    Sectors   Size Type
/dev/sda1        2048    1050623    1048576   512M EFI System
/dev/sda2     1050624 1937274879 1936224256 923.3G Linux filesystem
/dev/sda3  1937274880 1953523711   16248832   7.8G Linux swap

Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
----------------------------------------------------------------------------

The DBMS and tablespace of users is installed in /dev/sdb  SSD.

> If you're running under linux, maybe you can just send the output of:
> for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
> or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>
> Justin


Re: HDD vs SSD without explanation

От
Neto pr
Дата:
2018-01-14 15:59 GMT-08:00 Neto pr <netopr9@gmail.com>:
> Thanks for the reply.
> I'll try upload the execution plan with Explain (analyse, buffer) for
> website:  https://explain.depesz.com/
>

Below is a new execution plan, with Analyze, BUFFERS. This time,
without changing anything in the configuration of the DBMS, I just
rebooted the DBMS, the time of 16 minutes was obtained, against the 26
minutes of another execution. But it still has not managed to exceed
the execution time in HDD SAS 15Krpm.
I was not able to upload to the site, because I'm saving the execution
plan in the database, and when I retrieve it, it loses the line
breaks, and the dxxxx site does not allow uploading.


------------------- Execution Plan with Buffers  executed on SSD
Stores.---------------------------------------------------------------------------------------------

Finalize GroupAggregate  (cost=15822228.33..15980046.69 rows=60150
width=66) (actual time=969248.287..973686.679 rows=175 loops=1)  Group
Key: nation.n_name, (date_part(_year_::text,
(orders.o_orderdate)::timestamp without time zone))  Buffers: shared
hit=1327602 read=2305013, temp read=1183857 written=1180940  ->
Gather Merge  (cost=15822228.33..15977791.06 rows=120300 width=66)
(actual time=969222.164..973685.582 rows=525 loops=1)        Workers
Planned: 2        Workers Launched: 2        Buffers: shared
hit=1327602 read=2305013, temp read=1183857 written=1180940        ->
Partial GroupAggregate  (cost=15821228.31..15962905.44 rows=60150
width=66) (actual time=941985.137..946403.344 rows=175 loops=3)
      Group Key: nation.n_name, (date_part(_year_::text,
(orders.o_orderdate)::timestamp without time zone))
Buffers: shared hit=3773802 read=7120852, temp read=3550293
written=3541542              ->  Sort  (cost=15821228.31..15838806.37
rows=7031225 width=57) (actual time=941954.595..943119.850
rows=4344197 loops=3)                    Sort Key: nation.n_name,
(date_part(_year_::text, (orders.o_orderdate)::timestamp without time
zone)) DESC                    Sort Method: external merge  Disk:
320784kB                    Buffers: shared hit=3773802 read=7120852,
temp read=3550293 written=3541542                    ->  Hash Join
(cost=4708859.28..14719466.13 rows=7031225 width=57) (actual
time=619996.638..933725.615 rows=4344197 loops=3)
    Hash Cond: (lineitem.l_suppkey = supplier.s_suppkey)
           Buffers: shared hit=3773732 read=7120852, temp read=3220697
written=3211409                          ->  Hash Join
(cost=4683017.71..14434606.65 rows=7071075 width=43) (actual
time=579893.395..926348.061 rows=4344197 loops=3)
          Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
                       Buffers: shared hit=3758207 read=7108695, temp
read=3114271 written=3105025                                ->  Hash
Join  (cost=1993687.71..11297331.33 rows=7071075 width=47) (actual
time=79741.803..805259.856 rows=4344197 loops=3)
               Hash Cond: ((lineitem.l_suppkey = partsupp.ps_suppkey)
AND (lineitem.l_partkey = partsupp.ps_partkey))
              Buffers: shared hit=1754251 read=5797780, temp
read=2369849 written=2366741                                      ->
Hash Join  (cost=273201.71..9157213.44 rows=7071075 width=45) (actual
time=5363.078..672302.517 rows=4344197 loops=3)
                    Hash Cond: (lineitem.l_partkey = part.p_partkey)
                                         Buffers: shared hit=325918
read=5027133, temp read=1742658 written=1742616
                    ->  Parallel Seq Scan on lineitem
(cost=0.00..5861333.20 rows=100005120 width=41) (actual
time=0.129..536226.436 rows=80004097 loops=3)
                        Buffers: shared hit=2 read=4861280
                               ->  Hash  (cost=263921.00..263921.00
rows=565657 width=4) (actual time=5362.100..5362.100 rows=434469
loops=3)                                                  Buckets:
131072  Batches: 8  Memory Usage: 2933kB
                   Buffers: shared hit=325910 read=165853, temp
written=3327                                                  ->  Seq
Scan on part  (cost=0.00..263921.00 rows=565657 width=4) (actual
time=0.025..5279.959 rows=434469 loops=3)
                          Filter: ((p_name)::text ~~ _%orchid%_::text)
                                                       Rows Removed by
Filter: 7565531
Buffers: shared hit=325910 read=165853
     ->  Hash  (cost=1052986.00..1052986.00 rows=32000000 width=22)
(actual time=74231.061..74231.061 rows=32000000 loops=3)
                             Buckets: 65536  Batches: 512  Memory
Usage: 3941kB                                            Buffers:
shared hit=1428311 read=770647, temp written=513846
                        ->  Seq Scan on partsupp
(cost=0.00..1052986.00 rows=32000000 width=22) (actual
time=0.037..66316.652 rows=32000000 loops=3)
                       Buffers: shared hit=1428311 read=770647
                       ->  Hash  (cost=1704955.00..1704955.00
rows=60000000 width=8) (actual time=46310.630..46310.630 rows=60000000
loops=3)                                      Buckets: 131072
Batches: 1024  Memory Usage: 3316kB
  Buffers: shared hit=2003950 read=1310915, temp written=613128
                              ->  Seq Scan on orders
(cost=0.00..1704955.00 rows=60000000 width=8) (actual
time=0.033..34352.493 rows=60000000 loops=3)
                 Buffers: shared hit=2003950 read=1310915
            ->  Hash  (cost=18106.56..18106.56 rows=400000 width=30)
(actual time=226.360..226.360 rows=400000 loops=3)
           Buckets: 65536  Batches: 8  Memory Usage: 3549kB
                    Buffers: shared hit=15437 read=12157, temp
written=6396                                ->  Hash Join
(cost=1.56..18106.56 rows=400000 width=30) (actual time=0.037..145.779
rows=400000 loops=3)                                      Hash Cond:
(supplier.s_nationkey = nation.n_nationkey)
          Buffers: shared hit=15437 read=12157
             ->  Seq Scan on supplier  (cost=0.00..13197.00
rows=400000 width=12) (actual time=0.014..63.768 rows=400000 loops=3)
                                          Buffers: shared hit=15434
read=12157                                      ->  Hash
(cost=1.25..1.25 rows=25 width=30) (actual time=0.015..0.015 rows=25
loops=3)                                            Buckets: 1024
Batches: 1  Memory Usage: 10kB
   Buffers: shared hit=3                                            ->
 Seq Scan on nation  (cost=0.00..1.25 rows=25 width=30) (actual
time=0.006..0.008 rows=25 loops=3)
             Buffers: shared hit=3Planning time: 16.668 msExecution
time: 973799.430 ms


------------------------------------------------------------------------------------------------------------------------------------------------

> I'm make an experiment for a scientific research and this is what I
> find strange, explaining better, strange HDD performance far outweigh
> the performance of an SSD.
>
> Do you think that if you run a VACUMM FULL the performance with the
> SSD will be better than a 15Krpm SAS HDD?
>
> Best Regards
> Neto
> <div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table
> style="border-top: 1px solid #D3D4DE;">
>         <tr>
>       <td style="width: 55px; padding-top: 18px;"><a
> href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
> target="_blank"><img
> src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif"
> alt="" width="46" height="29" style="width: 46px; height: 29px;"
> /></a></td>
>                 <td style="width: 470px; padding-top: 17px; color: #41424e;
> font-size: 13px; font-family: Arial, Helvetica, sans-serif;
> line-height: 18px;">Livre de vírus. <a
> href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
> target="_blank" style="color: #4453ea;">www.avast.com</a>.              </td>
>         </tr>
> </table>
> <a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div>
>
> 2018-01-14 19:40 GMT-02:00 Justin Pryzby <pryzby@telsasoft.com>:
>> On Sun, Jan 14, 2018 at 12:44:00PM -0800, Neto pr wrote:
>>> Dear all
>>>
>>> Someone help me analyze the two execution plans below (Explain ANALYZE
>>> used), is the  query 9 of TPC-H benchmark [1].
>>>
>>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>>> 15 Krpm AND SSD Sansung EVO 500GB.
>>>
>>> I think maybe the execution plan is using more write operations, and so the
>>> HDD SAS 15Krpm has been faster.
>>
>> The query plan is all garbled by mail , could you resend?  Or post a link from
>> https://explain.depesz.com/
>>
>> To see if the query is causing many writes (due to dirty pages, sorts, etc),
>> run with explain(analyze,buffers)
>>
>> But from what I could tell, your problems are here:
>>
>> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861332.93 rows=100005093 width=41) (actual TIME=3.494..842667.110
rows=80004097loops=3) 
>> vs
>> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861333.40 rows=100005140 width=41) (actual TIME=41.805..224438.909
rows=80004097loops=3) 
>>
>> ->  Seq Scan on partsupp (cost=0.00..1052983.08 rows=31999708 width=22) (actual TIME=0.033..228828.149 rows=32000000
loops=3)
>> vs
>> ->  Seq Scan on partsupp  (cost=0.00..1052934.38 rows=31994838 width=22) (actual TIME=0.037..37865.003 rows=32000000
loops=3)
>>
>> Can you reproduce the speed difference using dd ?
>> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>>
>> Or: bonnie++ -f -n0
>>
>> What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
>> readahead?  blockdev --getra
>>
>> If you're running under linux, maybe you can just send the output of:
>> for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
>> or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>>
>> Justin


Re: HDD vs SSD without explanation

От
Justin Pryzby
Дата:
On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
> > The query plan is all garbled by mail , could you resend?  Or post a link from
> > https://explain.depesz.com/

On Sun, Jan 14, 2018 at 06:36:02PM -0800, Neto pr wrote:
> I was not able to upload to the site, because I'm saving the execution
> plan in the database, and when I retrieve it, it loses the line breaks,

That's why it's an issue for me, too..

> > What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
> 
> See below the Disk FileSystem --------------------------------
> root@hp2ml110deb:/# fdisk -l
> Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
> 
> Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> ----------------------------------------------------------------------------
What about sdb partitions/FS?

On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
> The DBMS and tablespace of users is installed in /dev/sdb  SSD.

Is that also a temp_tablespace ?  Or are your hashes spilling to HDD instead ?

Group Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone))
Buffers: shared hit=3773802 read=7120852, temp read=3550293 written=3541542

Are your SSD being used for anything else ?

What about these?

> > readahead?  blockdev --getra

> > If you're running under linux, maybe you can just send the output of:
> > for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
> > or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}

> > Can you reproduce the speed difference using dd ?
> > time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
> >
> > Or: bonnie++ -f -n0

Justin


Re: HDD vs SSD without explanation

От
Neto pr
Дата:
2018-01-14 19:09 GMT-08:00 Justin Pryzby <pryzby@telsasoft.com>:
> On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
>> > The query plan is all garbled by mail , could you resend?  Or post a link from
>> > https://explain.depesz.com/
>
> On Sun, Jan 14, 2018 at 06:36:02PM -0800, Neto pr wrote:
>> I was not able to upload to the site, because I'm saving the execution
>> plan in the database, and when I retrieve it, it loses the line breaks,
>
> That's why it's an issue for me, too..
>
>> > What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
>>
>> See below the Disk FileSystem --------------------------------
>> root@hp2ml110deb:/# fdisk -l
>> Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
>>
>> Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
>> Units: sectors of 1 * 512 = 512 bytes
>> Sector size (logical/physical): 512 bytes / 512 bytes
>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>> ----------------------------------------------------------------------------
> What about sdb partitions/FS?

I used EXT4 filesystem in Debian SO.

>
> On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
>> The DBMS and tablespace of users is installed in /dev/sdb  SSD.
>
> Is that also a temp_tablespace ?  Or are your hashes spilling to HDD instead ?
>

How can I find out where my temp_tablesapce is?
With the command \db+ (see below) does not show the location. But the
DBMS I asked to install inside the SSD, but how can I find out the
exact location of the temp_tablespace ?

----------------------------------------------------------------------------
tpch40gnorssd=# \db+
                                             List of tablespaces
    Name    |  Owner   |            Location            | Access
privileges | Options |  Size  | Description
------------+----------+--------------------------------+-------------------+---------+--------+-------------
 pg_default | postgres |                                |
     |         | 21 MB  |
 pg_global  | postgres |                                |
     |         | 573 kB |
 tblpgssd   | postgres | /media/ssd500gb/dados/pg101ssd |
     |         | 206 GB |
(3 rows)
------------------------------------------------------------------------------

> Group Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone))
> Buffers: shared hit=3773802 read=7120852, temp read=3550293 written=3541542
>
> Are your SSD being used for anything else ?
>
> What about these?
>
>> > readahead?  blockdev --getra
>

About knowing if the SSD is being used by another process, I will
still execute the command and send the result.

But I can say that the SSD is only used by the DBMS.
Explaining better, My server has an HDD and an SSD. The Debian OS is
installed on the HDD and I installed the DBMS inside the SSD and the
data tablespace also inside the SSD .
The server is dedicated to the DBMS and when I execute the queries,
nothing else is executed. I still can not understand how an HDD is
faster than an SSD.
I ran queries again on the SSD and the results were not good see:

execution 1- 00:16:13
execution 2- 00:25:30
execution 3- 00:28:09
execution 4- 00:24:33
execution 5- 00:24:38

Regards
Neto




>> > If you're running under linux, maybe you can just send the output of:
>> > for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
>> > or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>
>> > Can you reproduce the speed difference using dd ?
>> > time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>> >
>> > Or: bonnie++ -f -n0
>
> Justin


Re: HDD vs SSD without explanation

От
Neto pr
Дата:


2018-01-15 3:04 GMT-08:00 Neto pr <netopr9@gmail.com>:
> 2018-01-14 19:09 GMT-08:00 Justin Pryzby <pryzby@telsasoft.com>:
>> On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
>>> > The query plan is all garbled by mail , could you resend?  Or post a link from
>>> > https://explain.depesz.com/
>>
>> On Sun, Jan 14, 2018 at 06:36:02PM -0800, Neto pr wrote:
>>> I was not able to upload to the site, because I'm saving the execution
>>> plan in the database, and when I retrieve it, it loses the line breaks,
>>
>> That's why it's an issue for me, too..
>>
>>> > What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
>>>
>>> See below the Disk FileSystem --------------------------------
>>> root@hp2ml110deb:/# fdisk -l
>>> Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
>>>
>>> Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
>>> Units: sectors of 1 * 512 = 512 bytes
>>> Sector size (logical/physical): 512 bytes / 512 bytes
>>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> ----------------------------------------------------------------------------
>> What about sdb partitions/FS?
>
> I used EXT4 filesystem in Debian SO.
>
>>
>> On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
>>> The DBMS and tablespace of users is installed in /dev/sdb  SSD.
>>
>> Is that also a temp_tablespace ?  Or are your hashes spilling to HDD instead ?
>>
>
> How can I find out where my temp_tablesapce is?
> With the command \db+ (see below) does not show the location. But the
> DBMS I asked to install inside the SSD, but how can I find out the
> exact location of the temp_tablespace ?
>
> ----------------------------------------------------------------------------
> tpch40gnorssd=# \db+
>                                              List of tablespaces
>     Name    |  Owner   |            Location            | Access
> privileges | Options |  Size  | Description
> ------------+----------+--------------------------------+-------------------+---------+--------+-------------
>  pg_default | postgres |                                |
>      |         | 21 MB  |
>  pg_global  | postgres |                                |
>      |         | 573 kB |
>  tblpgssd   | postgres | /media/ssd500gb/dados/pg101ssd |
>      |         | 206 GB |
> (3 rows)
> ------------------------------------------------------------------------------
>

I checked that the temporary tablespace pg_default is on the SSD, because when running show temp_tablespaces in psql returns empty, and by the documentation,
https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-TEMP-TABLESPACES will be in the default directory, where I installed the DBMS in: /media/ssd500gb/opt/pgv101norssd/data.

The servers where I executed the query with HDD SAS is not the same one where I executed the query with SSD, but they are identical Server (HP Proliant ML110), it has the same model and configuration, only the disks that are not the same, see:

Server 1
- HDD SAS 15 Krpm - 320 GB (Location where O.S. Debian and Postgresql are installed)

Server 2
- Samsung Evo SSD 500 GB (Location where Postgresql is Installed)
- HDD Sata 7500 Krpm - 1TB (Location where O.S Debian is installed)


>> Group Key: nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp without time zone))
>> Buffers: shared hit=3773802 read=7120852, temp read=3550293 written=3541542
>>
>> Are your SSD being used for anything else ?
>>
>> What about these?
>>
>>> > readahead?  blockdev --getra
>>
>
> About knowing if the SSD is being used by another process, I will
> still execute the command and send the result.
>
> But I can say that the SSD is only used by the DBMS.
> Explaining better, My server has an HDD and an SSD. The Debian OS is
> installed on the HDD and I installed the DBMS inside the SSD and the
> data tablespace also inside the SSD .
> The server is dedicated to the DBMS and when I execute the queries,
> nothing else is executed. I still can not understand how an HDD is
> faster than an SSD.
> I ran queries again on the SSD and the results were not good see:
>
> execution 1- 00:16:13
> execution 2- 00:25:30
> execution 3- 00:28:09
> execution 4- 00:24:33
> execution 5- 00:24:38
>
> Regards
> Neto
>
>
>
>
>>> > If you're running under linux, maybe you can just send the output of:
>>> > for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
>>> > or: tail /sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>>
>>> > Can you reproduce the speed difference using dd ?
>>> > time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>>> >
>>> > Or: bonnie++ -f -n0
>>
>> Justin

Re: HDD vs SSD without explanation

От
"Georg H."
Дата:
Hello Neto

Am 14.01.2018 um 21:44 schrieb Neto pr:
> Dear all
>
> Someone help me analyze the two execution plans below (Explain ANALYZE 
> used), is the  query 9 of TPC-H benchmark [1].
> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 
> 320GB 15 Krpm AND SSD Sansung EVO 500GB.
> My DBMS parameters presents in postgresql.conf is default, but in SSD 
> I have changed random_page_cost = 1.0.
>
you are comparing a SAS Drive against a SATA SSD. Their interfaces serve 
a completely different bandwidth.
While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to 
transfer 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
Do a short research on SAS vs SATA and then use a SAS SSD for comparison :)

regards
Georg


Re: HDD vs SSD without explanation

От
Brian Busch
Дата:
we've had the same experience here - with older SATA 2 (3Gbps) - in spite of SSD having no spin latency, the bus speed itself was half of the SAS-2 (6Gbps) we were using at the time which negated SSD perf in this area. HDD was about the same perf as SSD for us.

Biran

On Mon, Jan 15, 2018 at 1:32 PM, Georg H. <georg-h@silentrunner.de> wrote:

Hello Neto

Am 14.01.2018 um 21:44 schrieb Neto pr:
Dear all

Someone help me analyze the two execution plans below (Explain ANALYZE used), is the  query 9 of TPC-H benchmark [1].
I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB 15 Krpm AND SSD Sansung EVO 500GB.
My DBMS parameters presents in postgresql.conf is default, but in SSD I have changed random_page_cost = 1.0.

you are comparing a SAS Drive against a SATA SSD. Their interfaces serve a completely different bandwidth.
While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to transfer 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
Do a short research on SAS vs SATA and then use a SAS SSD for comparison :)

regards
Georg


Re: HDD vs SSD without explanation

От
Fernando Hevia
Дата:


2018-01-15 15:32 GMT-03:00 Georg H. <georg-h@silentrunner.de>:

Hello Neto

Am 14.01.2018 um 21:44 schrieb Neto pr:
Dear all

Someone help me analyze the two execution plans below (Explain ANALYZE used), is the  query 9 of TPC-H benchmark [1].
I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB 15 Krpm AND SSD Sansung EVO 500GB.
My DBMS parameters presents in postgresql.conf is default, but in SSD I have changed random_page_cost = 1.0.

you are comparing a SAS Drive against a SATA SSD. Their interfaces serve a completely different bandwidth.
While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to transfer 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
Do a short research on SAS vs SATA and then use a SAS SSD for comparison :)

The query being all read operations both drives should perform somewhat similarly. Therefore, either the SAS drive has some special sauce to it (a.k.a very fast built-in cache) or there is something else going on these systems. Otherwise he shouldn't be stressing the 6 Gbit/s interface limit with a single drive, be that the SATA or the SAS drive.

Neto, you have been suggested to provide a number of command outputs to know more about your system. Testing the raw read throughput of both your drives should be first on your list.

Cheers.




Re: HDD vs SSD without explanation

От
Neto pr
Дата:
Hi Georg,
Your answer I believe has revealed the real problem.
I looked at the specification of my SATA SSD, and from my SAS HDD, I
saw that the SAS has 12 Gb/s versus 6 Gb/s from the SSD

SSD: Samsung 500 GB SATA III 6Gb/s - Model: 850 Evo
http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/

HDD: HPE 300GB 12G SAS Part-Number: 737261-B21
https://h20195.www2.hpe.com/v2/GetPDF.aspx%2Fc04111744.pdf

I saw that the SAS band is double, and because of that reason the
difference in performance occurred.

Another question, if I compare the disk below HDD SAS that has a
transfer rate of 6Gb/s equal to the SSD SATA 6Gb/s, do you think the
SSD would be more agile in this case?
HDD: HP 450GB 6G SAS 15K rpm LFF (3.5-inch) Part-Number: 652615-B21

best Regards
Neto

2018-01-15 16:32 GMT-02:00 Georg H. <georg-h@silentrunner.de>:
>
> Hello Neto
>
> Am 14.01.2018 um 21:44 schrieb Neto pr:
>>
>> Dear all
>>
>> Someone help me analyze the two execution plans below (Explain ANALYZE
>> used), is the  query 9 of TPC-H benchmark [1].
>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>> 15 Krpm AND SSD Sansung EVO 500GB.
>> My DBMS parameters presents in postgresql.conf is default, but in SSD I
>> have changed random_page_cost = 1.0.
>>
> you are comparing a SAS Drive against a SATA SSD. Their interfaces serve a
> completely different bandwidth.
> While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to transfer
> 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
> Do a short research on SAS vs SATA and then use a SAS SSD for comparison :)
>
> regards
> Georg
>


Re: HDD vs SSD without explanation

От
Neto pr
Дата:
2018-01-15 17:55 GMT-02:00 Fernando Hevia <fhevia@gmail.com>:
>
>
> 2018-01-15 15:32 GMT-03:00 Georg H. <georg-h@silentrunner.de>:
>>
>>
>> Hello Neto
>>
>> Am 14.01.2018 um 21:44 schrieb Neto pr:
>>>
>>> Dear all
>>>
>>> Someone help me analyze the two execution plans below (Explain ANALYZE
>>> used), is the  query 9 of TPC-H benchmark [1].
>>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>>> 15 Krpm AND SSD Sansung EVO 500GB.
>>> My DBMS parameters presents in postgresql.conf is default, but in SSD I
>>> have changed random_page_cost = 1.0.
>>>
>> you are comparing a SAS Drive against a SATA SSD. Their interfaces serve a
>> completely different bandwidth.
>> While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to
>> transfer 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
>> Do a short research on SAS vs SATA and then use a SAS SSD for comparison
>> :)
>
>
> The query being all read operations both drives should perform somewhat
> similarly. Therefore, either the SAS drive has some special sauce to it
> (a.k.a very fast built-in cache) or there is something else going on these
> systems. Otherwise he shouldn't be stressing the 6 Gbit/s interface limit
> with a single drive, be that the SATA or the SAS drive.
>
> Neto, you have been suggested to provide a number of command outputs to know
> more about your system. Testing the raw read throughput of both your drives
> should be first on your list.
>


Guys, sorry for the Top Post, I forgot ....

Fernando, I think the difference of 6 Gb/s to 12 Gb/s from SAS is what
caused the difference in query execution time.
Because looking at the execution plans and the cost estimate, I did
not see many differences, in methods of access among other things.
Regarding the query, none of them use indexes, since I did a first
test without indexes.
Do you think that if I compare the disk below HDD SAS that has a
transfer rate of 6Gb/s equal to the SSD SATA 6Gb/s, do you think the
SSD would be more agile in this case?

HDD: HP 450GB 6G SAS 15K rpm LFF (3.5-inch) Part-Number: 652615-B21

Neto

> Cheers.
>
>
>
>


Re: HDD vs SSD without explanation

От
Fernando Hevia
Дата:


2018-01-15 20:25 GMT-03:00 Neto pr <netopr9@gmail.com>:
2018-01-15 17:55 GMT-02:00 Fernando Hevia <fhevia@gmail.com>:
>
>
> 2018-01-15 15:32 GMT-03:00 Georg H. <georg-h@silentrunner.de>:
>>
>>
>> Hello Neto
>>
>> Am 14.01.2018 um 21:44 schrieb Neto pr:
>>>
>>> Dear all
>>>
>>> Someone help me analyze the two execution plans below (Explain ANALYZE
>>> used), is the  query 9 of TPC-H benchmark [1].
>>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>>> 15 Krpm AND SSD Sansung EVO 500GB.
>>> My DBMS parameters presents in postgresql.conf is default, but in SSD I
>>> have changed random_page_cost = 1.0.
>>>
>> you are comparing a SAS Drive against a SATA SSD. Their interfaces serve a
>> completely different bandwidth.
>> While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to
>> transfer 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
>> Do a short research on SAS vs SATA and then use a SAS SSD for comparison
>> :)
>
>
> The query being all read operations both drives should perform somewhat
> similarly. Therefore, either the SAS drive has some special sauce to it
> (a.k.a very fast built-in cache) or there is something else going on these
> systems. Otherwise he shouldn't be stressing the 6 Gbit/s interface limit
> with a single drive, be that the SATA or the SAS drive.
>
> Neto, you have been suggested to provide a number of command outputs to know
> more about your system. Testing the raw read throughput of both your drives
> should be first on your list.
>


Guys, sorry for the Top Post, I forgot ....

Fernando, I think the difference of 6 Gb/s to 12 Gb/s from SAS is what
caused the difference in query execution time.
Because looking at the execution plans and the cost estimate, I did
not see many differences, in methods of access among other things.
Regarding the query, none of them use indexes, since I did a first
test without indexes.
Do you think that if I compare the disk below HDD SAS that has a
transfer rate of 6Gb/s equal to the SSD SATA 6Gb/s, do you think the
SSD would be more agile in this case?

HDD: HP 450GB 6G SAS 15K rpm LFF (3.5-inch) Part-Number: 652615-B21

Neto

The 6 Gb/s interface is capable of a maximum throughput of around 600 Mb/s. None of your drives can achieve that so I don't think you are limited to the interface speed. The 12 Gb/s interface speed advantage kicks in when there are several drives installed and it won't make a diference in a single drive or even a two drive system.

But don't take my word for it. Test your drives throughput with the command Justin suggested so you know exactly what each drive is capable of:

Can you reproduce the speed difference using dd ?
time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size

While common sense says SSD drive should outperform the mechanical one, your test scenario (large volume sequential reads) evens out the field a lot. Still I would have expected somewhat similar results in the outcome, so yes, it is weird that the SAS drive doubles the SSD performance. That is why I think there must be something else going on during your tests on the SSD server. It can also be that the SSD isn't working properly or you are running an suboptimal OS+server+controller configuration for the drive.

Re: HDD vs SSD without explanation

От
Neto pr
Дата:
2018-01-15 16:18 GMT-08:00 Fernando Hevia <fhevia@gmail.com>:
>
>
> 2018-01-15 20:25 GMT-03:00 Neto pr <netopr9@gmail.com>:
>>
>> 2018-01-15 17:55 GMT-02:00 Fernando Hevia <fhevia@gmail.com>:
>> >
>> >
>> > 2018-01-15 15:32 GMT-03:00 Georg H. <georg-h@silentrunner.de>:
>> >>
>> >>
>> >> Hello Neto
>> >>
>> >> Am 14.01.2018 um 21:44 schrieb Neto pr:
>> >>>
>> >>> Dear all
>> >>>
>> >>> Someone help me analyze the two execution plans below (Explain ANALYZE
>> >>> used), is the  query 9 of TPC-H benchmark [1].
>> >>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS
>> >>> 320GB
>> >>> 15 Krpm AND SSD Sansung EVO 500GB.
>> >>> My DBMS parameters presents in postgresql.conf is default, but in SSD
>> >>> I
>> >>> have changed random_page_cost = 1.0.
>> >>>
>> >> you are comparing a SAS Drive against a SATA SSD. Their interfaces
>> >> serve a
>> >> completely different bandwidth.
>> >> While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to
>> >> transfer 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
>> >> Do a short research on SAS vs SATA and then use a SAS SSD for
>> >> comparison
>> >> :)
>> >
>> >
>> > The query being all read operations both drives should perform somewhat
>> > similarly. Therefore, either the SAS drive has some special sauce to it
>> > (a.k.a very fast built-in cache) or there is something else going on
>> > these
>> > systems. Otherwise he shouldn't be stressing the 6 Gbit/s interface
>> > limit
>> > with a single drive, be that the SATA or the SAS drive.
>> >
>> > Neto, you have been suggested to provide a number of command outputs to
>> > know
>> > more about your system. Testing the raw read throughput of both your
>> > drives
>> > should be first on your list.
>> >
>>
>>
>> Guys, sorry for the Top Post, I forgot ....
>>
>> Fernando, I think the difference of 6 Gb/s to 12 Gb/s from SAS is what
>> caused the difference in query execution time.
>> Because looking at the execution plans and the cost estimate, I did
>> not see many differences, in methods of access among other things.
>> Regarding the query, none of them use indexes, since I did a first
>> test without indexes.
>> Do you think that if I compare the disk below HDD SAS that has a
>> transfer rate of 6Gb/s equal to the SSD SATA 6Gb/s, do you think the
>> SSD would be more agile in this case?
>>
>> HDD: HP 450GB 6G SAS 15K rpm LFF (3.5-inch) Part-Number: 652615-B21
>>
>> Neto
>
>
> The 6 Gb/s interface is capable of a maximum throughput of around 600 Mb/s.
> None of your drives can achieve that so I don't think you are limited to the
> interface speed. The 12 Gb/s interface speed advantage kicks in when there
> are several drives installed and it won't make a diference in a single drive
> or even a two drive system.
>
> But don't take my word for it. Test your drives throughput with the command
> Justin suggested so you know exactly what each drive is capable of:
>
>> Can you reproduce the speed difference using dd ?
>> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>> skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>
>
> While common sense says SSD drive should outperform the mechanical one, your
> test scenario (large volume sequential reads) evens out the field a lot.
> Still I would have expected somewhat similar results in the outcome, so yes,
> it is weird that the SAS drive doubles the SSD performance. That is why I
> think there must be something else going on during your tests on the SSD
> server. It can also be that the SSD isn't working properly or you are
> running an suboptimal OS+server+controller configuration for the drive.

Ok.

Can you help me to analyze the output of the command: dd if=/dev/sdX
of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to
optimal_io_size
I put a heavy query running in the DBMS and ran the time sudo command
...  three times for each environment (SAS HDD and SATA SSD), see
below that the SSD had 412,325 and 120 MB/s
The HDD SAS had 183,176 and 183 MB/s ... strange that in the end the
SAS HDD can execute the query faster ... does it have something else
to analyze in the output below?

-------============  SAS HDD 320 Gb 12 Gb/s ==========--------------
root@deb:/etc#  time sudo dd if=/dev/sda2 of=/dev/null bs=1M count=32K
skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 188.01 s, 183 MB/s

real    3m8.473s
user    0m0.076s
sys     0m23.628s
root@deb:/etc#  time sudo dd if=/dev/sda2 of=/dev/null bs=1M count=32K
skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 195.582 s, 176 MB/s

real    3m16.304s
user    0m0.056s
sys     0m19.632s
root@deb:/etc#  time sudo dd if=/dev/sda2 of=/dev/null bs=1M count=32K
skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 187.822 s, 183 MB/s

real    3m8.457s
user    0m0.032s
sys     0m20.668s
root@deb:/etc#

-------============ SATA SSD 500 Gb 6 Gb/s =========----------------
root@hp2ml110deb:/etc/postgresql/10# time sudo dd if=/dev/sdb
of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to
optimal_io_size

32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 83.4281 s, 412 MB/s

real    1m23.693s
user    0m0.056s
sys     0m19.300s

root@hp2ml110deb:/etc/postgresql/10# time sudo dd if=/dev/sdb
of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to
optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 105.88 s, 325 MB/s

real    1m46.301s
user    0m0.020s
sys     0m14.676s

root@hp2ml110deb:/etc/postgresql/10# time sudo dd if=/dev/sdb
of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to
optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 285.959 s, 120 MB/s

real    4m46.283s
user    0m0.036s
sys     0m15.444s

------------------------------------- END -----------------------------


>


Re: HDD vs SSD without explanation

От
Justin Pryzby
Дата:
On Mon, Jan 15, 2018 at 05:19:59PM -0800, Neto pr wrote:
> >> Can you reproduce the speed difference using dd ?
> >> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
> >> skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
> >
> > Still I would have expected somewhat similar results in the outcome, so yes,
> > it is weird that the SAS drive doubles the SSD performance. That is why I
> > think there must be something else going on during your tests on the SSD
> > server. It can also be that the SSD isn't working properly or you are
> > running an suboptimal OS+server+controller configuration for the drive.
> 
> Ok.
> 
> Can you help me to analyze the output of the command: dd if=/dev/sdX
> of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to
> optimal_io_size

You should run the "dd" without the DB or anything else using the drive.  That
gets peformance of the drive, without the DB.

You should probably rerun the "dd" command using /dev/sdb1 if there's an
partition table on top (??).

I'm still wondering about these:

On Sun, Jan 14, 2018 at 09:09:41PM -0600, Justin Pryzby wrote:
> What about sdb partitions/FS?

> > > readahead?  blockdev --getra
> 
> > > If you're running under linux, maybe you can just send the output of:
> > > for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
> > > or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}

Justin


Re: HDD vs SSD without explanation

От
Neto pr
Дата:
2018-01-15 17:58 GMT-08:00 Justin Pryzby <pryzby@telsasoft.com>:
> On Mon, Jan 15, 2018 at 05:19:59PM -0800, Neto pr wrote:
>> >> Can you reproduce the speed difference using dd ?
>> >> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>> >> skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>> >
>> > Still I would have expected somewhat similar results in the outcome, so yes,
>> > it is weird that the SAS drive doubles the SSD performance. That is why I
>> > think there must be something else going on during your tests on the SSD
>> > server. It can also be that the SSD isn't working properly or you are
>> > running an suboptimal OS+server+controller configuration for the drive.
>>
>> Ok.
>>
>> Can you help me to analyze the output of the command: dd if=/dev/sdX
>> of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to
>> optimal_io_size
>
> You should run the "dd" without the DB or anything else using the drive.  That
> gets peformance of the drive, without the DB.

Oh important observation,..

>
> You should probably rerun the "dd" command using /dev/sdb1 if there's an
> partition table on top (??).
>
> I'm still wondering about these:

See Below:
------------========= SSD SATA 500GB 6 Gb/s
=======------------------------------
root@hp2ml110deb:/etc# time sudo dd if=/dev/sdb of=/dev/null bs=1M
count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 71.0047 s, 484 MB/s

real    1m11.109s
user    0m0.008s
sys     0m16.584s
root@hp2ml110deb:/etc# time sudo dd if=/dev/sdb of=/dev/null bs=1M
count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 70.937 s, 484 MB/s

real    1m11.089s
user    0m0.012s
sys     0m16.312s
root@hp2ml110deb:/etc#


------------========= HDD SAS 300GB 12 Gb/s
=======------------------------------
root@deb:/home/user1#  time sudo dd if=/dev/sda2 of=/dev/null bs=1M
count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size

32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 147.232 s, 233 MB/s

real    2m27.277s
user    0m0.036s
sys     0m23.096s
root@deb:/home/user1#
root@deb:/home/user1#  time sudo dd if=/dev/sda2 of=/dev/null bs=1M
count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 153.698 s, 224 MB/s

real    2m33.766s
user    0m0.032s
sys     0m22.812s
root@deb:/home/user1#
---------------------------------------------  END
---------------------------------------------------

I had not spoken, but my SAS HDD is connected to the HBA Controler,
through a SATA adapter, because the cable kit I would have to use and
it would be correct, was no available at the supplier,  so it sent the
SAS HDD with a SATA adapter. I found it strange that the speed of SAS
was below the SSD, and even then it can execute the query much faster.



>
> On Sun, Jan 14, 2018 at 09:09:41PM -0600, Justin Pryzby wrote:
>> What about sdb partitions/FS?
>
>> > > readahead?  blockdev --getra
>>
>> > > If you're running under linux, maybe you can just send the output of:
>> > > for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
>> > > or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>
> Justin


Re: HDD vs SSD without explanation

От
Mark Kirkwood
Дата:
On 16/01/18 13:18, Fernando Hevia wrote:

>
>
>
> The 6 Gb/s interface is capable of a maximum throughput of around 600 
> Mb/s. None of your drives can achieve that so I don't think you are 
> limited to the interface speed. The 12 Gb/s interface speed advantage 
> kicks in when there are several drives installed and it won't make a 
> diference in a single drive or even a two drive system.
>
> But don't take my word for it. Test your drives throughput with the 
> command Justin suggested so you know exactly what each drive is 
> capable of:
>
>     Can you reproduce the speed difference using dd ?
>     time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>     skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>
>
> While common sense says SSD drive should outperform the mechanical 
> one, your test scenario (large volume sequential reads) evens out the 
> field a lot. Still I would have expected somewhat similar results in 
> the outcome, so yes, it is weird that the SAS drive doubles the SSD 
> performance. That is why I think there must be something else going on 
> during your tests on the SSD server. It can also be that the SSD isn't 
> working properly or you are running an suboptimal OS+server+controller 
> configuration for the drive.
>

I would second the analysis above - unless you see your read MB/s 
slammed up against 580-600MB/s contunuously then the interface speed is 
not the issue. We have some similar servers that we replaced 12x SAS 
with 1x SATA 6 GBit/s (Intel DC S3710) SSD...and the latter way 
outperforms the original 12 SAS drives.

I suspect the problem is the particular SSD you have - I have 
benchmarked the 256GB EVO variant and was underwhelmed by the 
performance. These (budget) triple cell nand SSD seem to have highly 
variable read and write performance (the write is all about when the SLC 
nand cache gets full)...read I'm not so sure of - but it could be crappy 
chipset/firmware combination. In short I'd recommend *not* using that 
particular SSD for a database workload. I'd recommend one of the Intel 
Datacenter DC range (FWIW I'm not affiliated with Intel in any way...but 
their DC stuff works well).

regards

Mark


Re: HDD vs SSD without explanation

От
Neto pr
Дата:
2018-01-15 20:04 GMT-08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
> On 16/01/18 13:18, Fernando Hevia wrote:
>
>>
>>
>>
>> The 6 Gb/s interface is capable of a maximum throughput of around 600
>> Mb/s. None of your drives can achieve that so I don't think you are limited
>> to the interface speed. The 12 Gb/s interface speed advantage kicks in when
>> there are several drives installed and it won't make a diference in a single
>> drive or even a two drive system.
>>
>> But don't take my word for it. Test your drives throughput with the
>> command Justin suggested so you know exactly what each drive is capable of:
>>
>>     Can you reproduce the speed difference using dd ?
>>     time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>>     skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>>
>>
>> While common sense says SSD drive should outperform the mechanical one,
>> your test scenario (large volume sequential reads) evens out the field a
>> lot. Still I would have expected somewhat similar results in the outcome, so
>> yes, it is weird that the SAS drive doubles the SSD performance. That is why
>> I think there must be something else going on during your tests on the SSD
>> server. It can also be that the SSD isn't working properly or you are
>> running an suboptimal OS+server+controller configuration for the drive.
>>
>
> I would second the analysis above - unless you see your read MB/s slammed up
> against 580-600MB/s contunuously then the interface speed is not the issue.
> We have some similar servers that we replaced 12x SAS with 1x SATA 6 GBit/s
> (Intel DC S3710) SSD...and the latter way outperforms the original 12 SAS
> drives.
>
> I suspect the problem is the particular SSD you have - I have benchmarked
> the 256GB EVO variant and was underwhelmed by the performance. These
> (budget) triple cell nand SSD seem to have highly variable read and write
> performance (the write is all about when the SLC nand cache gets
> full)...read I'm not so sure of - but it could be crappy chipset/firmware
> combination. In short I'd recommend *not* using that particular SSD for a
> database workload. I'd recommend one of the Intel Datacenter DC range (FWIW
> I'm not affiliated with Intel in any way...but their DC stuff works well).
>
> regards
>
> Mark

Hi Mark
In other forums one person said me that on samsung evo should be
partition aligned to 3072 not  default 2048 , to start on erase block
bounduary .  And fs block should be 8kb. I am studing this too. Some
DBAs have reported in other situations that the SSDs when they are
full, are very slow. Mine is 85% full, so maybe that is also
influencing. I'm disappointed with this SSD from Samsung, because in
theory, the read speed of an SSD should be more than 300 times faster
than an HDD and this is not happening.

regards
Neto


Re: HDD vs SSD without explanation

От
Nicolas Charles
Дата:
Le 16/01/2018 à 11:14, Neto pr a écrit :
2018-01-15 20:04 GMT-08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
On 16/01/18 13:18, Fernando Hevia wrote:



The 6 Gb/s interface is capable of a maximum throughput of around 600
Mb/s. None of your drives can achieve that so I don't think you are limited
to the interface speed. The 12 Gb/s interface speed advantage kicks in when
there are several drives installed and it won't make a diference in a single
drive or even a two drive system.

But don't take my word for it. Test your drives throughput with the
command Justin suggested so you know exactly what each drive is capable of:
   Can you reproduce the speed difference using dd ?   time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K   skip=$((128*$RANDOM/32)) # set bs to optimal_io_size


While common sense says SSD drive should outperform the mechanical one,
your test scenario (large volume sequential reads) evens out the field a
lot. Still I would have expected somewhat similar results in the outcome, so
yes, it is weird that the SAS drive doubles the SSD performance. That is why
I think there must be something else going on during your tests on the SSD
server. It can also be that the SSD isn't working properly or you are
running an suboptimal OS+server+controller configuration for the drive.

I would second the analysis above - unless you see your read MB/s slammed up
against 580-600MB/s contunuously then the interface speed is not the issue.
We have some similar servers that we replaced 12x SAS with 1x SATA 6 GBit/s
(Intel DC S3710) SSD...and the latter way outperforms the original 12 SAS
drives.

I suspect the problem is the particular SSD you have - I have benchmarked
the 256GB EVO variant and was underwhelmed by the performance. These
(budget) triple cell nand SSD seem to have highly variable read and write
performance (the write is all about when the SLC nand cache gets
full)...read I'm not so sure of - but it could be crappy chipset/firmware
combination. In short I'd recommend *not* using that particular SSD for a
database workload. I'd recommend one of the Intel Datacenter DC range (FWIW
I'm not affiliated with Intel in any way...but their DC stuff works well).

regards

Mark
Hi Mark
In other forums one person said me that on samsung evo should be
partition aligned to 3072 not  default 2048 , to start on erase block
bounduary .  And fs block should be 8kb. I am studing this too. Some
DBAs have reported in other situations that the SSDs when they are
full, are very slow. Mine is 85% full, so maybe that is also
influencing. I'm disappointed with this SSD from Samsung, because in
theory, the read speed of an SSD should be more than 300 times faster
than an HDD and this is not happening.

regards
Neto

Hi Neto,

Unfortunately, Samsung 850 Evo is not a particularly fast SSD - especially it's not really consistent in term of performance ( see https://www.anandtech.com/show/8747/samsung-ssd-850-evo-review/5 and https://www.anandtech.com/bench/product/1913 ). This is not a product for professional usage, and you should not expect great performance from it - as reported by these benchmark, you can have a 34ms latency in very intensive usage:
ATSB - The Destroyer (99th Percentile Write Latency) 99th Percentile Latency in Microseconds - Lower is Better 34923

Even average write latency of the Samsung 850 Evo is 3,3 ms in intensive workload, while the HPE 300 GB 12G SAS is reported to have an average of 2.9 ms, and won't suffer from write amplification

As long has you stick with a light usage, this SSD will probably be more than capable, but if you want to host a database, you should really look at PRO drives

Kind regards
Nicolas

Re: HDD vs SSD without explanation

От
Mark Kirkwood
Дата:

On 16/01/18 23:14, Neto pr wrote:
> 2018-01-15 20:04 GMT-08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
>> On 16/01/18 13:18, Fernando Hevia wrote:
>>
>>>
>>>
>>> The 6 Gb/s interface is capable of a maximum throughput of around 600
>>> Mb/s. None of your drives can achieve that so I don't think you are limited
>>> to the interface speed. The 12 Gb/s interface speed advantage kicks in when
>>> there are several drives installed and it won't make a diference in a single
>>> drive or even a two drive system.
>>>
>>> But don't take my word for it. Test your drives throughput with the
>>> command Justin suggested so you know exactly what each drive is capable of:
>>>
>>>      Can you reproduce the speed difference using dd ?
>>>      time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>>>      skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>>>
>>>
>>> While common sense says SSD drive should outperform the mechanical one,
>>> your test scenario (large volume sequential reads) evens out the field a
>>> lot. Still I would have expected somewhat similar results in the outcome, so
>>> yes, it is weird that the SAS drive doubles the SSD performance. That is why
>>> I think there must be something else going on during your tests on the SSD
>>> server. It can also be that the SSD isn't working properly or you are
>>> running an suboptimal OS+server+controller configuration for the drive.
>>>
>> I would second the analysis above - unless you see your read MB/s slammed up
>> against 580-600MB/s contunuously then the interface speed is not the issue.
>> We have some similar servers that we replaced 12x SAS with 1x SATA 6 GBit/s
>> (Intel DC S3710) SSD...and the latter way outperforms the original 12 SAS
>> drives.
>>
>> I suspect the problem is the particular SSD you have - I have benchmarked
>> the 256GB EVO variant and was underwhelmed by the performance. These
>> (budget) triple cell nand SSD seem to have highly variable read and write
>> performance (the write is all about when the SLC nand cache gets
>> full)...read I'm not so sure of - but it could be crappy chipset/firmware
>> combination. In short I'd recommend *not* using that particular SSD for a
>> database workload. I'd recommend one of the Intel Datacenter DC range (FWIW
>> I'm not affiliated with Intel in any way...but their DC stuff works well).
>>
>> regards
>>
>> Mark
> Hi Mark
> In other forums one person said me that on samsung evo should be
> partition aligned to 3072 not  default 2048 , to start on erase block
> bounduary .  And fs block should be 8kb. I am studing this too. Some
> DBAs have reported in other situations that the SSDs when they are
> full, are very slow. Mine is 85% full, so maybe that is also
> influencing. I'm disappointed with this SSD from Samsung, because in
> theory, the read speed of an SSD should be more than 300 times faster
> than an HDD and this is not happening.
>
>

Interesting - I didn't try changing the alignment. However I could get 
the rated write and read performance on simple benchmarks (provided it 
was in a PCIe V3 slot)...so figured it was ok with the default aligning. 
However once more complex workloads were attempted (databases and 
distributed object store) the performance was disappointing.

If the SSD is 85% full that will not help either (also look at the 
expected lifetime of these EVO's - not that great for a server)!

One thing worth trying is messing about with the IO scheduler: if you 
are using noop, then try deadline (like I said crappy firmware)...

Realistically, I'd recommend getting an enterprise/DC SSD (put the EVO 
in your workstation, it will be quite nice there)!

Cheers
Mark


Re: HDD vs SSD without explanation

От
Neto pr
Дата:
Thanks all, but I still have not figured it out.
This is really strange because the tests were done on the same machine
(I use  HP ML110 Proliant 8gb RAM - Xeon 2.8 ghz processor (4
cores), and POSTGRESQL 10.1.
- Only the mentioned query running at the time of the test.
- I repeated the query 7 times and did not change the results.
- Before running each batch of 7 executions, I discarded the Operating
System cache and restarted DBMS like this:
(echo 3> / proc / sys / vm / drop_caches;

discs:
- 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID)
- 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID)

- The Operating System and the Postgresql DBMS are installed on the SSD disk.

Best Regards
[ ]`s Neto


2018-01-16 13:24 GMT-08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
>
>
> On 16/01/18 23:14, Neto pr wrote:
>>
>> 2018-01-15 20:04 GMT-08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
>>>
>>> On 16/01/18 13:18, Fernando Hevia wrote:
>>>
>>>>
>>>>
>>>> The 6 Gb/s interface is capable of a maximum throughput of around 600
>>>> Mb/s. None of your drives can achieve that so I don't think you are
>>>> limited
>>>> to the interface speed. The 12 Gb/s interface speed advantage kicks in
>>>> when
>>>> there are several drives installed and it won't make a diference in a
>>>> single
>>>> drive or even a two drive system.
>>>>
>>>> But don't take my word for it. Test your drives throughput with the
>>>> command Justin suggested so you know exactly what each drive is capable
>>>> of:
>>>>
>>>>      Can you reproduce the speed difference using dd ?
>>>>      time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>>>>      skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>>>>
>>>>
>>>> While common sense says SSD drive should outperform the mechanical one,
>>>> your test scenario (large volume sequential reads) evens out the field a
>>>> lot. Still I would have expected somewhat similar results in the
>>>> outcome, so
>>>> yes, it is weird that the SAS drive doubles the SSD performance. That is
>>>> why
>>>> I think there must be something else going on during your tests on the
>>>> SSD
>>>> server. It can also be that the SSD isn't working properly or you are
>>>> running an suboptimal OS+server+controller configuration for the drive.
>>>>
>>> I would second the analysis above - unless you see your read MB/s slammed
>>> up
>>> against 580-600MB/s contunuously then the interface speed is not the
>>> issue.
>>> We have some similar servers that we replaced 12x SAS with 1x SATA 6
>>> GBit/s
>>> (Intel DC S3710) SSD...and the latter way outperforms the original 12 SAS
>>> drives.
>>>
>>> I suspect the problem is the particular SSD you have - I have benchmarked
>>> the 256GB EVO variant and was underwhelmed by the performance. These
>>> (budget) triple cell nand SSD seem to have highly variable read and write
>>> performance (the write is all about when the SLC nand cache gets
>>> full)...read I'm not so sure of - but it could be crappy chipset/firmware
>>> combination. In short I'd recommend *not* using that particular SSD for a
>>> database workload. I'd recommend one of the Intel Datacenter DC range
>>> (FWIW
>>> I'm not affiliated with Intel in any way...but their DC stuff works
>>> well).
>>>
>>> regards
>>>
>>> Mark
>>
>> Hi Mark
>> In other forums one person said me that on samsung evo should be
>> partition aligned to 3072 not  default 2048 , to start on erase block
>> bounduary .  And fs block should be 8kb. I am studing this too. Some
>> DBAs have reported in other situations that the SSDs when they are
>> full, are very slow. Mine is 85% full, so maybe that is also
>> influencing. I'm disappointed with this SSD from Samsung, because in
>> theory, the read speed of an SSD should be more than 300 times faster
>> than an HDD and this is not happening.
>>
>>
>
> Interesting - I didn't try changing the alignment. However I could get the
> rated write and read performance on simple benchmarks (provided it was in a
> PCIe V3 slot)...so figured it was ok with the default aligning. However once
> more complex workloads were attempted (databases and distributed object
> store) the performance was disappointing.
>
> If the SSD is 85% full that will not help either (also look at the expected
> lifetime of these EVO's - not that great for a server)!
>
> One thing worth trying is messing about with the IO scheduler: if you are
> using noop, then try deadline (like I said crappy firmware)...
>
> Realistically, I'd recommend getting an enterprise/DC SSD (put the EVO in
> your workstation, it will be quite nice there)!
>
> Cheers
> Mark