Обсуждение: SSD gives disappointing speed up on OSX

Поиск
Список
Период
Сортировка

SSD gives disappointing speed up on OSX

От
Bill Ross
Дата:
I have a program that inserts 50M records of about 30 bytes each, with
some simple indexing, using about 5 GB of disk, layout shown below. When
I run the program without the inserts, it takes a few seconds to do just
the calculation part.

With inserts, it takes about 90 minutes to run on my macbook pro (2012)
with a spinning disk and 8G memory. Since CPU was running at 40% idle, I
figured this must be due to waiting on disk, so I swapped in an SSD. Now
on the console I see 6 Mb/s negotiated link speed on disk, vs. 3 Mb/s
before.

Surprise: raw insert speed is slower at first. CPU idle remains the
same. Comparing the outer loop of my insert, at i=2160 of 8400:

prev:  29 minutes
now:  31 minutes

But by the end, the SSD has pulled ahead:

prev: 88 minutes
now: 66 minutes

And in the next phase, which is all queries, it goes much faster than
spinning disk, for total real  72m15.728s.

Now I suspect the limit is OSX throttling per-process CPU.
Does this sound right?

Thanks,
Bill

Here is my postgres config:

shared_buffers = 2048MB
temp_buffers = 32MB
work_mem = 8MB
checkpoint_segments = 32

--- 41480732 of these records

CREATE TABLE IF NOT EXISTS pic_pic_color
(
     id1        VARCHAR(10),
     id2        VARCHAR(10),
     cd        REAL
);
CREATE INDEX  color_id1_idx ON pic_pic_color (id1);
CREATE INDEX  color_id2_idx ON pic_pic_color (id2);
CREATE INDEX  color_e_idx ON pic_pic_color (cd);

--- 7929126 of these records

CREATE TABLE IF NOT EXISTS pic_pic_kwd
(
         coder        SMALLINT,
         id1        VARCHAR(10),
         id2        VARCHAR(10),
         closeness    INTEGER
);
CREATE INDEX  kwd_seq1_idx ON pic_pic_kwd (coder, id1);
CREATE INDEX  kwd_close_idx ON pic_pic_kwd (closeness DESC);

(Curious about the application? http://phobrain.com)



Re: SSD gives disappointing speed up on OSX

От
Josh berkus
Дата:
On 02/02/2016 07:32 PM, Bill Ross wrote:
> I have a program that inserts 50M records of about 30 bytes each, with
> some simple indexing, using about 5 GB of disk, layout shown below. When
> I run the program without the inserts, it takes a few seconds to do just
> the calculation part.
>
> With inserts, it takes about 90 minutes to run on my macbook pro (2012)
> with a spinning disk and 8G memory. Since CPU was running at 40% idle, I
> figured this must be due to waiting on disk, so I swapped in an SSD. Now
> on the console I see 6 Mb/s negotiated link speed on disk, vs. 3 Mb/s
> before.

So, your basic problem is going to be that OSX doesn't have a decent
filesystem to offer.  HFS+ was created 18 years ago, and it was a hack
on top of an older HFS.  At its heart, it's still basically a DOS-era
16-bit filesystem.  This means that as long as you are on Mac OS, you
can assume that writes will be slow no matter how good your hardware is.

Supposedly you can still run ZFS on OSX, which might help you.  I
haven't done it, though.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


Re: SSD gives disappointing speed up on OSX

От
Chris Mair
Дата:
> I have a program that inserts 50M records of about 30 bytes each [..]


> Now I suspect the limit is OSX throttling per-process CPU.
> Does this sound right?

Mmm... I don't think so.

How do you perform the inserts?

- Single inserts per transaction?
- Bundled inserts in transactions (with or without prepared statements?)
- COPY?

Also,
have you tried doing the inserts without the indexes and create the indexes afterwards?

And finally you might want to try synchronous_commit = off
( http://www.postgresql.org/docs/9.4/static/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT )
.

Bye,
Chris.





Re: SSD gives disappointing speed up on OSX

От
"FarjadFarid\(ChkNet\)"
Дата:
Whilst the problem with the old filing system is correct. May be you want to ring out the best out of your current set
up.

There are a few things you can do to improve these performance figures.

1) Turn off logging during insert.
2) Ensure logging is performed on a different disk than where db files.
3)Finally ensure read of insert statements and writes operations are on disks drives.

The last option should improve the performance figures as it *reduces* flushing of disk's cache in between operations.



-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Josh berkus
Sent: 03 February 2016 03:53
To: Bill Ross; pgsql-general@postgresql.org
Subject: Re: [GENERAL] SSD gives disappointing speed up on OSX

On 02/02/2016 07:32 PM, Bill Ross wrote:
> I have a program that inserts 50M records of about 30 bytes each, with
> some simple indexing, using about 5 GB of disk, layout shown below.
> When I run the program without the inserts, it takes a few seconds to
> do just the calculation part.
>
> With inserts, it takes about 90 minutes to run on my macbook pro
> (2012) with a spinning disk and 8G memory. Since CPU was running at
> 40% idle, I figured this must be due to waiting on disk, so I swapped
> in an SSD. Now on the console I see 6 Mb/s negotiated link speed on
> disk, vs. 3 Mb/s before.

So, your basic problem is going to be that OSX doesn't have a decent filesystem to offer.  HFS+ was created 18 years
ago,and it was a hack on top of an older HFS.  At its heart, it's still basically a DOS-era 16-bit filesystem.  This
meansthat as long as you are on Mac OS, you can assume that writes will be slow no matter how good your hardware is. 

Supposedly you can still run ZFS on OSX, which might help you.  I haven't done it, though.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



Re: SSD gives disappointing speed up on OSX

От
"FarjadFarid\(ChkNet\)"
Дата:
Whilst the problem with the old filing system is correct. May be you want to ring out the best out of your current set
up.

There are a few things you can do to improve these performance figures.

1) Turn off logging during insert.
2) Ensure logging is performed on a different disk than where db files.
3)Finally ensure read of insert statements and writes operations are on *different* disks drives.

The last option should improve the performance figures as it *reduces* flushing of disk's cache in between operations.



-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Josh berkus
Sent: 03 February 2016 03:53
To: Bill Ross; pgsql-general@postgresql.org
Subject: Re: [GENERAL] SSD gives disappointing speed up on OSX

On 02/02/2016 07:32 PM, Bill Ross wrote:
> I have a program that inserts 50M records of about 30 bytes each, with
> some simple indexing, using about 5 GB of disk, layout shown below.
> When I run the program without the inserts, it takes a few seconds to
> do just the calculation part.
>
> With inserts, it takes about 90 minutes to run on my macbook pro
> (2012) with a spinning disk and 8G memory. Since CPU was running at
> 40% idle, I figured this must be due to waiting on disk, so I swapped
> in an SSD. Now on the console I see 6 Mb/s negotiated link speed on
> disk, vs. 3 Mb/s before.

So, your basic problem is going to be that OSX doesn't have a decent filesystem to offer.  HFS+ was created 18 years
ago,and it was a hack on top of an older HFS.  At its heart, it's still basically a DOS-era 16-bit filesystem.  This
meansthat as long as you are on Mac OS, you can assume that writes will be slow no matter how good your hardware is. 

Supposedly you can still run ZFS on OSX, which might help you.  I haven't done it, though.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general