Postgres bulk insert/ETL performance on high speed servers - test results

Поиск

Список

Период

Сортировка

От	Mike Sofen
Тема	Postgres bulk insert/ETL performance on high speed servers - test results
Дата	2 сентября 2016 г. 02:31:30
Msg-id	04b701d204c2$0bd7e330$2387a990$@runbox.com обсуждение исходный текст
Ответы	Re: Postgres bulk insert/ETL performance on high speed servers - test results
Список	pgsql-performance

Дерево обсуждения

High level summary:  server ram has a significant impact on batch processing
performance (no surprise), and AWS processing can largely compete with local
servers IF the AWS network connection is optimized.

With the recent threads about insert performance (and performance in
general), I thought I'd share some numbers that could assist some other
Postgres users in planning their environments.

I am currently running a Postgres dev server in AWS and we are evaluating a
high powered physical server for our data center, for which we received a
demo unit from Cisco for testing.  Great opportunity to test a range of
pinch points that could restrict scalability and performance, comparing how
2 very different servers behave under a high bulk loading/transform
scenario.  The scenario is that I'm migrating mysql data ("v1", eventually
20tb of genomics data) over to a new Postgres server ("v2").

[As a side note, I'm attempting to get a third server spun up, being a high
powered AWS EC2 instance (an r3.4xlarge with 122gb ram, 16 cores, 6tb SSD
EBS Optimized with 16k guaranteed IOPS).  When I finish the testing against
the 3rd server, I'll report again.]

Landscape:
Source mysql server:  Dell physical 24 cores at 2.8ghz, 32gb ram, 1gbe
networking, Percona/mysql v5.5.3 on linux in our data center
AWS:  EC2 m4.xlarge instance with 16 gb ram, 4 cores at 2.4ghz, 3tb SSD.  PG
v9.5.1 on Red Hat 4.8.5-4 64 bit, on a 10gb Direct Connect link from our
data center to.
Cisco:  Hyperflex HX240c M4 node with UCS B200 M4 blade, with 256gb ram, 48
cores at 2.2ghz, 4tb direct attached Intel flash (SSD) for the OS, 10tb of
NetApp Filer SSD storage via 4gb HBA cards.  PG v9.5.1 on Red Hat 4.8.5-4 64
bit, 10gbe networking but has to throttle down to 1gbe when talking to the
mysql source server.

PASS 1:
Process:      Extract (pull the raw v1 data over the network to the 32 v2
staging tables)
Num Source Rows:      8,232,673 (Small Test)
Rowcount Compression:      1.0 (1:1 copy)
AWS Time in Secs:      1,516**
Cisco Time in Secs:      376
Difference:      4.0x
Comment:      AWS:  5.7k rows/sec    cisco:  21.9k rows/sec
(**network speed appears to be the factor, see notes below)

Process:      Transform/Load (all work local to the server - read,
transform, write as a single batch)
Num Source Rows:      5,575,255 (many smaller batches from the source
tables, all writes going to a single target table)
Avg Rowcount Compression:      10.3 (jsonb row compression resulting in 10x
fewer rows)
AWS Time in Secs:      408
Cisco Time in Secs:      294
Difference:      1.4x  (the Cisco is 40% faster...not a huge difference)
Comment:AWS:  13.6k rows/sec   Cisco:  19k rows/sec

Notes:  The testing has revealed an issue with the networking in our data
center, which appears to be causing abnormally slow transfer speed to AWS.
That is being investigated.  So if we look at just the Transform/Load
process, we can see that both AWS and the local Cisco server have comparable
processing speeds on the small dataset.

However, when I moved to a medium sized dataset of 204m rows, a different
pattern emerged.  I'm including just the Transform/Load process here, and
testing just ONE table out of the batch:

PASS 2:
Process:      Transform/Load (all work local to the server - read,
transform, write as a single batch)
Num Source Rows:      10,554,800 (one batch from just a single source table
going to a single target table)
Avg Rowcount Compression:      31.5 (jsonb row compression resulting in
31.5x fewer rows)
AWS Time in Secs:      2,493 (41.5 minutes)
Cisco Time in Secs:      661 (10 minutes)
Difference:      3.8x
Comment:AWS:  4.2k rows/sec   Cisco:  16k rows/sec

It's obvious the size of the batch exceeded the AWS server memory, resulting
in a profoundly slower processing time.  This was a true, apples to apples
comparison between Pass 1 and Pass 2: average row lengths were within 7% of
each other (1121 vs 1203) using identical table structures and processing
code, the only difference was the target server.

I'm happy to answer questions about these results.

Mike Sofen (USA)

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Postgres bulk insert/ETL performance on high speed servers - test results