Обсуждение: Very slow checkpoints

Поиск

Список

Период

Сортировка

Very slow checkpoints

От

Steven Jones

Дата:

18 марта 2015 г., 14:21:15

Hi,

We have a so far (to us) unexplainable issue on our production systems after we roughly doubled the amount of data we import daily. We should be ok on pure theoretical hardware performance, but we are seeing some weird IO counters when the actual throughput of the writes is very low. The use case is as follows:

- typical DW - relatively constant periodic data loads - i.e. heavy write

- we receive large CSV files ~ 5-10Gb every 15 minutes spread out across 5-7 minutes

- Custom ETL scripts process and filter files within < 30 seconds down to about 5Gb CSV ready to load

- 2 loader queues load the files, picking off a file one-by-one

- tables are partitioned daily, indexed on a primary key + timestamp

- system is HP blade; 128Gb RAM, 2x 8-core, 12x 10k RPM RAID1+0 (database) on first controller, 2x 15k RAID1 (xlog) on a different controller

- DB size is ~2.5Tb; rotating load of 30 days keeps the database stable

- filesystem: zfs with lz4 compression

- raw throughput of the database disk is > 700Mbytes/sec sequential and >150Mbytes random for read and roughly half for write in various benchmarks

- CPU load is minimal when copy loads are taking place (i.e. after ETL has finished)

The issue is that the system is constantly checkpointing regardless of various kernel and postgres settings. Having read through most of the history of this list and most of the recommendations on various blogs, we have been unable to find an answer why the checkpoints are being written so slowly. Even when we disable all import processes or if index is dropped, the checkpoint is still taking > 1hour. Stats are pointing to checkpoint sizes of roughly 7Gb which should take < 1min even with full random reads; so even when imports are fully disabled, what is not making sense is why would the checkpointing be taking well over an hour?

One other thing that's noticed, but not measured, i.e. mostly anecdotal is that for a period of <1hr when postgres is restarted, the system performs mostly fine and checkpoints are completing in <5min; so it may be that after a while some (OS/postgres) buffers are filling up and causing this issue?

Full iostat/iotop, configuration, checkpoint stats, etc. are pasted below for completeness. Highlights are:

checkpoint_segments=512

shared_buffers=16GB

checkpoint_timeout=15min

checkpoint_completion_target=0.1

Regards,

Steve

---

Checkpoint stats:

db=# select * from pg_stat_bgwriter;

checkpoints_timed 6

checkpoints_req 3

checkpoint_write_time 26346184

checkpoint_sync_time 142

buffers_checkpoint 4227065

buffers_clean 4139841

maxwritten_clean 8261

buffers_backend 9128583

buffers_backend_fsync 0

buffers_alloc 9311478

stats_reset 2015-03-17 11:14:21.5649

---

postgres log file - checkpoint log entries:

2015-03-17 11:25:25 LOG: checkpoint complete: wrote 855754 buffers (40.8%); 0 transaction log file(s) added, 0 removed, 500 recycled; write=2988.185 s, sync=0.044 s, total=2988.331 s; sync files=110, longest=0.003 s, average=0.000 s

2015-03-17 11:25:25 LOG: checkpoint starting: xlog time

2015-03-17 11:59:54 LOG: parameter "checkpoint_completion_target" changed to "0.9"

2015-03-17 13:30:20 LOG: checkpoint complete: wrote 1012112 buffers (48.3%); 0 transaction log file(s) added, 0 removed, 512 recycled; write=7494.228 s, sync=0.021 s, total=7494.371 s; sync files=119, longest=0.001 s, average=0.000 s

2015-03-17 13:30:20 LOG: checkpoint starting: xlog time

2015-03-17 14:21:53 LOG: parameter "checkpoint_completion_target" changed to "0.1"

2015-03-17 16:00:58 LOG: checkpoint complete: wrote 1411979 buffers (67.3%); 0 transaction log file(s) added, 696 removed, 900 recycled; write=9036.898 s, sync=0.020 s, total=9038.538 s; sync files=109, longest=0.000 s, average=0.000 s

2015-03-17 16:00:58 LOG: checkpoint starting: time

2015-03-17 16:28:40 LOG: checkpoint complete: wrote 345183 buffers (16.5%); 0 transaction log file(s) added, 2001 removed, 0 recycled; write=1660.333 s, sync=0.018 s, total=1661.816 s; sync files=93, longest=0.002 s, average=0.000 s

2015-03-17 17:28:40 LOG: checkpoint starting: time

2015-03-17 18:54:47 LOG: checkpoint complete: wrote 602037 buffers (28.7%); 0 transaction log file(s) added, 0 removed, 500 recycled; write=5166.540 s, sync=0.039 s, total=5166.657 s; sync files=122, longest=0.003 s, average=0.000 s

2015-03-17 18:54:47 LOG: checkpoint starting: xlog time

---

iostat -x snapshot:

avg-cpu: %user %nice %system %iowait %steal %idle

0.50 0.00 2.35 15.09 0.00 82.05

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util

sda 0.00 0.00 0.00 5.00 0.00 2056.00 822.40 0.00 0.00 0.00 0.00 0.00 0.00

sdb 0.00 0.00 1055.00 549.00 41166.50 22840.00 79.81 5.28 3.28 4.94 0.10 0.62 100.00

---

vmstat 60 output

# vmstat 60

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----

r b swpd free buff cache si so bi bo in cs us sy id wa

5 3 877508 1251152 74476 98853728 0 0 87 1891 0 0 1 5 92 2

6 5 877508 915044 74940 99237840 0 0 46588 41857 6993 41784 8 4 76 12

2 4 877508 1676008 75292 98577936 0 0 46847 34540 4778 17175 9 3 75 13

---

sysctl settings for dirty pages

vm.dirty_background_bytes = 0

vm.dirty_background_ratio = 5

vm.dirty_bytes = 0

vm.dirty_expire_centisecs = 3000

vm.dirty_ratio = 10

vm.dirty_writeback_centisecs = 500

---

# free -m

total used free shared buffers cached

Mem: 128905 126654 2250 0 70 95035

-/+ buffers/cache: 31549 97355

Swap: 15255 856 14399

---

postgres settings:

# cat postgresql.conf |grep check

checkpoint_segments = 512 # in logfile segments, min 1, 16MB each

checkpoint_timeout = 15min # range 30s-1h

checkpoint_completion_target = 0.1 # checkpoint target duration, 0.0 - 1.0

checkpoint_warning = 10min # 0 disables

log_checkpoints = on

shared_buffers = 16384MB

bgwriter_lru_maxpages = 500

#bgwriter_lru_multiplier = 2.0 # 0-10.0 multipler on buffers scanned/round

wal_level = 'archive'

archive_mode = on

archive_command = 'cd .' # we can also use exit 0, anything that

max_wal_senders = 0

wal_keep_segments = 500

hot_standby = off

---

iotop snapshot:

Total DISK READ: 41.63 M/s | Total DISK WRITE: 31.43 M/s

TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND

32101 be/4 postgres 10.25 M/s 1085.86 K/s 0.00 % 96.80 % postgres: checkpointer process

56661 be/4 postgres 6.84 M/s 591.61 K/s 0.00 % 90.91 % postgres: dbauser db [local] COPY

56751 be/4 postgres 6.97 M/s 838.73 K/s 0.00 % 88.00 % postgres: dbauser db [local] COPY

56744 be/4 postgres 6.13 M/s 958.55 K/s 0.00 % 85.48 % postgres: dbauser db [local] COPY

56621 be/4 postgres 6.77 M/s 1288.05 K/s 0.00 % 83.96 % postgres: dbauser db [local] COPY

32102 be/4 postgres 8.05 M/s 1340.47 K/s 0.00 % 82.47 % postgres: writer process

1005 be/0 root 0.00 B/s 0.00 B/s 0.00 % 5.81 % [txg_sync]

32103 be/4 postgres 0.00 B/s 10.41 M/s 0.00 % 0.52 % postgres: wal writer process

---

Re: Very slow checkpoints

От

Steven Jones

Дата:

18 марта 2015 г., 14:25:36

Apologies about the formatting; resending again as plain-text. Regards, Steve From: steven.jones1201@outlook.com To: pgsql-performance@postgresql.org Subject: [PERFORM] Very slow checkpoints Date: Wed, 18 Mar 2015 11:21:08 +0000 Hi, We have a so far (to us) unexplainable issue on our production systems after we roughly doubled the amount of data we import daily. We should be ok on pure theoretical hardware performance, but we are seeing some weird IO counters when the actual throughput of the writes is very low. The use case is as follows: - typical DW - relatively constant periodic data loads - i.e. heavy write - we receive large CSV files ~ 5-10Gb every 15 minutes spread out across 5-7 minutes - Custom ETL scripts process and filter files within < 30 seconds down to about 5Gb CSV ready to load - 2 loader queues load the files, picking off a file one-by-one - tables are partitioned daily, indexed on a primary key + timestamp - system is HP blade; 128Gb RAM, 2x 8-core, 12x 10k RPM RAID1+0 (database) on first controller, 2x 15k RAID1 (xlog) on a different controller - DB size is ~2.5Tb; rotating load of 30 days keeps the database stable - filesystem: zfs with lz4 compression - raw throughput of the database disk is> 700Mbytes/sec sequential and>150Mbytes random for read and roughly half for write in various benchmarks - CPU load is minimal when copy loads are taking place (i.e. after ETL has finished) The issue is that the system is constantly checkpointing regardless of various kernel and postgres settings. Having read through most of the history of this list and most of the recommendations on various blogs, we have been unable to find an answer why the checkpoints are being written so slowly. Even when we disable all import processes or if index is dropped, the checkpoint is still taking> 1hour. Stats are pointing to checkpoint sizes of roughly 7Gb which should take < 1min even with full random reads; so even when imports are fully disabled, what is not making sense is why would the checkpointing be taking well over an hour? One other thing that's noticed, but not measured, i.e. mostly anecdotal is that for a period of <1hr when postgres is restarted, the system performs mostly fine and checkpoints are completing in <5min; so it may be that after a while some (OS/postgres) buffers are filling up and causing this issue? Full iostat/iotop, configuration, checkpoint stats, etc. are pasted below for completeness. Highlights are: checkpoint_segments=512 shared_buffers=16GB checkpoint_timeout=15min checkpoint_completion_target=0.1 Regards, Steve --- Checkpoint stats: db=# select * from pg_stat_bgwriter; checkpoints_timed 6 checkpoints_req 3 checkpoint_write_time 26346184 checkpoint_sync_time 142 buffers_checkpoint 4227065 buffers_clean 4139841 maxwritten_clean 8261 buffers_backend 9128583 buffers_backend_fsync 0 buffers_alloc 9311478 stats_reset 2015-03-17 11:14:21.5649 --- postgres log file - checkpoint log entries: 2015-03-17 11:25:25 LOG: checkpoint complete: wrote 855754 buffers (40.8%); 0 transaction log file(s) added, 0 removed, 500 recycled; write=2988.185 s, sync=0.044 s, total=2988.331 s; sync files=110, longest=0.003 s, average=0.000 s 2015-03-17 11:25:25 LOG: checkpoint starting: xlog time 2015-03-17 11:59:54 LOG: parameter "checkpoint_completion_target" changed to "0.9" 2015-03-17 13:30:20 LOG: checkpoint complete: wrote 1012112 buffers (48.3%); 0 transaction log file(s) added, 0 removed, 512 recycled; write=7494.228 s, sync=0.021 s, total=7494.371 s; sync files=119, longest=0.001 s, average=0.000 s 2015-03-17 13:30:20 LOG: checkpoint starting: xlog time 2015-03-17 14:21:53 LOG: parameter "checkpoint_completion_target" changed to "0.1" 2015-03-17 16:00:58 LOG: checkpoint complete: wrote 1411979 buffers (67.3%); 0 transaction log file(s) added, 696 removed, 900 recycled; write=9036.898 s, sync=0.020 s, total=9038.538 s; sync files=109, longest=0.000 s, average=0.000 s 2015-03-17 16:00:58 LOG: checkpoint starting: time 2015-03-17 16:28:40 LOG: checkpoint complete: wrote 345183 buffers (16.5%); 0 transaction log file(s) added, 2001 removed, 0 recycled; write=1660.333 s, sync=0.018 s, total=1661.816 s; sync files=93, longest=0.002 s, average=0.000 s 2015-03-17 17:28:40 LOG: checkpoint starting: time 2015-03-17 18:54:47 LOG: checkpoint complete: wrote 602037 buffers (28.7%); 0 transaction log file(s) added, 0 removed, 500 recycled; write=5166.540 s, sync=0.039 s, total=5166.657 s; sync files=122, longest=0.003 s, average=0.000 s 2015-03-17 18:54:47 LOG: checkpoint starting: xlog time --- iostat -x snapshot: avg-cpu: %user %nice %system %iowait %steal %idle 0.50 0.00 2.35 15.09 0.00 82.05 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 5.00 0.00 2056.00 822.40 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 1055.00 549.00 41166.50 22840.00 79.81 5.28 3.28 4.94 0.10 0.62 100.00 --- vmstat 60 output # vmstat 60 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 5 3 877508 1251152 74476 98853728 0 0 87 1891 0 0 1 5 92 2 6 5 877508 915044 74940 99237840 0 0 46588 41857 6993 41784 8 4 76 12 2 4 877508 1676008 75292 98577936 0 0 46847 34540 4778 17175 9 3 75 13 --- sysctl settings for dirty pages vm.dirty_background_bytes = 0 vm.dirty_background_ratio = 5 vm.dirty_bytes = 0 vm.dirty_expire_centisecs = 3000 vm.dirty_ratio = 10 vm.dirty_writeback_centisecs = 500 --- # free -m total used free shared buffers cached Mem: 128905 126654 2250 0 70 95035 -/+ buffers/cache: 31549 97355 Swap: 15255 856 14399 --- postgres settings: # cat postgresql.conf |grep check checkpoint_segments = 512 # in logfile segments, min 1, 16MB each checkpoint_timeout = 15min # range 30s-1h checkpoint_completion_target = 0.1 # checkpoint target duration, 0.0 - 1.0 checkpoint_warning = 10min # 0 disables log_checkpoints = on # cat postgresql.conf |egrep -e 'wal|arch|hot|lru|shared' shared_buffers = 16384MB bgwriter_lru_maxpages = 500 #bgwriter_lru_multiplier = 2.0 # 0-10.0 multipler on buffers scanned/round wal_level = 'archive' archive_mode = on archive_command = 'cd .' # we can also use exit 0, anything that max_wal_senders = 0 wal_keep_segments = 500 hot_standby = off --- iotop snapshot: Total DISK READ: 41.63 M/s | Total DISK WRITE: 31.43 M/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 32101 be/4 postgres 10.25 M/s 1085.86 K/s 0.00 % 96.80 % postgres: checkpointer process 56661 be/4 postgres 6.84 M/s 591.61 K/s 0.00 % 90.91 % postgres: dbauser db [local] COPY 56751 be/4 postgres 6.97 M/s 838.73 K/s 0.00 % 88.00 % postgres: dbauser db [local] COPY 56744 be/4 postgres 6.13 M/s 958.55 K/s 0.00 % 85.48 % postgres: dbauser db [local] COPY 56621 be/4 postgres 6.77 M/s 1288.05 K/s 0.00 % 83.96 % postgres: dbauser db [local] COPY 32102 be/4 postgres 8.05 M/s 1340.47 K/s 0.00 % 82.47 % postgres: writer process 1005 be/0 root 0.00 B/s 0.00 B/s 0.00 % 5.81 % [txg_sync] 32103 be/4 postgres 0.00 B/s 10.41 M/s 0.00 % 0.52 % postgres: wal writer process ---

Re: Very slow checkpoints

От

Ilya Kosmodemiansky

Дата:

18 марта 2015 г., 14:43:34

Hi Steven,

On Wed, Mar 18, 2015 at 12:21 PM, Steven Jones
<steven.jones1201@outlook.com> wrote:
>   - system is HP blade; 128Gb RAM, 2x 8-core, 12x 10k RPM RAID1+0 (database)

Have you BBU on your controller? And how your controller configured, I
mean cache mode, io mode, disk write cache mode. You have 15K SAS
(which form factor?) under WAL and 10K SAS under database, am I
correct?

> Full iostat/iotop, configuration, checkpoint stats, etc. are pasted below
> for completeness. Highlights are:
> checkpoint_segments=512
> shared_buffers=16GB
> checkpoint_timeout=15min
> checkpoint_completion_target=0.1

It looks like your checkpoint settings are a bit strange besides of
everything else. If you chose high value for checkpoint_segments, your
aim is to avoid checkpoints by timeout (or vice verse). If you have
checkpoint_segments=512, your checkpoint_timeout should be about
60min. And anyway - checkpoint_completion_target=0.9 or 0.7 in order
to spread disk load between checkpoints.

> ---
> sysctl settings for dirty pages
>
> vm.dirty_background_bytes = 0
> vm.dirty_background_ratio = 5
> vm.dirty_bytes = 0
> vm.dirty_expire_centisecs = 3000
> vm.dirty_ratio = 10
> vm.dirty_writeback_centisecs = 500

Values for this settings are really dependent of RAID (and BBU size).

And about further problem description: have you any graphical
representation of your % disc utilization?

Best regards,
Ilya

--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com

Re: Very slow checkpoints

От

Steven Jones

Дата:

18 марта 2015 г., 14:58:50

Hi Ilya,

Thank you for the response.

Yes BBU is on the controller; 1024Mb. It is a HP P410i controller, with write caching turned on the controller; off on
disklevel. 

2x 15k SAS SFF for WAL and 12x 10k SAS SFF for DB

We have tried longer settings for checkpoint_timeout, but not 1hr; so we will try that as well.

We don't at this stage have any graphs, but we will set it up over the next 24hrs at least.

Regards,
Steve

----------------------------------------
> From: ilya.kosmodemiansky@postgresql-consulting.com
> Date: Wed, 18 Mar 2015 12:42:43 +0100
> Subject: Re: [PERFORM] Very slow checkpoints
> To: steven.jones1201@outlook.com
> CC: pgsql-performance@postgresql.org
>
> Hi Steven,
>
> On Wed, Mar 18, 2015 at 12:21 PM, Steven Jones
> <steven.jones1201@outlook.com> wrote:
>> - system is HP blade; 128Gb RAM, 2x 8-core, 12x 10k RPM RAID1+0 (database)
>
> Have you BBU on your controller? And how your controller configured, I
> mean cache mode, io mode, disk write cache mode. You have 15K SAS
> (which form factor?) under WAL and 10K SAS under database, am I
> correct?
>
>> Full iostat/iotop, configuration, checkpoint stats, etc. are pasted below
>> for completeness. Highlights are:
>> checkpoint_segments=512
>> shared_buffers=16GB
>> checkpoint_timeout=15min
>> checkpoint_completion_target=0.1
>
> It looks like your checkpoint settings are a bit strange besides of
> everything else. If you chose high value for checkpoint_segments, your
> aim is to avoid checkpoints by timeout (or vice verse). If you have
> checkpoint_segments=512, your checkpoint_timeout should be about
> 60min. And anyway - checkpoint_completion_target=0.9 or 0.7 in order
> to spread disk load between checkpoints.
>
>
>
>> 
>> sysctl settings for dirty pages
>>
>> vm.dirty_background_bytes = 0
>> vm.dirty_background_ratio = 5
>> vm.dirty_bytes = 0
>> vm.dirty_expire_centisecs = 3000
>> vm.dirty_ratio = 10
>> vm.dirty_writeback_centisecs = 500
>
> Values for this settings are really dependent of RAID (and BBU size).
>
> And about further problem description: have you any graphical
> representation of your % disc utilization?
>
> Best regards,
> Ilya
>
> --
> Ilya Kosmodemiansky,
>
> PostgreSQL-Consulting.com
> tel. +14084142500
> cell. +4915144336040
> ik@postgresql-consulting.com
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

Re: Very slow checkpoints

От

Ilya Kosmodemiansky

Дата:

18 марта 2015 г., 15:14:25

On Wed, Mar 18, 2015 at 12:58 PM, Steven Jones
<steven.jones1201@outlook.com> wrote:
> Yes BBU is on the controller; 1024Mb. It is a HP P410i controller, with write caching turned on the controller; off
ondisk level. 


vm.dirty_background_bytes=67108864 and vm.dirty_bytes=536870912 looks
resonable for 512MB BBU, you can calculate them for 1024 or
recalculate them for dirty_background_ratio

By the way, which kernel do you use?

> We don't at this stage have any graphs, but we will set it up over the next 24hrs at least.

Do not forget to have iostat statistics on them, at least latency,
%iowait and %util, such parameters are very helpful.

And I am always suspicious about zfs under heavy writes. It is
reliable and quite comfortable in terms of configuration, but for
speed ext4 or xfs with disabled barrier looks more reasonable

>>
>> --
>> Ilya Kosmodemiansky,
>>
>> PostgreSQL-Consulting.com
>> tel. +14084142500
>> cell. +4915144336040
>> ik@postgresql-consulting.com
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com

Re: Very slow checkpoints

От

Ilya Kosmodemiansky

Дата:

18 марта 2015 г., 15:18:02

On Wed, Mar 18, 2015 at 12:21 PM, Steven Jones
<steven.jones1201@outlook.com> wrote:
>   - typical DW - relatively constant periodic data loads - i.e. heavy write
>   - we receive large CSV files ~ 5-10Gb every 15 minutes spread out across
> 5-7 minutes
>   - DB size is ~2.5Tb; rotating load of 30 days keeps the database stable

And an important addition: how your autovacuum is configured?


--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com

Re: Very slow checkpoints

От

Steven Jones

Дата:

18 марта 2015 г., 15:19:17

Thanks.

Kernel is: 3.5.0-36-generic #57~precise1-Ubuntu

Also - xlog is on ext4; db partition is zfs (on top of hardware RAID1+0).

Regards,
Steve

----------------------------------------
> From: ilya.kosmodemiansky@postgresql-consulting.com
> Date: Wed, 18 Mar 2015 13:13:36 +0100
> Subject: Re: [PERFORM] Very slow checkpoints
> To: steven.jones1201@outlook.com
> CC: pgsql-performance@postgresql.org
>
> On Wed, Mar 18, 2015 at 12:58 PM, Steven Jones
> <steven.jones1201@outlook.com> wrote:
>> Yes BBU is on the controller; 1024Mb. It is a HP P410i controller, with write caching turned on the controller; off
ondisk level. 
>
>
> vm.dirty_background_bytes=67108864 and vm.dirty_bytes=536870912 looks
> resonable for 512MB BBU, you can calculate them for 1024 or
> recalculate them for dirty_background_ratio
>
> By the way, which kernel do you use?
>
>> We don't at this stage have any graphs, but we will set it up over the next 24hrs at least.
>
> Do not forget to have iostat statistics on them, at least latency,
> %iowait and %util, such parameters are very helpful.
>
> And I am always suspicious about zfs under heavy writes. It is
> reliable and quite comfortable in terms of configuration, but for
> speed ext4 or xfs with disabled barrier looks more reasonable
>
>>>
>>> --
>>> Ilya Kosmodemiansky,
>>>
>>> PostgreSQL-Consulting.com
>>> tel. +14084142500
>>> cell. +4915144336040
>>> ik@postgresql-consulting.com
>>>
>>>
>>> --
>>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-performance
>>
>
>
>
> --
> Ilya Kosmodemiansky,
>
> PostgreSQL-Consulting.com
> tel. +14084142500
> cell. +4915144336040
> ik@postgresql-consulting.com
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

Re: Very slow checkpoints

От

Steven Jones

Дата:

18 марта 2015 г., 15:21:23

Autovacuum - default:

#autovacuum = on                        # Enable autovacuum subprocess?  'on'
#log_autovacuum_min_duration = -1       # -1 disables, 0 logs all actions and
#autovacuum_max_workers = 3             # max number of autovacuum subprocesses
#autovacuum_naptime = 1min              # time between autovacuum runs
autovacuum_vacuum_threshold = 500       # min number of row updates before
autovacuum_analyze_threshold = 500      # min number of row updates before
#autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1  # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced vacuum
#autovacuum_vacuum_cost_delay = 20ms    # default vacuum cost delay for
                                        # autovacuum, in milliseconds;
#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for
                                        # autovacuum, -1 means use

Regards,
Steve

----------------------------------------
> From: ilya.kosmodemiansky@postgresql-consulting.com
> Date: Wed, 18 Mar 2015 13:17:14 +0100
> Subject: Re: [PERFORM] Very slow checkpoints
> To: steven.jones1201@outlook.com
> CC: pgsql-performance@postgresql.org
>
> On Wed, Mar 18, 2015 at 12:21 PM, Steven Jones
> <steven.jones1201@outlook.com> wrote:
>> - typical DW - relatively constant periodic data loads - i.e. heavy write
>> - we receive large CSV files ~ 5-10Gb every 15 minutes spread out across
>> 5-7 minutes
>> - DB size is ~2.5Tb; rotating load of 30 days keeps the database stable
>
> And an important addition: how your autovacuum is configured?
>
>
> --
> Ilya Kosmodemiansky,
>
> PostgreSQL-Consulting.com
> tel. +14084142500
> cell. +4915144336040
> ik@postgresql-consulting.com
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

Re: Very slow checkpoints

От

Ilya Kosmodemiansky

Дата:

18 марта 2015 г., 15:29:12

On Wed, Mar 18, 2015 at 1:21 PM, Steven Jones
<steven.jones1201@outlook.com> wrote:
> #autovacuum = on                        # Enable autovacuum subprocess?  'on'
> #log_autovacuum_min_duration = -1       # -1 disables, 0 logs all actions and
> #autovacuum_max_workers = 3             # max number of autovacuum subprocesses
> #autovacuum_naptime = 1min              # time between autovacuum runs
> autovacuum_vacuum_threshold = 500       # min number of row updates before
> autovacuum_analyze_threshold = 500      # min number of row updates before
> #autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before vacuum
> #autovacuum_analyze_scale_factor = 0.1  # fraction of table size before analyze

If I were you, Ill use _scale_factor settings instead of threshold,
because it makes your autovacuum aggressive enough (you need it on
such workload) without firing too frequently (vacuuming has its
price).  autovacuum_vacuum_scale_factor = 0.01 and
autovacuum_analyze_scale_factor = 0.05 will be OK

And if you see all your autovacuum workers active all the time (more
than 80% of the time for example) it is a reason to increase
autovacuum_max_workers

--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com

Re: Very slow checkpoints

От

didier

Дата:

18 марта 2015 г., 18:27:43

Hi,

On Wed, Mar 18, 2015 at 12:21 PM, Steven Jones
<steven.jones1201@outlook.com> wrote:
> Hi,

> iostat -x snapshot:
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.50    0.00    2.35   15.09    0.00   82.05
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00    5.00     0.00  2056.00   822.40
> 0.00    0.00    0.00    0.00   0.00   0.00
> sdb               0.00     0.00 1055.00  549.00 41166.50 22840.00    79.81
> 5.28    3.28    4.94    0.10   0.62 100.00
Your sdb is saturated...


> ---
> iotop snapshot:
> Total DISK READ:      41.63 M/s | Total DISK WRITE:      31.43 M/s
>   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
> 32101 be/4 postgres   10.25 M/s 1085.86 K/s  0.00 % 96.80 % postgres:
> checkpointer process
> 56661 be/4 postgres    6.84 M/s  591.61 K/s  0.00 % 90.91 % postgres:
> dbauser db [local] COPY

> 32102 be/4 postgres    8.05 M/s 1340.47 K/s  0.00 % 82.47 % postgres: writer
> process
>  1005 be/0 root        0.00 B/s    0.00 B/s  0.00 %  5.81 % [txg_sync]
> 32103 be/4 postgres    0.00 B/s   10.41 M/s  0.00 %  0.52 % postgres: wal
> writer process

Why are checkpointer process and writer process reading at 18 MB/s ?
I have no experience with zfs but could it be related to COW and
recordsize? I have no idea if these reads are counted in iotop output
though.

Re: Very slow checkpoints

От

Steven Jones

Дата:

19 марта 2015 г., 08:28:54

Hi,

>>
>> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
>> avgqu-sz await r_await w_await svctm %util
>> sda 0.00 0.00 0.00 5.00 0.00 2056.00 822.40
>> 0.00 0.00 0.00 0.00 0.00 0.00
>> sdb 0.00 0.00 1055.00 549.00 41166.50 22840.00 79.81
>> 5.28 3.28 4.94 0.10 0.62 100.00
> Your sdb is saturated...

Yes that's what iostat seems to indicate, but it's weird because at the same time it is reporting 100% io utilization I
canhit the disk write (seq) at> 250Mbyte/sec: 

# sync;time bash -c "(dd if=/dev/sda1 of=bf bs=8k count=500000; sync)"
500000+0 records in
500000+0 records out
4096000000 bytes (4.1 GB) copied, 14.8575 s, 276 MB/s

real    0m14.896s
user    0m0.068s
sys    0m10.157s


> Why are checkpointer process and writer process reading at 18 MB/s ?
> I have no experience with zfs but could it be related to COW and
> recordsize? I have no idea if these reads are counted in iotop output
> though.

In general, some random disk write benchmarks and varying block sizes don't have a huge effect.  But for some reason
thecheckpointing process is just simply writing checkpoints too slowly. In the meantime the COPY is piling up logs
whilethe previous checkpoint is still being written, so the next one starts straight away and no setting is able to
splitthem up. 


Regards,
Steve

Re: Very slow checkpoints

От

Joao Junior

Дата:

19 марта 2015 г., 22:40:52

Would not be the case slightly decrease the shared buffer, lower your checkpoints_timeout for eg 5 minutes or decrease checkpoints_segments and set the checkpoint_completation_target to 0.5 to not mess up the next checkpoints?

What logs tell me is that a checkpoint occurs immediately to the other.

When you have a large shared_buffer and checkpoint_completation_target value near the maximum time between checkpoints, which I imagine is very spread given for a very long time interval
.
As your shared_buffer is relatively large, more data should be handled by the checkpoint. The checkpoints are being triggered by time (15 minutes) accumulating a good amount of data.

All this happens every time you load your CVS, and when the writing process of checkpoints reaches the end occurs immediately another.

It seems reasonable high disk activity performed by checkpoint and writing processes, performing theirs job duties amid a high load on the shared_buffer.

Regards

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Very slow checkpoints