Re: wal-size limited to 16MB - Performance issue for subsequent backup

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: wal-size limited to 16MB - Performance issue for subsequent backup
Дата
Msg-id 20141020194940.GE7176@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: wal-size limited to 16MB - Performance issue for subsequent backup  (jesper@krogh.cc)
Ответы Re: wal-size limited to 16MB - Performance issue for subsequent backup
Список pgsql-hackers
On 2014-10-20 21:41:26 +0200, jesper@krogh.cc wrote:
> 
> > On 2014-10-20 21:03:59 +0200, jesper@krogh.cc wrote:
> >> One of our "production issues" is that the system generates lots of
> >> wal-files, lots is like 151952 files over the last 24h, which is about
> >> 2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
> >> but the system does indeed work fine. We do subsequently gzip the files
> >> to limit actual disk-usage, this makes the files roughly 30-50% in size.

I'm a bit doubtful that 16MB vs., say, 64MB files really changes
anything substantial for you. If it indeed is a problem, it's simple
enough to join the files temporarily?

> > Have you analyzed what the source of that volume is? Which version of
> > postgres are you using? What's your checkpoint_timeout/segments
> > settings?
> 
> Suggestions are surely welcome.

Once you're on 9.3 I'd suggest using pg_xlogdump --stats on it. There's
a backport of the facility for 9.3 (looking somewhat different than what
is now in 9.5) at
http://archives.postgresql.org/message-id/CABRT9RAzGowqLFcEE8aF6VdPoFEy%2BP9gmu7ktGRzw0dgRwVr9Q%40mail.gmail.com

That'd tell you a fair bit more. It's noticeably harder to backport to <
9.3.

> I do suspect the majority is from 30 concurrent processes updating an
> 506GB GIN index, but it would be nice to confirm that. There is also a
> message-queue in the DB with a fairly high turnaround.

A 506GB GIN index? Uh, interesting :). What's it used for? Trigrams?

I'd suspect that the message queue isn't the primary culprit, but it's
hard to say for sure.

> Currently PG 9.2 moving to 9.3 hopefully before end-of-year,
> checkpoint_timeout = 30min, checkpoint_segments = 4096.

Generally a high checkpoint_timeout can significantly reduce the WAL
volume because of fewer full page writes. I've seen cases where spacing
checkpoint further apart by a factor of two reduced the overall WAL
volume by more than two.

> According to logs checkpoints are roughly 15 minutes apart.

Can you show log_checkpoints output?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: jesper@krogh.cc
Дата:
Сообщение: Re: wal-size limited to 16MB - Performance issue for subsequent backup
Следующее
От: jesper@krogh.cc
Дата:
Сообщение: Re: wal-size limited to 16MB - Performance issue for subsequent backup