Re: [PERFORM] Backup taking long time !!!

От: Stephen Frost
Тема: Re: [PERFORM] Backup taking long time !!!
Дата: ,
Msg-id: 20170120150646.GP18360@tamriel.snowman.net
(см: обсуждение, исходный текст)
Ответ на: Re: [PERFORM] Backup taking long time !!!  (Vladimir Borodin)
Ответы: Re: [PERFORM] Backup taking long time !!!  (Jim Nasby)
Re: [PERFORM] Backup taking long time !!!  (David Steele)
Список: pgsql-performance

Скрыть дерево обсуждения

[PERFORM] Backup taking long time !!!  (Dinesh Chandra 12108, )
 Re: [PERFORM] Backup taking long time !!!  ("Madusudanan.B.N", )
  Re: [PERFORM] Backup taking long time !!!  (Dinesh Chandra 12108, )
   Re: [PERFORM] Backup taking long time !!!  (Pavel Stehule, )
    Re: [PERFORM] Backup taking long time !!!  ("Madusudanan.B.N", )
    Re: [PERFORM] Backup taking long time !!!  (Dinesh Chandra 12108, )
     Re: [PERFORM] Backup taking long time !!!  (Pavel Stehule, )
      Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
  Re: [PERFORM] Backup taking long time !!!  (Vladimir Borodin, )
   Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
 Re: [PERFORM] Backup taking long time !!!  (Vladimir Borodin, )
  Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
 Re: [PERFORM] Backup taking long time !!!  (Vladimir Borodin, )
  Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
   Re: [PERFORM] Backup taking long time !!!  (Jim Nasby, )
    Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
     Re: [PERFORM] Backup taking long time !!!  (Jim Nasby, )
   Re: [PERFORM] Backup taking long time !!!  (David Steele, )
  Re: [PERFORM] Backup taking long time !!!  (Torsten Zuehlsdorff, )
   Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
 Re: [PERFORM] Backup taking long time !!!  (Vladimir Borodin, )
  Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
 Re: [PERFORM] Backup taking long time !!!  (julyanto SUTANDANG, )
  Re: [PERFORM] Backup taking long time !!!  (julyanto SUTANDANG, )
   Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
    Re: [PERFORM] Backup taking long time !!!  (Jeff Janes, )
     Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
     Re: [PERFORM] Backup taking long time !!!  (Simon Riggs, )
      Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
      Re: [PERFORM] Backup taking long time !!!  (Jeff Janes, )
       Re: [PERFORM] Backup taking long time !!!  (Rick Otten, )
        Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
     Re: [PERFORM] Backup taking long time !!!  (julyanto SUTANDANG, )
  Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
 Re: [PERFORM] Backup taking long time !!!  (julyanto SUTANDANG, )
  Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
 Re: [PERFORM] Backup taking long time !!!  (julyanto SUTANDANG, )
  Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
   Re: [PERFORM] Backup taking long time !!!  (Jim Nasby, )
    Re: [PERFORM] Backup taking long time !!!  (Jeff Janes, )
     Re: [PERFORM] Backup taking long time !!!  (Stephen Frost, )
 Re: [PERFORM] Backup taking long time !!!  (julyanto SUTANDANG, )

Vladimir,

* Vladimir Borodin () wrote:
> > 20 янв. 2017 г., в 16:40, Stephen Frost <> написал(а):
> >> Increments in pgbackrest are done on file level which is not really efficient. We have done parallelism,
compressionand page-level increments (9.3+) in barman fork [1], but unfortunately guys from 2ndquadrant-it don’t hurry
towork on it. 
> >
> > We're looking at page-level incremental backup in pgbackrest also.  For
> > larger systems, we've not heard too much complaining about it being
> > file-based though, which is why it hasn't been a priority.  Of course,
> > the OP is on 9.1 too, so.
>
> Well, we have forked barman and made everything from the above just because we needed ~ 2 PB of disk space for
storingbackups for our ~ 300 TB of data. (Our recovery window is 7 days) And on 5 TB database it took a lot of time to
make/restorea backup. 

Right, without incremental or compressed backups, you'd have to have
room for 7 full copies of your database.  Have you looked at what your
incrementals would be like with file-level incrementals and compression?

Single-process backup/restore is definitely going to be slow.  We've
seen pgbackrest doing as much as 3TB/hr with 32 cores handling
compression.  Of course, your i/o, network, et al, need to be able to
handle it.

> > As for your fork, well, I can't say I really blame the barman folks for
> > being cautious- that's usually a good thing in your backup software. :)
>
> The reason seems to be not the caution but the lack of time for working on it. But yep, it took us half a year to
deployour fork everywhere. And it would take much more time if we didn’t have system for checking backups consistency. 

How are you testing your backups..?  Do you have page-level checksums
enabled on your database?  pgbackrest recently added the ability to
check PG page-level checksums during a backup and report issues.  We've
also been looking at how to use pgbackrest to do backup/restore+replay
page-level difference analysis but there's still a number of things
which can cause differences, so it's a bit difficult to do.

Of course, doing a pgbackrest-restore-replay+pg_dump+pg_restore is
pretty easy to do and we do use that in some places to validate
backups.

> > I'm curious how you're handling compressed page-level incremental
> > backups though.  I looked through barman-incr and it wasn't obvious to
> > me what was going wrt how the incrementals are stored, are they ending
> > up as sparse files, or are you actually copying/overwriting the prior
> > file in the backup repository?
>
> No, we do store each file in the following way. At the beginning you write a map of changed pages. At second you
writechanged pages themselves. The compression is streaming so you don’t need much memory for that but the downside of
thisapproach is that you read each datafile twice (we believe in page cache here). 

Ah, yes, I noticed that you passed over the file twice but wasn't quite
sure what functools.partial() was doing and a quick read of the docs
made me think you were doing seeking there.

All the pages are the same size, so I'm surprised you didn't consider
just having a format along the lines of: magic+offset+page,
magic+offset+page, magic+offset+page, etc...

I'd have to defer to David on this, but I think he was considering
having some kind of a bitmap to indicate which pages changed instead
of storing the full offset as, again, all the pages are the same size.

> >  Apologies, python isn't my first
> > language, but the lack of any comment anywhere in that file doesn't
> > really help.
>
> Not a problem. Actually, it would be much easier to understand if it was a series of commits rather than one commit
thatwe do ammend and force-push after each rebase on vanilla barman. We should add comments. 

Both would make it easier to understand, though the comments would be
more helpful for me as I don't actually know the barman code all that
well.

Thanks!

Stephen

Вложения

В списке pgsql-performance по дате сообщения:

От: Stephen Frost
Дата:
Сообщение: Re: [PERFORM] Backup taking long time !!!
От: julyanto SUTANDANG
Дата:
Сообщение: Re: [PERFORM] Backup taking long time !!!