Обсуждение: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

Поиск
Список
Период
Сортировка

Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Jerry Sievers
Дата:
Has anyone found a clever way to $subject that doesn't involved
calculating which are all of the  files unnecessary to send and then
running rsync with an exclude list?

I'm pondering this right now.

ISTM for each unlogged table/index/toast  in pg_class where
relpersistence='u'...

rsync --exclude-from... (and exclude-from-file contains)

*/base/$dbid/$relfilenode
*/base/$dbid/$relfilenode.*
*/base/$dbid/$relfilenode_*
...

The leading * will match the old/new data directoryes being sent to the
standby.

MAster upgrade is done using hard-links and the rsync method generally
as per the pg docs.

I have among others, a 5TB system  w/about 40G of unlogged data.

The rsync method of upgrade takes only several seconds during a snapshot
based test however as the snap was off a live system, the crash recovery
startup trunc'd all the unlogged data.

This will not be the case during the real upgrade and I just as soon
avoid sending the unlogged data only to have it trunc'd away as soon as
the standby is started after the upgrade.

Thanks

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 312.241.7800


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Bruce Momjian
Дата:
On Fri, Jul 22, 2016 at 06:29:05PM -0500, Jerry Sievers wrote:
> Has anyone found a clever way to $subject that doesn't involved
> calculating which are all of the  files unnecessary to send and then
> running rsync with an exclude list?
>
> I'm pondering this right now.
>
> ISTM for each unlogged table/index/toast  in pg_class where
> relpersistence='u'...
>
> rsync --exclude-from... (and exclude-from-file contains)
>
> */base/$dbid/$relfilenode
> */base/$dbid/$relfilenode.*
> */base/$dbid/$relfilenode_*
> ...
>
> The leading * will match the old/new data directoryes being sent to the
> standby.
>
> MAster upgrade is done using hard-links and the rsync method generally
> as per the pg docs.
>
> I have among others, a 5TB system  w/about 40G of unlogged data.
>
> The rsync method of upgrade takes only several seconds during a snapshot
> based test however as the snap was off a live system, the crash recovery
> startup trunc'd all the unlogged data.
>
> This will not be the case during the real upgrade and I just as soon
> avoid sending the unlogged data only to have it trunc'd away as soon as
> the standby is started after the upgrade.

Good question --- I don't think there is an easy answer except to
truncate the unlogged tables before the upgrade.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Jerry Sievers
Дата:
Bruce Momjian <bruce@momjian.us> writes:

> On Fri, Jul 22, 2016 at 06:29:05PM -0500, Jerry Sievers wrote:
>
>> Has anyone found a clever way to $subject that doesn't involved
>> calculating which are all of the  files unnecessary to send and then
>> running rsync with an exclude list?
>>
>> I'm pondering this right now.
>>
>> ISTM for each unlogged table/index/toast  in pg_class where
>> relpersistence='u'...
>>
>> rsync --exclude-from... (and exclude-from-file contains)
>>
>> */base/$dbid/$relfilenode
>> */base/$dbid/$relfilenode.*
>> */base/$dbid/$relfilenode_*
>> ...
>>
>> The leading * will match the old/new data directoryes being sent to the
>> standby.
>>
>> MAster upgrade is done using hard-links and the rsync method generally
>> as per the pg docs.
>>
>> I have among others, a 5TB system  w/about 40G of unlogged data.
>>
>> The rsync method of upgrade takes only several seconds during a snapshot
>> based test however as the snap was off a live system, the crash recovery
>> startup trunc'd all the unlogged data.
>>
>> This will not be the case during the real upgrade and I just as soon
>> avoid sending the unlogged data only to have it trunc'd away as soon as
>> the standby is started after the upgrade.
>
> Good question --- I don't think there is an easy answer except to
> truncate the unlogged tables before the upgrade.

Yup, as I suspected.

My data warehouse team *rather* not have to dirive the unlogged data
again since it takes time...  and I rather not needlessly replicate it
during upgrade (er, because that takes time.)

At my company, if we choose to not avoid replicating unlogged data *and*
run for a short interval in production without a standby, snapshot the
halted master and take the rsync off of that meanwhile the master is
back online.

I'll figure something out.

Thanks!

--
Jerry Sievers
e: jerry.sievers@comcast.net
p: 312.241.7800


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Stephen Frost
Дата:
* Jerry Sievers (gsievers19@comcast.net) wrote:
> Has anyone found a clever way to $subject that doesn't involved
> calculating which are all of the  files unnecessary to send and then
> running rsync with an exclude list?
>
> I'm pondering this right now.

Certainly an interesting issue to consider.  I have to admit that I
hadn't considered unlogged tables when working out how to do the
hard-link-based approach.

> ISTM for each unlogged table/index/toast  in pg_class where
> relpersistence='u'...
>
> rsync --exclude-from... (and exclude-from-file contains)
>
> */base/$dbid/$relfilenode
> */base/$dbid/$relfilenode.*
> */base/$dbid/$relfilenode_*
> ...
>
> The leading * will match the old/new data directoryes being sent to the
> standby.

You would have to address tablespaces too, if you have any.

> I have among others, a 5TB system  w/about 40G of unlogged data.

Honestly, sending an extra 40G out of 5TB (we're talking less than 1%
here...) might be worth it to keep the process simple.

Thanks!

Stephen

Вложения

Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Stephen Frost
Дата:
* Stephen Frost (sfrost@snowman.net) wrote:
> > I have among others, a 5TB system  w/about 40G of unlogged data.
>
> Honestly, sending an extra 40G out of 5TB (we're talking less than 1%
> here...) might be worth it to keep the process simple.

Of course, I realize right after sending this that you aren't sending
the 5TB. :)

Depends on your network speed if the 40G would be an issue or not then.
In some cases, it might only add a bit of time but keep the process
simple, other cases, it might take longer.  I'm not really thrilled with
the idea of complicating the rsync in that way, but I can understand why
you're considering it.

Thanks!

Stephen

Вложения

Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Jerry Sievers
Дата:
Stephen Frost <sfrost@snowman.net> writes:

> * Stephen Frost (sfrost@snowman.net) wrote:
>
>> > I have among others, a 5TB system  w/about 40G of unlogged data.
>>
>> Honestly, sending an extra 40G out of 5TB (we're talking less than 1%
>> here...) might be worth it to keep the process simple.
>
> Of course, I realize right after sending this that you aren't sending
> the 5TB. :)

Roger that!  Really, the hard link pg-upgrade option and rsync method
for standbys is pure beauty :-)

We have only the post-analyze and this unlogged data  problem to deal
with.

The warehouse is our only system with any substantial amount of unlogged
data.

Reducing post-analyze delay was solved by me here with our groovy
threaded AnalyzMaTron.  I can get a >5TB system done in about 45 minutes
w/20 threads.

Thanks!

> Depends on your network speed if the 40G would be an issue or not then.
> In some cases, it might only add a bit of time but keep the process
> simple, other cases, it might take longer.  I'm not really thrilled with
> the idea of complicating the rsync in that way, but I can understand why
> you're considering it.
>
> Thanks!
>
> Stephen

--
Jerry Sievers
e: jerry.sievers@comcast.net
p: 312.241.7800


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Bruce Momjian
Дата:
On Mon, Jul 25, 2016 at 12:34:45PM -0500, Jerry Sievers wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>
> > * Stephen Frost (sfrost@snowman.net) wrote:
> >
> >> > I have among others, a 5TB system  w/about 40G of unlogged data.
> >>
> >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1%
> >> here...) might be worth it to keep the process simple.
> >
> > Of course, I realize right after sending this that you aren't sending
> > the 5TB. :)
>
> Roger that!  Really, the hard link pg-upgrade option and rsync method
> for standbys is pure beauty :-)
>
> We have only the post-analyze and this unlogged data  problem to deal
> with.
>
> The warehouse is our only system with any substantial amount of unlogged
> data.
>
> Reducing post-analyze delay was solved by me here with our groovy
> threaded AnalyzMaTron.  I can get a >5TB system done in about 45 minutes
> w/20 threads.

The only thing I can suggest is to document this limitation, if people
think it is worth it.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Jerry Sievers
Дата:
Bruce Momjian <bruce@momjian.us> writes:

> On Mon, Jul 25, 2016 at 12:34:45PM -0500, Jerry Sievers wrote:
>
>> Stephen Frost <sfrost@snowman.net> writes:
>>
>> > * Stephen Frost (sfrost@snowman.net) wrote:
>> >
>> >> > I have among others, a 5TB system  w/about 40G of unlogged data.
>> >>
>> >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1%
>> >> here...) might be worth it to keep the process simple.
>> >
>> > Of course, I realize right after sending this that you aren't sending
>> > the 5TB. :)
>>
>> Roger that!  Really, the hard link pg-upgrade option and rsync method
>> for standbys is pure beauty :-)
>>
>> We have only the post-analyze and this unlogged data  problem to deal
>> with.
>>
>> The warehouse is our only system with any substantial amount of unlogged
>> data.
>>
>> Reducing post-analyze delay was solved by me here with our groovy
>> threaded AnalyzMaTron.  I can get a >5TB system done in about 45 minutes
>> w/20 threads.
>
> The only thing I can suggest is to document this limitation, if people
> think it is worth it.

+1

A couple sentences right near the rsync piece in pg_upgrade docs would
 do it.

Caveat:UNNLogged data will be unnecessarily transfered via rsync and
then immediately discarded from standbys when the standbys are started
unless those tables are truncated on the master before the upgrade.

--
Jerry Sievers
e: jerry.sievers@comcast.net
p: 312.241.7800


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Stephen Frost
Дата:
* Bruce Momjian (bruce@momjian.us) wrote:
> On Mon, Jul 25, 2016 at 12:34:45PM -0500, Jerry Sievers wrote:
> > Stephen Frost <sfrost@snowman.net> writes:
> >
> > > * Stephen Frost (sfrost@snowman.net) wrote:
> > >
> > >> > I have among others, a 5TB system  w/about 40G of unlogged data.
> > >>
> > >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1%
> > >> here...) might be worth it to keep the process simple.
> > >
> > > Of course, I realize right after sending this that you aren't sending
> > > the 5TB. :)
> >
> > Roger that!  Really, the hard link pg-upgrade option and rsync method
> > for standbys is pure beauty :-)
> >
> > We have only the post-analyze and this unlogged data  problem to deal
> > with.
> >
> > The warehouse is our only system with any substantial amount of unlogged
> > data.
> >
> > Reducing post-analyze delay was solved by me here with our groovy
> > threaded AnalyzMaTron.  I can get a >5TB system done in about 45 minutes
> > w/20 threads.
>
> The only thing I can suggest is to document this limitation, if people
> think it is worth it.

It'd probably be good to document what happens with unlogged tables.

Thanks!

Stephen

Вложения

Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Bruce Momjian
Дата:
On Tue, Jul 26, 2016 at 07:11:29PM -0400, Stephen Frost wrote:
> * Bruce Momjian (bruce@momjian.us) wrote:
> > On Mon, Jul 25, 2016 at 12:34:45PM -0500, Jerry Sievers wrote:
> > > Stephen Frost <sfrost@snowman.net> writes:
> > >
> > > > * Stephen Frost (sfrost@snowman.net) wrote:
> > > >
> > > >> > I have among others, a 5TB system  w/about 40G of unlogged data.
> > > >>
> > > >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1%
> > > >> here...) might be worth it to keep the process simple.
> > > >
> > > > Of course, I realize right after sending this that you aren't sending
> > > > the 5TB. :)
> > >
> > > Roger that!  Really, the hard link pg-upgrade option and rsync method
> > > for standbys is pure beauty :-)
> > >
> > > We have only the post-analyze and this unlogged data  problem to deal
> > > with.
> > >
> > > The warehouse is our only system with any substantial amount of unlogged
> > > data.
> > >
> > > Reducing post-analyze delay was solved by me here with our groovy
> > > threaded AnalyzMaTron.  I can get a >5TB system done in about 45 minutes
> > > w/20 threads.
> >
> > The only thing I can suggest is to document this limitation, if people
> > think it is worth it.
>
> It'd probably be good to document what happens with unlogged tables.

Attached patch applied to head.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

Вложения

Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Peter Eisentraut
Дата:
On 8/4/16 6:55 PM, Bruce Momjian wrote:
>         <para>
> +        Unfortunately, <application>rsync</> also needlessly copies the
> +        files associated with temporary and unlogged tables.
>          If you have tablespaces, you will need to run a similar
>          <application>rsync</> command for each tablespace directory.  If you
>          have relocated <filename>pg_xlog</> outside the data directories,

That seems like an odd location to insert that.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Bruce Momjian
Дата:
On Fri, Aug  5, 2016 at 09:04:19AM -0400, Peter Eisentraut wrote:
> On 8/4/16 6:55 PM, Bruce Momjian wrote:
> >         <para>
> > +        Unfortunately, <application>rsync</> also needlessly copies the
> > +        files associated with temporary and unlogged tables.
> >          If you have tablespaces, you will need to run a similar
> >          <application>rsync</> command for each tablespace directory.  If you
> >          have relocated <filename>pg_xlog</> outside the data directories,
>
> That seems like an odd location to insert that.

Would you like to suggest another location as I could not find out.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Bruce Momjian
Дата:
On Fri, Aug  5, 2016 at 10:05:21AM -0400, Bruce Momjian wrote:
> On Fri, Aug  5, 2016 at 09:04:19AM -0400, Peter Eisentraut wrote:
> > On 8/4/16 6:55 PM, Bruce Momjian wrote:
> > >         <para>
> > > +        Unfortunately, <application>rsync</> also needlessly copies the
> > > +        files associated with temporary and unlogged tables.
> > >          If you have tablespaces, you will need to run a similar
> > >          <application>rsync</> command for each tablespace directory.  If you
> > >          have relocated <filename>pg_xlog</> outside the data directories,
> >
> > That seems like an odd location to insert that.
>
> Would you like to suggest another location as I could not find out.

Sorry, typo:  :-(

    Would you like to suggest another location as I could not find one.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Peter Eisentraut
Дата:
On 8/5/16 10:06 AM, Bruce Momjian wrote:
> On Fri, Aug  5, 2016 at 10:05:21AM -0400, Bruce Momjian wrote:
>> On Fri, Aug  5, 2016 at 09:04:19AM -0400, Peter Eisentraut wrote:
>>> On 8/4/16 6:55 PM, Bruce Momjian wrote:
>>>>         <para>
>>>> +        Unfortunately, <application>rsync</> also needlessly copies the
>>>> +        files associated with temporary and unlogged tables.
>>>>          If you have tablespaces, you will need to run a similar
>>>>          <application>rsync</> command for each tablespace directory.  If you
>>>>          have relocated <filename>pg_xlog</> outside the data directories,
>>>
>>> That seems like an odd location to insert that.
>>
>> Would you like to suggest another location as I could not find out.
>
> Sorry, typo:  :-(
>
>     Would you like to suggest another location as I could not find one.

I think it belongs to the previous paragraph, so I moved it there.


--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?

От
Bruce Momjian
Дата:
On Sun, Aug  7, 2016 at 09:29:20PM -0400, Peter Eisentraut wrote:
> On 8/5/16 10:06 AM, Bruce Momjian wrote:
> > On Fri, Aug  5, 2016 at 10:05:21AM -0400, Bruce Momjian wrote:
> >> On Fri, Aug  5, 2016 at 09:04:19AM -0400, Peter Eisentraut wrote:
> >>> On 8/4/16 6:55 PM, Bruce Momjian wrote:
> >>>>         <para>
> >>>> +        Unfortunately, <application>rsync</> also needlessly copies the
> >>>> +        files associated with temporary and unlogged tables.
> >>>>          If you have tablespaces, you will need to run a similar
> >>>>          <application>rsync</> command for each tablespace directory.  If you
> >>>>          have relocated <filename>pg_xlog</> outside the data directories,
> >>>
> >>> That seems like an odd location to insert that.
> >>
> >> Would you like to suggest another location as I could not find out.
> >
> > Sorry, typo:  :-(
> >
> >     Would you like to suggest another location as I could not find one.
>
> I think it belongs to the previous paragraph, so I moved it there.

OK, fine.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +