Обсуждение: Clarification on using pg_upgrade

Поиск
Список
Период
Сортировка

Clarification on using pg_upgrade

От
Tory M Blue
Дата:
Howdy

Postgres9.2 going to 9.4
CentOS 6.5

So in most of my environments, I use slony and thus use slony replication for my upgrades (Drop/add nodes etc).

But I've got a pretty big DB  just shy of a TB that is on a single node. A dump restore would take over 48 hours because of index creations etc, so thought maybe I would look at doing a upgrade via pg_upgrade.

There are some challenges, since I build my rpm's to a standard directory for binaries and then the data directory. So I will have to move/rename directories, but when that's done, I'm slightly confused on the pg_upgrade using link options.

If my data is located in /data

and I link to a new dir in /data1,  what actually happens. do I end up with 2 file systems and links and thus am not able to delete or cleanup any old data, or how does this work?

Also will the reindex creation still happen with this type of in-place upgrade, as if so, then it may not save too much time vs a dump/import.

I'm nervous about using pg_upgrade but it's really tough to recover from the jobs that backup during a dump/restore process (2-3 days), so really trying to wrap my head around pg_upgrade..

Suggestions, opinions on pg_upgrade vs dump/restore, the filesystem/mount below is what I'm working with.

Filesystem                       Size  Used Avail Use% Mounted on

/dev/sda6                        4.0T  1.1T  2.8T  29% /data


Thanks
Tory

Re: Clarification on using pg_upgrade

От
Justin Pryzby
Дата:
On Fri, Mar 04, 2016 at 02:27:59PM -0800, Tory M Blue wrote:
> If my data is located in /data
>
> and I link to a new dir in /data1,  what actually happens. do I end up with
> 2 file systems and links and thus am not able to delete or cleanup any old
> data, or how does this work?
>
> Also will the reindex creation still happen with this type of in-place
> upgrade, as if so, then it may not save too much time vs a dump/import.

Since you have the space, you can do a test upgrade; make a dump of the
essential tables (or the entire thing) and restore it to another instance,
perhaps even something run from your /home.

pg_upgrade --link makes hardlinks for tables and indices (same as cp -l), so
uses very little additional space.  Note, that means that both must be within
the filesystem (/data).  You should understand about hardinks and inodes
otherwise this will lead to confusion and mistakes.

Indexes don't need to be rebuilt afterwards.  I've upgraded ~35 customers to
9.5 already, some as big as 5TB.  So far the disruption has been at most 30min
(not counting ANALYZE afterwards).

When I use pg_upgrade, after stopping the old instance, I rename the data dir
(under centos, /var/lib/pgsql/9.4~).  Then pg_upgrade makes links in 9.5/.
Renaming has the advantage that the old instances can't be accidentally
started; and, makes it much easier to believe that it's safe to remove the 9.4~
afterwards.

Justin


Re: Clarification on using pg_upgrade

От
Jim Nasby
Дата:
On 3/4/16 4:58 PM, Justin Pryzby wrote:
> On Fri, Mar 04, 2016 at 02:27:59PM -0800, Tory M Blue wrote:
>> >If my data is located in /data
>> >
>> >and I link to a new dir in /data1,  what actually happens. do I end up with
>> >2 file systems and links and thus am not able to delete or cleanup any old
>> >data, or how does this work?
>> >
>> >Also will the reindex creation still happen with this type of in-place
>> >upgrade, as if so, then it may not save too much time vs a dump/import.
> Since you have the space, you can do a test upgrade; make a dump of the
> essential tables (or the entire thing) and restore it to another instance,
> perhaps even something run from your /home.

Since pg_upgrade operates at a binary level, if you want to test it I'd
recommend using a PITR backup and not pg_dump. It's theoretically
possible to have a database that will pg_dump correctly but that
pg_upgrade chokes on.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: Clarification on using pg_upgrade

От
Tory M Blue
Дата:
Thanks to all that responded

I successfully upgraded 800GB DB with pg_upgrade in about 2 hours.
This would have taken 2 days to dump/restore.

Slon is also starting to not be viable as it takes some indexes over 7
hours to complete. So this upgrade path seemed to really be nice.

Not sure how I can incorporate with my slon cluster, I guess that will
be the next thing I research.

Appreciate the responses and assistance.

Tory


Re: Clarification on using pg_upgrade

От
Jim Nasby
Дата:
On 3/24/16 12:43 PM, Tory M Blue wrote:
> Slon is also starting to not be viable as it takes some indexes over 7
> hours to complete. So this upgrade path seemed to really be nice.

If you're standing up a new replica from scratch on the latest version,
I'm not really sure why that matters?

> Not sure how I can incorporate with my slon cluster, I guess that will
> be the next thing I research.

Not sure I'm following, but you can pg_upgrade your replicas at the same
time as you do the master... or you can do them after the fact.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: Clarification on using pg_upgrade

От
Tory M Blue
Дата:
In line Jim

On Sun, Apr 3, 2016 at 10:13 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 3/24/16 12:43 PM, Tory M Blue wrote:
>>
>> Slon is also starting to not be viable as it takes some indexes over 7
>> hours to complete. So this upgrade path seemed to really be nice.
>
>
> If you're standing up a new replica from scratch on the latest version, I'm
> not really sure why that matters?

Not sure why the 7-13 hours causes an issue? Because if I'm upgrading
via slon process, I have to add and drop a node. If I'm dropping my
secondary (slave) I have to move reporting to the master, so now the
master is handing normal inserts and reports. Next item, I'm  without
a replica for 13+ hours, that's not good either.

>> Not sure how I can incorporate with my slon cluster, I guess that will
>> be the next thing I research.
>
>
> Not sure I'm following, but you can pg_upgrade your replicas at the same
> time as you do the master... or you can do them after the fact.
> --

I'm not sure how that statement is true. I'm fundamentally changing
the data in the master. My gut says you are thinking, just shut
everything down until you have upgraded all 4-5 servers.  I'm hoping
that's not what you are thinking here.

If I update my Master, my slave and query slaves are going to be
wondering what the heck is going on. Now I can stop slon, upgrade and
restart slon (if Postgres upgrade handles the weird pointers and stuff
that slon does on the slave nodes (inside the slon schema), but
depending on how long this process takes I'm down for a period of
time, that's not acceptable. so I have to upgrade my standby unit,
which now fundamentally is different than the master. This is what my
statement was referencing, with slon running, how do I use pg_upgrade
to upgrade the cluster without downtime. Again slon requires a drop
add if I'm rebuilding via slon but as I stated that's almost
unbearable at this juncture with how long indexes take..

Thanks
Tory


Re: Clarification on using pg_upgrade

От
Jim Nasby
Дата:
On 4/19/16 11:01 PM, Tory M Blue wrote:
>>> >> Slon is also starting to not be viable as it takes some indexes over 7
>>> >> hours to complete. So this upgrade path seemed to really be nice.
>> >
>> >
>> > If you're standing up a new replica from scratch on the latest version, I'm
>> > not really sure why that matters?
> Not sure why the 7-13 hours causes an issue? Because if I'm upgrading
> via slon process, I have to add and drop a node. If I'm dropping my
> secondary (slave) I have to move reporting to the master, so now the
> master is handing normal inserts and reports. Next item, I'm  without
> a replica for 13+ hours, that's not good either.

Don't drop and add a node, just do a master switchover. AFAIK that's
nearly instant as long as things are in sync.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: Clarification on using pg_upgrade

От
Tory M Blue
Дата:
On Tue, Jun 14, 2016 at 2:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 4/19/16 11:01 PM, Tory M Blue wrote:
>>>>
>>>> >> Slon is also starting to not be viable as it takes some indexes over
>>>> >> 7
>>>> >> hours to complete. So this upgrade path seemed to really be nice.
>>>
>>> >
>>> >
>>> > If you're standing up a new replica from scratch on the latest version,
>>> > I'm
>>> > not really sure why that matters?
>>
>> Not sure why the 7-13 hours causes an issue? Because if I'm upgrading
>> via slon process, I have to add and drop a node. If I'm dropping my
>> secondary (slave) I have to move reporting to the master, so now the
>> master is handing normal inserts and reports. Next item, I'm  without
>> a replica for 13+ hours, that's not good either.
>
>
> Don't drop and add a node, just do a master switchover. AFAIK that's nearly
> instant as long as things are in sync.
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)   mobile: 512-569-9461

Right, that's what we do, but then to upgrade, we have to drop/add the
node, because it's being upgraded.  If I'm updating the underlying OS,
I have to kill it all. If I'm doing a postgres upgrade, using an old
version of slon, without using pg_upgrade, I have to drop the db,
recreate it, which requires a drop/add.

I'm trying to figure out how to best do it using pg_upgrade  instead
of the entire drop/add for postgres upgrades (which are needed if you
are using slon as an upgrade engine for your db).

Tory


Re: Clarification on using pg_upgrade

От
Glyn Astill
Дата:
----- Original Message -----

> From: Tory M Blue <tmblue@gmail.com>
> To: Jim Nasby <Jim.Nasby@bluetreble.com>
> Cc: "pgsql-performance@postgresql.org" <pgsql-performance@postgresql.org>
> Sent: Tuesday, 14 June 2016, 22:08

> Subject: Re: [PERFORM] Clarification on using pg_upgrade
>
> Right, that's what we do, but then to upgrade, we have to drop/add the
> node, because it's being upgraded.  If I'm updating the underlying OS,
> I have to kill it all. If I'm doing a postgres upgrade, using an old
> version of slon, without using pg_upgrade, I have to drop the db,
> recreate it, which requires a drop/add.
>
> I'm trying to figure out how to best do it using pg_upgrade  instead
> of the entire drop/add for postgres upgrades (which are needed if you
> are using slon as an upgrade engine for your db).
>


I've just skimmed through this thread, but I can't quite gather what it is you're trying to achieve.  Are you looking
tomove away from Slony? Upgrade by any means with or without Slony?  Or just find a "fast" way of doing a major upgrade
whilstkeeping Slony in-place as your replication method? 

If it's the latter, the easiest way is to have 2 or more subscribers subscribed to the same sets and one at a time;
dropa subscriber node, upgrade and re-initdb, then use clone node to recreate it from another subscriber.  If you're
intenton using pg_upgrade you might be able to fudge it as long as you can bump up current txid to be greater than what
itwas before the upgrade; in fact I've done similar before with a slony subscriber, but only as a test on a small
database.