Обсуждение: Clarification on using pg_upgrade
Howdy
Postgres9.2 going to 9.4
CentOS 6.5
So in most of my environments, I use slony and thus use slony replication for my upgrades (Drop/add nodes etc).
But I've got a pretty big DB just shy of a TB that is on a single node. A dump restore would take over 48 hours because of index creations etc, so thought maybe I would look at doing a upgrade via pg_upgrade.
There are some challenges, since I build my rpm's to a standard directory for binaries and then the data directory. So I will have to move/rename directories, but when that's done, I'm slightly confused on the pg_upgrade using link options.
If my data is located in /data
and I link to a new dir in /data1, what actually happens. do I end up with 2 file systems and links and thus am not able to delete or cleanup any old data, or how does this work?
Also will the reindex creation still happen with this type of in-place upgrade, as if so, then it may not save too much time vs a dump/import.
I'm nervous about using pg_upgrade but it's really tough to recover from the jobs that backup during a dump/restore process (2-3 days), so really trying to wrap my head around pg_upgrade..
Suggestions, opinions on pg_upgrade vs dump/restore, the filesystem/mount below is what I'm working with.
Filesystem Size Used Avail Use% Mounted on
/dev/sda6 4.0T 1.1T 2.8T 29% /data
Thanks
Tory
On Fri, Mar 04, 2016 at 02:27:59PM -0800, Tory M Blue wrote: > If my data is located in /data > > and I link to a new dir in /data1, what actually happens. do I end up with > 2 file systems and links and thus am not able to delete or cleanup any old > data, or how does this work? > > Also will the reindex creation still happen with this type of in-place > upgrade, as if so, then it may not save too much time vs a dump/import. Since you have the space, you can do a test upgrade; make a dump of the essential tables (or the entire thing) and restore it to another instance, perhaps even something run from your /home. pg_upgrade --link makes hardlinks for tables and indices (same as cp -l), so uses very little additional space. Note, that means that both must be within the filesystem (/data). You should understand about hardinks and inodes otherwise this will lead to confusion and mistakes. Indexes don't need to be rebuilt afterwards. I've upgraded ~35 customers to 9.5 already, some as big as 5TB. So far the disruption has been at most 30min (not counting ANALYZE afterwards). When I use pg_upgrade, after stopping the old instance, I rename the data dir (under centos, /var/lib/pgsql/9.4~). Then pg_upgrade makes links in 9.5/. Renaming has the advantage that the old instances can't be accidentally started; and, makes it much easier to believe that it's safe to remove the 9.4~ afterwards. Justin
On 3/4/16 4:58 PM, Justin Pryzby wrote: > On Fri, Mar 04, 2016 at 02:27:59PM -0800, Tory M Blue wrote: >> >If my data is located in /data >> > >> >and I link to a new dir in /data1, what actually happens. do I end up with >> >2 file systems and links and thus am not able to delete or cleanup any old >> >data, or how does this work? >> > >> >Also will the reindex creation still happen with this type of in-place >> >upgrade, as if so, then it may not save too much time vs a dump/import. > Since you have the space, you can do a test upgrade; make a dump of the > essential tables (or the entire thing) and restore it to another instance, > perhaps even something run from your /home. Since pg_upgrade operates at a binary level, if you want to test it I'd recommend using a PITR backup and not pg_dump. It's theoretically possible to have a database that will pg_dump correctly but that pg_upgrade chokes on. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Thanks to all that responded I successfully upgraded 800GB DB with pg_upgrade in about 2 hours. This would have taken 2 days to dump/restore. Slon is also starting to not be viable as it takes some indexes over 7 hours to complete. So this upgrade path seemed to really be nice. Not sure how I can incorporate with my slon cluster, I guess that will be the next thing I research. Appreciate the responses and assistance. Tory
On 3/24/16 12:43 PM, Tory M Blue wrote: > Slon is also starting to not be viable as it takes some indexes over 7 > hours to complete. So this upgrade path seemed to really be nice. If you're standing up a new replica from scratch on the latest version, I'm not really sure why that matters? > Not sure how I can incorporate with my slon cluster, I guess that will > be the next thing I research. Not sure I'm following, but you can pg_upgrade your replicas at the same time as you do the master... or you can do them after the fact. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
In line Jim On Sun, Apr 3, 2016 at 10:13 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 3/24/16 12:43 PM, Tory M Blue wrote: >> >> Slon is also starting to not be viable as it takes some indexes over 7 >> hours to complete. So this upgrade path seemed to really be nice. > > > If you're standing up a new replica from scratch on the latest version, I'm > not really sure why that matters? Not sure why the 7-13 hours causes an issue? Because if I'm upgrading via slon process, I have to add and drop a node. If I'm dropping my secondary (slave) I have to move reporting to the master, so now the master is handing normal inserts and reports. Next item, I'm without a replica for 13+ hours, that's not good either. >> Not sure how I can incorporate with my slon cluster, I guess that will >> be the next thing I research. > > > Not sure I'm following, but you can pg_upgrade your replicas at the same > time as you do the master... or you can do them after the fact. > -- I'm not sure how that statement is true. I'm fundamentally changing the data in the master. My gut says you are thinking, just shut everything down until you have upgraded all 4-5 servers. I'm hoping that's not what you are thinking here. If I update my Master, my slave and query slaves are going to be wondering what the heck is going on. Now I can stop slon, upgrade and restart slon (if Postgres upgrade handles the weird pointers and stuff that slon does on the slave nodes (inside the slon schema), but depending on how long this process takes I'm down for a period of time, that's not acceptable. so I have to upgrade my standby unit, which now fundamentally is different than the master. This is what my statement was referencing, with slon running, how do I use pg_upgrade to upgrade the cluster without downtime. Again slon requires a drop add if I'm rebuilding via slon but as I stated that's almost unbearable at this juncture with how long indexes take.. Thanks Tory
On 4/19/16 11:01 PM, Tory M Blue wrote: >>> >> Slon is also starting to not be viable as it takes some indexes over 7 >>> >> hours to complete. So this upgrade path seemed to really be nice. >> > >> > >> > If you're standing up a new replica from scratch on the latest version, I'm >> > not really sure why that matters? > Not sure why the 7-13 hours causes an issue? Because if I'm upgrading > via slon process, I have to add and drop a node. If I'm dropping my > secondary (slave) I have to move reporting to the master, so now the > master is handing normal inserts and reports. Next item, I'm without > a replica for 13+ hours, that's not good either. Don't drop and add a node, just do a master switchover. AFAIK that's nearly instant as long as things are in sync. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
On Tue, Jun 14, 2016 at 2:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 4/19/16 11:01 PM, Tory M Blue wrote: >>>> >>>> >> Slon is also starting to not be viable as it takes some indexes over >>>> >> 7 >>>> >> hours to complete. So this upgrade path seemed to really be nice. >>> >>> > >>> > >>> > If you're standing up a new replica from scratch on the latest version, >>> > I'm >>> > not really sure why that matters? >> >> Not sure why the 7-13 hours causes an issue? Because if I'm upgrading >> via slon process, I have to add and drop a node. If I'm dropping my >> secondary (slave) I have to move reporting to the master, so now the >> master is handing normal inserts and reports. Next item, I'm without >> a replica for 13+ hours, that's not good either. > > > Don't drop and add a node, just do a master switchover. AFAIK that's nearly > instant as long as things are in sync. > -- > Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX > Experts in Analytics, Data Architecture and PostgreSQL > Data in Trouble? Get it in Treble! http://BlueTreble.com > 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 Right, that's what we do, but then to upgrade, we have to drop/add the node, because it's being upgraded. If I'm updating the underlying OS, I have to kill it all. If I'm doing a postgres upgrade, using an old version of slon, without using pg_upgrade, I have to drop the db, recreate it, which requires a drop/add. I'm trying to figure out how to best do it using pg_upgrade instead of the entire drop/add for postgres upgrades (which are needed if you are using slon as an upgrade engine for your db). Tory
----- Original Message ----- > From: Tory M Blue <tmblue@gmail.com> > To: Jim Nasby <Jim.Nasby@bluetreble.com> > Cc: "pgsql-performance@postgresql.org" <pgsql-performance@postgresql.org> > Sent: Tuesday, 14 June 2016, 22:08 > Subject: Re: [PERFORM] Clarification on using pg_upgrade > > Right, that's what we do, but then to upgrade, we have to drop/add the > node, because it's being upgraded. If I'm updating the underlying OS, > I have to kill it all. If I'm doing a postgres upgrade, using an old > version of slon, without using pg_upgrade, I have to drop the db, > recreate it, which requires a drop/add. > > I'm trying to figure out how to best do it using pg_upgrade instead > of the entire drop/add for postgres upgrades (which are needed if you > are using slon as an upgrade engine for your db). > I've just skimmed through this thread, but I can't quite gather what it is you're trying to achieve. Are you looking tomove away from Slony? Upgrade by any means with or without Slony? Or just find a "fast" way of doing a major upgrade whilstkeeping Slony in-place as your replication method? If it's the latter, the easiest way is to have 2 or more subscribers subscribed to the same sets and one at a time; dropa subscriber node, upgrade and re-initdb, then use clone node to recreate it from another subscriber. If you're intenton using pg_upgrade you might be able to fudge it as long as you can bump up current txid to be greater than what itwas before the upgrade; in fact I've done similar before with a slony subscriber, but only as a test on a small database.