Обсуждение: Upgrade from PG12 to PG
Hello,
I want to upgrade from PG12 to P15.
I use /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432
I takes a lot of time, more than 6 hours
Log file, al lot of  lines pg_restore: executing BLOB 1
Looking at the dump file: man many lines like SELECT 
pg_catalog.lo_unlink('100000');
I have the same issue with /usr/lib/postgresql15/bin/pg_upgrade -v -p 
5431 -P 5432 -k
Whats going on ?
Kind regards
			
		Hi Jef,
On Thu, Jul 20, 2023 at 1:23 PM Jef Mortelle <jefmortelle@gmail.com> wrote:
> Looking at the dump file: man many lines like SELECT
> pg_catalog.lo_unlink('100000');
>
>
> I have the same issue with /usr/lib/postgresql15/bin/pg_upgrade -v -p
> 5431 -P 5432 -k
>
>
> Whats going on ?
pg_upgrade is known to be problematic with large objects.
Please take a look here to start with:
https://www.postgresql.org/message-id/20210309200819.GO2021%40telsasoft.com
>
>
> Kind regards
>
>
>
			
		On Jul 20, 2023, at 7:23 AM, Jef Mortelle <jefmortelle@gmail.com> wrote: > I use /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432 > > I takes a lot of time, more than 6 hours I'm in the middle of a similar database migration myself, but I'm coming from 10.23. Have you considered adding the --link and --jobs=NN (where NN is the number of CPU cores on your server minus 2 or somethinglike that) to your pg_upgrade command? I wonder if that would speed things up significantly. Also, if your file system supports reflinks (Linux kernel 4.5 and XFS with reflink support activated or Btrfs or macOS APFS),then use --clone instead of --link. Good luck, Ed
Hi, Alsready tried to use --link and --jobs, but you cannot ommit the "select lo_unlink ...." for every rows containing datatype text in your database that the pg_* program creates in the export/dump file. These select's takes a lot of time, hours ... The pg_dumpall => size of the dumpfile is much more than the size of exported database Kind regards On 20/07/2023 15:29, Edward J. Sabol wrote: > On Jul 20, 2023, at 7:23 AM, Jef Mortelle <jefmortelle@gmail.com> wrote: >> I use /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432 >> >> I takes a lot of time, more than 6 hours > I'm in the middle of a similar database migration myself, but I'm coming from 10.23. > > Have you considered adding the --link and --jobs=NN (where NN is the number of CPU cores on your server minus 2 or somethinglike that) to your pg_upgrade command? I wonder if that would speed things up significantly. > > Also, if your file system supports reflinks (Linux kernel 4.5 and XFS with reflink support activated or Btrfs or macOSAPFS), then use --clone instead of --link. > > Good luck, > Ed >
Hi,
Many thanks for your answer.
So: not possible to have very little downtime if you have a database 
with al lot rows containing text  as datatype, as pg_upgrade needs 12hr 
for 24 milj rows in pg_largeobject.
Testing now with pg_dumpall en pg_restore ....
I think, postgresql should take this in high priority to  resolve this 
problem.
I have to make a choice in the near future: Postgres or Oracle, and that 
database would have a lot of datatype text.
Database would have 1 TB.
It seems me a little bit tricky/dangerous to use Postgres, just for 
being able to upgrade to a newer version.
Kind regards.
On 20/07/2023 13:43, Ilya Kosmodemiansky wrote:
> Hi Jef,
>
>
> On Thu, Jul 20, 2023 at 1:23 PM Jef Mortelle <jefmortelle@gmail.com> wrote:
>> Looking at the dump file: man many lines like SELECT
>> pg_catalog.lo_unlink('100000');
>>
>>
>> I have the same issue with /usr/lib/postgresql15/bin/pg_upgrade -v -p
>> 5431 -P 5432 -k
>>
>>
>> Whats going on ?
> pg_upgrade is known to be problematic with large objects.
> Please take a look here to start with:
> https://www.postgresql.org/message-id/20210309200819.GO2021%40telsasoft.com
>
>>
>> Kind regards
>>
>>
>>
			
		On Thu, 2023-07-20 at 15:46 +0200, Jef Mortelle wrote: > So: not possible to have very little downtime if you have a database > with al lot rows containing text as datatype, as pg_upgrade needs 12hr > for 24 milj rows in pg_largeobject. To clarify: "text" is no problem at all. Large objects are. Yours, Laurenz Albe
Yes,
 But create text allows you to store more data, 
8.3. Character Types
Table 8.4. Character Types
| Name | Description | 
|---|---|
| character varying(,varchar( | variable-length with limit | 
| character(,char( | fixed-length, blank padded | 
| text | variable unlimited length | 
On 20/07/2023 16:46, Laurenz Albe wrote:
On Thu, 2023-07-20 at 15:46 +0200, Jef Mortelle wrote:So: not possible to have very little downtime if you have a database with al lot rows containing text as datatype, as pg_upgrade needs 12hr for 24 milj rows in pg_largeobject.To clarify: "text" is no problem at all. Large objects are. Yours, Laurenz Albe
> On Jul 20, 2023, at 7:46 AM, Jef Mortelle <jefmortelle@gmail.com> wrote: > > So: not possible to have very little downtime if you have a database with al lot rows containing text as datatype, aspg_upgrade needs 12hr for 24 milj rows in pg_largeobject. We need to get terminology straight, as at the moment your posts are very confusing. In PostgreSQL large objects and textare not the same. Text is basically varchar without a specified length limit. Large object is a blob (but not what SQLcalls a BLOB)--it is kind of like a file stored outside the normal table mechanism, and provides facilities for partialreads, etc: https://www.postgresql.org/docs/15/largeobjects.html. There are a number of ways to wind up with referencesto large objects all deleted, but the orphaned large objects still in the database. First thing you should do: run lovacuum -n to find out if you have orphaned large objects. If so, start cleaning those up,then see how long pg_upgrade takes. Second, what's your hardware? I really don't see dump & restore of a 1TB database taking 6 hours. > Alsready tried to use --link and --jobs, but you cannot ommit the "select lo_unlink ...." for every rows containingdatatype text in your database that the pg_* program creates in the export/dump file. Terminology again, or are you conflating two different issues? pg_upgrade --link does not create a dump file.
> But create text allows you to store more data, not true varchar without length specified and text both have an upper limit around 1GB
Don't use pg_dumpall.  Use this instead:
pg_dump --format=directory --jobs=X --verbose
On 7/20/23 08:46, Jef Mortelle wrote:
> Hi,
>
> Many thanks for your answer.
>
> So: not possible to have very little downtime if you have a database with 
> al lot rows containing text  as datatype, as pg_upgrade needs 12hr for 24 
> milj rows in pg_largeobject.
>
> Testing now with pg_dumpall en pg_restore ....
>
>
> I think, postgresql should take this in high priority to  resolve this 
> problem.
>
> I have to make a choice in the near future: Postgres or Oracle, and that 
> database would have a lot of datatype text.
> Database would have 1 TB.
> It seems me a little bit tricky/dangerous to use Postgres, just for being 
> able to upgrade to a newer version.
>
> Kind regards.
>
> On 20/07/2023 13:43, Ilya Kosmodemiansky wrote:
>> Hi Jef,
>>
>>
>> On Thu, Jul 20, 2023 at 1:23 PM Jef Mortelle <jefmortelle@gmail.com> wrote:
>>> Looking at the dump file: man many lines like SELECT
>>> pg_catalog.lo_unlink('100000');
>>>
>>>
>>> I have the same issue with /usr/lib/postgresql15/bin/pg_upgrade -v -p
>>> 5431 -P 5432 -k
>>>
>>>
>>> Whats going on ?
>> pg_upgrade is known to be problematic with large objects.
>> Please take a look here to start with:
>> https://www.postgresql.org/message-id/20210309200819.GO2021%40telsasoft.com
>>
>>>
>>> Kind regards
>>>
>>>
>>>
>
>
-- 
Born in Arizona, moved to Babylonia.
			
		running /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432 -k gives you in the output : ... "/usr/lib/postgresql15/bin/pg_dump" --host /pg/PG15 --port 5431 --username postgres --schema-only --quote-all-identifiers --binary-upgrade --format=custom --verbose --file="/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/dump/pg_upgrade_dump_16385.custom" 'dbname=dbname' >> "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/log/pg_upgrade_dump_16385.log" 2>&1 ... so, yes pg_ugrade start a pg_dump session, even a pg_restore at the end, runs about 7 hours "/usr/lib/postgresql15/bin/pg_restore" --host /pg/PG15 --port 5432 --username postgres --create --exit-on-error --verbose --dbname template1 "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/dump/pg_upgrade_dump_16385.custom" >> "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/log/pg_upgrade_dump_16385.log" 2>&1 with a lot of lines in the logfiles like: pg_restore: executing BLOB 11108809 Server is a VM server, my VM has 64GB SuseSLES attached to a SAN with SSD disk (Hp3Par) 2)vacuumlo lovacum -n => I have vacuumlo, running this: vacuumlo dbname -v postgres@dbfluxd02:/pg/data> vacuumlo dflux -v Connected to database "dflux" Checking message_oid in fluxnaf.flux_message_content Checking payload_oid in fluxnaf.bridge_payload_content Successfully removed 0 large objects from database "dflux". and again same issues On 20/07/2023 16:51, Scott Ribe wrote: >> On Jul 20, 2023, at 7:46 AM, Jef Mortelle <jefmortelle@gmail.com> wrote: >> >> So: not possible to have very little downtime if you have a database with al lot rows containing text as datatype, aspg_upgrade needs 12hr for 24 milj rows in pg_largeobject. > We need to get terminology straight, as at the moment your posts are very confusing. In PostgreSQL large objects and textare not the same. Text is basically varchar without a specified length limit. Large object is a blob (but not what SQLcalls a BLOB)--it is kind of like a file stored outside the normal table mechanism, and provides facilities for partialreads, etc: https://www.postgresql.org/docs/15/largeobjects.html. There are a number of ways to wind up with referencesto large objects all deleted, but the orphaned large objects still in the database. > > First thing you should do: run lovacuum -n to find out if you have orphaned large objects. If so, start cleaning thoseup, then see how long pg_upgrade takes. > > Second, what's your hardware? I really don't see dump & restore of a 1TB database taking 6 hours. > >> Alsready tried to use --link and --jobs, but you cannot ommit the "select lo_unlink ...." for every rows containingdatatype text in your database that the pg_* program creates in the export/dump file. > Terminology again, or are you conflating two different issues? pg_upgrade --link does not create a dump file.
			
				 Jeff,
It does a "--schema-only" dump. The gianter the schema, the longer that "pg_dump --schema-only" takes.
Note also that there's a known issue with pg_upgrade and millions of Large Objects (not bytea or text, but lo_* columns).
		
	It does a "--schema-only" dump. The gianter the schema, the longer that "pg_dump --schema-only" takes.
Note also that there's a known issue with pg_upgrade and millions of Large Objects (not bytea or text, but lo_* columns).
On 7/20/23 12:05, Jef Mortelle wrote:
running /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432 -k
gives you in the output :
...
"/usr/lib/postgresql15/bin/pg_dump" --host /pg/PG15 --port 5431 --username postgres --schema-only --quote-all-identifiers --binary-upgrade --format=custom --verbose --file="/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/dump/pg_upgrade_dump_16385.custom" 'dbname=dbname' >> "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/log/pg_upgrade_dump_16385.log" 2>&1
...
so, yes pg_ugrade start a pg_dump session,
even a pg_restore at the end, runs about 7 hours
"/usr/lib/postgresql15/bin/pg_restore" --host /pg/PG15 --port 5432 --username postgres --create --exit-on-error --verbose --dbname template1 "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/dump/pg_upgrade_dump_16385.custom" >> "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/log/pg_upgrade_dump_16385.log" 2>&1
with a lot of lines in the logfiles like: pg_restore: executing BLOB 11108809
Server is a VM server, my VM has 64GB SuseSLES attached to a SAN with SSD disk (Hp3Par)
2)vacuumlo
lovacum -n => I have vacuumlo, running this:
vacuumlo dbname -v
postgres@dbfluxd02:/pg/data> vacuumlo dflux -v
Connected to database "dflux"
Checking message_oid in fluxnaf.flux_message_content
Checking payload_oid in fluxnaf.bridge_payload_content
Successfully removed 0 large objects from database "dflux".
and again same issues
On 20/07/2023 16:51, Scott Ribe wrote:On Jul 20, 2023, at 7:46 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:We need to get terminology straight, as at the moment your posts are very confusing. In PostgreSQL large objects and text are not the same. Text is basically varchar without a specified length limit. Large object is a blob (but not what SQL calls a BLOB)--it is kind of like a file stored outside the normal table mechanism, and provides facilities for partial reads, etc: https://www.postgresql.org/docs/15/largeobjects.html. There are a number of ways to wind up with references to large objects all deleted, but the orphaned large objects still in the database.
So: not possible to have very little downtime if you have a database with al lot rows containing text as datatype, as pg_upgrade needs 12hr for 24 milj rows in pg_largeobject.
First thing you should do: run lovacuum -n to find out if you have orphaned large objects. If so, start cleaning those up, then see how long pg_upgrade takes.
Second, what's your hardware? I really don't see dump & restore of a 1TB database taking 6 hours.Alsready tried to use --link and --jobs, but you cannot ommit the "select lo_unlink ...." for every rows containing datatype text in your database that the pg_* program creates in the export/dump file.Terminology again, or are you conflating two different issues? pg_upgrade --link does not create a dump file.
-- 
Born in Arizona, moved to Babylonia.
			
		Born in Arizona, moved to Babylonia.
> On Jul 20, 2023, at 11:05 AM, Jef Mortelle <jefmortelle@gmail.com> wrote: > > so, yes pg_ugrade start a pg_dump session, Only for the schema, which you can see in the output you posted. > Server is a VM server, my VM has 64GB SuseSLES attached to a SAN with SSD disk (Hp3Par) VM + SAN can perform well, or introduce all sorts of issues: busy neighbor, poor VM drivers, SAN only fast for large sequentialwrites, etc. > On Jul 20, 2023, at 11:22 AM, Ron <ronljohnsonjr@gmail.com> wrote: > > Note also that there's a known issue with pg_upgrade and millions of Large Objects (not bytea or text, but lo_* columns). Good to know, but it would be weird to have millions of large objects in a 1TB database. (Then again, I found an old postabout 3M large objects taking 5.5GB...) Try: time a run of that pg_dump command, then time a run of pg_restore of the schema only dump time a file copy of the db to a location on the SAN--purpose is not to produce a usable backup, but rather to check IOthroughput use the link option on pg_upgrade Searching on this subject turns up some posts about slow restore of large objects under much older versions of PG--not sureif any of it still applies. Finally given the earlier confusion between text and large objects, your apparent belief that text columns correlated tolarge objects, and that text could hold more data than varchar, it's worth asking: do you actually need large objects atall? (Is this even under your control?)
On 20/07/2023 16:51, Scott Ribe wrote:
Alsready tried to use --link and --jobs, but you cannot ommit the "select lo_unlink ...." for every rows containing datatype text in your database that the pg_* program creates in the export/dump file.Terminology again, or are you conflating two different issues? pg_upgrade --link does not create a dump file.
why does pg_dump/pg_restore for every row in pg_largeobject a "select lo_unlink ...." this takes a lot of time
On 20/07/2023 20:34, Scott Ribe wrote:
On Jul 20, 2023, at 11:05 AM, Jef Mortelle <jefmortelle@gmail.com> wrote: so, yes pg_ugrade start a pg_dump session,Only for the schema, which you can see in the output you posted.
=> the pg_restore of this pg_dump takes about 7 hours ... which is 99% used for executing the query like:  SELECT pg_catalog.lo_unlink('oid'); 
=> pg_dump schema_only, after RAM upgrade from 8GB up to 64GB (otherwise the query against pg_largeobject ends in a OUT of Memory error) runs in about 3-4 minutesGood to know, but it would be weird to have millions of large objects in a 1TB database. (Then again, I found an old post about 3M large objects taking 5.5GB...) Try: time a run of that pg_dump command, then time a run of pg_restore of the schema only dump
=> pg_restore takes 7 hours, which is 99% used for executing the query like: SELECT pg_catalog.lo_unlink('oid');
I used the link option in al my tests, and it takes the timesuse the link option on pg_upgrade
For some reason Postgres creates a new subdirectory for each PG version (I make use of tablespaces for each database in my PG cluster), also with using the link option.
So after some upgrade, it ends in a really mess with directory's?
The use of OID (large objects): it depends on the vendor of the software. I can ask the vendor to change to another type .... but honestly I don't believe it will changed in the near feature.Searching on this subject turns up some posts about slow restore of large objects under much older versions of PG--not sure if any of it still applies. Finally given the earlier confusion between text and large objects, your apparent belief that text columns correlated to large objects, and that text could hold more data than varchar, it's worth asking: do you actually need large objects at all? (Is this even under your control?)
Database is 95GB, so not so big ;-) but have ~25miljon large objects in it.
> On Jul 24, 2023, at 12:38 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:
> For some reason Postgres creates a new subdirectory for each PG version (I make use of tablespaces for each database
inmy PG cluster), also with using the link option. 
> So after some upgrade,  it ends in a really mess with directory's?
At the end of pg_upgrade, you can start up the old version against the old directory, or the new version against the
newdirectory. (With --link, only until writing into the db, then you are committed to the running version.) Once you
arecomfortable that everything is good with the new version, you should delete the old data. Alternatively, if there is
aproblem forcing you back to the old version, you delete the new data. 
> => pg_dump schema_only, after RAM upgrade from 8GB up to 64GB (otherwise the query against pg_largeobject ends in a
OUTof Memory error) runs in about 3-4 minutes 
> => pg_restore takes 7 hours, which is 99% used for executing the query like:  SELECT pg_catalog.lo_unlink('oid');
Given the tests you've run, it seems to me that it is doing something which it ought not when using --link.
> Database is 95GB, so not so big ;-) but have ~25miljon large objects in it.
I suppose the use of large objects here is an artifact of support for other databases which have much lower limits on
varcharcolumn length. 
			
		Hello
This is the syntax:
export PGDATA=/pg/PG15/system
export PATH=/usr/lib/postgresql15/bin:/bin:/usr/bin:/usr/local/bin
export PGDATAOLD=/pg/data
export PGDATANEW=/pg/PG15/system
export PGBINOLD=/usr/lib/postgresql12/bin
export PGBINNEW=/usr/lib/postgresql15/bin
/usr/lib/postgresql15/bin/pg_upgrade -r -v -p 5431 -P 5432 -k -j 8
  -r =--link
Kind regards
On 24/07/2023 14:52, Scott Ribe wrote:
>> On Jul 24, 2023, at 12:38 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:
>> For some reason Postgres creates a new subdirectory for each PG version (I make use of tablespaces for each database
inmy PG cluster), also with using the link option.
 
>> So after some upgrade,  it ends in a really mess with directory's?
> At the end of pg_upgrade, you can start up the old version against the old directory, or the new version against the
newdirectory. (With --link, only until writing into the db, then you are committed to the running version.) Once you
arecomfortable that everything is good with the new version, you should delete the old data. Alternatively, if there is
aproblem forcing you back to the old version, you delete the new data.
 
>
>> => pg_dump schema_only, after RAM upgrade from 8GB up to 64GB (otherwise the query against pg_largeobject ends in a
OUTof Memory error) runs in about 3-4 minutes
 
>> => pg_restore takes 7 hours, which is 99% used for executing the query like:  SELECT pg_catalog.lo_unlink('oid');
> Given the tests you've run, it seems to me that it is doing something which it ought not when using --link.
>
>> Database is 95GB, so not so big ;-) but have ~25miljon large objects in it.
> I suppose the use of large objects here is an artifact of support for other databases which have much lower limits on
varcharcolumn length.
 
>
>
			
		correction: -k = --link
On 24/07/2023 14:59, Jef Mortelle wrote:
> Hello
>
> This is the syntax:
>
> export PGDATA=/pg/PG15/system
> export PATH=/usr/lib/postgresql15/bin:/bin:/usr/bin:/usr/local/bin
>
> export PGDATAOLD=/pg/data
> export PGDATANEW=/pg/PG15/system
> export PGBINOLD=/usr/lib/postgresql12/bin
> export PGBINNEW=/usr/lib/postgresql15/bin
>
> /usr/lib/postgresql15/bin/pg_upgrade -r -v -p 5431 -P 5432 -k -j 8
>
>  -r =--link
>
> Kind regards
>
> On 24/07/2023 14:52, Scott Ribe wrote:
>>> On Jul 24, 2023, at 12:38 AM, Jef Mortelle <jefmortelle@gmail.com> 
>>> wrote:
>>> For some reason Postgres creates a new subdirectory for each PG 
>>> version (I make use of tablespaces for each database in my PG 
>>> cluster), also with using the link option.
>>> So after some upgrade,  it ends in a really mess with directory's?
>> At the end of pg_upgrade, you can start up the old version against 
>> the old directory, or the new version against the new directory. 
>> (With --link, only until writing into the db, then you are committed 
>> to the running version.) Once you are comfortable that everything is 
>> good with the new version, you should delete the old data. 
>> Alternatively, if there is a problem forcing you back to the old 
>> version, you delete the new data.
>>
>>> => pg_dump schema_only, after RAM upgrade from 8GB up to 64GB 
>>> (otherwise the query against pg_largeobject ends in a OUT of Memory 
>>> error) runs in about 3-4 minutes
>>> => pg_restore takes 7 hours, which is 99% used for executing the 
>>> query like:  SELECT pg_catalog.lo_unlink('oid');
>> Given the tests you've run, it seems to me that it is doing something 
>> which it ought not when using --link.
>>
>>> Database is 95GB, so not so big ;-) but have ~25miljon large objects 
>>> in it.
>> I suppose the use of large objects here is an artifact of support for 
>> other databases which have much lower limits on varchar column length.
>>
>>