Обсуждение: pg_upgrade with large pg_largeobject table

Поиск
Список
Период
Сортировка

pg_upgrade with large pg_largeobject table

От
Mate Varga
Дата:
Hi.

hanks in advance for any advice.

We have a PSQL 9.5 DB with 16G physical RAM and ~ 1 TB data mostly stored in the pg_largeobject system table. This table has 250M rows at the moment. We're trying to upgrade this to 10.x with an in-place upgrade. The command I'm using is:sudo -u postgres /usr/lib/postgresql/10/bin/pg_upgrade -b /usr/lib/postgresql/9.5/bin -B /usr/lib/postgresql/10/bin -p 5433 -P 5434 -d /etc/postgresql/9.5/test -D /etc/postgresql/10/test --linkThis command fails because of an OOM. Logs say:

command: "/usr/lib/postgresql/10/bin/pg_dump" --host /var/log/postgresql/pg_upgradecluster-9.5-10-test.wjNi --port 5433 --username postgres --schema-only --quote-all-identifiers --binary-upgrade --format=custom  --file="pg_upgrade_dump_31803.custom" 'dbname=tolven' >> "pg_upgrade_dump_31803.log" 2>&1
pg_dump: [archiver (db)] query failed: out of memory for query result
pg_dump: [archiver (db)] query was: SELECT oid, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = lomowner) AS rolname, lomacl, NULL AS rlomacl, NULL AS initlomacl, NULL AS initrlomacl  FROM pg_largeobject_metadata


Any ideas about how could we make this work on a server with only 16G RAM (possibly a few dozens of gigabytes of swap)?Thanks.
Would it help if we'd inline these largeobjects as e.g. text cols (instead of storing them as lobs)?
Thanks,
Mate

Re: pg_upgrade with large pg_largeobject table

От
Tom Lane
Дата:
Mate Varga <m@matevarga.net> writes:
> We have a PSQL 9.5 DB with 16G physical RAM and ~ 1 TB data mostly stored
> in the pg_largeobject system table. This table has 250M rows at the moment.

You mean 250M rows in pg_largeobject itself, or 250M large objects
(that is, 250M rows in pg_largeobject_metadata)?

> This command
> fails because of an OOM. Logs say:
> pg_dump: [archiver (db)] query failed: out of memory for query result
> pg_dump: [archiver (db)] query was: SELECT oid, (SELECT rolname FROM
> pg_catalog.pg_roles WHERE oid = lomowner) AS rolname, lomacl, NULL AS
> rlomacl, NULL AS initlomacl, NULL AS initrlomacl  FROM
> pg_largeobject_metadata

The selected rows shouldn't be very wide here, so I'm a bit surprised
you are hitting OOM, even if there are 250M rows to read.  Are you sure
you're using a 64-bit build of pg_dump?

> Would it help if we'd inline these largeobjects as e.g. text cols (instead
> of storing them as lobs)?

Yes, because then they'd not have their own ownership and permissions
for pg_dump to keep track of.  (You might want bytea instead of text,
depending on what the data is.)

Using the large-object API for things that tend to not actually be very
large (which they aren't, if you've got hundreds of millions of 'em)
is an antipattern, I'm afraid.  You could get away with it before we
added per-largeobject permissions, but now it's a problem for pg_dump.

            regards, tom lane


Re: pg_upgrade with large pg_largeobject table

От
Mate Varga
Дата:
Thanks.

> You mean 250M rows in pg_largeobject itself, or 250M large objects
(that is, 250M rows in pg_largeobject_metadata)?

250M large objects.

> Are you sure you're using a 64-bit build of pg_dump?

 file /usr/lib/postgresql/10/bin/pg_dump
/usr/lib/postgresql/10/bin/pg_dump: ELF 64-bit LSB  shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=4ff6bf0dd628d151481256723994dd50785e54e9, stripped
Also it really eats all the memory.

> Using the large-object API for things that tend to not actually be very large (which they aren't, if you've got hundreds of millions of 'em)  is an antipattern, I'm afraid.

I know :( So maybe I need to do some refactoring in the application and inline the lobs. The data is binary data with very high entropy (encrypted stuff). Would you recommend bytea for that?

Thanks again.
Mate



On Tue, Aug 14, 2018 at 7:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mate Varga <m@matevarga.net> writes:
> We have a PSQL 9.5 DB with 16G physical RAM and ~ 1 TB data mostly stored
> in the pg_largeobject system table. This table has 250M rows at the moment.

You mean 250M rows in pg_largeobject itself, or 250M large objects
(that is, 250M rows in pg_largeobject_metadata)?

> This command
> fails because of an OOM. Logs say:
> pg_dump: [archiver (db)] query failed: out of memory for query result
> pg_dump: [archiver (db)] query was: SELECT oid, (SELECT rolname FROM
> pg_catalog.pg_roles WHERE oid = lomowner) AS rolname, lomacl, NULL AS
> rlomacl, NULL AS initlomacl, NULL AS initrlomacl  FROM
> pg_largeobject_metadata

The selected rows shouldn't be very wide here, so I'm a bit surprised
you are hitting OOM, even if there are 250M rows to read.  Are you sure
you're using a 64-bit build of pg_dump?

> Would it help if we'd inline these largeobjects as e.g. text cols (instead
> of storing them as lobs)?

Yes, because then they'd not have their own ownership and permissions
for pg_dump to keep track of.  (You might want bytea instead of text,
depending on what the data is.)

Using the large-object API for things that tend to not actually be very
large (which they aren't, if you've got hundreds of millions of 'em)
is an antipattern, I'm afraid.  You could get away with it before we
added per-largeobject permissions, but now it's a problem for pg_dump.

                        regards, tom lane

Re: pg_upgrade with large pg_largeobject table

От
Tom Lane
Дата:
Mate Varga <m@matevarga.net> writes:
>> Using the large-object API for things that tend to not actually be very
>> large (which they aren't, if you've got hundreds of millions of 'em) is an
>> antipattern, I'm afraid.

> I know :( So maybe I need to do some refactoring in the application and
> inline the lobs. The data is binary data with very high entropy (encrypted
> stuff). Would you recommend bytea for that?

Yeah, it'd likely be less of a pain-in-the-neck than text.  You would need
some sort of encoding anyway to deal with zero bytes and sequences that
aren't valid per your encoding, so you might as well go with bytea's
solution.

            regards, tom lane