Обсуждение: ERROR: invalid memory alloc in Pg 9.6.6

Поиск
Список
Период
Сортировка

ERROR: invalid memory alloc in Pg 9.6.6

От
Don Seiler
Дата:
Started seeing "invalid memory alloc" errors in a non-production DB (9.6.6) all of a sudden this afternoon. It's running on CentOS 7.4 on VMWare.

I ran a pg_dump on that DB and got this error almost immediately:

pg_dump: [archiver (db)] query failed: ERROR:  invalid memory alloc request size 8830452760576
pg_dump: [archiver (db)] query was: SELECT at.attname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) AS attacl, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) AS rattacl, NULL AS initattacl, NULL AS initrattacl FROM pg_catalog.pg_attribute at JOIN pg_catalog.pg_class c ON (at.attrelid = c.oid) LEFT JOIN pg_catalog.pg_init_privs pip ON (at.attrelid = pip.objoid AND pip.classoid = 'pg_catalog.pg_class'::pg_catalog.regclass AND at.attnum = pip.objsubid) WHERE at.attrelid = '16445700'::pg_catalog.oid AND NOT at.attisdropped AND ((SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) IS NOT NULL OR (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) IS NOT NULL OR NULL IS NOT NULL OR NULL IS NOT NULL)ORDER BY at.attnum

I ran pg_catcheck and it came back just fine:
progress: done (0 inconsistencies, 0 warnings, 0 errors)

The web hits I found on this seem to expect the error to hit on the COPY command that dumps the data, but we seem to be hitting this error way ahead of that. I ran a pg_dump on the other DBs in the cluster and it finished without errors.

While this isn't production, I'd like to salvage things if possible (as well as try to determine why this happened). I can take a statement that hit the error from the log and run it myself in psql and it will return fine. I can only seem to hit this error when I run the pg_dump, but it's happening very frequently in this pre-prod DB.

Any hope here?

--
Don Seiler
www.seiler.us

Re: ERROR: invalid memory alloc in Pg 9.6.6

От
Don Seiler
Дата:
On Wed, Apr 11, 2018 at 2:44 PM, Don Seiler <don@seiler.us> wrote:

I can take a statement that hit the error from the log and run it myself in psql and it will return fine.

This includes the statement that hits the error on pg_dump. It errors every time on pg_dump but not when I run it manually from psql as the postgres user:

# SELECT at.attname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) AS attacl, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) AS rattacl, NULL AS initattacl, NULL AS initrattacl FROM pg_catalog.pg_attribute at JOIN pg_catalog.pg_class c ON (at.attrelid = c.oid) LEFT JOIN pg_catalog.pg_init_privs pip ON (at.attrelid = pip.objoid AND pip.classoid = 'pg_catalog.pg_class'::pg_catalog.regclass AND at.attnum = pip.objsubid) WHERE at.attrelid = '16445700'::pg_catalog.oid AND NOT at.attisdropped AND ((SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) IS NOT NULL OR (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) IS NOT NULL OR NULL IS NOT NULL OR NULL IS NOT NULL)ORDER BY at.attnum;
 attname | attacl | rattacl | initattacl | initrattacl
---------+--------+---------+------------+-------------
(0 rows)

Time: 3.099 ms 



--
Don Seiler
www.seiler.us

Re: ERROR: invalid memory alloc in Pg 9.6.6

От
Jaime Soler
Дата:
I think you are getting the error while pg_dump is collecting acl's of table '16445700'::pg_catalog.oid  https://github.com/postgres/postgres/blob/REL9_6_6/src/bin/pg_dump/pg_dump.c#L15182-L15222 but I am not sure why you are running out of memory 

2018-04-11 21:47 GMT+02:00 Don Seiler <don@seiler.us>:
On Wed, Apr 11, 2018 at 2:44 PM, Don Seiler <don@seiler.us> wrote:

I can take a statement that hit the error from the log and run it myself in psql and it will return fine.

This includes the statement that hits the error on pg_dump. It errors every time on pg_dump but not when I run it manually from psql as the postgres user:

# SELECT at.attname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) AS attacl, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) AS rattacl, NULL AS initattacl, NULL AS initrattacl FROM pg_catalog.pg_attribute at JOIN pg_catalog.pg_class c ON (at.attrelid = c.oid) LEFT JOIN pg_catalog.pg_init_privs pip ON (at.attrelid = pip.objoid AND pip.classoid = 'pg_catalog.pg_class'::pg_catalog.regclass AND at.attnum = pip.objsubid) WHERE at.attrelid = '16445700'::pg_catalog.oid AND NOT at.attisdropped AND ((SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) IS NOT NULL OR (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) IS NOT NULL OR NULL IS NOT NULL OR NULL IS NOT NULL)ORDER BY at.attnum;
 attname | attacl | rattacl | initattacl | initrattacl
---------+--------+---------+------------+-------------
(0 rows)

Time: 3.099 ms 



--
Don Seiler
www.seiler.us

Re: ERROR: invalid memory alloc in Pg 9.6.6

От
Stephen Frost
Дата:
Greetings,

* Don Seiler (don@seiler.us) wrote:
> Started seeing "invalid memory alloc" errors in a non-production DB (9.6.6)
> all of a sudden this afternoon. It's running on CentOS 7.4 on VMWare.
>
> I ran a pg_dump on that DB and got this error almost immediately:
>
> pg_dump: [archiver (db)] query failed: ERROR:  invalid memory alloc request
> size 8830452760576

That's interesting.  An invalid memory alloc request size complaint such
as that tends to be an indicator of corruption or a bug in the backend
possibly.

Certainly curious that you were able to run the query and not hit that..
Are you sure you were running it exactly the way pg_dump does?  There's
a bunch of setup that pg_dump does early on, maybe log all queries when
you run the pg_dump and then in a psql session try running them..?

If you can reliably reproduce this with either pg_dump or psql, then the
next step would perhaps be jumping in with gdb to see where that error
is being hit..

Thanks!

Stephen

Вложения

Re: ERROR: invalid memory alloc in Pg 9.6.6

От
Don Seiler
Дата:
On Fri, Apr 13, 2018 at 11:09 AM, Stephen Frost <sfrost@snowman.net> wrote:
If you can reliably reproduce this with either pg_dump or psql, then the
next step would perhaps be jumping in with gdb to see where that error
is being hit..

I forgot to follow-up yesterday. It seems like my issue was caused by a bug with the pg_qualstats extension that I was using in conjuction with the POWA extension. There was already a similar issue reported on their tracker at https://github.com/powa-team/powa/issues/104

It seems that high TPS stress testing on Wednesday afternoon triggered the bug. When the stress testing tailed off, the error messages stopped happening and I was able to run pg_dump without errors. I removed those extensions shortly after. The authors have put out a bug fix but I can't use this system to test their fix and my sandbox isn't available for the same HA server stress test that we were doing on Wednesday.

Don.

--
Don Seiler
www.seiler.us

Re: ERROR: invalid memory alloc in Pg 9.6.6

От
Adrien Nayrat
Дата:
On 04/13/2018 06:13 PM, Don Seiler wrote:
> On Fri, Apr 13, 2018 at 11:09 AM, Stephen Frost <sfrost@snowman.net
> <mailto:sfrost@snowman.net>> wrote:
>
>     If you can reliably reproduce this with either pg_dump or psql, then the
>     next step would perhaps be jumping in with gdb to see where that error
>     is being hit..
>
>
> I forgot to follow-up yesterday. It seems like my issue was caused by a bug with
> the pg_qualstats extension that I was using in conjuction with the POWA
> extension. There was already a similar issue reported on their tracker
> at https://github.com/powa-team/powa/issues/104
>
> It seems that high TPS stress testing on Wednesday afternoon triggered the bug.
> When the stress testing tailed off, the error messages stopped happening and I
> was able to run pg_dump without errors. I removed those extensions shortly
> after. The authors have put out a bug fix but I can't use this system to test
> their fix and my sandbox isn't available for the same HA server stress test that
> we were doing on Wednesday.
>
> Don.
>

Hello,

FIY, a new version including bug fix, has been released two days ago :
https://github.com/powa-team/pg_qualstats/releases/tag/1.0.4

Regards,


Вложения