Обсуждение: ERROR: invalid memory alloc request size when committing transaction
Hello Experts We have a large-ish (16T) database cluster which were are performing the following sequence on. - First we upgrade the whole cluster from pg11 to pg13, using pg_upgrade (this succeeds) - Next we run a migration script on each database in the cluster. The migration script converts a large number of tables from inheritance based partitioning to declarative partitioning. Unfortunately I am not at liberty to share the migration script. The second step succeeds for most of the databases in the cluster but fails on one of them. The migration is performed inside a transaction, and during the process of committing the transaction the following error is thrown: [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 ERROR: invalid memory alloc request size 1073741824 [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 STATEMENT: commit [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 WARNING: AbortTransaction while in COMMIT state The transaction is rolled back. I have looked into the error message - it is very low level from the memory manager and occurs when a memory allocation of >= 1GB is requested. Most of the hits on google for this error indicate database corruption, however I am not sure this is the case for us as we have been able to do a complete pg_dump on the database without errors. We repeated the migration with all postgres debug logging enabled - however this did not provide any more detail than above. Luckily this has occurred while we were testing the procedure on a replica of the production system, not on the actual production system. We are using pg 13.0. We are currently re-testing this with huge pages disabled (on a hunch) and after that we plan to re-test it on 13.3. Any ideas as to what could be causing this problem, or any suggestions for troubleshooting steps we could take? Many thanks in advance, Cheers Mike.
Michael Harris <michael.harris@ericsson.com> writes: > - Next we run a migration script on each database in the cluster. The migration script > converts a large number of tables from inheritance based partitioning to declarative > partitioning. Unfortunately I am not at liberty to share the migration script. How many is "a large number"? > The migration is performed inside a transaction, and during the process of committing the > transaction the following error is thrown: > [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 ERROR: invalid memory alloc request size 1073741824 > [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 STATEMENT: commit > [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 WARNING: AbortTransaction while in COMMIT state It's impossible to do more than speculate with this amount of information. However, a plausible speculation is that you're processing so many objects in one transaction as to overrun the limit on how large a WAL commit record, or something related to that, can be. Can you adjust your migration script to break it into multiple transactions that process fewer tables per transaction? regards, tom lane
Thanks Tom, > How many is "a large number"? 377k approx. I thought we had logged all the SQL statements last time we ran this, but we forgot to set log_statement. We'll do that nexttime so we can get a figure on exactly how many of different types of statement are in the transaction. > Can you adjust your migration script to break it into multiple transactions that process fewer tables per transaction? We are considering this. It is not ideal, because if conversion fails for any reason we would like the database to roll backto the pre-migration state, but maybe that's not possible with a database of this size. I am surprised by the error message though - I thought that if we'd hit some limit on a transaction we would get a more specificerror. Cheers Mike -----Original Message----- From: Tom Lane <tgl@sss.pgh.pa.us> Sent: Thursday, August 12, 2021 1:41 PM To: Michael Harris <michael.harris@ericsson.com> Cc: 'pgsql-general@lists.postgresql.org' <pgsql-general@lists.postgresql.org> Subject: Re: ERROR: invalid memory alloc request size when committing transaction Michael Harris <michael.harris@ericsson.com> writes: > - Next we run a migration script on each database in the cluster. The migration script > converts a large number of tables from inheritance based partitioning to declarative > partitioning. Unfortunately I am not at liberty to share the migration script. How many is "a large number"? > The migration is performed inside a transaction, and during the > process of committing the transaction the following error is thrown: > [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 > ERROR: invalid memory alloc request size 1073741824 > [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 > STATEMENT: commit > [2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 > WARNING: AbortTransaction while in COMMIT state It's impossible to do more than speculate with this amount of information. However, a plausible speculation is that you're processing so many objects in one transaction as to overrun the limit onhow large a WAL commit record, or something related to that, can be. Can you adjust your migration script to break itinto multiple transactions that process fewer tables per transaction? regards, tom lane
On Thu, 12 Aug 2021 at 06:42, Michael Harris <michael.harris@ericsson.com> wrote: > > Thanks Tom, > > > How many is "a large number"? > > 377k approx. I'm going to guess that it is the invalidation messages for all the DDL that is causing the memory allocation error. If you set wal_level = minimal then this might work. The total number indicates you are using too many partitions and should probably lower that by a factor of about 100. If you do that, you probably won't need to set wal_level. -- Simon Riggs http://www.EnterpriseDB.com/
Thanks Simon Setting wal_level to minimal has indeed allowed it to work. Thanks for the suggestion! We aren't easily able to change the number of partitions - it consists of a large number of partitioned tables each havinga reasonable number of partitions, but they multiply up to a huge number. Probably it needs to be broken into multiplesmaller databases, but that's easier said than done. Thanks again Cheers Mike -----Original Message----- From: Simon Riggs <simon.riggs@enterprisedb.com> Sent: Thursday, August 12, 2021 7:19 PM To: Michael Harris <michael.harris@ericsson.com> Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-general@lists.postgresql.org Subject: Re: ERROR: invalid memory alloc request size when committing transaction On Thu, 12 Aug 2021 at 06:42, Michael Harris <michael.harris@ericsson.com> wrote: > > Thanks Tom, > > > How many is "a large number"? > > 377k approx. I'm going to guess that it is the invalidation messages for all the DDL that is causing the memory allocation error. If youset wal_level = minimal then this might work. The total number indicates you are using too many partitions and should probably lower that by a factor of about 100. If you do that, you probably won't need to set wal_level. -- Simon Riggs http://www.EnterpriseDB.com/