Обсуждение: Troubles dumping a very large table.
(NOTE: I tried sending this email from my excite account and it appears to have been blocked for whatever reason. But if the message does get double posted, sorry for the inconvenience.) Hey all, Merry Christmas Eve, Happy Holidays, and all that good stuff. At my work, I'm trying to upgrade my system from a 8.1 to 8.3 and I'm dumping a few large static tables ahead of time to limit the amount of downtime during the upgrade. The trouble is, when I dump the largest table, which is 1.1 Tb with indexes, I keep getting the following error at the same point in the dump. pg_dump: SQL command failed pg_dump: Error message from server: ERROR: invalid string enlargement request size 1 pg_dump: The command was: COPY public.large_table (id, data) TO stdout; As you can see, the table is two columns, one column is an integer, and the other is bytea. Each cell in the data column can be as large as 600mb (we had bigger rows as well but we thought they were the source of the trouble and moved them elsewhere to be dealt with separately.) We are dumping the table using this command. /var/lib/pgsql-8.3.5/bin/pg_dump -O -x -t large_table mydb | gzip -c > large_table.pgsql.gz Originally we tried dumping the table with '/var/lib/pgsql-8.3.5/bin/pg_dump -O -x -t -F c > large_table.dump' but that was to cpu intensive and slowed down other db processes too much. It failed using that command as well, but I believe it is because we did not have enough postgres temp hard drive space. We have since symlinked the postgres temp space to a much bigger file system. The stats of the db server is as follows, Processors: 4x Opteron 2.4 Ghz cores Memory: 16 GB Disks: 42x 15K SCSI 146 GB disks. Also, the large table has been vacuumed recently. Lastly, we are dumping the table over nfs to very large sata array. Thanks again and Happy Holidays, Ted
Ted Allen <tallen@blackducksoftware.com> writes: > during the upgrade. The trouble is, when I dump the largest table, > which is 1.1 Tb with indexes, I keep getting the following error at the > same point in the dump. > pg_dump: SQL command failed > pg_dump: Error message from server: ERROR: invalid string enlargement > request size 1 > pg_dump: The command was: COPY public.large_table (id, data) TO stdout; > As you can see, the table is two columns, one column is an integer, and > the other is bytea. Each cell in the data column can be as large as > 600mb (we had bigger rows as well but we thought they were the source of > the trouble and moved them elsewhere to be dealt with separately.) 600mb measured how? I have a feeling the problem is that the value exceeds 1Gb when converted to text form... regards, tom lane
600mb measured by get_octet_length on data. If there is a better way to measure the row/cell size, please let me know becausewe thought it was the >1Gb problem too. We thought we were being conservative by getting rid of the larger rows butI guess we need to get rid of even more. Thanks, Ted ________________________________________ From: Tom Lane [tgl@sss.pgh.pa.us] Sent: Wednesday, December 24, 2008 12:49 PM To: Ted Allen Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Troubles dumping a very large table. Ted Allen <tallen@blackducksoftware.com> writes: > during the upgrade. The trouble is, when I dump the largest table, > which is 1.1 Tb with indexes, I keep getting the following error at the > same point in the dump. > pg_dump: SQL command failed > pg_dump: Error message from server: ERROR: invalid string enlargement > request size 1 > pg_dump: The command was: COPY public.large_table (id, data) TO stdout; > As you can see, the table is two columns, one column is an integer, and > the other is bytea. Each cell in the data column can be as large as > 600mb (we had bigger rows as well but we thought they were the source of > the trouble and moved them elsewhere to be dealt with separately.) 600mb measured how? I have a feeling the problem is that the value exceeds 1Gb when converted to text form... regards, tom lane
Ted Allen <tallen@blackducksoftware.com> writes: > 600mb measured by get_octet_length on data. If there is a better way to measure the row/cell size, please let me knowbecause we thought it was the >1Gb problem too. We thought we were being conservative by getting rid of the larger rowsbut I guess we need to get rid of even more. Yeah, the average expansion of bytea data in COPY format is about 3X :-( So you need to get the max row length down to around 300mb. I'm curious how you got the data in to start with --- were the values assembled on the server side? regards, tom lane
On Fri, Dec 26, 2008 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ted Allen <tallen@blackducksoftware.com> writes: >> 600mb measured by get_octet_length on data. If there is a better way to measure the row/cell size, please let me knowbecause we thought it was the >1Gb problem too. We thought we were being conservative by getting rid of the larger rowsbut I guess we need to get rid of even more. > > Yeah, the average expansion of bytea data in COPY format is about 3X :-( > So you need to get the max row length down to around 300mb. I'm curious > how you got the data in to start with --- were the values assembled on > the server side? Wouldn't binary style COPY be more forgiving in this regard? (if so, the OP might have better luck running COPY BINARY)... This also goes for libpq traffic..large (>1mb) bytea definately want to be passed using the binary switch in the protocol. merlin
"Merlin Moncure" <mmoncure@gmail.com> writes: > On Fri, Dec 26, 2008 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, the average expansion of bytea data in COPY format is about 3X :-( >> So you need to get the max row length down to around 300mb. I'm curious >> how you got the data in to start with --- were the values assembled on >> the server side? > Wouldn't binary style COPY be more forgiving in this regard? (if so, > the OP might have better luck running COPY BINARY)... Yeah, if he's willing to use COPY BINARY directly. AFAIR there is not an option to get pg_dump to use it. But maybe "pg_dump -s" together with a manual dump of the table data is the right answer. It probably beats shoving some of the rows aside as he's doing now... regards, tom lane
I was hoping use pg_dump and not to have to do a manual dump but if that latest solution (moving rows >300mb elsewhere and dealing with them later) does not work I'll try that. Thanks everyone. Merlin Moncure wrote: > On Fri, Dec 26, 2008 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Ted Allen <tallen@blackducksoftware.com> writes: >> >>> 600mb measured by get_octet_length on data. If there is a better way to measure the row/cell size, please let me knowbecause we thought it was the >1Gb problem too. We thought we were being conservative by getting rid of the larger rowsbut I guess we need to get rid of even more. >>> >> Yeah, the average expansion of bytea data in COPY format is about 3X :-( >> So you need to get the max row length down to around 300mb. I'm curious >> how you got the data in to start with --- were the values assembled on >> the server side? >> > > Wouldn't binary style COPY be more forgiving in this regard? (if so, > the OP might have better luck running COPY BINARY)... > > This also goes for libpq traffic..large (>1mb) bytea definately want > to be passed using the binary switch in the protocol. > > merlin >
Hi, Le vendredi 26 décembre 2008, Tom Lane a écrit : > Yeah, if he's willing to use COPY BINARY directly. AFAIR there is not > an option to get pg_dump to use it. Would it be possible to consider such an additional switch to pg_dump? Of course the DBA has to know when to use it safely, but if the plan is to be able to restore later dump on the same machine to recover from some human error (oops, forgot the WHERE clause to this DELETE statement), it seems it would be a good idea. Regards, -- dim