Обсуждение: Patch to speed up pg_dump
Hi, We have a database with tens of millions of large objects, none of them with any comments. When running pg_dump it spends several hours looking for BLOB comments, finding none at the end but taxing the server so much that the simplest query takes seconds to complete. The attached patch fixes this by fetching the description only of those BLOBs that may have it. For those interested, some more info: This query takes about 2 hours to execute: sw2=# select count(*) from pg_largeobject; count ----------- 135807552 (1 row) I'm throttling the transfer rate on pg_dump's stdout so that it doesn't affect server performance a lot, but obviously it didn't help the function saving (looking for) BLOB comments. Regards, Tamas --- postgresql-8.3.7/src/bin/pg_dump/pg_dump.c.orig 2009-03-31 15:47:28.000000000 -0400 +++ postgresql-8.3.7/src/bin/pg_dump/pg_dump.c 2009-04-01 10:25:39.000000000 -0400 @@ -1759,7 +1759,13 @@ /* Cursor to get all BLOB comments */ if (AH->remoteVersion >= 70200) - blobQry = "DECLARE blobcmt CURSOR FOR SELECT loid, obj_description(loid, 'pg_largeobject') FROM (SELECT DISTINCTloid FROM pg_largeobject) ss"; + /* Get comments for BLOBs that have a matching pg_description row. When there are many + * (millions) of BLOBs without comments this avoids fetching and then ignoring them, + * potentionally saving hours of backup time. + * Note that it may still select BLOBs that have no comment if a pg_description row's objoid + * matches a BLOB's loid, but references an object contained in a different system catalog, + * thus the PQgetisnull() check below is still needed to ignore them. */ + blobQry = "DECLARE blobcmt CURSOR FOR SELECT loid, obj_description(loid, 'pg_largeobject') FROM (SELECT DISTINCTloid FROM pg_description JOIN pg_largeobject ON (pg_description.objoid = pg_largeobject.loid)) ss"; else if (AH->remoteVersion >= 70100) blobQry = "DECLARE blobcmt CURSOR FOR SELECT loid, obj_description(loid) FROM (SELECT DISTINCT loid FROM pg_largeobject)ss"; else
"Vincze, Tamas" <vincze@neb.com> writes: > + * Note that it may still select BLOBs that have no comment if a pg_description row's objoid > + * matches a BLOB's loid, but references an object contained in a different system catalog, ... seems like that would be easy to fix ... regards, tom lane
Tom Lane wrote: > "Vincze, Tamas" <vincze@neb.com> writes: >> + * Note that it may still select BLOBs that have no comment if a pg_description row's objoid >> + * matches a BLOB's loid, but references an object contained in a different system catalog, > > ... seems like that would be easy to fix ... Yes, it wasn't that hard. The revised patch is attached. Originally I didn't want to add more dependencies on the system catalogs. Also, I've left the DECLARE statements untouched for pre-v7.2 backends, so the NULL check on the comment is still needed for those cases and if the description itself is NULL. Regards, Tamas --- postgresql-8.3.7/src/bin/pg_dump/pg_dump.c.orig 2009-03-31 15:47:28.000000000 -0400 +++ postgresql-8.3.7/src/bin/pg_dump/pg_dump.c 2009-04-01 14:07:55.000000000 -0400 @@ -1759,7 +1759,18 @@ /* Cursor to get all BLOB comments */ if (AH->remoteVersion >= 70200) - blobQry = "DECLARE blobcmt CURSOR FOR SELECT loid, obj_description(loid, 'pg_largeobject') FROM (SELECT DISTINCTloid FROM pg_largeobject) ss"; + /* Get comments for BLOBs that have a matching pg_description row. When there are many + * (millions) of BLOBs without comments this avoids fetching and then ignoring them, + * potentionally saving hours of backup time. */ + blobQry = "DECLARE blobcmt CURSOR FOR SELECT loid, obj_description(loid, 'pg_largeobject') FROM (" + "SELECT DISTINCT loid FROM pg_description JOIN pg_largeobject " + "ON (pg_description.objoid = pg_largeobject.loid) " + "WHERE classoid = (" + "SELECT oid FROM pg_class WHERE relname = 'pg_largeobject' AND relnamespace = (" + "SELECT oid FROM pg_namespace WHERE nspname='pg_catalog'" + ")" + ") AND objsubid = 0" + ") ss"; else if (AH->remoteVersion >= 70100) blobQry = "DECLARE blobcmt CURSOR FOR SELECT loid, obj_description(loid) FROM (SELECT DISTINCT loid FROM pg_largeobject)ss"; else
"Vincze, Tamas" <vincze@neb.com> writes: > Tom Lane wrote: >> "Vincze, Tamas" <vincze@neb.com> writes: > + * Note that it may still select BLOBs that have no comment if a pg_description row's objoid > + * matches a BLOB's loid, but references an object contained in a different system catalog, >> >> ... seems like that would be easy to fix ... > Yes, it wasn't that hard. The revised patch is attached. Applied to HEAD and 8.3, using regclass cast to simplify ... regards, tom lane