I did a quick grep for PG_CATCH uses to see what we have along the lines
of this bug:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php
In current sources there are three places at risk:
btbulkdelete, as noted in the above message
pg_start_backup needs to reset forcePageWrites = false
createdb wants to UnlockSharedObject and delete any already-copied files
In createdb, the Unlock actually doesn't need to get cleaned up, since
transaction abort would release the lock anyway. But leaving a possibly
large mess of orphaned files doesn't seem nice.
ISTM that there will be more cases like this in future, so we need a
general solution anyway. I propose the following sort of code structure
for these situations:
on_shmem_exit(cleanup_routine, arg);PG_TRY();{ ... do something ... cancel_shmem_exit(cleanup_routine,
arg);}PG_CATCH();{ cancel_shmem_exit(cleanup_routine, arg); cleanup_routine(arg); PG_RE_THROW();}PG_END_TRY();
where cancel_shmem_exit is defined to pop the latest shmem_exit_list
entry if it matches the passed arguments (I don't see any particular
need to search further than that in the list). This structure
guarantees that cleanup_routine will be called on the way out of
either a plain ERROR or a FATAL exit.
Thoughts? We clearly must do something about this before we can even
think of calling retail SIGTERM a supported feature ...
regards, tom lane