Обсуждение: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails
[PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails
От
Jianghua Yang
Дата:
Hi,
I found a file descriptor leak in the pg_dump compression backends
(gzip, lz4, zstd, and the uncompressed "none" backend), introduced
in commit e9960732a9 ("Introduce a generic pg_dump compression API",
PostgreSQL 16).
== The Bug ==
All four compression open functions use this pattern when an existing
file descriptor is passed in:
if (fd >= 0)
fp = fdopen(dup(fd), mode); /* or gzdopen() */
if (fp == NULL)
return false; /* dup'd fd is leaked here */
The problem is that dup(fd) and fdopen()/gzdopen() are two separate
steps, and their failure modes must be handled independently:
1. If dup() fails (returns -1), no fd is created -- this case was
not checked at all in the original code.
2. If dup() succeeds but fdopen()/gzdopen() fails (e.g., due to a
failed malloc(3) for the FILE structure), POSIX explicitly states:
"If the fdopen() function fails, the file descriptor is
not closed."
-- POSIX.1-2017, fdopen() specification
The duplicated fd therefore remains open with no owner, leaking
until the process exits.
== When Can dup() Fail? ==
The most realistic trigger for dup() returning EMFILE is parallel
pg_dump (pg_dump -j N) against a large database. Each worker opens
multiple file descriptors for tables, indexes, TOC files, and
compression streams simultaneously. On systems with a low per-process
fd limit (e.g., ulimit -n 1024), or when dumping a schema with a very
large number of objects, the process fd count can approach the limit.
At that point, dup() fails with EMFILE and the subsequent NULL-check
on fp returns false without any cleanup.
While fdopen() failure (EMFILE/OOM) is less common, it is equally
incorrect to ignore -- and it is precisely the case that POSIX calls
out as the caller's responsibility to close.
== The Fix ==
Save the result of dup() in a local variable. Check it immediately.
If fdopen()/gzdopen() subsequently fails, explicitly close the
duplicated fd before returning false.
if (fd >= 0)
{
int dup_fd = dup(fd);
if (dup_fd < 0)
return false;
fp = fdopen(dup_fd, mode);
if (fp == NULL)
{
close(dup_fd); /* POSIX: fdopen does not close on
failure */
return false;
}
}
else
{
fp = fopen(path, mode);
if (fp == NULL)
return false;
}
The zstd fix additionally ensures that the previously allocated
zstdcs structure is freed on all new failure paths.
== Affected Versions ==
PostgreSQL 16 and 17, and current master.
The compress_* files were introduced in commit e9960732a9 (Feb 2023).
== Patch ==
Patch attached. It applies cleanly to current master (as of
commit 18bcdb75d15).
Regards,
Jianghua Yang
Вложения
Jianghua Yang <yjhjstz@gmail.com> writes:
> == The Bug ==
> All four compression open functions use this pattern when an existing
> file descriptor is passed in:
> if (fd >= 0)
> fp = fdopen(dup(fd), mode); /* or gzdopen() */
> if (fp == NULL)
> return false; /* dup'd fd is leaked here */
> The problem is that dup(fd) and fdopen()/gzdopen() are two separate
> steps, and their failure modes must be handled independently:
Hmm. You're right that we could leak the dup'd FD, but would it matter?
I'm pretty sure all these programs will just exit immediately on
failure.
I'm not averse to improving the code, but I'm not sure there is
a live bug worth back-patching.
regards, tom lane
Re: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails
От
Jianghua Yang
Дата:
You're correct. All callers invoke pg_fatal() on failure, so the
process exits immediately and the OS reclaims the fd. There is no
live bug worth back-patching on those grounds.
That said, the patch does fix a real diagnostic problem. In the
original code, when dup() fails with EMFILE, the -1 return value is
passed directly to fdopen(), which fails with EBADF. The user sees:
pg_dump: error: could not open output file: Bad file descriptor
which is misleading -- the actual cause is fd exhaustion, not a bad
descriptor. With the patch, errno is preserved correctly, so the
message becomes:
pg_dump: error: could not open output file: Too many open files
which gives the user actionable information.
I'm happy to limit this to HEAD only if back-patching is not
warranted.
Regards,
Jianghua Yang
process exits immediately and the OS reclaims the fd. There is no
live bug worth back-patching on those grounds.
That said, the patch does fix a real diagnostic problem. In the
original code, when dup() fails with EMFILE, the -1 return value is
passed directly to fdopen(), which fails with EBADF. The user sees:
pg_dump: error: could not open output file: Bad file descriptor
which is misleading -- the actual cause is fd exhaustion, not a bad
descriptor. With the patch, errno is preserved correctly, so the
message becomes:
pg_dump: error: could not open output file: Too many open files
which gives the user actionable information.
I'm happy to limit this to HEAD only if back-patching is not
warranted.
Regards,
Jianghua Yang
Tom Lane <tgl@sss.pgh.pa.us> 于2026年3月19日周四 10:08写道:
Jianghua Yang <yjhjstz@gmail.com> writes:
> == The Bug ==
> All four compression open functions use this pattern when an existing
> file descriptor is passed in:
> if (fd >= 0)
> fp = fdopen(dup(fd), mode); /* or gzdopen() */
> if (fp == NULL)
> return false; /* dup'd fd is leaked here */
> The problem is that dup(fd) and fdopen()/gzdopen() are two separate
> steps, and their failure modes must be handled independently:
Hmm. You're right that we could leak the dup'd FD, but would it matter?
I'm pretty sure all these programs will just exit immediately on
failure.
I'm not averse to improving the code, but I'm not sure there is
a live bug worth back-patching.
regards, tom lane
Jianghua Yang <yjhjstz@gmail.com> writes:
> That said, the patch does fix a real diagnostic problem. In the
> original code, when dup() fails with EMFILE, the -1 return value is
> passed directly to fdopen(), which fails with EBADF. The user sees:
> pg_dump: error: could not open output file: Bad file descriptor
> which is misleading -- the actual cause is fd exhaustion, not a bad
> descriptor. With the patch, errno is preserved correctly, so the
> message becomes:
> pg_dump: error: could not open output file: Too many open files
> which gives the user actionable information.
Fair point. Still, this is such an unlikely edge-case that
I don't think it's worth a back-patch. Let's just do HEAD.
regards, tom lane
I wrote:
> Fair point. Still, this is such an unlikely edge-case that
> I don't think it's worth a back-patch. Let's just do HEAD.
Pushed. I also looked around for other instances of the same
pattern, and couldn't find any.
regards, tom lane
Re: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails
От
Jianghua Yang
Дата:
Thanks for pushing and for the extra scan.
Regards,
Jianghua Yang
Regards,
Jianghua Yang
Tom Lane <tgl@sss.pgh.pa.us> 于2026年3月19日周四 11:26写道:
I wrote:
> Fair point. Still, this is such an unlikely edge-case that
> I don't think it's worth a back-patch. Let's just do HEAD.
Pushed. I also looked around for other instances of the same
pattern, and couldn't find any.
regards, tom lane