Обсуждение: IO wait events for COPY FROM/TO PROGRAM or file
Hi hackers,
Following up on the discussion about wait event coverage for COPY operations [1], here's a tiny patch that adds two new IO wait events:
This enables diagnosing for cases like:
- storage I/O bottlenecks during bulk loads slow exports to files (COPY FROM/TO '/path/to/file')
- pipe buffer congestion in ETL pipelines (COPY FROM/TO PROGRAM)
Following up on the discussion about wait event coverage for COPY operations [1], here's a tiny patch that adds two new IO wait events:
- COPY_DATA_READ: COPY FROM blocking on file/program read
- COPY_DATA_WRITE: COPY TO blocking on file/program write
- storage I/O bottlenecks during bulk loads slow exports to files (COPY FROM/TO '/path/to/file')
- pipe buffer congestion in ETL pipelines (COPY FROM/TO PROGRAM)
Note: as it turned out, COPY FROM/TO STDIN/STDOUT already have coverage via Client/ClientRead and Client/ClientWrite at the protocol layer (thus, pg_dump/pg_restore are already covered). So only file/program cases needed instrumentation.
Nikx
Вложения
On Thu, Jan 08, 2026 at 01:53:17PM -0800, Nikolay Samokhvalov wrote:
> Following up on the discussion about wait event coverage for COPY
> operations [1], here's a tiny patch that adds two new IO wait events:
> - COPY_DATA_READ: COPY FROM blocking on file/program read
> - COPY_DATA_WRITE: COPY TO blocking on file/program write
case COPY_FILE:
+ pgstat_report_wait_start(WAIT_EVENT_COPY_DATA_WRITE);
if (fwrite(fe_msgbuf->data, fe_msgbuf->len, 1,
cstate->copy_file) != 1 ||
ferror(cstate->copy_file))
@@ -486,6 +487,7 @@ CopySendEndOfRow(CopyToState cstate)
(errcode_for_file_access(),
errmsg("could not write to COPY file: %m")));
}
+ pgstat_report_wait_end();
Hmm. Makes sense to me (perhaps you have played with an LLM to find
out fread() and fwrite() patterns that would need an event?), but I
don't think that the part about COPY TO is done right. The wait event
should be set during the fwrite() call only, similarly to the COPY
FROM case, by saving the result returned by fwrite() in a separate
variable.
--
Michael
Вложения
Hi,
Thanks for this patch. I have a few comments.
> Hmm. Makes sense to me (perhaps you have played with an LLM to find
> out fread() and fwrite() patterns that would need an event?), but I
> don't think that the part about COPY TO is done right. The wait event
> should be set during the fwrite() call only, similarly to the COPY
> FROM case, by saving the result returned by fwrite() in a separate
> variable.
I see other cases such as inside copy_file() in which the read() is wrapped
by wait_start and wait_end, but in the write() case, the wait_end() is after
the error handling ( as the attached v1 ).
```
pgstat_report_wait_start(WAIT_EVENT_COPY_FILE_READ);
nbytes = read(srcfd, buffer, COPY_BUF_SIZE);
pgstat_report_wait_end();
.....
..........
pgstat_report_wait_start(WAIT_EVENT_COPY_FILE_WRITE);
if ((int) write(dstfd, buffer, nbytes) != nbytes)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tofile)));
}
pgstat_report_wait_end();
```
another example is write_relmap_file():
```
/* Write new data to the file. */
pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write file \"%s\": %m",
maptempfilename)));
}
pgstat_report_wait_end();
```
I don't see any issue with this approach for fwrite().
Another comment. Wouldn't be better to use "COPY FROM" and "COPY TO" in the
names to make it more obvious they are related to the COPY command?
so instead of:
+COPY_DATA_READ "Waiting for a read from a file or program during COPY FROM."
+COPY_DATA_WRITE "Waiting for a write to a file or program during COPY TO."
this:
COPY_FROM_READ: "Waiting to read data from a file or program during COPY FROM."
COPY_TO_WRITE: "Waiting to write data to a file or program during COPY TO."
--
Sami Imseih
Amazon Web Services (AWS)
On Fri, Jan 9, 2026 at 8:52 AM Sami Imseih <samimseih@gmail.com> wrote:
>
> Hi,
>
> Thanks for this patch. I have a few comments.
>
> > Hmm. Makes sense to me (perhaps you have played with an LLM to find
> > out fread() and fwrite() patterns that would need an event?), but I
> > don't think that the part about COPY TO is done right. The wait event
> > should be set during the fwrite() call only, similarly to the COPY
> > FROM case, by saving the result returned by fwrite() in a separate
> > variable.
>
> I see other cases such as inside copy_file() in which the read() is wrapped
> by wait_start and wait_end, but in the write() case, the wait_end() is after
> the error handling ( as the attached v1 ).
>
> ```
> pgstat_report_wait_start(WAIT_EVENT_COPY_FILE_READ);
> nbytes = read(srcfd, buffer, COPY_BUF_SIZE);
> pgstat_report_wait_end();
> .....
> ..........
> pgstat_report_wait_start(WAIT_EVENT_COPY_FILE_WRITE);
> if ((int) write(dstfd, buffer, nbytes) != nbytes)
> {
> /* if write didn't set errno, assume problem is no disk space */
> if (errno == 0)
> errno = ENOSPC;
> ereport(ERROR,
> (errcode_for_file_access(),
> errmsg("could not write to file \"%s\": %m", tofile)));
> }
> pgstat_report_wait_end();
> ```
> another example is write_relmap_file():
>
> ```
> /* Write new data to the file. */
> pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
> if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
> {
> /* if write didn't set errno, assume problem is no disk space */
> if (errno == 0)
> errno = ENOSPC;
> ereport(ERROR,
> (errcode_for_file_access(),
> errmsg("could not write file \"%s\": %m",
> maptempfilename)));
> }
> pgstat_report_wait_end();
> ```
>
> I don't see any issue with this approach for fwrite().
>
> Another comment. Wouldn't be better to use "COPY FROM" and "COPY TO" in the
> names to make it more obvious they are related to the COPY command?
>
> so instead of:
>
> +COPY_DATA_READ "Waiting for a read from a file or program during COPY FROM."
> +COPY_DATA_WRITE "Waiting for a write to a file or program during COPY TO."
>
> this:
>
> COPY_FROM_READ: "Waiting to read data from a file or program during COPY FROM."
> COPY_TO_WRITE: "Waiting to write data to a file or program during COPY TO."
IMHO this suggestion makes sense to avoid confusion with similar wait
event names like COPY_FILE_READ etc. So using COPY_FROM and COPY_TO
is clearer for the purpose.
--
Regards,
Dilip Kumar
Google