Re: parallel pg_restore blocks on heavy random read I/O on all children processes
От | Dimitrios Apostolou |
---|---|
Тема | Re: parallel pg_restore blocks on heavy random read I/O on all children processes |
Дата | |
Msg-id | 1cbb9bd6-60cd-92cb-c3c2-4cf4fd8a7b64@gmx.net обсуждение исходный текст |
Ответ на | Re: parallel pg_restore blocks on heavy random read I/O on all children processes (Dimitrios Apostolou <jimis@gmx.net>) |
Ответы |
Re: parallel pg_restore blocks on heavy random read I/O on all children processes
|
Список | pgsql-performance |
Hello again, I traced the seeking-reading behaviour of parallel pg_restore inside _skipData() when called from _PrintTocData(). Since most of today's I/O devices (both rotating and solid state) can read 1MB faster sequentially than it takes to seek and read 4KB, I tried the following change: diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c index 55107b20058..262ba509829 100644 --- a/src/bin/pg_dump/pg_backup_custom.c +++ b/src/bin/pg_dump/pg_backup_custom.c @@ -618,31 +618,31 @@ _skipLOs(ArchiveHandle *AH) * Skip data from current file position. * Data blocks are formatted as an integer length, followed by data. * A zero length indicates the end of the block. */ static void _skipData(ArchiveHandle *AH) { lclContext *ctx = (lclContext *) AH->formatData; size_t blkLen; char *buf = NULL; int buflen = 0; blkLen = ReadInt(AH); while (blkLen != 0) { - if (ctx->hasSeek) + if (ctx->hasSeek && blkLen > 1024 * 1024) { if (fseeko(AH->FH, blkLen, SEEK_CUR) != 0) pg_fatal("error during file seek: %m"); } else { if (blkLen > buflen) { free(buf); buf = (char *) pg_malloc(blkLen); buflen = blkLen; } if (fread(buf, 1, blkLen, AH->FH) != blkLen) { if (feof(AH->FH)) This simple change improves immensely (10x maybe, depends on the number of workers) the offset-table building phase of the parallel backup. A problem still remaining is that this offset-table building phase is done on every worker process, which means that all workers scan almost in parallel the whole archive. A more intrusive improvement would be to move this phase to the parent process, before spawning the children. What do you think? Regards, Dimitris P.S. I also have a simple change that changes -j1 switch to mean "parallel but with one worker process", that I did for debugging purposes. Not sure if it is of interest here.
В списке pgsql-performance по дате отправления: