O(n) tasks cause lengthy startups and checkpoints

Поиск
Список
Период
Сортировка
От Bossart, Nathan
Тема O(n) tasks cause lengthy startups and checkpoints
Дата
Msg-id C1EE64B0-D4DB-40F3-98C8-0CED324D34CB@amazon.com
обсуждение исходный текст
Ответы Re: O(n) tasks cause lengthy startups and checkpoints  (SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com>)
Re: O(n) tasks cause lengthy startups and checkpoints  (Andres Freund <andres@anarazel.de>)
Re: O(n) tasks cause lengthy startups and checkpoints  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
Hi hackers,

Thanks to 61752af, SyncDataDirectory() can make use of syncfs() to
avoid individually syncing all database files after a crash.  However,
as noted earlier this year [0], there are still a number of O(n) tasks
that affect startup and checkpointing that I'd like to improve.
Below, I've attempted to summarize each task and to offer ideas for
improving matters.  I'll likely split each of these into its own
thread, given there is community interest for such changes.

1) CheckPointSnapBuild(): This function loops through
   pg_logical/snapshots to remove all snapshots that are no longer
   needed.  If there are many entries in this directory, this can take
   a long time.  The note above this function indicates that this is
   done during checkpoints simply because it is convenient.  IIUC
   there is no requirement that this function actually completes for a
   given checkpoint.  My current idea is to move this to a new
   maintenance worker.
2) CheckPointLogicalRewriteHeap(): This function loops through
   pg_logical/mappings to remove old mappings and flush all remaining
   ones.  IIUC there is no requirement that the "remove old mappings"
   part must complete for a given checkpoint, but the "flush all
   remaining" portion allows replay after a checkpoint to only "deal
   with the parts of a mapping that have been written out after the
   checkpoint started."  Therefore, I think we should move the "remove
   old mappings" part to a new maintenance worker (probably the same
   one as for 1), and we should consider using syncfs() for the "flush
   all remaining" part.  (I suspect the main argument against the
   latter will be that it could cause IO spikes.)
3) RemovePgTempFiles(): This step can delay startup if there are many
   temporary files to individually remove.  This step is already
   optionally done after a crash via the remove_temp_files_after_crash
   GUC.  I propose that we have startup move the temporary file
   directories aside and create new ones, and then a separate worker
   (probably the same one from 1 and 2) could clean up the old files.
4) StartupReorderBuffer(): This step deletes logical slot data that
   has been spilled to disk.  This code appears to be written to avoid
   deleting different types of files in these directories, but AFAICT
   there shouldn't be any other files.  Therefore, I think we could do
   something similar to 3 (i.e., move the directories aside during
   startup and clean them up via a new maintenance worker).

I realize adding a new maintenance worker might be a bit heavy-handed,
but I think it would be nice to have somewhere to offload tasks that
really shouldn't impact startup and checkpointing.  I imagine such a
process would come in handy down the road, too.  WDYT?

Nathan

[0] https://postgr.es/m/32B59582-AA6C-4609-B08F-2256A271F7A5%40amazon.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: SKIP LOCKED assert triggered
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Column Filtering in Logical Replication