Обсуждение: Progress report removal of temp files and temp relation files using ereport_startup_progress
Progress report removal of temp files and temp relation files using ereport_startup_progress
От
Bharath Rupireddy
Дата:
Hi, At times, there can be many temp files (under pgsql_tmp) and temp relation files (under removal which after crash may take longer during which users have no clue about what's going on in the server before it comes up online. Here's a proposal to use ereport_startup_progress to report the progress of the file removal. Thoughts? Regards, Bharath Rupireddy.
Вложения
Re: Progress report removal of temp files and temp relation files using ereport_startup_progress
От
Ashutosh Bapat
Дата:
Hi Bharath, On Sat, Apr 30, 2022 at 11:08 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Hi, > > At times, there can be many temp files (under pgsql_tmp) and temp > relation files (under removal which after crash may take longer during > which users have no clue about what's going on in the server before it > comes up online. > > Here's a proposal to use ereport_startup_progress to report the > progress of the file removal. > > Thoughts? The patch looks good to me. With this patch, the user would at least know which directory is being scanned and how much time has elapsed. It would be better to know how much work is remaining. I could not find a way to estimate the number of files in the directory so that we can extrapolate elapsed time and estimate the remaining time. Well, we could loop the output of opendir() twice, first to estimate and then for the actual work. This might actually work, if the time to delete all the files is very high compared to the time it takes to scan all the files/directories. Another possibility is to scan the sorted output of opendir() thus using the current file name to estimate remaining files in a very crude and inaccurate way. That doesn't look attractive either. I can't think of any better way to estimate the remaining time. But at least with this patch, a user knows which files have been deleted, guessing how far, in the directory structure, the process has reached. S/he can then take a look at the remaining contents of the directory to estimate how much it should wait. -- Best Wishes, Ashutosh Bapat
Re: Progress report removal of temp files and temp relation files using ereport_startup_progress
От
Bharath Rupireddy
Дата:
On Mon, May 2, 2022 at 6:26 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Hi Bharath, > > > On Sat, Apr 30, 2022 at 11:08 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Hi, > > > > At times, there can be many temp files (under pgsql_tmp) and temp > > relation files (under removal which after crash may take longer during > > which users have no clue about what's going on in the server before it > > comes up online. > > > > Here's a proposal to use ereport_startup_progress to report the > > progress of the file removal. > > > > Thoughts? > > The patch looks good to me. > > With this patch, the user would at least know which directory is being > scanned and how much time has elapsed. There's a problem with the patch, the timeout mechanism isn't being used by the postmaster process. Postmaster doesn't InitializeTimeouts() and doesn't register STARTUP_PROGRESS_TIMEOUT, I tried to make postmaster do that (attached a v2 patch) but make check fails. Now, I'm thinking if it's a good idea to let postmaster use timeouts at all? > It would be better to know how > much work is remaining. I could not find a way to estimate the number > of files in the directory so that we can extrapolate elapsed time and > estimate the remaining time. Well, we could loop the output of > opendir() twice, first to estimate and then for the actual work. This > might actually work, if the time to delete all the files is very high > compared to the time it takes to scan all the files/directories. > > Another possibility is to scan the sorted output of opendir() thus > using the current file name to estimate remaining files in a very > crude and inaccurate way. That doesn't look attractive either. I can't > think of any better way to estimate the remaining time. I think 'how much work/how many files remaining to process' is a generic problem, for instance, snapshot, mapping files, old WAL file processing and so on. I don't think we can do much about it. > But at least with this patch, a user knows which files have been > deleted, guessing how far, in the directory structure, the process has > reached. S/he can then take a look at the remaining contents of the > directory to estimate how much it should wait. Not sure we will be able to use the timeout mechanism within postmaster. Another idea is to have a generic GUC something like log_file_processing_traffic = {none, medium, high} (similar idea is proposed for WAL files processing while replaying/recovering at [1]), default being none, when set to medium a log message gets emitted for every say 128 or 256 (just a random number) files processed. when set to high, log messages get emitted for every file processed (too verbose). I think this generic GUC log_file_processing_traffic can be used in many other file processing areas. Thoughts? [1] https://www.postgresql.org/message-id/CALj2ACVnhbx4pLZepvdqOfeOekvZXJ2F%3DwJeConGzok%2B6kgCVA%40mail.gmail.com Regards, Bharath Rupireddy.
Вложения
Re: Progress report removal of temp files and temp relation files using ereport_startup_progress
От
Bharath Rupireddy
Дата:
On Thu, May 5, 2022 at 12:11 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, May 2, 2022 at 6:26 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > Hi Bharath, > > > > > > On Sat, Apr 30, 2022 at 11:08 AM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > Hi, > > > > > > At times, there can be many temp files (under pgsql_tmp) and temp > > > relation files (under removal which after crash may take longer during > > > which users have no clue about what's going on in the server before it > > > comes up online. > > > > > > Here's a proposal to use ereport_startup_progress to report the > > > progress of the file removal. > > > > > > Thoughts? > > > > The patch looks good to me. > > > > With this patch, the user would at least know which directory is being > > scanned and how much time has elapsed. > > There's a problem with the patch, the timeout mechanism isn't being > used by the postmaster process. Postmaster doesn't > InitializeTimeouts() and doesn't register STARTUP_PROGRESS_TIMEOUT, I > tried to make postmaster do that (attached a v2 patch) but make check > fails. > > Now, I'm thinking if it's a good idea to let postmaster use timeouts at all? Here's the v3 patch, which adds progress reports for temp file removal under the pgsql_tmp directory and temporary relation files under the pg_tblspc directory, regression tests pass with it. Regards, Bharath Rupireddy.