Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
От | Masahiro Ikeda |
---|---|
Тема | Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested. |
Дата | |
Msg-id | c96d8989100e4bce4fa586064aa7e0e9@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested. (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Ответы |
Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
(Fujii Masao <masao.fujii@oss.nttdata.com>)
|
Список | pgsql-hackers |
On 2021-03-18 13:37, Fujii Masao wrote: > On 2021/03/18 11:59, kuroda.hayato@fujitsu.com wrote: >> Dear Ikeda-san, >> >> I confirmed new patch and no problem was found. Thanks. >> (I'm not a native English speaker, so I cannot check your comments >> correctly, sorry) > > One user-visible side-effect by this change is; with the patch, the > stats is > cleared when only the stats collector is killed (with SIGQUIT) > accidentally > and restarted by postmaster later. Thanks for your comments. As you said, the temporary stats files don't removed if the stats collector is killed with SIGQUIT. So, if the user change the GUC parameter "stats_temp_directory" after immediate shutdown, temporary stats file can't be removed forever. But, I think this case is rarely happened and unimportant. Actually, pgstat_write_statsfiles() didn't check error of unlink() and the same problem is occurred if the server is crashed now. The documentation said following. I think it's enough. ``` For better performance, <varname>stats_temp_directory</varname> can be pointed at a RAM-based file system, decreasing physical I/O requirements. When the server shuts down cleanly, a permanent copy of the statistics data is stored in the <filename>pg_stat</filename> subdirectory, so that statistics can be retained across server restarts. When recovery is performed at server start (e.g., after immediate shutdown, server crash, and point-in-time recovery), all statistics counters are reset. ``` > On the other than, currently the stats is > written in that case and subsequently-restarted stats collector can use > that stats file. I'm not sure if we need to keep supporting this > behavior, though. I don't have any strong opinion this behaivor is useless too. Since the reinitialized phase is not executed when only the stats collector is crashed (since it didn't touch the shared memory), if the permanent stats file is exists, the stats collector can use it. But, IIUC, the case is rare. The case is happened by operation mistake which a operator sends the SIGQUIT signal to the stats collector. Please let me know if there are more important case. But, if SIGKILL is sent by operator, the stats can't be rescure now because the permanent stats files can't be written before exiting. Since the case which can rescure the stats is rare, I think it's ok to initialize the stats even if SIGQUIT is sent. If to support this feature, we need to implement the following first. > (2) As another aspect, it needs to change the behavior removing all > stats files because autovacuum > uses the stats. There are some ideas, for example writing the stats > files every X minutes > (using wal or another mechanism) and even if a crash occurs, the > startup process can restore > the stats with slightly low freshness. But, it needs to find a way > how to handle the stats files > when deleting on PITR rewind or stats collector crash happens. > When only the stats collector exits by SIGQUIT, with the patch > FreeWaitEventSet() is also skipped. Is this ok? Thanks, I fixed it. > - * Loop to process messages until we get SIGQUIT or detect ungraceful > - * death of our parent postmaster. > + * Loop to process messages until we get SIGTERM or SIGQUIT of our > parent > + * postmaster. > > "SIGTERM or SIGQUIT of our parent postmaster" should be > "SIGTERM, SIGQUIT, or detect ungraceful death of our parent > postmaster"? Yes, I fixed it. > +SignalHandlerForUnsafeExit(SIGNAL_ARGS) > > I don't think SignalHandlerForUnsafeExit is good name. Because that's > not > "unsafe" exit. No? Even after this signal handler is triggered, the > server is > still running normally and a process that exits will be restarted > later. What > about SignalHandlerForNonCrashExit or SignalHandlerForNonFatalExit? OK, I fixed. I changed to the SignalPgstatHandlerForNonCrashExit() to add FreeWaitEventSet() in the handler for the stats collector. Regards, -- Masahiro Ikeda NTT DATA CORPORATION
Вложения
В списке pgsql-hackers по дате отправления:
Следующее
От: Thomas MunroДата:
Сообщение: Re: fdatasync performance problem with large number of DB files