Re: Use durable_unlink for .ready and .done files for WAL segmentremoval

Поиск
Список
Период
Сортировка
От Bossart, Nathan
Тема Re: Use durable_unlink for .ready and .done files for WAL segmentremoval
Дата
Msg-id 35A15B11-2A0F-4DDE-81CD-CD00C6538512@amazon.com
обсуждение исходный текст
Ответ на Re: Use durable_unlink for .ready and .done files for WAL segmentremoval  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Use durable_unlink for .ready and .done files for WAL segmentremoval  (Andres Freund <andres@anarazel.de>)
Re: Use durable_unlink for .ready and .done files for WAL segmentremoval  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
On 11/21/18, 10:16 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
>> At Fri, 02 Nov 2018 14:47:08 +0000, Nathan Bossart
>> <bossartn@amazon.com> wrote in
>> <154117002849.5569.14588306221618961668.pgcf@coridan.postgresql.org>:
>>> Granted, any added delay from this patch is unlikely to be noticeable
>>> unless your archiver is way behind and archive_status has a huge
>>> number of files.  However, I have seen cases where startup is stuck on
>>> other tasks like SyncDataDirectory() and RemovePgTempFiles() for a
>>> very long time, so perhaps it is worth considering.
>
> What's the scale of the pg_wal partition and the amount of time things
> were stuck?  I would imagine that the sync phase hurts the most, and a
> fast startup time for crash recovery is always important.

I don't have exact figures to share, but yes, a huge number of calls
to sync_file_range() and fsync() can use up a lot of time.  Presumably
Postgres processes files individually instead of using sync() because
sync() may return before writing is done.  Also, sync() would affect
non-Postgres files.  However, it looks like Linux actually does wait
for writing to complete before returning from sync() [0].

For RemovePgTempFiles(), the documentation above the function
indicates that skipping temp file cleanup during startup would
actually be okay because collisions with existing temp file names
should be handled by OpenTemporaryFile().  I assume this cleanup is
done during startup because there isn't a great alternative besides
offloading the work to a new background worker or something.

On 11/27/18, 6:35 AM, "Michael Paquier" <michael@paquier.xyz> wrote:
> Attached is a patch showing shaped based on the idea of upthread.
> Thoughts?

I took a look at this patch.

+                       /*
+                        * In the event of a system crash, archive status files may be
+                        * left behind as their removals are not durable, cleaning up
+                        * orphan entries here is the cheapest method.  So check that
+                        * the segment trying to be archived still exists.
+                        */
+                       snprintf(pathname, MAXPGPATH, XLOGDIR "/%s", xlog);
+                       if (stat(pathname, &stat_buf) != 0)
+                       {

Don't we also need to check that errno is ENOENT here?

+                               StatusFilePath(xlogready, xlog, ".ready");
+                               if (durable_unlink(xlogready, WARNING) == 0)
+                                       ereport(WARNING,
+                                                       (errmsg("removed orphan archive status file %s",
+                                                                       xlogready)));
+                               return;

IIUC any time that the file does not exist, we will attempt to unlink
it.  Regardless of whether unlinking fails or succeeds, we then
proceed to give up archiving for now, but it's not clear why.  Perhaps
we should retry unlinking a number of times (like we do for
pgarch_archiveXlog()) when durable_unlink() fails and simply "break"
to move on to the next .ready file if durable_unlink() succeeds.

Nathan

[0] http://man7.org/linux/man-pages/man2/sync.2.html


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: jit comments typos (Re: pg11.1 jit segv)
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Use durable_unlink for .ready and .done files for WAL segmentremoval