Re: .ready and .done files considered harmful

Поиск
Список
Период
Сортировка
От Dipesh Pandit
Тема Re: .ready and .done files considered harmful
Дата
Msg-id CAN1g5_GAoO43bgUVjw39D8HTG_1se4_6oFk5B0DZpqGOEQytvA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: .ready and .done files considered harmful  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
Список pgsql-hackers
Hi,

Thanks for the feedback.

> I wonder if this can be simplified even further.  If we don't bother
> trying to catch out-of-order .ready files in XLogArchiveNotify() and
> just depend on the per-checkpoint/restartpoint directory scans, we can
> probably remove lastReadySegNo from archiver state completely.

If we agree that some extra delay in archiving these files is acceptable
then we don't require any special handling for this scenario otherwise
we may need to handle it separately.

> +       /* Initialize the current state of archiver */
> +       xlogState.lastSegNo = MaxXLogSegNo;
> +       xlogState.lastTli = MaxTimeLineID;
>
> It looks like we have two ways to force a directory scan.  We can
> either set force_dir_scan to true, or lastSegNo can be set to
> MaxXLogSegNo.  Why not just set force_dir_scan to true here so that we
> only have one way to force a directory scan?

make sense, I have updated it.

> Don't we also need to force a directory scan in the other cases we
> return early from pgarch_ArchiverCopyLoop()?  We will have already
> advanced the archiver state in pgarch_readyXlog(), so I think we'd end
> up skipping files if we didn't.  For example, if archive_command isn't
> set, we'll just return, and the next call to pgarch_readyXlog() might
> return the next file.

I agree, we should do it for all early return paths.

> nitpick: I think we should just call PgArchForceDirScan() here.

Yes, that's right.

> > > This is an interesting idea, but the "else" block here seems prone to
> > > race conditions.  I think we'd have to hold arch_lck to prevent that.
> > > But as I mentioned above, if we are okay with depending on the
> > > fallback directory scans, I think we can remove the "else" block
> > > completely.

Ohh I didn't realize the race condition here. The competing processes
can read the same value of lastReadySegNo.
 
> > Thinking further, we probably need to hold a lock even when we are
> > creating the .ready file to avoid race conditions.
>
> The race condition surely happens, but even if that happens, all
> competing processes except one of them detect out-of-order and will
> enforce directory scan.  But I'm not sure how it behaves under more
> complex situation so I'm not sure I like that behavior.
>
> We could just use another lock for the logic there, but instead
> couldn't we merge PgArchGetLastReadySegNo and PgArchSetLastReadySegNo
> into one atomic test-and-(check-and-)set function?  Like this.

I agree that we can merge the existing "Get" and "Set" functions into
an atomic test-and-check-and-set function to avoid a race condition.

I have incorporated these changes and updated a new patch. PFA patch.

Thanks,
Dipesh
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Nancarrow
Дата:
Сообщение: Re: Added schema level support for publication.
Следующее
От: "Drouvot, Bertrand"
Дата:
Сообщение: Re: Minimal logical decoding on standbys