Re: Changing WAL Header to reduce contention duringReserveXLogInsertLocation()

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Changing WAL Header to reduce contention duringReserveXLogInsertLocation()
Дата
Msg-id d37a0436-671b-ae7f-4e5f-501912102075@iki.fi
обсуждение исходный текст
Ответ на Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 27/03/18 21:02, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Mar 27, 2018 at 11:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> This is ignoring the possibility of damaged data in between, ie
>>> A ... B ... CHKPT ... C ...  a few zeroed pages ... D ... CHKPT ... E ... F
> 
>> It's hard for me to believe that this case matters very much.  If
>> you're trying to run pg_rewind on a system where the WAL segments
>> contain a few zeroed pages, you're probably going to be hosed anyway,
>> if not by this particular thing then by something else.
> 
> Well, the point of checkpoints is that WAL data before the last one
> should no longer matter anymore, isn't it?
> 
> But really this is just one illustration of the point, which is that
> changing the WAL header definition as proposed *does* cost us reliability.
> We can argue about whether better concurrency in WAL insertion is
> worth that price, but claiming that the price is zero is flat wrong.
> 
> For me, the more important issue is that checking xl_prev not only
> validates that the record is where it is supposed to be, but that you
> arrived at it correctly.  Replacing it with xl_curr would keep the
> guarantee of the former (in fact likely make it stronger); but we'd
> completely lose the latter.

Yeah, removing xl_prev would certainly remove a useful cross-check from 
the format. Maybe it would be worth it, I don't think it's that 
critical. And adding some extra information in the page or segment 
headers might alleviate that.

But let's go back to why we're considering this. The idea was to 
optimize this block:

>     /*
>      * The duration the spinlock needs to be held is minimized by minimizing
>      * the calculations that have to be done while holding the lock. The
>      * current tip of reserved WAL is kept in CurrBytePos, as a byte position
>      * that only counts "usable" bytes in WAL, that is, it excludes all WAL
>      * page headers. The mapping between "usable" byte positions and physical
>      * positions (XLogRecPtrs) can be done outside the locked region, and
>      * because the usable byte position doesn't include any headers, reserving
>      * X bytes from WAL is almost as simple as "CurrBytePos += X".
>      */
>     SpinLockAcquire(&Insert->insertpos_lck);
> 
>     startbytepos = Insert->CurrBytePos;
>     endbytepos = startbytepos + size;
>     prevbytepos = Insert->PrevBytePos;
>     Insert->CurrBytePos = endbytepos;
>     Insert->PrevBytePos = startbytepos;
> 
>     SpinLockRelease(&Insert->insertpos_lck);

One trick that we could do is to replace that with a 128-bit atomic 
compare-and-swap instruction. Modern 64-bit Intel systems have that, 
it's called CMPXCHG16B. Don't know about other architectures. An atomic 
fetch-and-add, as envisioned in the comment above, would presumably be 
better, but I suspect that a compare-and-swap would be good enough to 
move the bottleneck elsewhere again.

- Heikki


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: ALTER TABLE ADD COLUMN fast default
Следующее
От: Satoshi Nagayasu
Дата:
Сообщение: Missing parse_merge.h?