Re: [HACKERS] Deadlock in XLogInsert at AIX

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: [HACKERS] Deadlock in XLogInsert at AIX
Дата
Msg-id 85a1966c-aeee-fb38-77bd-6cad9a90b35f@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Deadlock in XLogInsert at AIX  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: [HACKERS] Deadlock in XLogInsert at AIX  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Список pgsql-hackers
On 30.01.2017 19:21, Heikki Linnakangas wrote:
> On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:
> Interesting.. What should happen here is that for the backend's own 
> insertion slot, the "insertingat" value should be greater than the 
> requested flush point ('upto' variable). That's because before 
> GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the 
> backend's insertingat value, to the position that it wants to insert 
> to. And AdvanceXLInsertBuffer() only calls 
> WaitXLogInsertionsToFinish() with value smaller than what was passed 
> as the 'upto' argument.
>
>> The comment to WaitXLogInsertionsToFinish says:
>>
>>   * Note: When you are about to write out WAL, you must call this 
>> function
>>   * *before* acquiring WALWriteLock, to avoid deadlocks. This 
>> function might
>>   * need to wait for an insertion to finish (or at least advance to next
>>   * uninitialized page), and the inserter might need to evict an old WAL
>> buffer
>>   * to make room for a new one, which in turn requires WALWriteLock.
>>
>> Which contradicts to the observed stack trace.
>
> Not AFAICS. In the stack trace you showed, the backend is not holding 
> WALWriteLock. It would only acquire it after the 
> WaitXLogInsertionsToFinish() call finished.
>
>

Hmmm, may be I missed something.
I am not telling about WALBufMappingLock which is required after return 
from XLogInsertionsToFinish.
But about lock obtained by WALInsertLockAcquire  at line 946 in 
XLogInsertRecord.
It will be release at line  1021 by  WALInsertLockRelease(). But 
CopyXLogRecordToWAL is invoked with this lock granted.


> This line in the stack trace is suspicious:
>
>> WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
>
> AdvanceXLInsertBuffer() should only ever call 
> WaitXLogInsertionsToFinish() with an xlog position that points to a 
> page bounary, but that upto value points to the middle of a page.
>
> Perhaps the value stored in the stack trace is not what the caller 
> passed, but it was updated because it was past the 'reserveUpto' 
> value? That would explain the "request to flush past end
> of generated WAL" notices you saw in the log. Now, why would that 
> happen, I have no idea.
>
> If you can and want to provide me access to the system, I could have a 
> look myself. I'd also like to see if the attached additional 
> Assertions will fire.

I really get this assertion failed:

ExceptionalCondition(conditionName = "!(OldPageRqstPtr <= upto || 
opportunistic)", errorType = "FailedAssertion", fileName = "xlog.c", 
lineNumber = 1917), line 54 in "assert.c"
(dbx) up
unnamed block in AdvanceXLInsertBuffer(upto = 147439056632, 
opportunistic = '\0'), line 1917 in "xlog.c"
(dbx) p OldPageRqstPtr
147439058944
(dbx) p upto
147439056632
(dbx) p opportunistic
'\0'

Also , in another run, I encountered yet another assertion failure:

ExceptionalCondition(conditionName = "!((((NewPageBeginPtr) / 8192) % 
(XLogCtl->XLogCacheBlck + 1)) == nextidx)", errorType = 
"FailedAssertion", fileName = "xlog.c", lineNumber = 1950), line 54 in 
"assert.c"

nextidx equals to 1456, while expected value is 1457.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: [HACKERS] WIP: [[Parallel] Shared] Hash
Следующее
От: Abbas Butt
Дата:
Сообщение: Re: [HACKERS] An issue in remote query optimization