Re: There's some sort of race condition with the new FSM stuff

Поиск

Список

Период

Сортировка

От	Heikki Linnakangas
Тема	Re: There's some sort of race condition with the new FSM stuff
Дата	14 октября 2008 г. 04:37:07
Msg-id	48F44C13.5000800@enterprisedb.com обсуждение исходный текст
Ответ на	Re: There's some sort of race condition with the new FSM stuff (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: There's some sort of race condition with the new FSM stuff
Список	pgsql-hackers

Дерево обсуждения

Tom Lane wrote:
> I wrote:
>> Two different buildfarm machines are currently showing the same failure:
>> ERROR:  could not fsync segment 0 of relation 1663/16384/29270/1: No such file or directory
>> ERROR:  checkpoint request failed
> 
> Some tests show that when the serial regression tests are run in a
> freshly initdb'd installation, HEAD assigns OID 29270 to "bmscantest"
> in the bitmapops test.  So that's been dropped some time before the
> failure occurs; which means that this isn't a narrow-window race
> condition; which raises the question of why we're not seeing it on more
> machines.  I notice now that kudu and dragonfly are actually the same
> machine ... could this be an OS-specific problem?  Kris, has there been
> any system-software change on that machine recently?

Must be because of this little missing line here:

--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -1012,6 +1012,7 @@ ForwardFsyncRequest(RelFileNode rnode, ForkNumber 
forknum, BlockNumber segno)        }        request = &BgWriterShmem->requests[BgWriterShmem->num_requests++];
request->rnode= rnode;
 
+       request->forknum = forknum;        request->segno = segno;        LWLockRelease(BgWriterCommLock);
returntrue;
 

Most fsync requests are for main fork, not FSM, so I guess what usually 
happens because of that bug is that the we skip the fsync on the FSM, 
which is why we haven't noticed before.

I still wonder, though, why we're seeing the error consistently on kudu, 
and not on any other animal. Perhaps the forknum field that's left 
uninitialized gets a different value there than on other platforms.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: There's some sort of race condition with the new FSM stuff