Re: PANIC: could not fsync file "pg_multixact/..." since commit dee663f7843

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: PANIC: could not fsync file "pg_multixact/..." since commit dee663f7843
Дата	5 ноября 2020 г. 02:07:37
Msg-id	18580b87-4b80-60c9-16f0-ab8d98395855@enterprisedb.com обсуждение исходный текст
Ответ на	Re: PANIC: could not fsync file "pg_multixact/..." since commit dee663f7843 (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы	Re: PANIC: could not fsync file "pg_multixact/..." since commit dee663f7843 (Thomas Munro <thomas.munro@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On 11/4/20 2:50 PM, Tomas Vondra wrote:
> On Wed, Nov 04, 2020 at 05:36:46PM +1300, Thomas Munro wrote:
>> On Wed, Nov 4, 2020 at 2:57 PM Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>>> On Wed, Nov 04, 2020 at 02:49:24PM +1300, Thomas Munro wrote:
>>> >On Wed, Nov 4, 2020 at 2:32 PM Tomas Vondra
>>> ><tomas.vondra@2ndquadrant.com> wrote:
>>> >> After a while (~1h on my machine) the pg_multixact gets over 10GB, 
>>> which
>>> >> triggers a more aggressive cleanup (per 
>>> MultiXactMemberFreezeThreshold).
>>> >> My guess is that this discards some of the files, but checkpointer is
>>> >> not aware of that, or something like that. Not sure.
>>> >
>>> >Urgh.  Thanks.  Looks like perhaps the problem is that I have
>>> >RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true) in one codepath
>>> >that unlinks files, but not another.  Looking.
>>>
>>> Maybe. I didn't have time to investigate this more deeply, and it takes
>>> quite a bit of time to reproduce. I can try again with extra logging or
>>> test some proposed fixes, if you give me a patch.
>>
>> I think this should be fixed by doing all unlinking through a common
>> code path.  Does this pass your test?
> 
> Seems to be working - without the patch it failed after ~1h, now it's
> running for more than 2h without a crash. I'll let it run for a few more
> hours (on both machines).
> 

It's been running for hours on both machines, without any crashes etc. 
While that's not a definitive proof the fix is correct, it certainly 
behaves differently.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Melanie Plageman
Дата: 05 ноября 2020 г., 01:33:58
Сообщение: Re: Parallel Full Hash Join

Следующее

От: James Coleman
Дата: 05 ноября 2020 г., 02:53:32
Сообщение: Re: Use of "long" in incremental sort code

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PANIC: could not fsync file "pg_multixact/..." since commit dee663f7843

Предыдущее

Следующее