Re: Unlogged relations and WAL-logging

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: Unlogged relations and WAL-logging
Дата
Msg-id a5b13df4-a083-8119-5a27-63360a147ece@eisentraut.org
обсуждение исходный текст
Ответ на Re: Unlogged relations and WAL-logging  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: Unlogged relations and WAL-logging
Список pgsql-hackers
Is the patch 
0003-Remove-unnecessary-smgrimmedsync-when-creating-unlog.patch still 
relevant, or can this commitfest entry be closed?

On 23.08.23 16:40, Heikki Linnakangas wrote:
>>>> 5. In heapam_relation_set_new_filenode(), we do this:
>>>>
>>>>>
>>>>>         /*
>>>>>          * If required, set up an init fork for an unlogged table 
>>>>> so that it can
>>>>>          * be correctly reinitialized on restart.  An immediate 
>>>>> sync is required
>>>>>          * even if the page has been logged, because the write did 
>>>>> not go through
>>>>>          * shared_buffers and therefore a concurrent checkpoint may 
>>>>> have moved the
>>>>>          * redo pointer past our xlog record.  Recovery may as well 
>>>>> remove it
>>>>>          * while replaying, for example, XLOG_DBASE_CREATE or 
>>>>> XLOG_TBLSPC_CREATE
>>>>>          * record. Therefore, logging is necessary even if 
>>>>> wal_level=minimal.
>>>>>          */
>>>>>         if (persistence == RELPERSISTENCE_UNLOGGED)
>>>>>         {
>>>>>                 Assert(rel->rd_rel->relkind == RELKIND_RELATION ||
>>>>>                            rel->rd_rel->relkind == RELKIND_MATVIEW ||
>>>>>                            rel->rd_rel->relkind == 
>>>>> RELKIND_TOASTVALUE);
>>>>>                 smgrcreate(srel, INIT_FORKNUM, false);
>>>>>                 log_smgrcreate(newrnode, INIT_FORKNUM);
>>>>>                 smgrimmedsync(srel, INIT_FORKNUM);
>>>>>         }
>>>>
>>>> The comment doesn't make much sense, we haven't written nor WAL-logged
>>>> any page here, with nor without the buffer cache. It made more sense
>>>> before commit fa0f466d53.
>>>
>>> Well, it seems to me (and perhaps I am just confused) that complaining
>>> that there's no page written here might be a technicality. The point
>>> is that there's no synchronization between the work we're doing here
>>> -- which is creating a fork, not writing a page -- and any concurrent
>>> checkpoint. So we both need to log it, and also sync it immediately.
>>
>> I see. I pushed the fix from the other thread that makes smgrcreate()
>> call register_dirty_segment (commit 4b4798e13). I believe that makes
>> this smgrimmedsync() unnecessary. If a concurrent checkpoint happens
>> with a redo pointer greater than this WAL record, it must've received
>> the fsync request created by smgrcreate(). That depends on the fact that
>> we write the WAL record *after* smgrcreate(). Subtle..
>>
>> Hmm, we have a similar smgrimmedsync() call after index build, because
>> we have written pages directly with smgrextend(skipFsync=true). If no
>> checkpoints have occurred during the index build, we could call
>> register_dirty_segment() instead of smgrimmedsync(). That would avoid
>> the fsync() latency when creating an index on an empty or small index.
>>
>> This is all very subtle to get right though. That's why I'd like to
>> invent a new bulk-creation facility that would handle this stuff, and
>> make the callers less error-prone.
> 
> Having a more generic and less error-prone bulk-creation mechanism is 
> still on my long TODO list..
> 




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jelte Fennema
Дата:
Сообщение: Re: Should we use MemSet or {0} for struct initialization?
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Commitfest 2023-09 starts soon