Re: Error with index on unlogged table

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Error with index on unlogged table
Дата
Msg-id CAB7nPqTJGbRdexanOk2HUXqWU1MDTxv4tSQ9MSQtBCvUJ=m9DQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Error with index on unlogged table  (Andres Freund <andres@anarazel.de>)
Ответы Re: Error with index on unlogged table  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Fri, Dec 4, 2015 at 5:10 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-12-04 17:00:13 +0900, Michael Paquier wrote:
>> Andres Freud wrote:
>> >>  extern void InitXLogInsert(void);
>> >> diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
>> >> index ad1eb4b..91445f1 100644
>> >> --- a/src/include/catalog/pg_control.h
>> >> +++ b/src/include/catalog/pg_control.h
>> >> @@ -73,6 +73,7 @@ typedef struct CheckPoint
>> >>  #define XLOG_END_OF_RECOVERY                 0x90
>> >>  #define XLOG_FPI_FOR_HINT                            0xA0
>> >>  #define XLOG_FPI                                             0xB0
>> >> +#define XLOG_FPI_FOR_SYNC                            0xC0
>> >
>> >
>> > I'm not a big fan of the XLOG_FPI_FOR_SYNC name. Syncing is a bit too
>> > ambigous for my taste. How about either naming it XLOG_FPI_FLUSH or
>> > instead adding actual record data and a 'flags' field in there? I
>> > slightly prefer the latter - XLOG_FPI and XLOG_FPI_FOR_HINT really are
>> > different, XLOG_FPI_FOR_SYNC not so much.
>>
>> Let's go for XLOG_FPI_FLUSH.
>
> I think the other way is a bit better, because we can add new flags
> without changing the WAL format.

Hm. On the contrary, I think that it would make more sense to have a
flag as well for FOR_HINT honestly, those are really the same
operations, and FOR_HINT is just here for statistic purposes.

>> diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
>> index 99337b0..b646101 100644
>> --- a/src/backend/access/brin/brin.c
>> +++ b/src/backend/access/brin/brin.c
>> @@ -682,7 +682,15 @@ brinbuildempty(PG_FUNCTION_ARGS)
>>       brin_metapage_init(BufferGetPage(metabuf), BrinGetPagesPerRange(index),
>>                                          BRIN_CURRENT_VERSION);
>>       MarkBufferDirty(metabuf);
>> -     log_newpage_buffer(metabuf, false);
>> +
>> +     /*
>> +      * For unlogged relations, this page should be immediately flushed
>> +      * to disk after being replayed. This is necessary to ensure that the
>> +      * initial on-disk state of unlogged relations is preserved when
>> +      * they get reset at the end of recovery.
>> +      */
>> +     log_newpage_buffer(metabuf, false,
>> +             index->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED);
>>       END_CRIT_SECTION();
>
> Maybe write the last sentence as '... as the on disk files are copied
> directly at the end of recovery.'?

Check.

>> @@ -336,7 +336,8 @@ end_heap_rewrite(RewriteState state)
>>                                               MAIN_FORKNUM,
>>                                               state->rs_blockno,
>>                                               state->rs_buffer,
>> -                                             true);
>> +                                             true,
>> +                                             false);
>>               RelationOpenSmgr(state->rs_new_rel);
>>
>>               PageSetChecksumInplace(state->rs_buffer, state->rs_blockno);
>> @@ -685,7 +686,8 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
>>                                                       MAIN_FORKNUM,
>>                                                       state->rs_blockno,
>>                                                       page,
>> -                                                     true);
>> +                                                     true,
>> +                                                     false);
>
> Did you verify that that's ok when a unlogged table is clustered/vacuum
> full'ed?

Yep.

>> @@ -181,6 +183,9 @@ xlog_identify(uint8 info)
>>               case XLOG_FPI_FOR_HINT:
>>                       id = "FPI_FOR_HINT";
>>                       break;
>> +             case XLOG_FPI_FLUSH:
>> +                     id = "FPI_FOR_SYNC";
>> +                     break;
>>       }
>
> Old string.

Yeah, that's now completely removed.

>> --- a/src/backend/access/transam/xloginsert.c
>> +++ b/src/backend/access/transam/xloginsert.c
>> @@ -932,10 +932,13 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
>>   * If the page follows the standard page layout, with a PageHeader and unused
>>   * space between pd_lower and pd_upper, set 'page_std' to TRUE. That allows
>>   * the unused space to be left out from the WAL record, making it smaller.
>> + *
>> + * If 'is_flush' is set to TRUE, relation will be requested to flush
>> + * immediately its data at replay after replaying this full page image.
>>   */
>
> s/is_flush/flush_immed/? And maybe say that it 'will be flushed to the
> OS immediately after replaying the record'?

s/OS/stable storage?
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ildus Kurbangaliev
Дата:
Сообщение: Support of partial decompression for datums
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Support of partial decompression for datums