Re: [HACKERS] increasing the default WAL segment size

Поиск
Список
Период
Сортировка
От Beena Emerson
Тема Re: [HACKERS] increasing the default WAL segment size
Дата
Msg-id CAOG9ApFC2eUPQsEZ9=0PnTmC1fPq_P45f1sR4TrkWG7eXxXY6w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] increasing the default WAL segment size  (Andres Freund <andres@anarazel.de>)
Ответы Re: [HACKERS] increasing the default WAL segment size  (Beena Emerson <memissemerson@gmail.com>)
Re: [HACKERS] increasing the default WAL segment size  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
Hello Andres,

Thank you for your review.

On Fri, Jan 27, 2017 at 12:39 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2017-01-23 11:35:11 +0530, Beena Emerson wrote:
> Please find attached an updated WIP patch. I have incorporated almost all
> comments. This is to be applied over Robert's patches. I will post
> performance results later on.
>
> 1. shift (>>) and AND (&) operations: The assign hook of wal_segment_size
> sets the WalModMask and WalShiftBit. All the modulo and division operations
> using XLogSegSize has been replaced with these. However, there are many
> preprocessors which divide with XLogSegSize in xlog_internal.h. I have not
> changed these because it would mean I will have to reassign the WalShiftBit
> along with XLogSegSize in all the modules which use these macros. That does
> not seem to be a good idea. Also, this means shift operator can be used
> only in couple of places.

I think it'd be better not to have XLogSegSize anymore. Silently
changing a macros behaviour from being a compile time constant to
something runtime configurable is a bad idea.

I dont think I understood u clearly. You mean convert the macros using XLogSegSize to functions?
 
> 8. Declaring XLogSegSize: There are 2 internal variables for the same
> parameter. In original code XLOG_SEG_SIZE is defined in the auto-generated
> file src/include/pg_config.h. And xlog_internal.h defines:
>
> #define XLogSegSize     ((uint32) XLOG_SEG_SIZE)
>
> To avoid renaming all parts of code, I made the following change in
> xlog_internal.h
>
> + extern uint32 XLogSegSize;
>
> +#define XLOG_SEG_SIZE XLogSegSize
>
>  would it be better to just use one variable XLogSegSize everywhere. But
> few external modules could be using XLOG_SEG_SIZE. Thoughts?

They'll quite possibly break with configurable size anyway.  So I'd
rather have those broken explicitly.

Ok. I will remove the XLOG_SEGS_SIZE variable then?


> +/*
> + * These variables are set in assign_wal_segment_size
> + *
> + * WalModMask: It is an AND mask for XLogSegSize to allow for faster modulo
> + *           operations using it.
> + *
> + * WalShiftBit: It is an shift bit for XLogSegSize to allow for faster
> + *           division operations using it.
> + *
> + * UsableBytesInSegment: It is the number of bytes in a WAL segment usable for
> + *           WAL data.
> + */
> +uint32               WalModMask;
> +static int   UsableBytesInSegment;
> +static int   WalShiftBit;

This could use some editorializing. "Faster modulo operations" isn't an
explaining how/why it's actually being used. Same for WalShiftBit.

 I will change these comments.
 

>  /*
>   * Private, possibly out-of-date copy of shared LogwrtResult.
> @@ -957,6 +975,7 @@ XLogInsertRecord(XLogRecData *rdata,
>       if (!XLogInsertAllowed())
>               elog(ERROR, "cannot make new WAL entries during recovery");
>
> +
>       /*----------
>        *

Spurious newline change.

>               if (ptr % XLOG_BLCKSZ == SizeOfXLogShortPHD &&
> -                     ptr % XLOG_SEG_SIZE > XLOG_BLCKSZ)
> +                     (ptr & WalModMask) > XLOG_BLCKSZ)
>                       initializedUpto = ptr - SizeOfXLogShortPHD;
>               else if (ptr % XLOG_BLCKSZ == SizeOfXLogLongPHD &&
> -                              ptr % XLOG_SEG_SIZE < XLOG_BLCKSZ)
> +                              (ptr & WalModMask) < XLOG_BLCKSZ)
>                       initializedUpto = ptr - SizeOfXLogLongPHD;
>               else
>                       initializedUpto = ptr;

How about we introduce a XLogSegmentOffset(XLogRecPtr) function like
macro in a first patch?  That'll reduce the amount of change in the
commit actually changing things quite noticeably, and makes it easier to
adjust things later.  I see very little benefit for in-place usage of
either % XLOG_SEG_SIZE or & WalModMask.

I will check this.
 


> @@ -1794,6 +1813,7 @@ XLogBytePosToRecPtr(uint64 bytepos)
>       uint32          seg_offset;
>       XLogRecPtr      result;
>
> +
>       fullsegs = bytepos / UsableBytesInSegment;
>       bytesleft = bytepos % UsableBytesInSegment;

spurious change.

> @@ -1878,7 +1898,7 @@ XLogRecPtrToBytePos(XLogRecPtr ptr)
>
>       XLByteToSeg(ptr, fullsegs);
>
> -     fullpages = (ptr % XLOG_SEG_SIZE) / XLOG_BLCKSZ;
> +     fullpages = (ptr & WalModMask) / XLOG_BLCKSZ;
>       offset = ptr % XLOG_BLCKSZ;
>
>       if (fullpages == 0)
> @@ -2043,7 +2063,7 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic)
>               /*
>                * If first page of an XLOG segment file, make it a long header.
>                */
> -             if ((NewPage->xlp_pageaddr % XLogSegSize) == 0)
> +             if ((NewPage->xlp_pageaddr & WalModMask) == 0)
>               {
>                       XLogLongPageHeader NewLongPage = (XLogLongPageHeader) NewPage;
>
> @@ -2095,6 +2115,7 @@ CalculateCheckpointSegments(void)
>        *        number of segments consumed between checkpoints.
>        *-------
>        */
> +
>       target = (double) max_wal_size / (2.0 + CheckPointCompletionTarget);

spurious change.


>  void
> +assign_wal_segment_size(int newval, void *extra)
> +{
> +     /*
> +      * During system initialization, XLogSegSize is not set so we use
> +      * DEFAULT_XLOG_SEG_SIZE instead.
> +      */
> +     int     WalSegSize = (XLogSegSize == 0) ? DEFAULT_XLOG_SEG_SIZE : XLOG_SEG_SIZE;
> +
> +     wal_segment_size = newval;
> +     UsableBytesInSegment = (wal_segment_size * UsableBytesInPage) -
> +                                                (SizeOfXLogLongPHD - SizeOfXLogShortPHD);
> +     WalModMask = WalSegSize - 1;
> +
> +     /* Set the WalShiftBit */
> +     WalShiftBit = 0;
> +     while (WalSegSize > 1)
> +     {
> +             WalSegSize = WalSegSize >> 1;
> +             WalShiftBit++;
> +     }
> +}

Hm. Are GUC hooks a good way to compute the masks?  Interdependent GUCs
are unfortunately not working well, and several GUCs might end up
depending on these.  I think it might be better to assign the variables
somewhere early in StartupXLOG() or such.

I am not sure about these interdependent GUCs. I need to study this better and make changes as required.


> +
> +void
> +assign_min_wal_size(int newval, void *extra)
> +{
> +     /*
> +      * During system initialization, XLogSegSize is not set so we use
> +      * DEFAULT_XLOG_SEG_SIZE instead.
> +      *
> +      * min_wal_size is in kB and XLogSegSize is in bytes and so it is
> +      * converted to kB for the calculation.
> +      */
> +     int     WalSegSize = (XLogSegSize == 0) ? (DEFAULT_XLOG_SEG_SIZE / 1024) :
> +                                                                               (XLOG_SEG_SIZE / 1024);
> +
> +     min_wal_size = newval / WalSegSize;
> +}
> +
> +void
>  assign_max_wal_size(int newval, void *extra)
>  {
> -     max_wal_size = newval;
> +     /*
> +      * During system initialization, XLogSegSize is not set so we use
> +      * DEFAULT_XLOG_SEG_SIZE instead.
> +      *
> +      * max_wal_size is in kB and XLogSegSize is in bytes and so it is
> +      * converted to bytes for the calculation.
> +      */
> +     int     WalSegSize = (XLogSegSize == 0) ? (DEFAULT_XLOG_SEG_SIZE / 1024) :
> +                                                                               (XLOG_SEG_SIZE / 1024);
> +
> +     max_wal_size = newval / WalSegSize;
>       CalculateCheckpointSegments();
>  }

I don't think it's a good idea to have GUCs that are initially set to
the wrong value and such.  How about just storing bytes, and converting
into segments upon use?

max_wal_size is used in CalculateCheckpointSegments and XLOGfileslop.
min_wal_size is used in XLOGfileslop only.

XLOGfileslop is called after the postgres has started up and would have XLogSegSize set by then but CalculateCheckpointSegments  would be a problem. assign_max_wal_size calls CalculateCheckpointSegments which will need the value as segment count not bytes. If we continue as bytes, then we will need to shift the WalSegSize adjustment in the CalculateCheckpointSegments. 
 

 
> @@ -2135,8 +2205,8 @@ XLOGfileslop(XLogRecPtr PriorRedoPtr)
>        * correspond to. Always recycle enough segments to meet the minimum, and
>        * remove enough segments to stay below the maximum.
>        */
> -     minSegNo = PriorRedoPtr / XLOG_SEG_SIZE + min_wal_size - 1;
> -     maxSegNo = PriorRedoPtr / XLOG_SEG_SIZE + max_wal_size - 1;
> +     minSegNo = (PriorRedoPtr >> WalShiftBit) + min_wal_size - 1;
> +     maxSegNo = (PriorRedoPtr >> WalShiftBit) + max_wal_size - 1;

I think a macro would be good here too (same prerequisite patch as above).

> @@ -4677,8 +4749,18 @@ XLOGShmemSize(void)
>        */
>       if (XLOGbuffers == -1)
>       {
> -             char            buf[32];
> -
> +             /*
> +              * The calculation of XLOGbuffers, requires the now run-time parameter
> +              * XLogSegSize from the ControlFile. The value determined here is
> +              * required to create the shared memory segment. Hence, temporarily
> +              * allocating space and reading ControlFile here.
> +              */

I don't like comments containing things like "the now run-time paramter"
much - they are likely going to still be there in 10 years, and will be
hard to understand.

you are right. 
 

But anyway, how about we simply remove the "max one segment" boundary
instead? I don't think it's actually very meaningful - several people
posted benchmarks with more than one segment being beneficial.


> diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
> index 31290d3..87efc3c 100644
> --- a/src/bin/pg_basebackup/streamutil.c
> +++ b/src/bin/pg_basebackup/streamutil.c
> @@ -238,6 +238,59 @@ GetConnection(void)
>  }
>
>  /*
> + * Run the SHOW_WAL_SEGMENT_SIZE command to set the XLogSegSize
> + */
> +bool
> +SetXLogSegSize(PGconn *conn)
> +{

I think this is a confusing function name, because it sounds like
you're setting the SegSize remotely or such. I think making it
XLogRecPtr RetrieveXLogSegSize(conn); or such would lead to better code.

I agree. I will do the needful.
 

> diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
> index 963802e..4ceebdc 100644
> --- a/src/bin/pg_resetxlog/pg_resetxlog.c
> +++ b/src/bin/pg_resetxlog/pg_resetxlog.c
> @@ -57,6 +57,7 @@
>  #include "storage/large_object.h"
>  #include "pg_getopt.h"
>
> +uint32               XLogSegSize;

This seems like a bad idea - having the same local variable both in
frontend and backend programs seems like a recipe for disaster.

I had to use the same variable name because they were used in the macros specified in the xlog_internal.h,  So re-assigning this variable would automatically make the macros using XLogSegSize accessible in these programs. Else they will throw error "undefined reference to `XLogSegSize'"
 

--
Thank you, 

Beena Emerson

Have a Great Day!

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: [HACKERS] Failure in commit_ts tap tests
Следующее
От: Nikhil Sontakke
Дата:
Сообщение: Re: [HACKERS] Speedup twophase transactions