Re: [HACKERS] increasing the default WAL segment size

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] increasing the default WAL segment size
Дата
Msg-id 20170126190925.pnrb5lysrz2op2fl@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] increasing the default WAL segment size  (Beena Emerson <memissemerson@gmail.com>)
Ответы Re: [HACKERS] increasing the default WAL segment size  (Beena Emerson <memissemerson@gmail.com>)
Список pgsql-hackers
Hi,

On 2017-01-23 11:35:11 +0530, Beena Emerson wrote:
> Please find attached an updated WIP patch. I have incorporated almost all
> comments. This is to be applied over Robert's patches. I will post
> performance results later on.
>
> 1. shift (>>) and AND (&) operations: The assign hook of wal_segment_size
> sets the WalModMask and WalShiftBit. All the modulo and division operations
> using XLogSegSize has been replaced with these. However, there are many
> preprocessors which divide with XLogSegSize in xlog_internal.h. I have not
> changed these because it would mean I will have to reassign the WalShiftBit
> along with XLogSegSize in all the modules which use these macros. That does
> not seem to be a good idea. Also, this means shift operator can be used
> only in couple of places.

I think it'd be better not to have XLogSegSize anymore. Silently
changing a macros behaviour from being a compile time constant to
something runtime configurable is a bad idea.


> 8. Declaring XLogSegSize: There are 2 internal variables for the same
> parameter. In original code XLOG_SEG_SIZE is defined in the auto-generated
> file src/include/pg_config.h. And xlog_internal.h defines:
>
> #define XLogSegSize     ((uint32) XLOG_SEG_SIZE)
>
> To avoid renaming all parts of code, I made the following change in
> xlog_internal.h
>
> + extern uint32 XLogSegSize;
>
> +#define XLOG_SEG_SIZE XLogSegSize
>
>  would it be better to just use one variable XLogSegSize everywhere. But
> few external modules could be using XLOG_SEG_SIZE. Thoughts?

They'll quite possibly break with configurable size anyway.  So I'd
rather have those broken explicitly.



> +/*
> + * These variables are set in assign_wal_segment_size
> + *
> + * WalModMask: It is an AND mask for XLogSegSize to allow for faster modulo
> + *        operations using it.
> + *
> + * WalShiftBit: It is an shift bit for XLogSegSize to allow for faster
> + *        division operations using it.
> + *
> + * UsableBytesInSegment: It is the number of bytes in a WAL segment usable for
> + *        WAL data.
> + */
> +uint32        WalModMask;
> +static int    UsableBytesInSegment;
> +static int    WalShiftBit;

This could use some editorializing. "Faster modulo operations" isn't an
explaining how/why it's actually being used. Same for WalShiftBit.

>  /*
>   * Private, possibly out-of-date copy of shared LogwrtResult.
> @@ -957,6 +975,7 @@ XLogInsertRecord(XLogRecData *rdata,
>      if (!XLogInsertAllowed())
>          elog(ERROR, "cannot make new WAL entries during recovery");
>
> +
>      /*----------
>       *

Spurious newline change.

>          if (ptr % XLOG_BLCKSZ == SizeOfXLogShortPHD &&
> -            ptr % XLOG_SEG_SIZE > XLOG_BLCKSZ)
> +            (ptr & WalModMask) > XLOG_BLCKSZ)
>              initializedUpto = ptr - SizeOfXLogShortPHD;
>          else if (ptr % XLOG_BLCKSZ == SizeOfXLogLongPHD &&
> -                 ptr % XLOG_SEG_SIZE < XLOG_BLCKSZ)
> +                 (ptr & WalModMask) < XLOG_BLCKSZ)
>              initializedUpto = ptr - SizeOfXLogLongPHD;
>          else
>              initializedUpto = ptr;

How about we introduce a XLogSegmentOffset(XLogRecPtr) function like
macro in a first patch?  That'll reduce the amount of change in the
commit actually changing things quite noticeably, and makes it easier to
adjust things later.  I see very little benefit for in-place usage of
either % XLOG_SEG_SIZE or & WalModMask.


> @@ -1794,6 +1813,7 @@ XLogBytePosToRecPtr(uint64 bytepos)
>      uint32        seg_offset;
>      XLogRecPtr    result;
>
> +
>      fullsegs = bytepos / UsableBytesInSegment;
>      bytesleft = bytepos % UsableBytesInSegment;

spurious change.

> @@ -1878,7 +1898,7 @@ XLogRecPtrToBytePos(XLogRecPtr ptr)
>
>      XLByteToSeg(ptr, fullsegs);
>
> -    fullpages = (ptr % XLOG_SEG_SIZE) / XLOG_BLCKSZ;
> +    fullpages = (ptr & WalModMask) / XLOG_BLCKSZ;
>      offset = ptr % XLOG_BLCKSZ;
>
>      if (fullpages == 0)
> @@ -2043,7 +2063,7 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic)
>          /*
>           * If first page of an XLOG segment file, make it a long header.
>           */
> -        if ((NewPage->xlp_pageaddr % XLogSegSize) == 0)
> +        if ((NewPage->xlp_pageaddr & WalModMask) == 0)
>          {
>              XLogLongPageHeader NewLongPage = (XLogLongPageHeader) NewPage;
>
> @@ -2095,6 +2115,7 @@ CalculateCheckpointSegments(void)
>       *      number of segments consumed between checkpoints.
>       *-------
>       */
> +
>      target = (double) max_wal_size / (2.0 + CheckPointCompletionTarget);

spurious change.


>  void
> +assign_wal_segment_size(int newval, void *extra)
> +{
> +    /*
> +     * During system initialization, XLogSegSize is not set so we use
> +     * DEFAULT_XLOG_SEG_SIZE instead.
> +     */
> +    int    WalSegSize = (XLogSegSize == 0) ? DEFAULT_XLOG_SEG_SIZE : XLOG_SEG_SIZE;
> +
> +    wal_segment_size = newval;
> +    UsableBytesInSegment = (wal_segment_size * UsableBytesInPage) -
> +                           (SizeOfXLogLongPHD - SizeOfXLogShortPHD);
> +    WalModMask = WalSegSize - 1;
> +
> +    /* Set the WalShiftBit */
> +    WalShiftBit = 0;
> +    while (WalSegSize > 1)
> +    {
> +        WalSegSize = WalSegSize >> 1;
> +        WalShiftBit++;
> +    }
> +}

Hm. Are GUC hooks a good way to compute the masks?  Interdependent GUCs
are unfortunately not working well, and several GUCs might end up
depending on these.  I think it might be better to assign the variables
somewhere early in StartupXLOG() or such.


> +
> +void
> +assign_min_wal_size(int newval, void *extra)
> +{
> +    /*
> +     * During system initialization, XLogSegSize is not set so we use
> +     * DEFAULT_XLOG_SEG_SIZE instead.
> +     *
> +     * min_wal_size is in kB and XLogSegSize is in bytes and so it is
> +     * converted to kB for the calculation.
> +     */
> +    int    WalSegSize = (XLogSegSize == 0) ? (DEFAULT_XLOG_SEG_SIZE / 1024) :
> +                                          (XLOG_SEG_SIZE / 1024);
> +
> +    min_wal_size = newval / WalSegSize;
> +}
> +
> +void
>  assign_max_wal_size(int newval, void *extra)
>  {
> -    max_wal_size = newval;
> +    /*
> +     * During system initialization, XLogSegSize is not set so we use
> +     * DEFAULT_XLOG_SEG_SIZE instead.
> +     *
> +     * max_wal_size is in kB and XLogSegSize is in bytes and so it is
> +     * converted to bytes for the calculation.
> +     */
> +    int    WalSegSize = (XLogSegSize == 0) ? (DEFAULT_XLOG_SEG_SIZE / 1024) :
> +                                          (XLOG_SEG_SIZE / 1024);
> +
> +    max_wal_size = newval / WalSegSize;
>      CalculateCheckpointSegments();
>  }

I don't think it's a good idea to have GUCs that are initially set to
the wrong value and such.  How about just storing bytes, and converting
into segments upon use?



> @@ -2135,8 +2205,8 @@ XLOGfileslop(XLogRecPtr PriorRedoPtr)
>       * correspond to. Always recycle enough segments to meet the minimum, and
>       * remove enough segments to stay below the maximum.
>       */
> -    minSegNo = PriorRedoPtr / XLOG_SEG_SIZE + min_wal_size - 1;
> -    maxSegNo = PriorRedoPtr / XLOG_SEG_SIZE + max_wal_size - 1;
> +    minSegNo = (PriorRedoPtr >> WalShiftBit) + min_wal_size - 1;
> +    maxSegNo = (PriorRedoPtr >> WalShiftBit) + max_wal_size - 1;

I think a macro would be good here too (same prerequisite patch as above).

> @@ -4677,8 +4749,18 @@ XLOGShmemSize(void)
>       */
>      if (XLOGbuffers == -1)
>      {
> -        char        buf[32];
> -
> +        /*
> +         * The calculation of XLOGbuffers, requires the now run-time parameter
> +         * XLogSegSize from the ControlFile. The value determined here is
> +         * required to create the shared memory segment. Hence, temporarily
> +         * allocating space and reading ControlFile here.
> +         */

I don't like comments containing things like "the now run-time paramter"
much - they are likely going to still be there in 10 years, and will be
hard to understand.


But anyway, how about we simply remove the "max one segment" boundary
instead? I don't think it's actually very meaningful - several people
posted benchmarks with more than one segment being beneficial.


> diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
> index 31290d3..87efc3c 100644
> --- a/src/bin/pg_basebackup/streamutil.c
> +++ b/src/bin/pg_basebackup/streamutil.c
> @@ -238,6 +238,59 @@ GetConnection(void)
>  }
>
>  /*
> + * Run the SHOW_WAL_SEGMENT_SIZE command to set the XLogSegSize
> + */
> +bool
> +SetXLogSegSize(PGconn *conn)
> +{

I think this is a confusing function name, because it sounds like
you're setting the SegSize remotely or such. I think making it
XLogRecPtr RetrieveXLogSegSize(conn); or such would lead to better code.

> diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
> index 963802e..4ceebdc 100644
> --- a/src/bin/pg_resetxlog/pg_resetxlog.c
> +++ b/src/bin/pg_resetxlog/pg_resetxlog.c
> @@ -57,6 +57,7 @@
>  #include "storage/large_object.h"
>  #include "pg_getopt.h"
>
> +uint32        XLogSegSize;

This seems like a bad idea - having the same local variable both in
frontend and backend programs seems like a recipe for disaster.


Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] increasing the default WAL segment size
Следующее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal