Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)

Поиск
Список
Период
Сортировка
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jul 16, 2012 at 3:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW, while we are on the subject: hasn't this split completely broken
>> the statistics about backend-initiated writes?

> Yes, it seems to have done just that.

So I went to fix this in the obvious way (attached), but while testing
it I found that the number of buffers_backend events reported during
a regression test run barely changed; which surprised the heck out of
me, so I dug deeper.  The cause turns out to be extremely scary:
ForwardFsyncRequest isn't getting called at all in the bgwriter process,
because the bgwriter process has a pendingOpsTable.  So it just queues
its fsync requests locally, and then never acts on them, since it never
runs any checkpoints anymore.

This implies that nobody has done pull-the-plug testing on either HEAD
or 9.2 since the checkpointer split went in (2011-11-01), because even
a modicum of such testing would surely have shown that we're failing to
fsync a significant fraction of our write traffic.

Furthermore, I would say that any performance testing done since then,
if it wasn't looking at purely read-only scenarios, isn't worth the
electrons it's written on.  In particular, any performance gain that
anybody might have attributed to the checkpointer splitup is very
probably hogwash.

This is not giving me a warm feeling about our testing practices.

As far as fixing the bug is concerned, the reason for the foulup
is that mdinit() looks to IsBootstrapProcessingMode() to decide
whether to create a pendingOpsTable.  That probably was all right
when it was coded, but what it means today is that *any* process
started via AuxiliaryProcessMain will have one; thus not only do
bgwriters have one, but so do walwriter and walreceiver processes;
which might not represent a bug today but it's pretty scary anyway.
I think we need to fix that so it's more directly dependent on the
auxiliary process type.  We can't use flags set by the respective
FooMain() functions, such as am_bg_writer, because mdinit is called
from BaseInit() which happens before reaching those functions.
My suggestion is that bootstrap.c ought to make the process's
AuxProcType value available and then mdinit should consult that to
decide what to do.  (Having done that, we might consider getting rid
of the "retail" process-type flags am_bg_writer etc.)

            regards, tom lane

diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 5f93fccbfab1bbb8306f5de4ad228f3cb48b0862..41a7b2be4f680db08556d948eaaa002ed50119c5 100644
*** a/src/backend/postmaster/bgwriter.c
--- b/src/backend/postmaster/bgwriter.c
*************** BackgroundWriterMain(void)
*** 341,346 ****
--- 341,357 ----
  }


+ /*
+  * IsBackgroundWriterProcess
+  *        Return true if running in background writer process.
+  */
+ bool
+ IsBackgroundWriterProcess(void)
+ {
+     return am_bg_writer;
+ }
+
+
  /* --------------------------------
   *        signal handler routines
   * --------------------------------
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 92fd4276cd1b3be81d1ac741f9f6ea09d241ea52..bd1db4811d661d75e616bc77157c4a9e9b7e92fc 100644
*** a/src/backend/postmaster/checkpointer.c
--- b/src/backend/postmaster/checkpointer.c
*************** bool
*** 1124,1129 ****
--- 1124,1130 ----
  ForwardFsyncRequest(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
  {
      CheckpointerRequest *request;
+     bool        am_bg_writer;
      bool        too_full;

      if (!IsUnderPostmaster)
*************** ForwardFsyncRequest(RelFileNode rnode, F
*** 1131,1141 ****

      if (am_checkpointer)
          elog(ERROR, "ForwardFsyncRequest must not be called in checkpointer");

      LWLockAcquire(CheckpointerCommLock, LW_EXCLUSIVE);

      /* Count all backend writes regardless of if they fit in the queue */
!     CheckpointerShmem->num_backend_writes++;

      /*
       * If the checkpointer isn't running or the request queue is full, the
--- 1132,1144 ----

      if (am_checkpointer)
          elog(ERROR, "ForwardFsyncRequest must not be called in checkpointer");
+     am_bg_writer = IsBackgroundWriterProcess();

      LWLockAcquire(CheckpointerCommLock, LW_EXCLUSIVE);

      /* Count all backend writes regardless of if they fit in the queue */
!     if (!am_bg_writer)
!         CheckpointerShmem->num_backend_writes++;

      /*
       * If the checkpointer isn't running or the request queue is full, the
*************** ForwardFsyncRequest(RelFileNode rnode, F
*** 1150,1156 ****
           * Count the subset of writes where backends have to do their own
           * fsync
           */
!         CheckpointerShmem->num_backend_fsync++;
          LWLockRelease(CheckpointerCommLock);
          return false;
      }
--- 1153,1160 ----
           * Count the subset of writes where backends have to do their own
           * fsync
           */
!         if (!am_bg_writer)
!             CheckpointerShmem->num_backend_fsync++;
          LWLockRelease(CheckpointerCommLock);
          return false;
      }
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 2e97e6aea5551f338939e680cb58e903fa409dc4..7deb62cf24a5d86248019ded418c360cb6f2800a 100644
*** a/src/include/postmaster/bgwriter.h
--- b/src/include/postmaster/bgwriter.h
*************** extern double CheckPointCompletionTarget
*** 28,33 ****
--- 28,35 ----
  extern void BackgroundWriterMain(void) __attribute__((noreturn));
  extern void CheckpointerMain(void) __attribute__((noreturn));

+ extern bool IsBackgroundWriterProcess(void);
+
  extern void RequestCheckpoint(int flags);
  extern void CheckpointWriteDelay(int flags, double progress);


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Using pg_upgrade on log-shipping standby servers
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)