Re: statistics for shared catalogs not updated when autovacuum is off

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: statistics for shared catalogs not updated when autovacuum is off
Дата
Msg-id 4625.1464202586@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: statistics for shared catalogs not updated when autovacuum is off  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> In short, I think the attached is enough to fix it.  There's some cosmetic
> cleanup that could be done here: a lot of these functions could use better
> comments, and I'm not happy about the autovac launcher depending on
> MyDatabaseId being zero.  But those are not functional problems.

While re-reading this, I realized that there's a second serious bug in the
same area: commit 187492b6c basically broke the mechanism that prevents
duplicated writes when two backends send inquiries for the same DB at
about the same time.  The intention is that if an inquiry arrives with a
cutoff_time older than when we last wrote the DB's stats, we don't have to
do anything.  But as the code stands we will make a DBWriteRequest
unconditionally, because the previous request has been removed from the
list.  It would only succeed in merging requests if the second one arrives
before we have executed a write in response to the first one, which looks
to me to be impossible given the coding of the message-processing loop in
PgstatCollectorMain.

In the attached expanded patch, I fixed pgstat_recv_inquiry() to not make
a write request unless the cutoff-time restriction is met.  I also removed
DBWriteRequest.request_time, which was not being consulted at all.  At
this point the DBWriteRequest data structure is just overkill and could be
replaced by an OID list at very substantial notational savings, but I've
not done that yet.

Comments/objections?

            regards, tom lane

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 41f4374..66f0e9c 100644
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
*************** static PgStat_GlobalStats globalStats;
*** 224,231 ****
  /* Write request info for each database */
  typedef struct DBWriteRequest
  {
!     Oid            databaseid;        /* OID of the database to write */
!     TimestampTz request_time;    /* timestamp of the last write request */
      slist_node    next;
  } DBWriteRequest;

--- 224,230 ----
  /* Write request info for each database */
  typedef struct DBWriteRequest
  {
!     Oid            databaseid;        /* OID of database to write (0 => shared) */
      slist_node    next;
  } DBWriteRequest;

*************** pgstat_recv_inquiry(PgStat_MsgInquiry *m
*** 4838,4850 ****
      elog(DEBUG2, "received inquiry for database %u", msg->databaseid);

      /*
!      * Find the last write request for this DB.  If it's older than the
!      * request's cutoff time, update it; otherwise there's nothing to do.
       *
       * Note that if a request is found, we return early and skip the below
       * check for clock skew.  This is okay, since the only way for a DB
       * request to be present in the list is that we have been here since the
!      * last write round.
       */
      slist_foreach(iter, &last_statrequests)
      {
--- 4836,4848 ----
      elog(DEBUG2, "received inquiry for database %u", msg->databaseid);

      /*
!      * If there's already a write request for this DB, there's nothing to do.
       *
       * Note that if a request is found, we return early and skip the below
       * check for clock skew.  This is okay, since the only way for a DB
       * request to be present in the list is that we have been here since the
!      * last write round.  It seems sufficient to check for clock skew once per
!      * write round.
       */
      slist_foreach(iter, &last_statrequests)
      {
*************** pgstat_recv_inquiry(PgStat_MsgInquiry *m
*** 4853,4873 ****
          if (req->databaseid != msg->databaseid)
              continue;

-         if (msg->cutoff_time > req->request_time)
-             req->request_time = msg->cutoff_time;
          return;
      }

      /*
!      * There's no request for this DB yet, so create one.
!      */
!     newreq = palloc(sizeof(DBWriteRequest));
!
!     newreq->databaseid = msg->databaseid;
!     newreq->request_time = msg->clock_time;
!     slist_push_head(&last_statrequests, &newreq->next);
!
!     /*
       * If the requestor's local clock time is older than stats_timestamp, we
       * should suspect a clock glitch, ie system time going backwards; though
       * the more likely explanation is just delayed message receipt.  It is
--- 4851,4864 ----
          if (req->databaseid != msg->databaseid)
              continue;

          return;
      }

      /*
!      * Check to see if we last wrote this database at a time >= the requested
!      * cutoff time.  If so, this is a stale request that was generated before
!      * we updated the DB file, and we don't need to do it again.
!      *
       * If the requestor's local clock time is older than stats_timestamp, we
       * should suspect a clock glitch, ie system time going backwards; though
       * the more likely explanation is just delayed message receipt.  It is
*************** pgstat_recv_inquiry(PgStat_MsgInquiry *m
*** 4876,4882 ****
       * to update the stats file for a long time.
       */
      dbentry = pgstat_get_db_entry(msg->databaseid, false);
!     if ((dbentry != NULL) && (msg->clock_time < dbentry->stats_timestamp))
      {
          TimestampTz cur_ts = GetCurrentTimestamp();

--- 4867,4881 ----
       * to update the stats file for a long time.
       */
      dbentry = pgstat_get_db_entry(msg->databaseid, false);
!     if (dbentry == NULL)
!     {
!         /*
!          * We have no data for this DB!  We might choose to do nothing here,
!          * but instead enter a request so that at least the global stats get
!          * written in response to the inquiry.
!          */
!     }
!     else if (msg->clock_time < dbentry->stats_timestamp)
      {
          TimestampTz cur_ts = GetCurrentTimestamp();

*************** pgstat_recv_inquiry(PgStat_MsgInquiry *m
*** 4897,4907 ****
                   writetime, mytime, dbentry->databaseid);
              pfree(writetime);
              pfree(mytime);
!
!             newreq->request_time = cur_ts;
!             dbentry->stats_timestamp = cur_ts - 1;
          }
      }
  }


--- 4896,4921 ----
                   writetime, mytime, dbentry->databaseid);
              pfree(writetime);
              pfree(mytime);
!         }
!         else
!         {
!             /* No, it's just a stale request, ignore it */
!             return;
          }
      }
+     else if (msg->cutoff_time <= dbentry->stats_timestamp)
+     {
+         /* Stale request, ignore it */
+         return;
+     }
+
+     /*
+      * We need to write this DB, so create a request.
+      */
+     newreq = (DBWriteRequest *) palloc(sizeof(DBWriteRequest));
+
+     newreq->databaseid = msg->databaseid;
+     slist_push_head(&last_statrequests, &newreq->next);
  }


*************** pgstat_db_requested(Oid databaseid)
*** 5504,5510 ****
  {
      slist_iter    iter;

!     /* Check the databases if they need to refresh the stats. */
      slist_foreach(iter, &last_statrequests)
      {
          DBWriteRequest *req = slist_container(DBWriteRequest, next, iter.cur);
--- 5518,5533 ----
  {
      slist_iter    iter;

!     /*
!      * If any requests are outstanding at all, we should write the stats for
!      * shared catalogs (the "database" with OID 0).  This ensures that
!      * backends will see up-to-date stats for shared catalogs, even though
!      * they send inquiry messages mentioning only their own DB.
!      */
!     if (databaseid == InvalidOid && !slist_is_empty(&last_statrequests))
!         return true;
!
!     /* Search to see if there's an open request to write this database. */
      slist_foreach(iter, &last_statrequests)
      {
          DBWriteRequest *req = slist_container(DBWriteRequest, next, iter.cur);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 4be09fe..a078b61 100644
*** a/src/include/pgstat.h
--- b/src/include/pgstat.h
*************** typedef struct PgStat_MsgDummy
*** 219,225 ****

  /* ----------
   * PgStat_MsgInquiry            Sent by a backend to ask the collector
!  *                                to write the stats file.
   * ----------
   */

--- 219,238 ----

  /* ----------
   * PgStat_MsgInquiry            Sent by a backend to ask the collector
!  *                                to write the stats file(s).
!  *
!  * Ordinarily, an inquiry message prompts writing of the global stats file,
!  * the stats file for shared catalogs, and the stats file for the specified
!  * database.  If databaseid is InvalidOid, only the first two are written.
!  *
!  * New file(s) will be written only if the existing file has a timestamp
!  * older than the specified cutoff_time; this prevents duplicated effort
!  * when multiple requests arrive at nearly the same time, assuming that
!  * backends send requests with cutoff_times a little bit in the past.
!  *
!  * clock_time should be the requestor's current local time; the collector
!  * uses this to check for the system clock going backward, but it has no
!  * effect unless that occurs.  We assume clock_time >= cutoff_time, though.
   * ----------
   */

*************** typedef struct PgStat_MsgInquiry
*** 228,234 ****
      PgStat_MsgHdr m_hdr;
      TimestampTz clock_time;        /* observed local clock time */
      TimestampTz cutoff_time;    /* minimum acceptable file timestamp */
!     Oid            databaseid;        /* requested DB (InvalidOid => all DBs) */
  } PgStat_MsgInquiry;


--- 241,247 ----
      PgStat_MsgHdr m_hdr;
      TimestampTz clock_time;        /* observed local clock time */
      TimestampTz cutoff_time;    /* minimum acceptable file timestamp */
!     Oid            databaseid;        /* requested DB (InvalidOid => shared only) */
  } PgStat_MsgInquiry;



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Is the unfair lwlock behavior intended?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Changed SRF in targetlist handling