Re: statistics for shared catalogs not updated when autovacuum is off
От | Tom Lane |
---|---|
Тема | Re: statistics for shared catalogs not updated when autovacuum is off |
Дата | |
Msg-id | 4625.1464202586@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: statistics for shared catalogs not updated when autovacuum is off (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
I wrote: > In short, I think the attached is enough to fix it. There's some cosmetic > cleanup that could be done here: a lot of these functions could use better > comments, and I'm not happy about the autovac launcher depending on > MyDatabaseId being zero. But those are not functional problems. While re-reading this, I realized that there's a second serious bug in the same area: commit 187492b6c basically broke the mechanism that prevents duplicated writes when two backends send inquiries for the same DB at about the same time. The intention is that if an inquiry arrives with a cutoff_time older than when we last wrote the DB's stats, we don't have to do anything. But as the code stands we will make a DBWriteRequest unconditionally, because the previous request has been removed from the list. It would only succeed in merging requests if the second one arrives before we have executed a write in response to the first one, which looks to me to be impossible given the coding of the message-processing loop in PgstatCollectorMain. In the attached expanded patch, I fixed pgstat_recv_inquiry() to not make a write request unless the cutoff-time restriction is met. I also removed DBWriteRequest.request_time, which was not being consulted at all. At this point the DBWriteRequest data structure is just overkill and could be replaced by an OID list at very substantial notational savings, but I've not done that yet. Comments/objections? regards, tom lane diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 41f4374..66f0e9c 100644 *** a/src/backend/postmaster/pgstat.c --- b/src/backend/postmaster/pgstat.c *************** static PgStat_GlobalStats globalStats; *** 224,231 **** /* Write request info for each database */ typedef struct DBWriteRequest { ! Oid databaseid; /* OID of the database to write */ ! TimestampTz request_time; /* timestamp of the last write request */ slist_node next; } DBWriteRequest; --- 224,230 ---- /* Write request info for each database */ typedef struct DBWriteRequest { ! Oid databaseid; /* OID of database to write (0 => shared) */ slist_node next; } DBWriteRequest; *************** pgstat_recv_inquiry(PgStat_MsgInquiry *m *** 4838,4850 **** elog(DEBUG2, "received inquiry for database %u", msg->databaseid); /* ! * Find the last write request for this DB. If it's older than the ! * request's cutoff time, update it; otherwise there's nothing to do. * * Note that if a request is found, we return early and skip the below * check for clock skew. This is okay, since the only way for a DB * request to be present in the list is that we have been here since the ! * last write round. */ slist_foreach(iter, &last_statrequests) { --- 4836,4848 ---- elog(DEBUG2, "received inquiry for database %u", msg->databaseid); /* ! * If there's already a write request for this DB, there's nothing to do. * * Note that if a request is found, we return early and skip the below * check for clock skew. This is okay, since the only way for a DB * request to be present in the list is that we have been here since the ! * last write round. It seems sufficient to check for clock skew once per ! * write round. */ slist_foreach(iter, &last_statrequests) { *************** pgstat_recv_inquiry(PgStat_MsgInquiry *m *** 4853,4873 **** if (req->databaseid != msg->databaseid) continue; - if (msg->cutoff_time > req->request_time) - req->request_time = msg->cutoff_time; return; } /* ! * There's no request for this DB yet, so create one. ! */ ! newreq = palloc(sizeof(DBWriteRequest)); ! ! newreq->databaseid = msg->databaseid; ! newreq->request_time = msg->clock_time; ! slist_push_head(&last_statrequests, &newreq->next); ! ! /* * If the requestor's local clock time is older than stats_timestamp, we * should suspect a clock glitch, ie system time going backwards; though * the more likely explanation is just delayed message receipt. It is --- 4851,4864 ---- if (req->databaseid != msg->databaseid) continue; return; } /* ! * Check to see if we last wrote this database at a time >= the requested ! * cutoff time. If so, this is a stale request that was generated before ! * we updated the DB file, and we don't need to do it again. ! * * If the requestor's local clock time is older than stats_timestamp, we * should suspect a clock glitch, ie system time going backwards; though * the more likely explanation is just delayed message receipt. It is *************** pgstat_recv_inquiry(PgStat_MsgInquiry *m *** 4876,4882 **** * to update the stats file for a long time. */ dbentry = pgstat_get_db_entry(msg->databaseid, false); ! if ((dbentry != NULL) && (msg->clock_time < dbentry->stats_timestamp)) { TimestampTz cur_ts = GetCurrentTimestamp(); --- 4867,4881 ---- * to update the stats file for a long time. */ dbentry = pgstat_get_db_entry(msg->databaseid, false); ! if (dbentry == NULL) ! { ! /* ! * We have no data for this DB! We might choose to do nothing here, ! * but instead enter a request so that at least the global stats get ! * written in response to the inquiry. ! */ ! } ! else if (msg->clock_time < dbentry->stats_timestamp) { TimestampTz cur_ts = GetCurrentTimestamp(); *************** pgstat_recv_inquiry(PgStat_MsgInquiry *m *** 4897,4907 **** writetime, mytime, dbentry->databaseid); pfree(writetime); pfree(mytime); ! ! newreq->request_time = cur_ts; ! dbentry->stats_timestamp = cur_ts - 1; } } } --- 4896,4921 ---- writetime, mytime, dbentry->databaseid); pfree(writetime); pfree(mytime); ! } ! else ! { ! /* No, it's just a stale request, ignore it */ ! return; } } + else if (msg->cutoff_time <= dbentry->stats_timestamp) + { + /* Stale request, ignore it */ + return; + } + + /* + * We need to write this DB, so create a request. + */ + newreq = (DBWriteRequest *) palloc(sizeof(DBWriteRequest)); + + newreq->databaseid = msg->databaseid; + slist_push_head(&last_statrequests, &newreq->next); } *************** pgstat_db_requested(Oid databaseid) *** 5504,5510 **** { slist_iter iter; ! /* Check the databases if they need to refresh the stats. */ slist_foreach(iter, &last_statrequests) { DBWriteRequest *req = slist_container(DBWriteRequest, next, iter.cur); --- 5518,5533 ---- { slist_iter iter; ! /* ! * If any requests are outstanding at all, we should write the stats for ! * shared catalogs (the "database" with OID 0). This ensures that ! * backends will see up-to-date stats for shared catalogs, even though ! * they send inquiry messages mentioning only their own DB. ! */ ! if (databaseid == InvalidOid && !slist_is_empty(&last_statrequests)) ! return true; ! ! /* Search to see if there's an open request to write this database. */ slist_foreach(iter, &last_statrequests) { DBWriteRequest *req = slist_container(DBWriteRequest, next, iter.cur); diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 4be09fe..a078b61 100644 *** a/src/include/pgstat.h --- b/src/include/pgstat.h *************** typedef struct PgStat_MsgDummy *** 219,225 **** /* ---------- * PgStat_MsgInquiry Sent by a backend to ask the collector ! * to write the stats file. * ---------- */ --- 219,238 ---- /* ---------- * PgStat_MsgInquiry Sent by a backend to ask the collector ! * to write the stats file(s). ! * ! * Ordinarily, an inquiry message prompts writing of the global stats file, ! * the stats file for shared catalogs, and the stats file for the specified ! * database. If databaseid is InvalidOid, only the first two are written. ! * ! * New file(s) will be written only if the existing file has a timestamp ! * older than the specified cutoff_time; this prevents duplicated effort ! * when multiple requests arrive at nearly the same time, assuming that ! * backends send requests with cutoff_times a little bit in the past. ! * ! * clock_time should be the requestor's current local time; the collector ! * uses this to check for the system clock going backward, but it has no ! * effect unless that occurs. We assume clock_time >= cutoff_time, though. * ---------- */ *************** typedef struct PgStat_MsgInquiry *** 228,234 **** PgStat_MsgHdr m_hdr; TimestampTz clock_time; /* observed local clock time */ TimestampTz cutoff_time; /* minimum acceptable file timestamp */ ! Oid databaseid; /* requested DB (InvalidOid => all DBs) */ } PgStat_MsgInquiry; --- 241,247 ---- PgStat_MsgHdr m_hdr; TimestampTz clock_time; /* observed local clock time */ TimestampTz cutoff_time; /* minimum acceptable file timestamp */ ! Oid databaseid; /* requested DB (InvalidOid => shared only) */ } PgStat_MsgInquiry;
В списке pgsql-hackers по дате отправления: