Re: inefficient loop in StandbyReleaseLockList()

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: inefficient loop in StandbyReleaseLockList()
Дата
Msg-id 1311319.1635804078@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: inefficient loop in StandbyReleaseLockList()  ("Bossart, Nathan" <bossartn@amazon.com>)
Ответы Re: inefficient loop in StandbyReleaseLockList()  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-hackers
"Bossart, Nathan" <bossartn@amazon.com> writes:
> On 10/31/21, 1:55 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> 2. I think we almost certainly have a problem in SyncPostCheckpoint.

> This one doesn't look as straightforward.  It looks like we might need
> a list_delete_first_n() to delete the first N entries all at once to
> improve this one.

Yeah.  We don't absolutely need a new list primitive: we could use
list_copy_tail() and then free the old list.  But the extra palloc
traffic involved makes this sound like a bad idea.  It does have
the advantage that we could shorten the List's storage even when
it doesn't go to empty, but I'm not sure that's worth anything.
If the List isn't going to empty, that implies that we're getting
a steady stream of unlink requests, meaning we'd probably just
fill it up again.

The minimum-change patch would have us truncating the list before
each AbsorbSyncRequests call, so that the list state meets that
function's expectations.  However, as long as UNLINKS_PER_ABSORB
is only 10, I don't think that gets us out of the O(N^2) woods.
So what I did in the attached is add a "canceled" flag to
PendingUnlinkEntry, which lets us deal with canceled or finished
entries without having to delete them from the list right away.
Then we only need to physically clean up the list once per
SyncPostCheckpoint call.

            regards, tom lane

diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c
index 94fb236daf..12847e35ea 100644
--- a/src/backend/nodes/list.c
+++ b/src/backend/nodes/list.c
@@ -906,6 +906,72 @@ list_delete_last(List *list)
     return list_truncate(list, list_length(list) - 1);
 }

+/*
+ * Delete the first N cells of the list.
+ *
+ * The List is pfree'd if the request causes all cells to be deleted.
+ */
+List *
+list_delete_first_n(List *list, int n)
+{
+    check_list_invariants(list);
+
+    /* No-op request? */
+    if (n <= 0)
+        return list;
+
+    /* Delete whole list? */
+    if (n >= list_length(list))
+    {
+        list_free(list);
+        return NIL;
+    }
+
+    /*
+     * Otherwise, we normally just collapse out the removed elements.  But for
+     * debugging purposes, move the whole list contents someplace else.
+     *
+     * (Note that we *must* keep the contents in the same memory context.)
+     */
+#ifndef DEBUG_LIST_MEMORY_USAGE
+    memmove(&list->elements[0], &list->elements[n],
+            (list->length - n) * sizeof(ListCell));
+    list->length -= n;
+#else
+    {
+        ListCell   *newelems;
+        int            newmaxlen = list->length - n;
+
+        newelems = (ListCell *)
+            MemoryContextAlloc(GetMemoryChunkContext(list),
+                               newmaxlen * sizeof(ListCell));
+        memcpy(newelems, &list->elements[n], newmaxlen * sizeof(ListCell));
+        if (list->elements != list->initial_elements)
+            pfree(list->elements);
+        else
+        {
+            /*
+             * As in enlarge_list(), clear the initial_elements[] space and/or
+             * mark it inaccessible.
+             */
+#ifdef CLOBBER_FREED_MEMORY
+            wipe_mem(list->initial_elements,
+                     list->max_length * sizeof(ListCell));
+#else
+            VALGRIND_MAKE_MEM_NOACCESS(list->initial_elements,
+                                       list->max_length * sizeof(ListCell));
+#endif
+        }
+        list->elements = newelems;
+        list->max_length = newmaxlen;
+        list->length = newmaxlen;
+        check_list_invariants(list);
+    }
+#endif
+
+    return list;
+}
+
 /*
  * Generate the union of two lists. This is calculated by copying
  * list1 via list_copy(), then adding to it all the members of list2
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index 4a2ed414b0..023d7413e2 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -69,6 +69,7 @@ typedef struct
 {
     FileTag        tag;            /* identifies handler and file */
     CycleCtr    cycle_ctr;        /* checkpoint_cycle_ctr when request was made */
+    bool        canceled;        /* true if request has been canceled */
 } PendingUnlinkEntry;

 static HTAB *pendingOps = NULL;
@@ -195,11 +196,12 @@ void
 SyncPostCheckpoint(void)
 {
     int            absorb_counter;
+    ListCell   *lc;

     absorb_counter = UNLINKS_PER_ABSORB;
-    while (pendingUnlinks != NIL)
+    foreach(lc, pendingUnlinks)
     {
-        PendingUnlinkEntry *entry = (PendingUnlinkEntry *) linitial(pendingUnlinks);
+        PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
         char        path[MAXPGPATH];

         /*
@@ -214,6 +216,10 @@ SyncPostCheckpoint(void)
         if (entry->cycle_ctr == checkpoint_cycle_ctr)
             break;

+        /* Skip over any canceled entries */
+        if (entry->canceled)
+            continue;
+
         /* Unlink the file */
         if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
                                                           path) < 0)
@@ -231,15 +237,13 @@ SyncPostCheckpoint(void)
                          errmsg("could not remove file \"%s\": %m", path)));
         }

-        /* And remove the list entry */
-        pendingUnlinks = list_delete_first(pendingUnlinks);
-        pfree(entry);
+        /* Mark the list entry as canceled, just in case */
+        entry->canceled = true;

         /*
          * As in ProcessSyncRequests, we don't want to stop absorbing fsync
          * requests for a long time when there are many deletions to be done.
-         * We can safely call AbsorbSyncRequests() at this point in the loop
-         * (note it might try to delete list entries).
+         * We can safely call AbsorbSyncRequests() at this point in the loop.
          */
         if (--absorb_counter <= 0)
         {
@@ -247,6 +251,26 @@ SyncPostCheckpoint(void)
             absorb_counter = UNLINKS_PER_ABSORB;
         }
     }
+
+    /*
+     * If we reached the end of the list, we can just remove the whole list
+     * (remembering to pfree all the PendingUnlinkEntry objects).  Otherwise,
+     * we must keep the entries at or after "lc".
+     */
+    if (lc == NULL)
+    {
+        list_free_deep(pendingUnlinks);
+        pendingUnlinks = NIL;
+    }
+    else
+    {
+        int            ntodelete = list_cell_number(pendingUnlinks, lc);
+
+        for (int i = 0; i < ntodelete; i++)
+            pfree(list_nth(pendingUnlinks, i));
+
+        pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
+    }
 }

 /*
@@ -486,17 +510,14 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
                 entry->canceled = true;
         }

-        /* Remove matching unlink requests */
+        /* Cancel matching unlink requests */
         foreach(cell, pendingUnlinks)
         {
             PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(cell);

             if (entry->tag.handler == ftag->handler &&
                 syncsw[ftag->handler].sync_filetagmatches(ftag, &entry->tag))
-            {
-                pendingUnlinks = foreach_delete_current(pendingUnlinks, cell);
-                pfree(entry);
-            }
+                entry->canceled = true;
         }
     }
     else if (type == SYNC_UNLINK_REQUEST)
@@ -508,6 +529,7 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
         entry = palloc(sizeof(PendingUnlinkEntry));
         entry->tag = *ftag;
         entry->cycle_ctr = checkpoint_cycle_ctr;
+        entry->canceled = false;

         pendingUnlinks = lappend(pendingUnlinks, entry);

diff --git a/src/include/nodes/pg_list.h b/src/include/nodes/pg_list.h
index 30f98c4595..c3f47db888 100644
--- a/src/include/nodes/pg_list.h
+++ b/src/include/nodes/pg_list.h
@@ -564,6 +564,7 @@ extern pg_nodiscard List *list_delete_int(List *list, int datum);
 extern pg_nodiscard List *list_delete_oid(List *list, Oid datum);
 extern pg_nodiscard List *list_delete_first(List *list);
 extern pg_nodiscard List *list_delete_last(List *list);
+extern pg_nodiscard List *list_delete_first_n(List *list, int n);
 extern pg_nodiscard List *list_delete_nth_cell(List *list, int n);
 extern pg_nodiscard List *list_delete_cell(List *list, ListCell *cell);


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Fix C4819 warning in MSVC
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Missing include in be-secure-openssl.c?