Re: Sync Rep v17

Поиск
Список
Период
Сортировка
От Yeb Havinga
Тема Re: Sync Rep v17
Дата
Msg-id 4D67CDC4.3010203@gmail.com
обсуждение исходный текст
Ответ на Re: Sync Rep v17  (Jaime Casanova <jaime@2ndquadrant.com>)
Ответы Re: Sync Rep v17  (Jaime Casanova <jaime@2ndquadrant.com>)
Re: Sync Rep v17  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Sync Rep v17  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
On 2011-02-22 20:43, Jaime Casanova wrote:
>
> you can make this happen more easily, i just run "pgbench -n -c10 -j10
> test" and qot that warning and sometimes a segmentation fault and
> sometimes a failed assertion
>
> and the problematic code starts at
> src/backend/replication/syncrep.c:277, here my suggestions on that
> code.
> still i get a failed assertion because of the second Assert (i think
> we should just remove that one)
>
> *************** SyncRepRemoveFromQueue(void)
> *** 288,299 ****
>
>                          if (proc->lwWaitLink == NULL)
>                                  elog(WARNING, "could not locate
> ourselves on wait queue");
> !                       proc = proc->lwWaitLink;
>                  }
>
>                  if (proc->lwWaitLink == NULL)   /* At tail */
>                  {
> !                       Assert(proc == MyProc);
>                          /* Remove ourselves from tail of queue */
>                          Assert(queue->tail == MyProc);
>                          queue->tail = proc;
> --- 288,300 ----
>
>                          if (proc->lwWaitLink == NULL)
>                                  elog(WARNING, "could not locate
> ourselves on wait queue");
> !                       else
> !                               proc = proc->lwWaitLink;
>                  }
>
>                  if (proc->lwWaitLink == NULL)   /* At tail */
>                  {
> !                       Assert(proc != MyProc);
>                          /* Remove ourselves from tail of queue */
>                          Assert(queue->tail == MyProc);
>                          queue->tail = proc;
>
I also did some initial testing on this patch and got the queue related 
errors with > 1 clients. With the code change from Jaime above I still 
got a lot of 'not on queue warnings'.

I tried to understand how the queue was supposed to work - resulting in 
the changes below that also incorporates a suggestion from Fujii 
upthread, to early exit when myproc was found.

With the changes below all seems to work without warnings. I now see 
that the note about the list invariant is too short, better was: "if 
queue length = 1 then head = tail"

--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -274,6 +274,8 @@ SyncRepRemoveFromQueue(void)        }        else        {
+               bool found = false;
+                while (proc->lwWaitLink != NULL)                {                        /* Are we the next proc in
ourtraversal of the 
 
queue? */
@@ -284,17 +286,19 @@ SyncRepRemoveFromQueue(void)                                 * No need to touch head or tail.
                           */                                proc->lwWaitLink = MyProc->lwWaitLink;
 
+                               found = true;
+                               break;                        }

-                       if (proc->lwWaitLink == NULL)
-                               elog(WARNING, "could not locate 
ourselves on wait queue");                        proc = proc->lwWaitLink;                }
+               if (!found)
+                       elog(WARNING, "could not locate ourselves on 
wait queue");

-               if (proc->lwWaitLink == NULL)   /* At tail */
+               /* If MyProc was removed from the tail, maintain list 
invariant head==tail */
+               if (proc->lwWaitLink == NULL)                {
-                       Assert(proc == MyProc);
-                       /* Remove ourselves from tail of queue */
+                       Assert(proc != MyProc); /* impossible since that 
is the head=MyProc branch above */                        Assert(queue->tail == MyProc);
queue->tail= proc;                        proc->lwWaitLink = NULL;
 

I needed to add this to make the documentation compile

--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2010,6 +2010,9 @@ SET ENABLE_SEQSCAN TO OFF;         You should also consider setting
<varname>hot_standby_feedback</>        as an alternative to using this parameter.
 
</para>
+ </listitem>
+ </varlistentry>
+ </variablelist></sect2>

<sect2 id="runtime-config-sync-rep">

regards,
Yeb Havinga



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: wCTE: why not finish sub-updates at the end, not the beginning?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: wCTE: why not finish sub-updates at the end, not the beginning?