Re: pgbench bug candidate: negative "initial connection time"

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: pgbench bug candidate: negative "initial connection time"
Дата
Msg-id 20210618.172603.409047378314636142.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: pgbench bug candidate: negative "initial connection time"  (Fabien COELHO <coelho@cri.ensmp.fr>)
Ответы Re: pgbench bug candidate: negative "initial connection time"  (Fabien COELHO <coelho@cri.ensmp.fr>)
Список pgsql-hackers
At Thu, 17 Jun 2021 11:52:04 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in 
> > Ok. I gave up to change the state in threadRun. Instead, I changed the
> > condition at the end of bench, which enables to report abortion due to
> > socket errors.
> >
> > +@@ -6480,7 +6490,7 @@ main(int argc, char **argv)
> > + #endif                            /* ENABLE_THREAD_SAFETY */
> > +
> > +         for (int j = 0; j < thread->nstate; j++)
> > +-            if (thread->state[j].state == CSTATE_ABORTED)
> > ++            if (thread->state[j].state != CSTATE_FINISHED)
> > +                 exit_code = 2;
> > +
> > +         /* aggregate thread level stats */
> >
> > Does this make sense?
> 
> Yes, definitely.

I sought for a simple way to enforce all client finishes with the
states abort or finished but I didn't find.  So +1 for the
change. However, as a matter of style. if we touch the code maybe we
want to enclose the if statement.

Doing this means we regard any state other than CSTATE_FINISHED as
aborted. So, the current goto's to done in threadRun are effectively
aborting a part or the all clients running on the thread. So for
example the following place:

pgbench.c:6713
>        /* must be something wrong */
>        pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
>        goto done;

Should say such like "thread %d aborted: %s() failed: ...".


I'm not sure what is the consensus here about the case where aborted
client can recoonect to the same server.  This patch doesn't that.  However, I think causing reconnection needs more
workthan accepted as a fix while beta.
 


====

+            /* as the bench is already running, we do not abort */
             pg_log_error("client %d aborted while establishing connection", st->id);
             st->state = CSTATE_ABORTED;

The comment looks strange that it is saying "we don't abort" while
setting the state to CSTATE_ABORT then showing "client %d aborted".


====
   if ((con = doConnect()) == NULL)
+  {
+    pg_log_fatal("connection for initialization failed");
     exit(1);

doConnect() prints an error emssage given from libpq.  So The
additional messaget is redundant.


====
   errno = THREAD_BARRIER_INIT(&barrier, nthreads);
   if (errno != 0)
+  {
     pg_log_fatal("could not initialize barrier: %m");
+    exit(1);

This is a run-time error. Maybe we should return 2 in that case.


===
     if (thread->logfile == NULL)
     {
       pg_log_fatal("could not open logfile \"%s\": %m", logpath);
-      goto done;
+      exit(1);

Maybe we should exit with 2 this case.  If we exit this case, we might
also want to exit when fclose() fails.  (Currently the error of
fclose() is ignored.)



===
+        /* coldly abort on connection failure */
+        pg_log_fatal("cannot create connection for thread %d client %d",
+               thread->tid, i);
+        exit(1);

It seems to me that the "thread %d client %d(not clinent id but the
client index within the thread)" doesn't make sense to users.  Even if
we showed a message like that, it should show only the global clinent
id (cstate->id).

I think that we should return with 2 here but we return with 1
in another place for the same reason..


===
         /* must be something wrong */
-        pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD);
+        pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
         goto done;

Why doesn't a fatal error cause an immediate exit?  (And if we change
this to fatal, we also need to change similar errors in the same
function to fatal.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Refactor ECPGconnect and allow IPv6 connection
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: [PoC] Federated Authn/z with OAUTHBEARER