Re: pgbench bug candidate: negative "initial connection time"
| От | Kyotaro Horiguchi |
|---|---|
| Тема | Re: pgbench bug candidate: negative "initial connection time" |
| Дата | |
| Msg-id | 20210618.172603.409047378314636142.horikyota.ntt@gmail.com обсуждение исходный текст |
| Ответ на | Re: pgbench bug candidate: negative "initial connection time" (Fabien COELHO <coelho@cri.ensmp.fr>) |
| Ответы |
Re: pgbench bug candidate: negative "initial connection time"
|
| Список | pgsql-hackers |
At Thu, 17 Jun 2021 11:52:04 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
> > Ok. I gave up to change the state in threadRun. Instead, I changed the
> > condition at the end of bench, which enables to report abortion due to
> > socket errors.
> >
> > +@@ -6480,7 +6490,7 @@ main(int argc, char **argv)
> > + #endif /* ENABLE_THREAD_SAFETY */
> > +
> > + for (int j = 0; j < thread->nstate; j++)
> > +- if (thread->state[j].state == CSTATE_ABORTED)
> > ++ if (thread->state[j].state != CSTATE_FINISHED)
> > + exit_code = 2;
> > +
> > + /* aggregate thread level stats */
> >
> > Does this make sense?
>
> Yes, definitely.
I sought for a simple way to enforce all client finishes with the
states abort or finished but I didn't find. So +1 for the
change. However, as a matter of style. if we touch the code maybe we
want to enclose the if statement.
Doing this means we regard any state other than CSTATE_FINISHED as
aborted. So, the current goto's to done in threadRun are effectively
aborting a part or the all clients running on the thread. So for
example the following place:
pgbench.c:6713
> /* must be something wrong */
> pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
> goto done;
Should say such like "thread %d aborted: %s() failed: ...".
I'm not sure what is the consensus here about the case where aborted
client can recoonect to the same server. This patch doesn't that. However, I think causing reconnection needs more
workthan accepted as a fix while beta.
====
+ /* as the bench is already running, we do not abort */
pg_log_error("client %d aborted while establishing connection", st->id);
st->state = CSTATE_ABORTED;
The comment looks strange that it is saying "we don't abort" while
setting the state to CSTATE_ABORT then showing "client %d aborted".
====
if ((con = doConnect()) == NULL)
+ {
+ pg_log_fatal("connection for initialization failed");
exit(1);
doConnect() prints an error emssage given from libpq. So The
additional messaget is redundant.
====
errno = THREAD_BARRIER_INIT(&barrier, nthreads);
if (errno != 0)
+ {
pg_log_fatal("could not initialize barrier: %m");
+ exit(1);
This is a run-time error. Maybe we should return 2 in that case.
===
if (thread->logfile == NULL)
{
pg_log_fatal("could not open logfile \"%s\": %m", logpath);
- goto done;
+ exit(1);
Maybe we should exit with 2 this case. If we exit this case, we might
also want to exit when fclose() fails. (Currently the error of
fclose() is ignored.)
===
+ /* coldly abort on connection failure */
+ pg_log_fatal("cannot create connection for thread %d client %d",
+ thread->tid, i);
+ exit(1);
It seems to me that the "thread %d client %d(not clinent id but the
client index within the thread)" doesn't make sense to users. Even if
we showed a message like that, it should show only the global clinent
id (cstate->id).
I think that we should return with 2 here but we return with 1
in another place for the same reason..
===
/* must be something wrong */
- pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD);
+ pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
goto done;
Why doesn't a fatal error cause an immediate exit? (And if we change
this to fatal, we also need to change similar errors in the same
function to fatal.)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
В списке pgsql-hackers по дате отправления: