Re: pgbench bug candidate: negative "initial connection time"

Поиск
Список
Период
Сортировка
От Yugo NAGATA
Тема Re: pgbench bug candidate: negative "initial connection time"
Дата
Msg-id 20210614170518.5c0fb098a6f9a159d920e326@sraoss.co.jp
обсуждение исходный текст
Ответ на pgbench bug candidate: negative "initial connection time"  ("kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>)
Список pgsql-hackers
Hello Kuroda-san,

On Fri, 11 Jun 2021 08:58:45 +0000
"kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com> wrote:

> Hi Hackers,
> 
> I played pgbench with wrong parameters, and I found bug-candidate.
> The latest commit in my source is 3a09d75.
> 
> 1. Do initdb and start.
> 2. Initialize schema and data with "scale factor" = 1.
> 3. execute following command many times:
> 
> $ pgbench -c 101 -j 10 postgres
> 
> Then, sometimes the negative " initial connection time" was returned.
> Lateyncy average is also strange.
> 
> ```
> $ pgbench -c 101 -j 10 postgres
> starting vacuum...end.
> pgbench: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL:  sorry, too many clients already
> pgbench (PostgreSQL) 14.0
> transaction type: <builtin: TPC-B (sort of)>
> scaling factor: 1
> query mode: simple
> number of clients: 101
> number of threads: 10
> number of transactions per client: 10
> number of transactions actually processed: 910/1010
> latency average = 41387686.662 ms
> initial connection time = -372896921.586 ms
> tps = 0.002440 (without initial connection time)
> ```
> 
> I sought pgbench.c and found a reason.
> When a thread failed to get some connections, they do not fill any values to thread->bench_start in threadRun().
> And if the failure is caused in the final thread (this means threads[nthreads - 1]->bench_start is zero),
> the following if-statement sets bench_start to zero.
> 
> ```
>    6494                 /* first recorded benchmarking start time */
>    6495                 if (bench_start == 0 || thread->bench_start < bench_start)
>    6496                         bench_start = thread->bench_start;
> ```
> 
> The wrong bench_start propagates to printResult() and then some invalid values are appered.
> 
> ```
>    6509         printResults(&stats, pg_time_now() - bench_start, conn_total_duration,
>    6510                                  bench_start - start_time, latency_late);
> ```
> 
> I cannot distinguish whether we have to fix it, but I attache the patch.
> This simply ignores a result when therad->bench_start is zero.


 +        /* skip if the thread faild to get connection */
+        if (thread->bench_start == 0)
+            continue;

It detects if a thread failed to get the initial connection by thread->bench_start == 0, but this assumes the initial
valueis zero. For ensuring this, I think it is better to initialize it in an early state, for example like this.
 

@@ -6419,6 +6419,7 @@ main(int argc, char **argv)
                initRandomState(&thread->ts_throttle_rs);
                initRandomState(&thread->ts_sample_rs);
                thread->logfile = NULL; /* filled in later */
+               thread->bench_start = 0;
                thread->latency_late = 0;
                initStats(&thread->stats, 0)

typo: faild -> failed

Regards,
Yugo Nagata
-- 
Yugo NAGATA <nagata@sraoss.co.jp>



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Yugo NAGATA
Дата:
Сообщение: Re: Avoid stuck of pbgench due to skipped transactions
Следующее
От: Yugo NAGATA
Дата:
Сообщение: Re: pgbench bug candidate: negative "initial connection time"