Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions

Поиск
Список
Период
Сортировка
От Mengxing Liu
Тема Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions
Дата
Msg-id 6b51db2e.3a65.15c8619446f.Coremail.liu-mx15@mails.tsinghua.edu.cn
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions  (Kevin Grittner <kgrittn@gmail.com>)
Список pgsql-hackers

> From: "Kevin Grittner" <kgrittn@gmail.com>
> <liu-mx15@mails.tsinghua.edu.cn> wrote:
> 
> > "vmstat 1" output is as follow. Because I used only 30 cores (1/4 of all),  cpu user time should be about 12*4 =
48.
> > There seems to be no process blocked by IO.
> >
> > procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
> >  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
> > 28  0      0 981177024 315036 70843760    0    0     0     9    0    0  1  0 99  0  0
> > 21  1      0 981178176 315036 70843784    0    0     0     0 25482 329020 12  3 85  0  0
> > 18  1      0 981179200 315036 70843792    0    0     0     0 26569 323596 12  3 85  0  0
> > 17  0      0 981175424 315036 70843808    0    0     0     0 25374 322992 12  4 85  0  0
> > 12  0      0 981174208 315036 70843824    0    0     0     0 24775 321577 12  3 85  0  0
> >  8  0      0 981179328 315036 70845336    0    0     0     0 13115 199020  6  2 92  0  0
> > 13  0      0 981179200 315036 70845792    0    0     0     0 22893 301373 11  3 87  0  0
> > 11  0      0 981179712 315036 70845808    0    0     0     0 26933 325728 12  4 84  0  0
> > 30  0      0 981178304 315036 70845824    0    0     0     0 23691 315821 11  4 85  0  0
> > 12  1      0 981177600 315036 70845832    0    0     0     0 29485 320166 12  4 84  0  0
> > 32  0      0 981180032 315036 70845848    0    0     0     0 25946 316724 12  4 84  0  0
> > 21  0      0 981176384 315036 70845864    0    0     0     0 24227 321938 12  4 84  0  0
> > 21  0      0 981178880 315036 70845880    0    0     0     0 25174 326943 13  4 83  0  0
> 
> This machine has 120 cores?  Is hyperthreading enabled?  If so, what
> you are showing might represent a total saturation of the 30 cores.
> Context switches of about 300,000 per second is pretty high.  I can't
> think of when I've seen that except when there is high spinlock
> contention.
> 

Yes, and the hyper-threading is closed. 

> Just to put the above in context, how did you limit the test to 30
> cores?  How many connections were open during the test?
> 

I used numactl to limit the test in the first two sockets (15 cores in each socket)
And there are 90 concurrent connections. 

> > The flame graph is attached. I use 'perf' to generate the flame graph. Only the CPUs running PG server are
profiled.
> > I'm not familiar with other part of PG. Can you find anything unusual in the graph?
> 
> Two SSI functions stand out:
> 10.86% PredicateLockTuple
>  3.51% CheckForSerializableConflictIn
> 
> In both cases, most of that seems to go to lightweight locking.  Since
> you said this is a CPU graph, that again suggests spinlock contention
> issues.
> 
> -- 

Yes. Is there any other kind of locks besides spinlock? I'm reading locks in PG now. If all locks are spinlock, the CPU
shouldbe used 100%. But now only 50% CPU are used. 
 
I'm afraid there are extra time waiting for mutex or semaphore.
These SSI functions will cost more time than reported, because perf doesn't record the time sleeping & waiting for
locks.
 
CheckForSerializableConflictIn takes 10% of running time. (refer to my last email) 

--
Mengxing Liu











В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] Race conditions with WAL sender PID lookups
Следующее
От: Noah Misch
Дата:
Сообщение: Re: [HACKERS] walsender & parallelism