Re: BufferAlloc: don't take two simultaneous locks

Поиск
Список
Период
Сортировка
От Yura Sokolov
Тема Re: BufferAlloc: don't take two simultaneous locks
Дата
Msg-id 0b35f32057974441f30d5f94aef87e319498c260.camel@postgrespro.ru
обсуждение исходный текст
Ответ на Re: BufferAlloc: don't take two simultaneous locks  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Ответы Re: BufferAlloc: don't take two simultaneous locks  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Список pgsql-hackers
Good day, hackers.

This is continuation of BufferAlloc saga.

This time I've tried to implement approach:
- if there's no buffer, insert placeholder
- then find victim
- if other backend wants to insert same buffer, it waits on
  ConditionVariable.

Patch make separate ConditionVariable per backend, and placeholder
contains backend id. So waiters don't suffer from collision on
partition, they wait exactly for concrete buffer.

This patch doesn't contain any dynahash changes since order of
operation doesn't change: "insert then delete". So there is no way to
"reserve" entry.

But it contains changes to ConditionVariable:

- adds ConditionVariableSleepOnce, which doesn't reinsert process back
  on CV's proclist.
  This method could not be used in loop as ConditionVariableSleep,
  and ConditionVariablePrepareSleep must be called before.
  
- adds ConditionVariableBroadcastFast - improvement over regular
  ConditionVariableBroadcast that awakes processes in batches.
  So CVBroadcastFast doesn't acquire/release CV's spinlock mutex for
  every proclist entry, but rather for batch of entries.
  
  I believe, it could safely replace ConditionVariableBroadcast. Though
  I didn't try yet to replace and check.

Tests:
- tests done on 2 socket Xeon 5220 2.20GHz with turbo bust disabled
  (ie max frequency is 2.20GHz)
- runs on 1 socket or 2 sockets using numactl
- pgbench scale 100 - 1.5GB of data
- shared_buffers : 128MB, 1GB (and 2GB)
- variations of simple_select with 1 key per query, 3 keys per query
  and 10 keys per query.

1 socket 1 key

  conns |  master 128M |     v12 128M |    master 1G |       v12 1G 
--------+--------------+--------------+--------------+--------------
      1 |        25670 |        24926 |        29491 |        28858 
      2 |        50157 |        48894 |        58356 |        57180 
      3 |        75036 |        72904 |        87152 |        84869 
      5 |       124479 |       120720 |       143550 |       140799 
      7 |       168586 |       164277 |       199360 |       195578 
     17 |       319943 |       314010 |       364963 |       358550 
     27 |       423617 |       420528 |       491493 |       485139 
     53 |       491357 |       490994 |       574477 |       571753 
     83 |       487029 |       486750 |       571057 |       566335 
    107 |       478429 |       479862 |       565471 |       560115 
    139 |       467953 |       469981 |       556035 |       551056 
    163 |       459467 |       463272 |       548976 |       543660 
    191 |       448420 |       456105 |       540881 |       534556 
    211 |       440229 |       458712 |       545195 |       535333 
    239 |       431754 |       471373 |       547111 |       552591 
    271 |       421767 |       473479 |       544014 |       557910 
    307 |       408234 |       474285 |       539653 |       556629 
    353 |       389360 |       472491 |       534719 |       554696 
    397 |       377063 |       471513 |       527887 |       554383 

1 socket 3 keys

  conns |  master 128M |     v12 128M |    master 1G |       v12 1G 
--------+--------------+--------------+--------------+--------------
      1 |        15277 |        14917 |        20109 |        19564 
      2 |        29587 |        28892 |        39430 |        36986 
      3 |        44204 |        43198 |        58993 |        57196 
      5 |        71471 |        68703 |        96923 |        92497 
      7 |        98823 |        97823 |       133173 |       130134 
     17 |       201351 |       198865 |       258139 |       254702 
     27 |       254959 |       255503 |       338117 |       339044 
     53 |       277048 |       291923 |       384300 |       390812 
     83 |       251486 |       287247 |       376170 |       385302 
    107 |       232037 |       281922 |       365585 |       380532 
    139 |       210478 |       276544 |       352430 |       373815 
    163 |       193875 |       271842 |       341636 |       368034 
    191 |       179544 |       267033 |       334408 |       362985 
    211 |       172837 |       269329 |       330287 |       366478 
    239 |       162647 |       272046 |       322646 |       371807 
    271 |       153626 |       271423 |       314017 |       371062 
    307 |       144122 |       270540 |       305358 |       370462 
    353 |       129544 |       268239 |       292867 |       368162 
    397 |       123430 |       267112 |       284394 |       366845 
    
1 socket 10 keys

  conns |  master 128M |     v12 128M |    master 1G |       v12 1G 
--------+--------------+--------------+--------------+--------------
      1 |         6824 |         6735 |        10475 |        10220 
      2 |        13037 |        12628 |        20382 |        19849 
      3 |        19416 |        19043 |        30369 |        29554 
      5 |        31756 |        30657 |        49402 |        48614 
      7 |        42794 |        42179 |        67526 |        65071 
     17 |        91443 |        89772 |       139630 |       139929 
     27 |       107751 |       110689 |       165996 |       169955 
     53 |        97128 |       120621 |       157670 |       184382 
     83 |        82344 |       117814 |       142380 |       183863 
    107 |        70764 |       115841 |       134266 |       182426 
    139 |        57561 |       112528 |       125090 |       180121 
    163 |        50490 |       110443 |       119932 |       178453 
    191 |        45143 |       108583 |       114690 |       175899 
    211 |        42375 |       107604 |       111444 |       174109 
    239 |        39861 |       106702 |       106253 |       172410 
    271 |        37398 |       105819 |       102260 |       170792 
    307 |        35279 |       105355 |        97164 |       168313 
    353 |        33427 |       103537 |        91629 |       166232 
    397 |        31778 |       101793 |        87230 |       164381 
    
2 sockets 1 key

  conns |  master 128M |     v12 128M |    master 1G |       v12 1G 
--------+--------------+--------------+--------------+--------------
      1 |        24839 |        24386 |        29246 |        28361 
      2 |        46655 |        45265 |        55942 |        54327 
      3 |        69278 |        68332 |        83984 |        81608 
      5 |       115263 |       112746 |       139012 |       135426 
      7 |       159881 |       155119 |       193846 |       188399 
     17 |       373808 |       365085 |       456463 |       441603 
     27 |       503663 |       495443 |       600335 |       584741 
     53 |       708849 |       744274 |       900923 |       908488 
     83 |       593053 |       862003 |       985953 |      1038033 
    107 |       431806 |       875704 |       957115 |      1075172 
    139 |       328380 |       879890 |       881652 |      1069872 
    163 |       288339 |       874792 |       824619 |      1064047 
    191 |       255666 |       870532 |       790583 |      1061124 
    211 |       241230 |       865975 |       764898 |      1058473 
    239 |       227344 |       857825 |       732353 |      1049745 
    271 |       216095 |       848240 |       703729 |      1043182 
    307 |       206978 |       833980 |       674711 |      1031533 
    353 |       198426 |       803830 |       633783 |      1018479 
    397 |       191617 |       744466 |       599170 |      1006134 
    
2 sockets 3 keys

  conns |  master 128M |     v12 128M |    master 1G |       v12 1G 
--------+--------------+--------------+--------------+--------------
      1 |        14688 |        14088 |        18912 |        18905 
      2 |        26759 |        25925 |        36817 |        35924 
      3 |        40002 |        38658 |        54765 |        53266 
      5 |        63479 |        63041 |        90521 |        87496 
      7 |        88561 |        87101 |       123425 |       121877 
     17 |       199411 |       196932 |       289555 |       282146 
     27 |       270121 |       275950 |       386884 |       383019
     53 |       202918 |       374848 |       395967 |       501648 
     83 |       149599 |       363623 |       335815 |       478628 
    107 |       126501 |       348125 |       311617 |       472473 
    139 |       106091 |       331350 |       279843 |       466408 
    163 |        95497 |       321978 |       260884 |       461688 
    191 |        87427 |       312815 |       241189 |       458252 
    211 |        82783 |       307261 |       231435 |       454327 
    239 |        78930 |       299661 |       219655 |       451826 
    271 |        74081 |       294233 |       211555 |       448412 
    307 |        71352 |       288133 |       202838 |       446143 
    353 |        67872 |       279948 |       193354 |       441929 
    397 |        66178 |       275784 |       185556 |       438330 

2 sockets 10 keys

  conns |  master 128M |     v12 128M |    master 1G |       v12 1G 
--------+--------------+--------------+--------------+--------------
      1 |         6200 |         6108 |        10163 |         9563 
      2 |        11196 |        10871 |        18373 |        17827 
      3 |        16479 |        16129 |        26807 |        26584 
      5 |        26750 |        26241 |        44291 |        43409 
      7 |        36501 |        35433 |        60508 |        59379 
     17 |        77320 |        77451 |       130413 |       128452 
     27 |        91833 |       105643 |       147259 |       156833 
     53 |        57138 |       115793 |       119306 |       150647 
     83 |        44435 |       108850 |       105454 |       148006 
    107 |        38031 |       105199 |        95108 |       146162 
    139 |        31697 |       101096 |        84011 |       143281 
    163 |        28826 |        98255 |        78411 |       141375 
    191 |        26223 |        96224 |        74256 |       139646 
    211 |        24933 |        94815 |        71542 |       137834 
    239 |        23626 |        92849 |        69289 |       137235 
    271 |        22664 |        90938 |        66431 |       136080 
    307 |        21691 |        89358 |        64661 |       133166 
    353 |        20712 |        88239 |        61619 |       133339 
    397 |        20374 |        86708 |        58937 |       130684 

Well, as you see, there is some regression on low connection numbers.
I don't get where it from.

More over, it is even in case of 2GB shared buffers - when all data
fits into buffers cache and new code doesn't work at all.
(except this incomprehensible regression there's no different in
 performance with 2GB shared buffers).

For example 2GB shared buffers 1 socket 3 keys:
  conns |    master 2G |       v12 2G 
--------+--------------+--------------
      1 |        23491 |        22621 
      2 |        46436 |        44851 
      3 |        69265 |        66844 
      5 |       112432 |       108801 
      7 |       158859 |       150247 
     17 |       297600 |       291605 
     27 |       390041 |       384590 
     53 |       448384 |       447588 
     83 |       445582 |       442048 
    107 |       440544 |       438200 
    139 |       433893 |       430818 
    163 |       427436 |       424182 
    191 |       420854 |       417045 
    211 |       417228 |       413456 

Perhaps something changes in memory layout due to array of CV's, or
compiler layouts/optimizes functions differently. I can't find the
reason ;-( I would appreciate help on this.


regards

---

Yura Sokolov

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Allowing REINDEX to have an optional name
Следующее
От: Yura Sokolov
Дата:
Сообщение: Re: BufferAlloc: don't take two simultaneous locks