Buffer Allocation Concurrency Limits

Поиск
Список
Период
Сортировка
От Jason Petersen
Тема Buffer Allocation Concurrency Limits
Дата
Msg-id 7359EE56-1AEF-4C37-9818-0BB58EC72C5C@citusdata.com
обсуждение исходный текст
Ответы Re: Buffer Allocation Concurrency Limits  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
In December, Metin (a coworker of mine) discussed an inability to scale a simple task (parallel scans of many independent tables) to many cores (it’s here). As a ramp-up task at Citus I was tasked to figure out what the heck was going on here.

I have a pretty extensive writeup here (whose length is more a result of my inexperience with the workings of PostgreSQL than anything else) and was looking for some feedback.

In short, my conclusion is that a working set larger than memory results in backends piling up on BufFreelistLock. As much as possible I removed anything that could be blamed for this:

  • Hyper-Threading is disabled
  • zone reclaim mode is disabled
  • numactl was used to ensure interleaved allocation
  • kernel.sched_migration_cost was set to highly disable migration
  • kernel.sched_autogroup_enabled was disabled
  • transparent hugepage support was disabled

For a way forward, I was thinking the buffer allocation sections could use some of the atomics Andres added here. Rather than workers grabbing BufFreelistLock to iterate the clock hand until they find a victim, the algorithm could be rewritten in a lock-free style, allowing workers to move the clock hand in tandem.

Alternatively, the clock iteration could be moved off to a background process, similar to what Amit Kapila proposed here.

Is this assessment accurate? I know 9.4 changes a lot about lock organization, but last I looked I didn’t see anything that could alleviate this contention: are there any plans to address this?

—Jason

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)