[HACKERS] Re: [GSOC][weekly report 9] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions

Поиск
Список
Период
Сортировка
От Mengxing Liu
Тема [HACKERS] Re: [GSOC][weekly report 9] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions
Дата
Msg-id 619f23c1.5ee5.15dc0c4c09b.Coremail.liu-mx15@mails.tsinghua.edu.cn
обсуждение исходный текст
Ответ на [HACKERS] Re: [GSOC][weekly report 9] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
> From: "Alvaro Herrera" <alvherre@2ndquadrant.com>

> * I wonder why did you settle on a skip list as the best optimization
>   path for this.  Did you consider other data structures?  (We don't
>   seem to have much that would be usable in shared memory, I fear.)
>
There are three typical alternative data structures: 1) unordered linked list, with O(N) searching and O(1)
inserting/removing. 
2) tree-like data structures,  with O(log(N)) inserting/searching/removing. Skip list just likes a tree.
3) hash table , with O(1) inserting/searching.  But constant is much bigger than linked list.
Any data structures can be classified into one of data structures above.

In PG serializable module,  number of conflicts is much smaller than we excepted, which means N is small.
So we should be very careful about constants before Big O notation.
Sometimes O(N) complexity of linked list is faster than O(1) complexity of hash table.
For example, when N is smaller than 5, HTAB is slower than SHM_QUEUE when searching an element.
And in all cases, remove operation of hash table is slower than that of linked list,
which would result in more serious problem: more contention on SeriliazableFinishedListLock.

Skip list is better than HTAB because its remove operation has O(1) complexity.
I think any other tree-like data structures won't do better.

Shared memory is also the reason why I chose hash table and skip list at the beginning.
It's quite difficult to develop a totally new data structure in shared memory,
while there were well tested hash table and linked list code already.

> * I wonder if a completely different approach to the finished xact
>   maintenance problem would be helpful.  For instance, in
>   ReleasePredicateLocks we call ClearOldPredicateLocks()
>   inconditionally, and there we grab the SerializableFinishedListLock
>   lock inconditionally for the whole duration of that routine, but
>   perhaps it would be better to skip the whole ClearOld stuff if the
>   SerializableFinishedListLock is not available?  Find out some way to
>   ensure that the cleanup is always called later on, but maybe skipping
>   it once in a while improves overall performance.
>

Yes, that sounds nice. Actually, for a special backend worker, it's OK to skip the ClearOldPredicateLocks.
Because other workers will do the clean job later. I'll try it later.

> * the whole predicate.c stuff is written using SHM_QUEUE.  I suppose
>   SHM_QUEUE works just fine, but predicate.c was being written at about
>   the same time (or a bit earlier) than the newer ilist.c interface was
>   being created, which I think had more optimization work thrown in.
>   Maybe it would be good for predicate.c to ditch use of SHM_QUEUE and
>   use ilist.c interfaces instead?  We could even think about being less
>   strict about holding exclusive lock on SerializableFinished for the
>   duration of ClearOldPredicateLocks, i.e. use only a share lock and
>   only exchange for exclusive if a list modification is needed.
>

I read the code in ilist.c but I didn't see too much difference with SHM_QUEUE.
Anyway, I'll do a small test to compare the performance of these two data structures.

> --
> Álvaro Herrera                https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sincerely


Mengxing Liu











В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: [HACKERS] why not parallel seq scan for slow functions