Re: Patch: ResourceOwner optimization for tables with many partitions

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Patch: ResourceOwner optimization for tables with many partitions
Дата
Msg-id CA+TgmoZPa2pOqGm_cPws6JYWWKXumYngXJSAWyd7pY86TYjUSg@mail.gmail.com
обсуждение исходный текст
Ответ на Patch: ResourceOwner optimization for tables with many partitions  (Aleksander Alekseev <a.alekseev@postgrespro.ru>)
Ответы Re: Patch: ResourceOwner optimization for tables with many partitions  (Aleksander Alekseev <a.alekseev@postgrespro.ru>)
Список pgsql-hackers
On Fri, Dec 4, 2015 at 7:15 AM, Aleksander Alekseev
<a.alekseev@postgrespro.ru> wrote:
> Current implementation of ResourceOwner uses arrays to store resources
> like TupleDescs, Snapshots, etc. When we want to release one of these
> resources ResourceOwner finds it with linear scan. Granted, resource
> array are usually small so its not a problem most of the time. But it
> appears to be a bottleneck when we are working with tables which have a
> lot of partitions.
>
> To reproduce this issue:
> 1. run `./gen.pl 10000 | psql my_database postgres`
> 2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database`
> 3. in second terminal run `sudo perf top -u postgres`
>
> Both gen.pl and q.sql are attached to this message.
>
> You will see that postgres spends a lot of time in ResourceOwnerForget*
> procedures:
>
>  32.80%  postgres               [.] list_nth
>  20.29%  postgres               [.] ResourceOwnerForgetRelationRef
>  12.87%  postgres               [.] find_all_inheritors
>   7.90%  postgres               [.] get_tabstat_entry
>   6.68%  postgres               [.] ResourceOwnerForgetTupleDesc
>   1.17%  postgres               [.] hash_search_with_hash_value
>  ... < 1% ...
>
> I would like to suggest a patch (see attachment) witch fixes this
> bottleneck. Also I discovered that there is a lot of code duplication in
> ResourceOwner. Patch fixes this too. The idea is quite simple. I just
> replaced arrays with something that could be considered hash tables,
> but with some specific optimizations.
>
> After applying this patch we can see that bottleneck is gone:
>
>  42.89%  postgres               [.] list_nth
>  18.30%  postgres               [.] find_all_inheritors
>  10.97%  postgres               [.] get_tabstat_entry
>   1.82%  postgres               [.] hash_search_with_hash_value
>   1.21%  postgres               [.] SearchCatCache
>  ... < 1% ...
>
> For tables with thousands partitions we gain in average 32.5% more TPS.
>
> As far as I can see in the same time patch doesn't make anything worse.
> `make check` passes with asserts enabled and disabled. There is no
> performance degradation according to both standard pgbench benchmark
> and benchmark described above for tables with 10 and 100 partitions.

I do think it's interesting to try to speed up the ResourceOwner
mechanism when there are many resources owned.  We've had some forays
in that direction in the past, and I think it's useful to continue
that work.  But I also think that this patch, while interesting, is
not something we can seriously consider committing in its current
form, because:

1. It lacks adequate comments and submission notes to explain the
general theory of operation of the patch.

2. It seems to use uint8 * rather freely as a substitute for char * or
void *, which doesn't seem like a great idea.

3. It doesn't follow PostgreSQL formatting conventions (uncuddled
curly braces, space after if and before open paren, etc.).

4. It mixes together multiple ideas in a single patch, not only
introducing a hashing concept but also striping a brand-new layer of
abstraction across the resource-owner mechanism.  I am not sure that
layer of abstraction is a very good idea, but if it needs to be done,
I think it should be a separate patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Include ppc64le build type for back branches
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: PostgresNode::_update_pid using undefined variables in tap tests