Re: Further pg_upgrade analysis for many tables

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Further pg_upgrade analysis for many tables
Дата	20 января 2013 г. 19:08:13
Msg-id	26604.1358708885@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Further pg_upgrade analysis for many tables (Jeff Janes <jeff.janes@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

Jeff Janes <jeff.janes@gmail.com> writes:
> [ patch for AtEOXact_RelationCache ]

I've reviewed and committed this with some mostly-cosmetic adjustments,
notably:

* Applied it to AtEOSubXact cleanup too.  AFAICS that's just as
idempotent, and it seemed weird to not use the same technique both
places.

* Dropped the hack to force a full-table scan in Assert mode.  Although
that's a behavioral change that I suspect Jeff felt was above his pay
grade, it seemed to me that not exercising the now-normal hash_search
code path in assert-enabled testing was a bad idea.  Also, the value of
exhaustive checking for relcache reference leaks is vastly lower than it
once was, because those refcounts are managed mostly automatically now.

* Redid the representation of the overflowed state a bit --- the way
that n_eoxact_list worked seemed a bit too cute/complicated for my
taste.

> On Wednesday, January 9, 2013, Simon Riggs wrote:
>> Why does the list not grow as needed?

> It would increase the code complexity for no concretely-known benefit.

Actually there's a better argument for that: at some point a long list
is actively counterproductive, because N hash_search lookups will cost
more than the full-table scan would.

I did some simple measurements that told me that with 100-odd entries
in the hashtable (which seems to be about the minimum for an active
backend), the hash_seq_search() traversal is about 40x more expensive
than one hash_search() lookup.  (I find this number slightly
astonishing, but that's the answer I got.)  So the crossover point
is at least 40 and probably quite a bit more, since (1) my measurement
did not count the cost of uselessly doing the actual relcache-entry
cleanup logic on non-targeted entries, and (2) if the list is that
long there are probably more than 100-odd entries in the hash table,
and hash table growth hurts the seqscan approach much more than the
search approach.

Now on the other side, simple single-command transactions are very
unlikely to have created more than a few list entries anyway.  So
it's probably not worth getting very tense about the exact limit
as long as it's at least a couple dozen.  I set the limit to 32
as committed, because that seemed like a nice round number in the
right general area.

BTW, this measurement also convinced me that the patch is a win
even when the hashtable is near minimum size, even though there's
no practical way to isolate the cost of AtEOXact_RelationCache in
vivo in such cases.  It's good to know that we're not penalizing
simple cases to speed up the huge-number-of-relations case, even
if the penalty would be small.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Further pg_upgrade analysis for many tables