Re: [HACKERS] Protect syscache from bloating with negative cacheentries

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: [HACKERS] Protect syscache from bloating with negative cacheentries
Дата
Msg-id ecd7ba88-0484-54be-5b8a-c48848438aea@BlueTreble.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Protect syscache from bloating with negative cacheentries  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: [HACKERS] Protect syscache from bloating with negative cache entries  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [HACKERS] Protect syscache from bloating with negative cacheentries  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-hackers
On 12/26/16 2:31 AM, Kyotaro HORIGUCHI wrote:
> The points of discussion are the following, I think.
>
> 1. The first patch seems working well. It costs the time to scan
>    the whole of a catcache that have negative entries for other
>    reloids. However, such negative entries are created by rather
>    unusual usages. Accesing to undefined columns, and accessing
>    columns on which no statistics have created. The
>    whole-catcache scan occurs on ATTNAME, ATTNUM and
>    STATRELATTINH for every invalidation of a relcache entry.

I took a look at this. It looks sane, though I've got a few minor 
comment tweaks:

+ *    Remove negative cache tuples maching a partial key.
s/maching/matching/

+/* searching with a paritial key needs scanning the whole cache */

s/needs/means/

+ * a negative cache entry cannot be referenced so we can remove

s/referenced/referenced,/

I was wondering if there's a way to test the performance impact of 
deleting negative entries.

> 2. The second patch also works, but flushing negative entries by
>    hash values is inefficient. It scans the bucket corresponding
>    to given hash value for OIDs, then flushing negative entries
>    iterating over all the collected OIDs. So this costs more time
>    than 1 and flushes involving entries that is not necessary to
>    be removed. If this feature is valuable but such side effects
>    are not acceptable, new invalidation category based on
>    cacheid-oid pair would be needed.

I glanced at this and it looks sane. Didn't go any farther since this 
one's pretty up in the air. ISTM it'd be better to do some kind of aging 
instead of patch 2.

The other (possibly naive) question I have is how useful negative 
entries really are? Will Postgres regularly incur negative lookups, or 
will these only happen due to user activity? I can't think of a case 
where an app would need to depend on fast negative lookup (in other 
words, it should be considered a bug in the app). I can see where 
getting rid of them completely might be problematic, but maybe we can 
just keep a relatively small number of them around. I'm thinking a 
simple LRU list of X number of negative entries; when that fills you 
reuse the oldest one. You'd have to pay the LRU maintenance cost on 
every negative hit, but if those shouldn't be that common it shouldn't 
be bad.

That might well necessitate another GUC, but it seems a lot simpler than 
most of the other ideas.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: [HACKERS] Checksums by default?
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: [HACKERS] remote_apply for logical replication?