RE: Protect syscache from bloating with negative cache entries

Поиск
Список
Период
Сортировка
От Ideriha, Takeshi
Тема RE: Protect syscache from bloating with negative cache entries
Дата
Msg-id 4E72940DA2BF16479384A86D54D0988A6F3D09C1@G01JPEXMBKW04
обсуждение исходный текст
Ответ на Re: Protect syscache from bloating with negative cache entries  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы RE: Protect syscache from bloating with negative cache entries  ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>)
Список pgsql-hackers
Hello,

Sorry for delay. 
The detailed comments for the source code will be provided later.

>> I just thought that the pair of ageclass and nentries can be
>> represented as json or multi-dimensional array but in virtual they are
>> all same and can be converted each other using some functions. So I'm not sure
>which representaion is better one.
>
>Multi dimentional array in any style sounds reasonable. Maybe array is preferable in
>system views as it is a basic type than JSON. In the attached, it looks like the follows:
>
>=# select * from pg_stat_syscache  where ntuples > 100; -[ RECORD
>1 ]--------------------------------------------------
>pid         | 1817
>relname     | pg_class
>cache_name  | pg_class_oid_index
>size        | 2048
>ntuples     | 189
>searches    | 1620
>hits        | 1431
>neg_hits    | 0
>ageclass    | {{30,189},{60,0},{600,0},{1200,0},{1800,0},{0,0}}
>last_update | 2018-11-27 19:22:00.74026+09

Thanks, cool. That seems better to me.

>
>> >3. non-transactional GUC setting (in 0003)
>> >
>> >It allows setting GUC variable set by the action
>> >GUC_ACTION_NONXACT(the name requires condieration) survive beyond
>> >rollback. It is required by remote guc setting to work sanely.
>> >Without the feature a remote-set value within a trasction will
>> >disappear involved in rollback. The only local interface for the
>> >NONXACT action is set_config(name, value, is_local=false, is_nonxact = true).
>pg_set_backend_guc() below works on this feature.
>>
>> TBH, I'm not familiar with around this and I may be missing something.
>> In order to change the other backend's GUC value, is ignoring
>> transactional behevior always necessary? When transaction of GUC
>> setting is failed and rollbacked, if the error message is supposeed to
>> be reported I thought just trying the transaction again is enough.
>
>The target backend can be running frequent transaction.  The invoker backend cannot
>know whether the remote change happend during a transaction and whether the
>transaction if any is committed or aborted, no error message sent to invoker backend.
>We could wait for the end of a trasaction but that doesn't work with long transactions.
>
>Maybe we don't need the feature in GUC system but adding another similar feature
>doesn't seem reasonable. This would be useful for some other tracking features.

Thank you for the clarification. 


>> >4. pg_set_backend_guc() function.
>> >
>> >Of course syscache statistics recording consumes significant amount
>> >of time so it cannot be turned on usually. On the other hand since
>> >this feature is turned on by GUC, it is needed to grab the active
>> >client connection to turn on/off the feature(but we cannot). Instead, I provided a
>means to change GUC variables in another backend.
>> >
>> >pg_set_backend_guc(pid, name, value) sets the GUC variable "name"
>> >on the backend "pid" to "value".
>> >
>> >
>> >
>> >With the above tools, we can inspect catcache statistics of seemingly bloated
>process.
>> >
>> >A. Find a bloated process pid using ps or something.
>> >
>> >B. Turn on syscache stats on the process.
>> >  =# select pg_set_backend_guc(9984, 'track_syscache_usage_interval',
>> >'10000');
>> >
>> >C. Examine the statitics.
>> >
>> >=# select pid, relname, cache_name, size from pg_stat_syscache order
>> >by size desc limit 3;
>> > pid  |   relname    |            cache_name            |   size
>> >------+--------------+----------------------------------+----------
>> > 9984 | pg_statistic | pg_statistic_relid_att_inh_index | 32154112
>> > 9984 | pg_cast      | pg_cast_source_target_index      |     4096
>> > 9984 | pg_operator  | pg_operator_oprname_l_r_n_index  |     4096
>> >
>> >
>> >=# select * from pg_stat_syscache where cache_name =
>> >'pg_statistic_relid_att_inh_index'::regclass;
>> >-[ RECORD 1 ]---------------------------------
>> >pid         | 9984
>> >relname     | pg_statistic
>> >cache_name  | pg_statistic_relid_att_inh_index
>> >size        | 11026176
>> >ntuples     | 77950
>> >searches    | 77950
>> >hits        | 0
>> >neg_hits    | 0
>> >ageclass    | {30,60,600,1200,1800,0}
>> >nentries    | {17630,16950,43370,0,0,0}
>> >last_update | 2018-10-17 15:58:19.738164+09
>>
>> The output of this view seems good to me.
>>
>> I can imagine this use case. Does the use case of setting GUC locally never happen?
>> I mean can the setting be locally changed?
>
>Syscahe grows through a life of a backend/session. No other client cannot connect to
>it at the same time. So the variable must be set at the start of a backend using ALTER
>USER/DATABASE, or the client itself is obliged to deliberitely turn on the feature at a
>convenient time. I suppose that in most use cases one wants to turn on this feature
>after he sees another session is eating memory more and more.
>
>The attached is the rebased version that has multidimentional ageclass.

Thank you! That's convenient.
How about splitting this non-xact guc and remote guc setting feature as another commit fest entry? 
I'm planning to review 001 and 002 patch in more detail and hopefully turn it to 'ready for committer' 
and review remote guc feature later. 
Related to the feature division why have you discarded pruning of relcache and plancache?
Personally I want relcache one as well as catcache because regarding memory bloat there is some correlation between
them.

Regards,
Takeshi Ideriha



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Switching to 64-bit Bitmapsets
Следующее
От: Tom Lane
Дата:
Сообщение: Re: slow queries over information schema.tables