RE: error "can only drop stats once" brings down database

Поиск
Список
Период
Сортировка
От Floris Van Nee
Тема RE: error "can only drop stats once" brings down database
Дата
Msg-id 47ca6fcc687e408287871f7eaaaf3c3b@Optiver.com
обсуждение исходный текст
Ответ на Re: error "can only drop stats once" brings down database  (Andres Freund <andres@anarazel.de>)
Ответы Re: error "can only drop stats once" brings down database
Список pgsql-bugs
> I suspect that the "high value" oids are related to the issue. I bet you're
> running into OID wraparound for objects at a much higher frequency than
> most users.  I suspect the reason this causes problems is the problem I just
> found and described here:

I definitely suspect oid wraparound has something to do with it. However, given what I found on the other thread (
https://www.postgresql.org/message-id/flat/17947-b9554521ad963c9c%40postgresql.org ), I don't think the two have the
exactsame root cause. The other thread seems a clear case of 'forgot to call gc' when it should. 

>
> The fact that we're not triggering "stats object garbage collection" for most
> drops also explains why you're much more likely to see this on a standby
> than on a primary. An a primary it's going to be rare to have a single backend
> live long enough to observe an oid wraparound leading to one backend
> accessing stats for the same object type with the same oid after that object
> previously having been dropped. But most stats accesses on a standby are
> going to be by the same process, the startup process.  You'd still need some
> other accesses to prevent the object from being dropped "immediately", but
> that could be due to shorter lived processes.
>

As mentioned on other thread, I do think garbage collection gets called as part of pgstat_execute_transactional_drops
inrecovery. 

There's still the case of the "skip gc of entry if pending" though, however I struggle to understand how this can lead
tosuch an error in the recovery process after wraparound. I'd expect the gc to happen relatively frequently (at least
muchmore frequently than a wraparound) as tables get created/dropped quite frequently. Next to that, because the
recoveryprocess is the only one creating/dropping relations on standby, it should be impossible to get in a state where
therecovery "local" stats cache has a non-gc'd entry that should be dropped? Because when dropping, at least it removes
thelocal cache entry (even if it cannot remove the shared one). So later, when creating it again in
pgstat_get_entry_ref,it should always hit the pgstat_reinit_entry path? 

Would it make sense to at least commit your patch to enhance the error message a bit?


-Floris




В списке pgsql-bugs по дате отправления:

Предыдущее
От: Sjors Gielen
Дата:
Сообщение: Re: BUG #18365: Inconsistent cost function between materialized and non-materialized CTE
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error