Calculage avg. width when operator = is missing

Поиск
Список
Период
Сортировка
От Shulgin, Oleksandr
Тема Calculage avg. width when operator = is missing
Дата
Msg-id CACACo5QGXRc5daiUytveu99qfM=xkV=2SxEkZwuqCOqoz1wwaA@mail.gmail.com
обсуждение исходный текст
Ответы Re: Calculage avg. width when operator = is missing  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
Hi Hackers,

I've recently stumbled upon a problem with table bloat estimation in case there are columns of type JSON.

The quick bloat estimation queries use sum over pg_statistic.stawidth of table's columns, but in case of JSON the corresponding entry is never created by the ANALYZE command due to equality comparison operator missing.  I understand why there is no such operator defined for this particular type, but shouldn't we still try to produce meaningful average width estimation?

In my case the actual bloat is around 40% as verified with pgstattuple, while the bloat reported by quick estimate can be between 75% and 95%(!) in three instances of this problem.  We're talking about some hundreds of GB of miscalculation.

Attached patch against master makes the std_typanalyze still try to compute the minimal stats even if there is no "=" operator.  Makes sense?

I could also find this report in archives that talks about similar problem, but due to all values being over the analyze threshold:


I think we could try harder, otherwise any estimate relying on average width can be way off in such cases.

--
Alex

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Improving test coverage of extensions with pg_dump
Следующее
От: Robert Haas
Дата:
Сообщение: Re: 9.5: Can't connect with PGSSLMODE=require on Windows