Re: Extended Statistics set/restore/clear functions.

Поиск
Список
Период
Сортировка
От Corey Huinker
Тема Re: Extended Statistics set/restore/clear functions.
Дата
Msg-id CADkLM=cT2rqtw12JX6+hD4gL=wSKR=jt040QGGsurZiqpZ6ZLQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Extended Statistics set/restore/clear functions.  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
> But, if we don't care about the order of the combinations, I also don't
> think we need to expose the functions at all. We know exactly how many
> combinations there should be for any N attributes as each attribute must be
> unique. So if we have the right number of unique combinations, and they're
> all subsets of the first-longest, then we must have a complete set.
> Thoughts on that?
>
> Getting _too_ tight with the ordering and contents makes me concerned for
> the day when the format might change. We don't want to _fail_ an upgrade
> because some of the combinations were in the wrong order.

That's fair.  The planner costing code pulling the stats numbers based
on the attributes was smart enough to not care much about the ordering
as far as I recall, but I'd rather make sure of that first.  This
needs some careful lookup.

I've done some experiments, creating extended stats objects up to the 8 attribute limit.

The big takeaway is that I wasn't imagining that he number of dependencies combinations is NOT deterministic:

/*
 * if the dependency seems entirely invalid, don't store it
 */
if (degree == 0.0)
     continue;

So, in theory, an empty (i.e. '[]') pg_dependencies is valid.

The number of pg_ndistinct is deterministic, now, but I'm even less sure that'll be true in the future.

We can definitely rely on the attnums being all the positive numbers in ascending order first, followed by the negative numbers in descending order, but that's about it. Which raises the question of how we describe the error when attnums are out of order.

We know that the deserialize functions take the data's word for it as to how many items to unpack, so I don't see the impact of not caring how many might be missing. That even sort of feeds into Tom's idea that stats import was in some sense a fuzzing tool.
 
I'd try to look at the bits related to pg_dependencies and
pg_ndistinct as two separate concepts, at the end.  They're sort of
alike, but have too many differences already.

Based on the above, I think we can't really add anything beyond the attnum order, and we have to relax some existing restrictions on pg_dependencies...

В списке pgsql-hackers по дате отправления: