Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> Hm. I am wrong about this, since it's the fact that consumers are taking
> stanullfrac into account that makes the value wrong in the first place.
Also, the way that the value is calculated in the samples-not-all-distinct
case corresponds to the way I have it in the patch. What you want to do
would correspond to leaving these edge cases alone and changing all the
other ANALYZE cases instead (*plus* changing the consumers). I find that
a tad scary.
> But I think the fix is still wrong, because it changes the meaning of
> ALTER TABLE ... ALTER col SET (n_distinct=...) in a non-useful way; it
> is no longer possible to nail down a useful negative n_distinct value if
> the null fraction of the column is variable.
I think that argument is bogus. If we change the way that
get_variable_numdistinct (and other consumers) use the value, that will
break all existing custom settings of n_distinct, because they will no
longer mean what they did before. There have been exactly zero field
complaints that people could not get the results they wanted, so I do
not think that's justified.
In short, what you want to do constitutes a redefinition of stadistinct,
while my patch doesn't. That is far more invasive and I fear it will
break things that are not broken today.
regards, tom lane