Обсуждение: "Bug" in statistics for v7.2?
Okay, if I'm understanding pg_stats at all, which I may not be, n_distinct should represent # of distinct values in that row, no? But, I have one field that has 5 distinct values: iwantu=# select distinct(profiles_faith) from iwantu_profiles;profiles_faith ---------------- 0 1 2 7 8 (5 rows) But pg_stats is reporting 1: tablename | attname | avg_width | n_distinct -----------------+------------------------+-----------+------------iwantu_profiles | profiles_faith | 2 | 1 So am I reading n_distinct wrong?
"Marc G. Fournier" <scrappy@hub.org> writes: > Okay, if I'm understanding pg_stats at all, which I may not be, n_distinct > should represent # of distinct values in that row, no? > But, I have one field that has 5 distinct values: > But pg_stats is reporting 1: The pg_stats values are only, um, statistical. If 99.9% of the table is the same value and the other four values appear only once or twice, it's certainly possible for ANALYZE's sample to include only the common value and miss the rare ones. AFAIK that will not break anything; if you have an example where the planner seems to be fooled because of this, let's see it. regards, tom lane
That explains it ... profiles_faith | count ----------------+-------- 0 | 485938 1 | 2 2 | 6 7 | 2 8 | 21 (5 rows) Cool, another waste of space *sigh* thanks ... On Wed, 13 Feb 2002, Tom Lane wrote: > "Marc G. Fournier" <scrappy@hub.org> writes: > > Okay, if I'm understanding pg_stats at all, which I may not be, n_distinct > > should represent # of distinct values in that row, no? > > But, I have one field that has 5 distinct values: > > But pg_stats is reporting 1: > > The pg_stats values are only, um, statistical. If 99.9% of the table is > the same value and the other four values appear only once or twice, it's > certainly possible for ANALYZE's sample to include only the common value > and miss the rare ones. AFAIK that will not break anything; if you have > an example where the planner seems to be fooled because of this, let's > see it. > > regards, tom lane >
> That explains it ... > > profiles_faith | count > ----------------+-------- > 0 | 485938 > 1 | 2 > 2 | 6 > 7 | 2 > 8 | 21 > (5 rows) > > Cool, another waste of space *sigh* > > thanks ... > > > On Wed, 13 Feb 2002, Tom Lane wrote: > > > "Marc G. Fournier" <scrappy@hub.org> writes: > > > Okay, if I'm understanding pg_stats at all, which I may not be, n_distinct > > > should represent # of distinct values in that row, no? > > > But, I have one field that has 5 distinct values: > > > But pg_stats is reporting 1: > > > > The pg_stats values are only, um, statistical. If 99.9% of the table is > > the same value and the other four values appear only once or twice, it's > > certainly possible for ANALYZE's sample to include only the common value > > and miss the rare ones. AFAIK that will not break anything; if you have > > an example where the planner seems to be fooled because of this, let's > > see it. Hmm ? How about select * from xxx where profiles_faith = 7 would estimate all rows, no ? Instead of 2. That is why I think a bin for "very uncommon" values could also be useful sometimes. Andreas
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > The pg_stats values are only, um, statistical. If 99.9% of the table is > the same value and the other four values appear only once or twice, it's > certainly possible for ANALYZE's sample to include only the common value > and miss the rare ones. AFAIK that will not break anything; if you have > an example where the planner seems to be fooled because of this, let's > see it. > Hmm ? How about select * from xxx where profiles_faith = 7 > would estimate all rows, no ? Instead of 2. Not in 7.2 ... nor in previous versions AFAIR. > That is why I think a bin for "very uncommon" values could also be > useful sometimes. Perhaps you should experiment or read the code before opining... regards, tom lane