Обсуждение: change sample size for statistics
Hi,
is there a way to change the sample size for statistics (that analyze gathers)?
It is said to be 10%. i would like to raise that, because we are getting bas estimations for n_distinct.
Cheers,
WBL
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
is there a way to change the sample size for statistics (that analyze gathers)?
It is said to be 10%. i would like to raise that, because we are getting bas estimations for n_distinct.
Cheers,
WBL
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
On 6/10/11 5:15 AM, Willy-Bas Loos wrote: > Hi, > > is there a way to change the sample size for statistics (that analyze > gathers)? > It is said to be 10%. i would like to raise that, because we are getting bas > estimations for n_distinct. It's not 10%. We use a fixed sample size, which is configurable on the system, table, or column basis. Some reading (read all these pages to understand what you're doing): http://www.postgresql.org/docs/9.0/static/planner-stats.html http://www.postgresql.org/docs/9.0/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER http://www.postgresql.org/docs/9.0/static/planner-stats-details.html http://www.postgresql.org/docs/9.0/static/sql-altertable.html (scroll down to "set storage" on that last page) -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
[ Sorry, forgot to cc list ] >> It is said to be 10%. i would like to raise that, because we are getting bas >> estimations for n_distinct. > > More to the point, the estimator we use is going to be biased for many > ( probably most ) distributions no matter how large your sample size > is. > > If you need to fix ndistinct, a better approach may be to do it manually. > > Best, > Nathan >
On Fri, Jun 10, 2011 at 9:58 PM, Josh Berkus <josh@agliodbs.com> wrote:
It's not 10%. We use a fixed sample size, which is configurable on thesystem, table, or column basis.
I mean the number of records that are scanned by analyze to come to the statistics for the planner, especially n_disctict.
On Fri, Jun 10, 2011 at 10:06 PM, Nathan Boley <npboley@gmail.com> wrote:
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
If you need to fix ndistinct, a better approach may be to do it manually.
That would be nice, but how do i prevent the analyzer to overwrite n_distinct without blocking the generation of new histogram values etc for that column?
We use version 8.4 at the moment (on debian squeeze).
Cheers,
WBL
-- "Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
On Mon, Jun 13, 2011 at 6:33 PM, Willy-Bas Loos <willybas@gmail.com> wrote: > On Fri, Jun 10, 2011 at 9:58 PM, Josh Berkus <josh@agliodbs.com> wrote: >> >> It's not 10%. We use a fixed sample size, which is configurable on the >> system, table, or column basis. > > It seems that you are referring to "alter column set statistics" and > "default_statistics_target", which are the number of percentiles in the > histogram (and MCV's) . > I mean the number of records that are scanned by analyze to come to the > statistics for the planner, especially n_disctict. In 9.0+ you can do ALTER TABLE .. ALTER COLUMN .. SET (n_distinct = ...); -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company