Re: [HACKERS] multivariate statistics (v19)

Поиск

Список

Период

Сортировка

От	Alvaro Herrera
Тема	Re: [HACKERS] multivariate statistics (v19)
Дата	7 февраля 2017 г. 01:11:57
Msg-id	20170206221157.54lzliw3wjhskb6w@alvherre.pgsql обсуждение исходный текст
Ответ на	Re: [HACKERS] multivariate statistics (v19) (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

Looking at 0003, I notice that gram.y is changed to add a WITH ( .. )
clause.  If it's not specified, an error is raised.  If you create
stats with (ndistinct) then you can't alter it later to add
"dependencies" or whatever; unless I misunderstand, you have to drop the
statistics and create another one.  Probably in a forthcoming patch we
should have ALTER support to add a stats type.

Also, why isn't the default to build everything, rather than nothing?

BTW, almost everything in the backend could be inside "utils/", so let's
not do that -- let's just create src/backend/statistics/ for all your
code.

Here a few notes while reading README.dependencies -- some typos, two
questions.

diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
index 908f094..7f3ed3d 100644
--- a/src/backend/utils/mvstats/README.dependencies
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -36,7 +36,7 @@ design choice to model the dataset in denormalized way, either because ofperformance or to make
queryingeasier.
 
-soft dependencies
+Soft dependencies-----------------Real-world data sets often contain data errors, either because of data entry
@@ -48,7 +48,7 @@ rendering the approach mostly useless even for slightly noisy data sets, orresult in sudden changes
inbehavior depending on minor differences betweensamples provided to ANALYZE.
 
-For this reason the statistics implementes "soft" functional dependencies,
+For this reason the statistics implements "soft" functional dependencies,associating each functional dependency with a
degreeof validity (a numbernumber between 0 and 1). This degree is then used to combine selectivitiesin a smooth
manner.
@@ -75,6 +75,7 @@ The algorithm also requires a minimum size of the group to consider itconsistent (currently 3 rows in
thesample). Small groups make it less likelyto break the consistency.
 
+## What is it that we store in the catalog?Clause reduction (planner/optimizer)------------------------------------
@@ -95,12 +96,12 @@ example for (a,b,c) we first use (a,b=>c) to break the computation intoand then apply (a=>b) the
sameway on P(a=?,b=?).
 
-Consistecy of clauses
+Consistency of clauses---------------------Functional dependencies only express general dependencies between
columns,withoutreferencing particular values. This assumes that the equality clauses
 
-are in fact consistent with the functinal dependency, i.e. that given a
+are in fact consistent with the functional dependency, i.e. that given adependency (a=>b), the value in (b=?) clause
isthe value determined by (a=?).If that's not the case, the clauses are "inconsistent" with the functionaldependency
andthe result will be over-estimation.
 
@@ -111,6 +112,7 @@ set will be empty, but we'll estimate the selectivity using the ZIP condition.In this case the
defaultestimation based on AVIA principle happens to workbetter, but mostly by chance.
 
+## what is AVIA principle?This issue is the price for the simplicity of functional dependencies. If theapplication
frequentlyconstructs queries with clauses inconsistent with
 

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] multivariate statistics (v19)