Re: multivariate statistics / patch v7

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: multivariate statistics / patch v7
Дата
Msg-id 20150703.143034.132752108.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: multivariate statistics / patch v7  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: multivariate statistics / patch v7  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Hello, I started to work on this patch.

> attached is v7 of the multivariate stats patch. The main improvement
> is major refactoring of the clausesel.c portion - splitting the
> awfully long spaghetti-style functions into smaller pieces, making it
> much more understandable etc.

Thank you, it looks clearer. I have some comment for the brief
look at this. This patchset is relatively large so I will comment
on "per-notice" basis.. which means I'll send comment before
examining the entire of this patchset. Sorry in advance for the
desultory comments.

=======
General comments:

- You included unnecessary stuffs such like regression.diffs in these patches.

- Now OID 3307 is used by pg_stat_file. I moved pg_mv_stats_dependencies_info/show to 3311/3312.

- Single-variate stats have a mechanism to inject arbitrary values as statistics, that is, get_relation_stats_hook and
thesimilar stuffs. I want the similar mechanism for multivariate statistics, too.
 

0001:

- I also don't think it is right thing for expression_tree_walker to recognize RestrictInfo since it is not a part of
expression.

0003:

- In clauselist_selectivity, find_stats is uselessly called for single clause. This should be called after the
clauselistfound to consist more than one clause.
 

- Searching vars to be compared with mv-stat columns which find_stats does should stop at disjunctions. But this patch
doesn'tbehave so and it should be an unwanted behavior. The following steps shows that.
 

=====# CREATE TABLE t1 (a int, b int, c int);=# INSERT INTO t1 (SELECT a, a * 2, a * 3 FROM generate_series(0, 9999)
a);=#EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3; Seq Scan on t1  (cost=0.00..230.00 rows=1 width=12)=#
ALTERTABLE t1 ADD STATISTICS (HISTOGRAM) ON (a, b, c);=# ANALZYE t1;=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2
ORc = 3; Seq Scan on t1  (cost=0.00..230.00 rows=268 width=12)
 
====Rows changed unwantedly.
It seems not so simple thing as your code assumes.

> I do assume some of those pieces are unnecessary because there already
> is a helper function with the same purpose (but I'm not aware of
> that). But IMHO this piece of code begins to look reasonable
> (especially when compared to the previous state).

Year, such kind of work should be done later:p This patch is
not-so-invasive so as to make it undoable.

> The other major improvement it review of the comments (including
> FIXMEs and TODOs), and removal of the obsolete / misplaced ones. And
> there was plenty of those ...
> 
> These changes made this version ~20k smaller than v6.
> 
> The patch also rebases to current master, which I assume shall be
> quite stable - so hopefully no more duplicate OIDs for a while.
> 
> There are 6 files attached, but only 0002-0006 are actually part of
> the multivariate statistics patch itself. The first part makes it
> possible to use pull_varnos() with expression trees containing
> RestrictInfo nodes, but maybe this is not the right way to fix this
> (there's another thread where this was discussed).

As mentioned above, checking if mv stats can be applied would be
more complex matter than now you are assuming. I also will
consider that.

> Also, the regression tests testing plan choice with multivariate stats
> (e.g. that a bitmap index scan is chosen instead of index scan) fail
> from time to time. I suppose this happens because the invalidation
> after ANALYZE is not processed before executing the query, so the
> optimizer does not see the stats, or something like that.

I saw that occurs, but have no idea how it occurs so far..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: [PROPOSAL] VACUUM Progress Checker.
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: WAL logging problem in 9.4.3?