Re: Additional improvements to extended statistics
От | Tomas Vondra |
---|---|
Тема | Re: Additional improvements to extended statistics |
Дата | |
Msg-id | 20200309000157.ig5tcrynvaqu4ixd@development обсуждение исходный текст |
Ответ на | Re: Additional improvements to extended statistics (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Ответы |
Re: Additional improvements to extended statistics
Re: Additional improvements to extended statistics |
Список | pgsql-hackers |
On Sun, Mar 08, 2020 at 07:17:10PM +0000, Dean Rasheed wrote: >On Fri, 6 Mar 2020 at 12:58, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >> >> Here is a rebased version of this patch series. I've polished the first >> two parts a bit - estimation of OR clauses and (Var op Var) clauses. >> > >Hi, > >I've been looking over the first patch (OR list support). It mostly >looks reasonable to me, except there's a problem with the way >statext_mcv_clauselist_selectivity() combines multiple stat_sel values >into the final result -- in the OR case, it needs to start with sel = >0, and then apply the OR formula to factor in each new estimate. I.e., >this isn't right for an OR list: > > /* Factor the estimate from this MCV to the oveall estimate. */ > sel *= stat_sel; > >(Oh and there's a typo in that comment: s/oveall/overall/). > >For example, with the regression test data, this isn't estimated well: > > SELECT * FROM mcv_lists_multi WHERE a = 0 OR b = 0 OR c = 0 OR d = 0; > >Similarly, if no extended stats can be applied it needs to return 0 >not 1, for example this query on the test data: > > SELECT * FROM mcv_lists WHERE a = 1 OR a = 2 OR d IS NOT NULL; > Ah, right. Thanks for noticing this. Attaches is an updated patch series with parts 0002 and 0003 adding tests demonstrating the issue and then fixing it (both shall be merged to 0001). >It might also be worth adding a couple more regression test cases like these. Agreed, 0002 adds a couple of relevant tests. Incidentally, I've been working on improving test coverage for extended stats over the past few days (it has ~80% lines covered, which is not bad nor great). I haven't submitted that to hackers yet, because it's mostly mechanical and it's would interfere with the two existing threads about extended stats ... Speaking of which, would you take a look at [1]? I think supporting SAOP is fine, but I wonder if you agree with my conclusion we can't really support inclusion @> as explained in [2]. [1] https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy@pierred-pdoc [2] https://www.postgresql.org/message-id/20200202184134.swoqkqlqorqolrqv%40development regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: