Re: [HACKERS] PATCH: multivariate histograms and MCV lists
От | Tomas Vondra |
---|---|
Тема | Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Дата | |
Msg-id | c2406825-6ebe-26e6-91b3-ba94026546a6@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] PATCH: multivariate histograms and MCV lists (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Ответы |
Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Список | pgsql-hackers |
On 3/14/19 12:56 PM, Kyotaro HORIGUCHI wrote: > At Wed, 13 Mar 2019 19:37:45 +1300, David Rowley <david.rowley@2ndquadrant.com> wrote in <CAKJS1f_6qDQj9m2H0jF4bRkZVLpfc7O9E+MxdXrq0wgv0z1NrQ@mail.gmail.com> >> On Wed, 13 Mar 2019 at 17:20, Kyotaro HORIGUCHI >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >>> bms_member_index seems working differently than maybe expected. >>> >>> bms_member_index((2, 4), 0) => 0, (I think) should be -1 >>> bms_member_index((2, 4), 1) => 0, should be -1 >>> bms_member_index((2, 4), 2) => 0, should be 0 >>> bms_member_index((2, 4), 3) => 1, should be -1 >>> bms_member_index((2, 4), 4) => 1, should be 1 >>> bms_member_index((2, 4), 5) => 2, should be -1 >>> bms_member_index((2, 4), 6) => 2, should be -1 >>> ... >>> bms_member_index((2, 4), 63) => 2, should be -1 >>> bms_member_index((2, 4), 64) => -1, correct >>> >>> It works correctly only when x is a member - the way the function >>> is maybe actually used in this patch -, or needs to change the >>> specifiction (or the comment) of the function. >> >> Looks like: >> >> + if (wordnum >= a->nwords) >> + return -1; >> >> should be: >> >> + if (wordnum >= a->nwords || >> + (a->word[wordnum] & ((bitmapword) 1 << bitnum)) == 0) >> + return -1; > > Yeah, seems right. > Yep, that was broken. The attached patch fixes this by simply calling bms_is_member, instead of copying the checks into bms_member_index. I've also reworked the regression tests to use a function extracting the cardinality estimates, as proposed by Dean and David. I have not reduced the size of data sets yet, so the tests are not much faster, but we no longer check the exact query plan. That's probably a good idea anyway. Actually - the tests are a bit faster because it allows removing indexes that were used for the query plans. FWIW I've noticed an annoying thing when modifying type of column not included in a statistics. Consider this: create table t (a int, b int, c text); insert into t select mod(i,10), mod(i,10), '' from generate_series(1,10000) s(i); create statistics s (dependencies) on a,b from t; analyze t; explain analyze select * from t where a = 1 and b = 1; QUERY PLAN --------------------------------------------------------------------- Seq Scan on t (cost=0.00..205.00 rows=1000 width=9) (actual time=0.014..1.910 rows=1000 loops=1) Filter: ((a = 1) AND (b = 1)) Rows Removed by Filter: 9000 Planning Time: 0.119 ms Execution Time: 2.234 ms (5 rows) alter table t alter c type varchar(61); explain analyze select * from t where a = 1 and b = 1; QUERY PLAN --------------------------------------------------------------------- Seq Scan on t (cost=0.00..92.95 rows=253 width=148) (actual time=0.020..2.420 rows=1000 loops=1) Filter: ((a = 1) AND (b = 1)) Rows Removed by Filter: 9000 Planning Time: 0.128 ms Execution Time: 2.767 ms (5 rows) select stxdependencies from pg_statistic_ext; stxdependencies ------------------------------------------ {"1 => 2": 1.000000, "2 => 1": 1.000000} (1 row) That is, we don't remove the statistics, but the estimate still changes. But that's because the ALTER TABLE also resets reltuples/relpages: select relpages, reltuples from pg_class where relname = 't'; relpages | reltuples ----------+----------- 0 | 0 (1 row) That's a bit unfortunate, and it kinda makes the whole effort to not drop the statistics unnecessarily kinda pointless :-( regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления:
Следующее
От: Shawn DebnathДата:
Сообщение: Re: Introduce timeout capability for ConditionVariableSleep