Обсуждение: [HACKERS] PATCH: multivariate histograms and MCV lists

Поиск

Список

Период

Сортировка

[HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

14 августа 2017 г., 04:48:27

Hi all,

For PostgreSQL 10 we managed to get the basic CREATE STATISTICS bits in 
(grammar, infrastructure, and two simple types of statistics). See:

     https://commitfest.postgresql.org/13/852/

This patch presents a rebased version of the remaining parts, adding 
more complex statistic types (MCV lists and histograms), and hopefully 
some additional improvements.

The code was rebased on top of current master, and I've made various 
improvements to match how the committed parts were reworked. So the 
basic idea and shape remains the same, the tweaks are mostly small.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Adrien Nayrat

Дата:

17 августа 2017 г., 16:06:32

On 08/14/2017 12:48 AM, Tomas Vondra wrote:
> Hi all,
>
> For PostgreSQL 10 we managed to get the basic CREATE STATISTICS bits in
> (grammar, infrastructure, and two simple types of statistics). See:
>
>     https://commitfest.postgresql.org/13/852/
>
> This patch presents a rebased version of the remaining parts, adding more
> complex statistic types (MCV lists and histograms), and hopefully some
> additional improvements.
>
> The code was rebased on top of current master, and I've made various
> improvements to match how the committed parts were reworked. So the basic idea
> and shape remains the same, the tweaks are mostly small.
>
>
> regards
>
>
>
>

Hello,

There is no check of "statistics type/kind" in pg_stats_ext_mcvlist_items and
pg_histogram_buckets.

select stxname,stxkind from pg_statistic_ext ;
  stxname  | stxkind
-----------+---------
 stts3     | {h}
 stts2     | {m}

So you can call :

SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext WHERE stxname
= 'stts3'));

SELECT * FROM pg_histogram_buckets((SELECT oid FROM pg_statistic_ext WHERE
stxname = 'stts2'), 0);

Both crashes.

Unfotunately, I don't have the knowledge to produce a patch :/

Small fix in documentation, patch attached.


Thanks!

--
Adrien NAYRAT

http://dalibo.com - http://dalibo.org

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

26 августа 2017 г., 20:14:24


On 08/17/2017 12:06 PM, Adrien Nayrat wrote:>
> Hello,
> 
> There is no check of "statistics type/kind" in
> pg_stats_ext_mcvlist_items and pg_histogram_buckets.
> 
> select stxname,stxkind from pg_statistic_ext ; stxname  | stxkind 
> -----------+--------- stts3     | {h} stts2     | {m}
> 
> So you can call :
> 
> SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext
> WHERE stxname = 'stts3'));
> 
> SELECT * FROM pg_histogram_buckets((SELECT oid FROM pg_statistic_ext
> WHERE stxname = 'stts2'), 0);
> 
> Both crashes.
> 

Thanks for the report, this is clearly a bug. I don't think we need to
test the stxkind, but rather a missing check that the requested type is
actually built.

> Unfotunately, I don't have the knowledge to produce a patch :/
> 
> Small fix in documentation, patch attached.
> 

Thanks, will fix.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

08 сентября 2017 г., 02:07:47

Hi,

Attached is an updated version of the patch, fixing the issues reported
by Adrien Nayrat, and also a bunch of issues pointed out by valgrind.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

13 сентября 2017 г., 03:06:18

Attached is an updated version of the patch, dealing with fallout of
821fb8cdbf700a8aadbe12d5b46ca4e61be5a8a8 which touched the SGML
documentation for CREATE STATISTICS.

regards

On 09/07/2017 10:07 PM, Tomas Vondra wrote:
> Hi,
> 
> Attached is an updated version of the patch, fixing the issues reported
> by Adrien Nayrat, and also a bunch of issues pointed out by valgrind.
> 
> regards
> 

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Mark Dilger

Дата:

10 ноября 2017 г., 21:17:15

> On Sep 12, 2017, at 2:06 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> Attached is an updated version of the patch, dealing with fallout of
> 821fb8cdbf700a8aadbe12d5b46ca4e61be5a8a8 which touched the SGML
> documentation for CREATE STATISTICS.

Your patches need updating.

Tom's commit 471d55859c11b40059aef7dd82f82b3a0dc338b1 changed
src/bin/psql/describe.c, which breaks your 0001-multivariate-MCV-lists.patch.gz
file.

I reviewed the patch a few months ago, and as I recall, it looked good to me.
I should review it again before approving it, though.

mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

19 ноября 2017 г., 02:28:58

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi,

On 11/25/2017 05:15 PM, Mark Dilger wrote:
> 
>> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> Hi,
>>
>> Attached is an updated version of the patch, adopting the psql describe
>> changes introduced by 471d55859c11b.
>>
>> regards
>>
>> -- 
>> Tomas Vondra                  http://www.2ndQuadrant.com
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>> <0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>
> 
> Thanks, Tomas, again for your work on this feature.
> 
> Applying just the 0001-multivariate-MCV-lists.patch to the current master, and
> then extending the stats_ext.sql test as follows, I am able to trigger an error,
> "ERROR:  operator 4294934272 is not a valid ordering operator".
> 

Ah, that's a silly bug ...

The code assumes that VacAttrStats->extra_data is always StdAnalyzeData,
and attempts to extract the ltopr from that. But for arrays that's of
course not true (array_typanalyze uses ArrayAnalyzeExtraData instead).

The reason why this only fails after the second INSERT is that we need
at least two occurrences of a value before considering it eligible for
MCV list. So after the first INSERT we don't even call the serialize.

Attached is a fix that should resolve this in MCV lists by looking up
the operator using lookup_type_cache() when serializing the MCV.

FWIW histograms have the same issue, but on more places (not just in
serialize, but also when building the histogram).

I'll send a properly updated patch series shortly, with tests checking
correct behavior with arrays.

Thanks for the report.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

0001-MCV-fix.patch

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Mark Dilger

Дата:

26 ноября 2017 г., 02:23:17

> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> Hi,
>
> Attached is an updated version of the patch, adopting the psql describe
> changes introduced by 471d55859c11b.
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> <0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

Hello Tomas,

After applying both your patches, I get a warning:

histogram.c:1284:10: warning: taking the absolute value of unsigned type 'uint32' (aka 'unsigned int') has no effect
[-Wabsolute-value]      delta = fabs(data->numrows);               ^ 
histogram.c:1284:10: note: remove the call to 'fabs' since unsigned values cannot be negative       delta =
fabs(data->numrows);              ^~~~ 
1 warning generated.

Looking closer at this section, there is some odd integer vs. floating point arithmetic happening
that is not necessarily wrong, but might be needlessly inefficient:
   delta = fabs(data->numrows);   split_value = values[0].value;
   for (i = 1; i < data->numrows; i++)   {       if (values[i].value != values[i - 1].value)       {           /* are
wecloser to splitting the bucket in half? */           if (fabs(i - data->numrows / 2.0) < delta)           {
   /* let's assume we'll use this value for the split */               split_value = values[i].value;
delta= fabs(i - data->numrows / 2.0);               nrows = i;           }       }   } 

I'm not sure the compiler will be able to optimize out the recomputation of data->numrows / 2.0
each time through the loop, since the compiler might not be able to prove to itself that data->numrows
does not get changed.  Perhaps you should compute it just once prior to entering the outer loop,
store it in a variable of integer type, round 'delta' off and store in an integer, and do integer comparisons
within the loop?  Just a thought....

mark

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Mark Dilger

Дата:

26 ноября 2017 г., 02:25:49

> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> Hi,
>
> Attached is an updated version of the patch, adopting the psql describe
> changes introduced by 471d55859c11b.
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> <0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

In src/backend/statistics/mcv.c, you have a few typos:

+ * there bo be a lot of duplicate values. But perhaps that's not true and we

+       /* Now it's safe to access the dimention info. */

+        * Nowe we know the total expected MCV size, including all the pieces

+                       /* pased by reference, but fixed length (name, tid, ...) */


In src/include/statistics/statistics.h, there is some extraneous whitespace that needs
removing.


mark

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Mark Dilger

Дата:

26 ноября 2017 г., 03:01:06

> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> Hi,
>
> Attached is an updated version of the patch, adopting the psql describe
> changes introduced by 471d55859c11b.

Hi Tomas,

In src/backend/statistics/dependencies.c, you have introduced a comment:

+       /*
+        * build an array of SortItem(s) sorted using the multi-sort support
+        *
+        * XXX This relies on all stats entries pointing to the same tuple
+        * descriptor. Not sure if that might not be the case.
+        */

Would you mind explaining that a bit more for me?  I don't understand exactly what
you mean here, but it sounds like the sort of thing that needs to be clarified/fixed
before it can be committed.  Am I misunderstanding this?

In src/backend/statistics/mcv.c, you have comments:

+ * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we want to
+ * check the most frequent items first.
+ *
+ * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
+ * Maybe we could save some space here, but the bytea compression should
+ * handle it just fine.
+ *
+ * TODO: This probably should not use the ndistinct directly (as computed from
+ * the table, but rather estimate the number of distinct values in the
+ * table), no?

Do you intend these to be fixed/implemented prior to committing this patch?

Further down in function statext_mcv_build, you have two loops, the first allocating
memory and the second initializing the memory.  There is no clear reason why this
must be done in two loops.  I tried combining the two loops into one, and it worked
just fine, but did not look any cleaner to me.  Feel free to disregard this paragraph
if you like it better the way you currently have it organized.

Further down in statext_mcv_deserialize, you have some elogs which might need to be
ereports.  It is unclear to me whether you consider these deserialize error cases to be
"can't happen" type errors.  If so, you might add that fact to the comments rather than
changing the elogs to ereports.

mark

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

26 ноября 2017 г., 05:18:11

Hi,

On 11/25/2017 09:23 PM, Mark Dilger wrote:
> 
>> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> Hi,
>>
>> Attached is an updated version of the patch, adopting the psql describe
>> changes introduced by 471d55859c11b.
>>
>> regards
>>
>> -- 
>> Tomas Vondra                  http://www.2ndQuadrant.com
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>> <0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>
> 
> Hello Tomas,
> 
> After applying both your patches, I get a warning:
> 
> histogram.c:1284:10: warning: taking the absolute value of unsigned type 'uint32' (aka 'unsigned int') has no effect
[-Wabsolute-value]
>         delta = fabs(data->numrows);
>                 ^
> histogram.c:1284:10: note: remove the call to 'fabs' since unsigned values cannot be negative
>         delta = fabs(data->numrows);
>                 ^~~~
> 1 warning generated.
> 

Hmm, yeah. The fabs() call is unnecessary, and probably a remnant from
some previous version where the field was not uint32.

I wonder why you're getting the warning and I don't, though. What
compiler are you using?

> 
> Looking closer at this section, there is some odd integer vs. floating point arithmetic happening
> that is not necessarily wrong, but might be needlessly inefficient:
> 
>     delta = fabs(data->numrows);
>     split_value = values[0].value;
> 
>     for (i = 1; i < data->numrows; i++)
>     {
>         if (values[i].value != values[i - 1].value)
>         {
>             /* are we closer to splitting the bucket in half? */
>             if (fabs(i - data->numrows / 2.0) < delta)
>             {
>                 /* let's assume we'll use this value for the split */
>                 split_value = values[i].value;
>                 delta = fabs(i - data->numrows / 2.0);
>                 nrows = i;
>             }
>         }
>     }
> 
> I'm not sure the compiler will be able to optimize out the recomputation of data->numrows / 2.0
> each time through the loop, since the compiler might not be able to prove to itself that data->numrows
> does not get changed.  Perhaps you should compute it just once prior to entering the outer loop,
> store it in a variable of integer type, round 'delta' off and store in an integer, and do integer comparisons
> within the loop?  Just a thought....
> 

Yeah, that's probably right. But I wonder if the loop is needed at all,
or whether we should start at i=(data->numrows/2.0) instead, and walk to
the closest change of value in both directions. That would probably save
more CPU than computing numrows/2.0 only once.

The other issue in that block of code seems to be that we compare the
values using simple inequality. That probably works for passbyval data
types, but we should use proper comparator (e.g. compare_datums_simple).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

26 ноября 2017 г., 05:33:46


On 11/25/2017 10:01 PM, Mark Dilger wrote:
> 
>> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> Hi,
>>
>> Attached is an updated version of the patch, adopting the psql describe
>> changes introduced by 471d55859c11b.
> 
> Hi Tomas,
> 
> In src/backend/statistics/dependencies.c, you have introduced a comment:
> 
> +       /*
> +        * build an array of SortItem(s) sorted using the multi-sort support
> +        *
> +        * XXX This relies on all stats entries pointing to the same tuple
> +        * descriptor. Not sure if that might not be the case.
> +        */
> 
> Would you mind explaining that a bit more for me?  I don't understand exactly what
> you mean here, but it sounds like the sort of thing that needs to be clarified/fixed
> before it can be committed.  Am I misunderstanding this?
> 

The call right after that comment is
   items = build_sorted_items(numrows, rows, stats[0]->tupDesc,                              mss, k, attnums_dep);

That method processes an array of tuples, and the structure is defined
by "tuple descriptor" (essentially a list of attribute info - data type,
length, ...). We get that from stats[0] and assume all the entries point
to the same tuple descriptor. That's generally safe assumption, I think,
because all the stats entries relate to columns from the same table.

> 
> In src/backend/statistics/mcv.c, you have comments:
> 
> + * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
> + * should do that too, because when walking through the list we want to
> + * check the most frequent items first.
> + *
> + * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
> + * Maybe we could save some space here, but the bytea compression should
> + * handle it just fine.
> + *
> + * TODO: This probably should not use the ndistinct directly (as computed from
> + * the table, but rather estimate the number of distinct values in the
> + * table), no?
> 
> Do you intend these to be fixed/implemented prior to committing this patch?
> 

Actually, the first FIXME is obsolete, as build_distinct_groups returns
the groups sorted by frequency. I'll remove that.

I think the rest is more a subject for discussion, so I'd need to hear
some feedback.

> 
> Further down in function statext_mcv_build, you have two loops, the first allocating
> memory and the second initializing the memory.  There is no clear reason why this
> must be done in two loops.  I tried combining the two loops into one, and it worked
> just fine, but did not look any cleaner to me.  Feel free to disregard this paragraph
> if you like it better the way you currently have it organized.
> 

I did it this way because of readability. I don't think this is a major
efficiency issue, as the maximum number of items is fairly limited, and
it happens only once at the end of the MCV list build (and the sorts and
comparisons are likely much more CPU expensive).

> 
> Further down in statext_mcv_deserialize, you have some elogs which might need to be
> ereports.  It is unclear to me whether you consider these deserialize error cases to be
> "can't happen" type errors.  If so, you might add that fact to the comments rather than
> changing the elogs to ereports.
> 

I might be missing something, but why would ereport be more appropriate
than elog? Ultimately, there's not much difference between elog(ERROR)
and ereport(ERROR) - both will cause a failure.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Mark Dilger

Дата:

26 ноября 2017 г., 07:17:44

> On Nov 25, 2017, at 3:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
>
>
> On 11/25/2017 10:01 PM, Mark Dilger wrote:
>>
>>> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>>
>>> Hi,
>>>
>>> Attached is an updated version of the patch, adopting the psql describe
>>> changes introduced by 471d55859c11b.
>>
>> Hi Tomas,
>>
>> In src/backend/statistics/dependencies.c, you have introduced a comment:
>>
>> +       /*
>> +        * build an array of SortItem(s) sorted using the multi-sort support
>> +        *
>> +        * XXX This relies on all stats entries pointing to the same tuple
>> +        * descriptor. Not sure if that might not be the case.
>> +        */
>>
>> Would you mind explaining that a bit more for me?  I don't understand exactly what
>> you mean here, but it sounds like the sort of thing that needs to be clarified/fixed
>> before it can be committed.  Am I misunderstanding this?
>>
>
> The call right after that comment is
>
>    items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
>                               mss, k, attnums_dep);
>
> That method processes an array of tuples, and the structure is defined
> by "tuple descriptor" (essentially a list of attribute info - data type,
> length, ...). We get that from stats[0] and assume all the entries point
> to the same tuple descriptor. That's generally safe assumption, I think,
> because all the stats entries relate to columns from the same table.

Right, I got that, and tried mocking up some code to test that in an Assert.
I did not pursue that far enough to reach any conclusion, however.  You
seem to be indicating in the comment some uncertainty about whether the
assumption is safe.  Do we need to dig into that further?

>>
>> In src/backend/statistics/mcv.c, you have comments:
>>
>> + * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
>> + * should do that too, because when walking through the list we want to
>> + * check the most frequent items first.
>> + *
>> + * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
>> + * Maybe we could save some space here, but the bytea compression should
>> + * handle it just fine.
>> + *
>> + * TODO: This probably should not use the ndistinct directly (as computed from
>> + * the table, but rather estimate the number of distinct values in the
>> + * table), no?
>>
>> Do you intend these to be fixed/implemented prior to committing this patch?
>>
>
> Actually, the first FIXME is obsolete, as build_distinct_groups returns
> the groups sorted by frequency. I'll remove that.

Ok, good.  That's the one I understood least.

> I think the rest is more a subject for discussion, so I'd need to hear
> some feedback.

In terms of storage efficiency, you are using float8 for the frequency, which is consistent
with what other stats work uses, but may be overkill.  A float4 seems sufficient to me.
The extra four bytes for a float8 may be pretty small compared to the size of the arrays
being stored, so I'm not sure it matters.  Also, this might have been discussed before,
and I am not asking for a reversal of decisions the members of this mailing list may
already have reached.

As for using arrays of something smaller than Datum, you'd need some logic to specify
what the size is in each instance, and that probably complicates the code rather a lot.
Maybe someone else has a technique for doing that cleanly?

>>
>> Further down in function statext_mcv_build, you have two loops, the first allocating
>> memory and the second initializing the memory.  There is no clear reason why this
>> must be done in two loops.  I tried combining the two loops into one, and it worked
>> just fine, but did not look any cleaner to me.  Feel free to disregard this paragraph
>> if you like it better the way you currently have it organized.
>>
>
> I did it this way because of readability. I don't think this is a major
> efficiency issue, as the maximum number of items is fairly limited, and
> it happens only once at the end of the MCV list build (and the sorts and
> comparisons are likely much more CPU expensive).

I defer to your judgement here.  It seems fine the way you did it.

>> Further down in statext_mcv_deserialize, you have some elogs which might need to be
>> ereports.  It is unclear to me whether you consider these deserialize error cases to be
>> "can't happen" type errors.  If so, you might add that fact to the comments rather than
>> changing the elogs to ereports.
>>
>
> I might be missing something, but why would ereport be more appropriate
> than elog? Ultimately, there's not much difference between elog(ERROR)
> and ereport(ERROR) - both will cause a failure.

I understand project policy to allow elog for error conditions that will be reported
in "can't happen" type situations, similar to how an Assert would be used.  For
conditions that can happen through (mis)use by the user, ereport is appropriate.
Not knowing whether you thought these elogs were reporting conditions that a
user could cause, I did not know if you should change them to ereports, or if you
should just add a brief comment along the lines of /* should not be possible */.

I may misunderstand project policy.  If so, I'd gratefully accept correction on this
matter.

mark

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

26 ноября 2017 г., 08:20:26


On 11/26/2017 02:17 AM, Mark Dilger wrote:
> 
>> On Nov 25, 2017, at 3:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>>
>>
>> On 11/25/2017 10:01 PM, Mark Dilger wrote:
>>>
>>>> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Attached is an updated version of the patch, adopting the psql describe
>>>> changes introduced by 471d55859c11b.
>>>
>>> Hi Tomas,
>>>
>>> In src/backend/statistics/dependencies.c, you have introduced a comment:
>>>
>>> +       /*
>>> +        * build an array of SortItem(s) sorted using the multi-sort support
>>> +        *
>>> +        * XXX This relies on all stats entries pointing to the same tuple
>>> +        * descriptor. Not sure if that might not be the case.
>>> +        */
>>>
>>> Would you mind explaining that a bit more for me?  I don't understand exactly what
>>> you mean here, but it sounds like the sort of thing that needs to be clarified/fixed
>>> before it can be committed.  Am I misunderstanding this?
>>>
>>
>> The call right after that comment is
>>
>>    items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
>>                               mss, k, attnums_dep);
>>
>> That method processes an array of tuples, and the structure is defined
>> by "tuple descriptor" (essentially a list of attribute info - data type,
>> length, ...). We get that from stats[0] and assume all the entries point
>> to the same tuple descriptor. That's generally safe assumption, I think,
>> because all the stats entries relate to columns from the same table.
> 
> Right, I got that, and tried mocking up some code to test that in an Assert.
> I did not pursue that far enough to reach any conclusion, however.  You
> seem to be indicating in the comment some uncertainty about whether the
> assumption is safe.  Do we need to dig into that further?
> 

I don't think it's worth the effort, really. I don't think we can really
get mismatching tuple descriptors here - that could only happen with
columns coming from different tables, or something similarly obscure.

>>>
>>> In src/backend/statistics/mcv.c, you have comments:
>>>
>>> + * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
>>> + * should do that too, because when walking through the list we want to
>>> + * check the most frequent items first.
>>> + *
>>> + * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
>>> + * Maybe we could save some space here, but the bytea compression should
>>> + * handle it just fine.
>>> + *
>>> + * TODO: This probably should not use the ndistinct directly (as computed from
>>> + * the table, but rather estimate the number of distinct values in the
>>> + * table), no?
>>>
>>> Do you intend these to be fixed/implemented prior to committing this patch?
>>>
>>
>> Actually, the first FIXME is obsolete, as build_distinct_groups returns
>> the groups sorted by frequency. I'll remove that.
> 
> Ok, good.  That's the one I understood least.
> 
>> I think the rest is more a subject for discussion, so I'd need to hear
>> some feedback.
> 
> In terms of storage efficiency, you are using float8 for the frequency, which is consistent
> with what other stats work uses, but may be overkill.  A float4 seems sufficient to me.
> The extra four bytes for a float8 may be pretty small compared to the size of the arrays
> being stored, so I'm not sure it matters.  Also, this might have been discussed before,
> and I am not asking for a reversal of decisions the members of this mailing list may
> already have reached.
> 
> As for using arrays of something smaller than Datum, you'd need some logic to specify
> what the size is in each instance, and that probably complicates the code rather a lot.
> Maybe someone else has a technique for doing that cleanly? 
> 

Note that this is not about storage efficiency. The comment is before
statext_mcv_build, so it's actually related to in-memory representation.
If you look into statext_mcv_serialize, it does use typlen to only copy
the number of bytes needed for each column.

>>>
>>> Further down in function statext_mcv_build, you have two loops, the first allocating
>>> memory and the second initializing the memory.  There is no clear reason why this
>>> must be done in two loops.  I tried combining the two loops into one, and it worked
>>> just fine, but did not look any cleaner to me.  Feel free to disregard this paragraph
>>> if you like it better the way you currently have it organized.
>>>
>>
>> I did it this way because of readability. I don't think this is a major
>> efficiency issue, as the maximum number of items is fairly limited, and
>> it happens only once at the end of the MCV list build (and the sorts and
>> comparisons are likely much more CPU expensive).
> 
> I defer to your judgement here.  It seems fine the way you did it.
> 
>>> Further down in statext_mcv_deserialize, you have some elogs which might need to be
>>> ereports.  It is unclear to me whether you consider these deserialize error cases to be
>>> "can't happen" type errors.  If so, you might add that fact to the comments rather than
>>> changing the elogs to ereports.
>>>
>>
>> I might be missing something, but why would ereport be more appropriate
>> than elog? Ultimately, there's not much difference between elog(ERROR)
>> and ereport(ERROR) - both will cause a failure.
> 
> I understand project policy to allow elog for error conditions that will be reported
> in "can't happen" type situations, similar to how an Assert would be used.  For
> conditions that can happen through (mis)use by the user, ereport is appropriate.
> Not knowing whether you thought these elogs were reporting conditions that a
> user could cause, I did not know if you should change them to ereports, or if you
> should just add a brief comment along the lines of /* should not be possible */.
> 
> I may misunderstand project policy.  If so, I'd gratefully accept correction on this
> matter.
> 

I don't know - I always considered "elog" old interface, and "ereport"
is the new one. In any case, those are "should not happen" cases. It
would mean some sort of data corruption, or so.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tom Lane

Дата:

26 ноября 2017 г., 08:35:35

Mark Dilger <hornschnorter@gmail.com> writes:
>> On Nov 25, 2017, at 3:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> I might be missing something, but why would ereport be more appropriate
>> than elog? Ultimately, there's not much difference between elog(ERROR)
>> and ereport(ERROR) - both will cause a failure.

The core technical differences are (1) an ereport message is exposed for
translation, normally, while an elog is not; and (2) with ereport you can
set the errcode, whereas with elog it's always going to be XX000
(ERRCODE_INTERNAL_ERROR).

> I understand project policy to allow elog for error conditions that will be reported
> in "can't happen" type situations, similar to how an Assert would be used.  For
> conditions that can happen through (mis)use by the user, ereport is appropriate.

The project policy about this is basically that elog should only be used
for things that are legitimately "internal errors", ie not user-facing.
If there's a deterministic way for a user to trigger the error, or if
it can reasonably be expected to occur during normal operation, it should
definitely have an ereport (and a non-default errcode).
        regards, tom lane

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Alvaro Herrera

Дата:

26 ноября 2017 г., 09:32:08

Mark Dilger wrote:


> I understand project policy to allow elog for error conditions that will be reported
> in "can't happen" type situations, similar to how an Assert would be used.  For
> conditions that can happen through (mis)use by the user, ereport is appropriate.
> Not knowing whether you thought these elogs were reporting conditions that a
> user could cause, I did not know if you should change them to ereports, or if you
> should just add a brief comment along the lines of /* should not be possible */.

Two things dictate that policy:

1. messages are translated by default for ereport but not for elog.
Both things can be overridden, but we tend not to do it unless there's
no choice.

2. you can assign SQLSTATE only with ereport.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

27 ноября 2017 г., 22:47:13

Hi,

Attached is an updated version of the patch series, fixing the issues
reported by Mark Dilger:

1) Fix fabs() issue in histogram.c.

2) Do not rely on extra_data being StdAnalyzeData, and instead lookup
the LT operator explicitly. This also adds a simple regression tests to
make sure ANALYZE on arrays works fine, but perhaps we should invent
some simple queries too.

3) I've removed / clarified some of the comments mentioned by Mark.

4) I haven't changed how the statistics kinds are defined in relation.h,
but I agree there should be a comment explaining how STATS_EXT_INFO_*
relate to StatisticExtInfo.kinds.

5) The most significant change happened histograms. There used to be two
structures for histograms:

  - MVHistogram - expanded (no deduplication etc.), result of histogram
    build and never used for estimation

  - MVSerializedHistogram - deduplicated to save space, produced from
    MVHistogram before storing in pg_statistic_ext and never used for
    estimation

So there wasn't really any reason to expose the "non-serialized" version
outside histogram.c. It was just confusing and unnecessary, so I've
moved MVHistogram to histogram.c (and renamed it to MVHistogramBuild),
and renamed MVSerializedHistogram. And same for the MVBucket stuff.

So now we only deal with MVHistogram everywhere, except in histogram.c.

6) I've also made MVHistogram to include a varlena header directly (and
be packed as a bytea), which allows us to store it without having to
call any serialization functions).

I guess if we should do (5) and (6) for the MCV lists too, it seems more
convenient than the current approach. And perhaps even for the
statistics added to 9.6 (it does not change the storage format).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Michael Paquier

Дата:

30 ноября 2017 г., 10:11:28

On Tue, Nov 28, 2017 at 1:47 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Attached is an updated version of the patch series, fixing the issues
> reported by Mark Dilger:

Moved to next CF.
-- 
Michael

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Mark Dilger

Дата:

20 декабря 2017 г., 01:17:41

> On Nov 27, 2017, at 8:47 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> Hi,
>
> Attached is an updated version of the patch series, fixing the issues
> reported by Mark Dilger:
>
> 1) Fix fabs() issue in histogram.c.
>
> 2) Do not rely on extra_data being StdAnalyzeData, and instead lookup
> the LT operator explicitly. This also adds a simple regression tests to
> make sure ANALYZE on arrays works fine, but perhaps we should invent
> some simple queries too.
>
> 3) I've removed / clarified some of the comments mentioned by Mark.
>
> 4) I haven't changed how the statistics kinds are defined in relation.h,
> but I agree there should be a comment explaining how STATS_EXT_INFO_*
> relate to StatisticExtInfo.kinds.
>
> 5) The most significant change happened histograms. There used to be two
> structures for histograms:
>
>  - MVHistogram - expanded (no deduplication etc.), result of histogram
>    build and never used for estimation
>
>  - MVSerializedHistogram - deduplicated to save space, produced from
>    MVHistogram before storing in pg_statistic_ext and never used for
>    estimation
>
> So there wasn't really any reason to expose the "non-serialized" version
> outside histogram.c. It was just confusing and unnecessary, so I've
> moved MVHistogram to histogram.c (and renamed it to MVHistogramBuild),
> and renamed MVSerializedHistogram. And same for the MVBucket stuff.
>
> So now we only deal with MVHistogram everywhere, except in histogram.c.
>
> 6) I've also made MVHistogram to include a varlena header directly (and
> be packed as a bytea), which allows us to store it without having to
> call any serialization functions).
>
> I guess if we should do (5) and (6) for the MCV lists too, it seems more
> convenient than the current approach. And perhaps even for the
> statistics added to 9.6 (it does not change the storage format).

I tested your latest patches on my mac os x laptop and got one test
failure due to the results of 'explain' coming up differently.  For the record,
I followed these steps:

cd postgresql/
git pull
# this got my directory up to 8526bcb2df76d5171b4f4d6dc7a97560a73a5eff with no local changes
patch -p 1 < ../0001-multivariate-MCV-lists.patch
patch -p 1 < ../0002-multivariate-histograms.patch
./configure --prefix=/Users/mark/master/testinstall --enable-cassert --enable-tap-tests --enable-depend && make -j4 &&
makecheck-world 

mark

On 12/20/2017 02:44 AM, Mark Dilger wrote:
> 
>> On Dec 19, 2017, at 4:31 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> Hi,
>>
>> On 12/19/2017 08:17 PM, Mark Dilger wrote:
>>>
>>> I tested your latest patches on my mac os x laptop and got one test
>>> failure due to the results of 'explain' coming up differently.  For the record,
>>> I followed these steps:
>>>
>>> cd postgresql/
>>> git pull
>>> # this got my directory up to 8526bcb2df76d5171b4f4d6dc7a97560a73a5eff with no local changes
>>> patch -p 1 < ../0001-multivariate-MCV-lists.patch
>>> patch -p 1 < ../0002-multivariate-histograms.patch
>>> ./configure --prefix=/Users/mark/master/testinstall --enable-cassert --enable-tap-tests --enable-depend && make -j4
&&make check-world

>>>
>>
>> Yeah, those steps sounds about right.
>>
>> Apparently this got broken by ecc27d55f4, although I don't quite
>> understand why - but it works fine before. Can you try if it works fine
>> on 9f4992e2a9 and fails with ecc27d55f4?
> 
> It succeeds with 9f4992e2a9.  It fails with ecc27d55f4.  The failures look
> to be the same as I reported previously.
> 

Gah, this turned out to be a silly bug. The ecc27d55f4 commit does:

    ... and fix dependencies_clauselist_selectivity() so that
    estimatedclauses actually is a pure output argument as stated by
    its API contract.

which does bring the code in line with the comment stating that
'estimatedclauses' is an output parameter. It wasn't meant to be
strictly output, though, but an input/output one instead (to pass
information about already estimated clauses when applying multiple
statistics).

With only dependencies it did not matter, but with the new MCV and
histogram patches we do this:

   Bitmapset *estimatedclauses = NULL;

   s1 *= statext_clauselist_selectivity(..., &estimatedclauses);

   s1 *= dependencies_clauselist_selectivity(..., &estimatedclauses);

Since ecc27d55f4, the first thing dependencies_clauselist_selectivity
does is resetting estimatedclauses to NULL, throwing away  information
about which clauses were estimated by MCV and histogram stats.

Of course, that's something ecc27d55f4 could not predict, but the reset
of estimatedclauses also makes the first loop over clauses rather
confusing, as it also checks the estimatedclauses bitmapset:

    listidx = 0;
    foreach(l, clauses)
    {
        Node *clause = (Node *) lfirst(l);

        if (!bms_is_member(listidx, *estimatedclauses))
        {
            ...
        }

        listidx++;
    }

Of course, the index can never be part of the bitmapset - we've just
reset it to NULL, and it's the first loop. This does not break anything,
but it's somewhat confusing.

Attached is an updated patch series, where the first patch fixes this by
removing the reset of estimatedclauses (and tweaking the comment).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Thomas Munro

Дата:

12 января 2018 г., 06:48:58

On Thu, Jan 4, 2018 at 1:12 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Attached is an updated patch series, where the first patch fixes this by
> removing the reset of estimatedclauses (and tweaking the comment).

Hi Tomas,

FYI, from the annoying robot department:

ref/create_statistics.sgml:170: parser error : Opening and ending tag
mismatch: structname line 170 and unparseable
   Create table <structname>t2</> with two perfectly correlated columns
                                 ^
ref/create_statistics.sgml:195: parser error : Opening and ending tag
mismatch: structname line 195 and unparseable
   Create table <structname>t3</> with two strongly correlated columns, and
                                 ^
ref/create_statistics.sgml:213: parser error : StartTag: invalid element name
EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 500) AND (b > 500);
                                           ^
ref/create_statistics.sgml:216: parser error : StartTag: invalid element name
EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 400) AND (b > 600);
                                           ^
ref/create_statistics.sgml:239: parser error : chunk is not well balanced
reference.sgml:116: parser error : Failure to process entity createStatistics
   &createStatistics;
                     ^
reference.sgml:116: parser error : Entity 'createStatistics' not defined
   &createStatistics;
                     ^
reference.sgml:293: parser error : chunk is not well balanced
postgres.sgml:231: parser error : Failure to process entity reference
 &reference;
            ^
postgres.sgml:231: parser error : Entity 'reference' not defined
 &reference;
            ^

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

13 января 2018 г., 04:51:05


On 01/12/2018 01:48 AM, Thomas Munro wrote:
> On Thu, Jan 4, 2018 at 1:12 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> Attached is an updated patch series, where the first patch fixes this by
>> removing the reset of estimatedclauses (and tweaking the comment).
> 
> Hi Tomas,
> 
> FYI, from the annoying robot department:
> 
> ref/create_statistics.sgml:170: parser error : Opening and ending tag
> mismatch: structname line 170 and unparseable
>    Create table <structname>t2</> with two perfectly correlated columns
>                                  ^
> ref/create_statistics.sgml:195: parser error : Opening and ending tag
> mismatch: structname line 195 and unparseable
>    Create table <structname>t3</> with two strongly correlated columns, and
>                                  ^

Thanks. Attached is an updated patch fixing all the doc issues.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

25 февраля 2018 г., 04:01:59

Hi,

Attached is an updated version of the patch, fixing some minor bitrot
and duplicate OIDs.

Sadly, this patch series does not seem to move forward very much, and
I'm not sure how to change that :-/ But I'd like to point out that while
those new statistics are still per-table, we might use them to improve
join estimates (which is kinda the elephant in the room, when it comes
to estimates) similarly to what eqjoinsel() does with per-column stats.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

An updated patch version, fixing the breakage caused by fd1a421fe6
twiddling with pg_proc.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

OK, here is an updated patch fixing breakage caused by 5564c11815.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

15 марта 2018 г., 06:31:19

Hi,

On 03/10/2018 06:19 PM, Alvaro Herrera wrote:
> On 0002:
> 
> In terms of docs, I think it's better not to have anything
> user-facing in the README. Consider that users are going to be
> reading the HTML docs only, and many of them may not have the README
> available at all. So anything that could be useful to users must be
> in the XML docs only; keep in the README only stuff that would be
> useful to a developer (a section such as "not yet implemented" would
> belong there, for example). Stuff that's in the XML should not appear
> in the README (because DRY). For the same reason, having the XML docs
> end with "see the README" seems a bad idea to me.
> 

I do agree with this in general, but I'm not sure which "user-facing"
bits in the READMEs you mean. I'll go through the docs, but it would be
easier to start with some hints.

> UPDATE_RESULT() is a bit weird to me. I think after staring at it for
> a while it looks okay, but why was it such a shock? In 0002 it's
> only used in one place so I would suggest to have it expanded, but I
> see you use it in 0003 also, three times I think. IMO for clarity it
> seems better to just have the expanded code rather than the macro.
> 

I don't quite see why expanding the macro would make the code clearer,
to be honest. I mean, expanding all

    UPDATE_RESULT(matches[i], tmp[i], is_or)

calls to

    matches[i]
        = is_or ? Max(matches[i],tmp[i]) : Min(matches[i], tmp[i]);

does not convey the intent of the code very well, I think. But I'm not
going to fight for it very hard.

That being said, perhaps the name of the macro is not very clear, and
something like MERGE_MATCH would be a better fit.

> find_ext_attnums (and perhaps other places) have references to 
> renamed columns, "starelid" and others.
Will fix.

>  Also there is this comment:
> /* Prepare to scan pg_statistic_ext for entries having indrelid = this rel. */
> which is outdated since it uses syscache, not a scan.  Just remove the
> comment ...
> 

Will fix.

> Please add a comment on what does build_attnums() do.
> 
> pg_stats_ext_mcvlist_items is odd. I suppose you made it take oid to 
> avoid having to deal with a malicious bytea?

That is one reason, yes. The other reason is that we also need to do

    getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
                      &outfuncs[i], &isvarlena);

so that we can format the MCV items as text. Which means we need
additional information about the extended statistic, so that we can
determine data types. Maybe we could simply store OIDs into the
statistic, similarly to arrays.

That wouldn't solve the issue of malicious values, but maybe we could
make it accept just pg_mcv_list - that should be safe, as casts from
bytea are not supported.

> The query in docs is pretty odd-looking,
> 
> SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext WHERE stxname = 'stts2'));
> If we keep the function as is, I would suggest to use LATERAL instead,
>   SELECT m.* FROM pg_statistic_ext, pg_mcv_list_items(oid) m WHERE stxname = 'stts2';
> but seems like it should be more like this instead:
>   SELECT m.* FROM pg_statistic_ext, pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts2';
> and not have the output formatting function load the data again from the
> table.  It'd be a bit like a type-specific UNNEST.
> 

OK, I'll look into that while reviewing the docs.

> There are a few elog(ERROR) messages.  The vast majority seem to be just
> internal messages so they're okay, but there is one that should be
> ereport:
> 
> +   if (total_length > (1024 * 1024))
> +       elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
> I think we have some precedent for better wording, such as
>     errmsg("index row size %zu exceeds maximum %zu for index \"%s\""
> so I would say
>    errmsg("serialized MCV list size %zu exceedes maximum %zu" )
> though I wonder when is this error thrown -- if this is detected during
> analyze for example, what happens?
> 

Actually, do we need/want to enforce such limit? It seemed like a good
idea back then, but perhaps having a limit with a mostly arbitrary value
is not such a great idea after all.

> There is this FIXME:
> +    * FIXME Should skip already estimated clauses (using the estimatedclauses
> +    * bitmap).
> Are you planning on implementing this before commit?
> 

Actually, in the MCV patch this is not really needed, because it gets
applied before functional dependencies (and those do skip already
estimated clauses).

Moreover the 0003 patch (histograms) reworks this part of the code a bit
(because MCV and histograms are somewhat complementary). So I think this
shouldn't really be a FIXME, but more a comment "We're not handling
this, because it's not needed."

But let me look at this a bit - it might make sense to move some of the
code from 0003 to 0002, which would fix this limitation, of course.

> There are other FIXMEs also.  This in particular caught my attention:
> 
> +           /* merge the bitmap into the existing one */
> +           for (i = 0; i < mcvlist->nitems; i++)
> +           {
> +               /*
> +                * Merge the result into the bitmap (Min for AND, Max for OR).
> +                *
> +                * FIXME this does not decrease the number of matches
> +                */
> +               UPDATE_RESULT(matches[i], or_matches[i], is_or);
> +           }
> 
> We come back to UPDATE_RESULT again ... and note how the comment makes
> no sense unless you know what UPDATE_RESULT does internally.  This is
> one more indication that the macro is not a great thing to have.  Let's
> lose it.  But while at it, what to do about the FIXME?
> 

Hmmm, not sure that's really a fault of the UPDATE_RESULT macro.

Sorting out the FIXME should not be difficult, I think - just remember
the original values of matches[i], and update the number of matches if
it gets flipped from true to false.

> You also have this
> +   /* XXX syscache contains OIDs of deleted stats (not invalidated) */
> +   if (!HeapTupleIsValid(htup))
> +       return NULL;
> but what does it mean?  Is it here to cover for some unknown bug?
> Should we maybe not have this at all?
> 

Yeah, that's a bogus/obsolete FIXME, from before we had proper cache
invalidations in RemoveStatistics I think.

> Another XXX comment says
> + * XXX All the memory is allocated in a single chunk, so that the caller
> + * can simply pfree the return value to release all of it.
> 
> but I would say just remove the XXX and leave the rest of the comment.
> 

OK, makes sense.

> There is another XXX comment that says "this is useless", and I agree.
> Just take it all out ...
> 

You mean the check with UINT16_MAX assert? Yeah, I'll get rid of that.

Thanks for the review!

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

19 марта 2018 г., 05:57:25

Hi,

Attached is an updated version of the patch series, addressing issues
pointed out by Alvaro. Let me go through the main changes:


1) I've updated / reworked the docs, updating the XML docs. There were
some obsolete references to functions that got renamed later, and I've
also reworked some of the user-facing docs with the aim to meet Alvaro's
suggestions. I've removed the references to READMEs etc, and at this
point I'm not sure I have a good idea how to improve this further ...


2) I got rid of the UPDATE_RESULT macro, along with counting the
matches. Initially I intended to just expand the macro and fix the match
counting (as mentioned in the FIXME), but I came to the conclusion it's
not really worth the complexity.

The idea was that by keeping the count of matching MCV items / histogram
buckets, we can terminate early in some cases. For example when
evaluating AND-clause, we can just terminate when (nmatches==0). But I
have no numbers demonstrating this actually helps, and furthermore it
was not implemented in histograms (well, we still counted the matches
but never terminated).

So I've just ripped that out and we can put it back later if needed.


3) Regarding the pg_mcv_list_items() and pg_histogram_buckets()
functions, it occurred to me that people can't really inject malicious
values because are no casts to the custom data types used to store MCV
lists and histograms in pg_statistic_ext.

The other issue was the lack of knowledge of data types for values
stored in the statistics. The code used OID of the statistic to get this
information (by looking at the relation). But it occurred to me this
could be solved the same way the regular statistics solve this - by
storing OID of the types. The anyarray does this automatically, but
there's no reason we can't do that too in pg_mcv_list and pg_histogram.

So I've done that, and the functions now take the custom data types
instead of the OID. I've also tweaked the documentation to use the
lateral syntax (good idea!) and added a new section into funcs.sgml.


4) I've merged the 0001 and 0002 patches. The 0001 was not really a bug
fix, and it was a behavior change required by introducing the MCV list,
so merging it seems right.


5) I've moved some changes from the histogram patch to MCV. The original
patch series was structured so that it introduced some code in mcv.c and
them moved it into extended_statistic.c so that it can be shared. Now
it's introduced in mcv.c right away, which makes it easier to understand
and reduces size of the patches.


6) I've fixed a bunch of comments, obsolete FIXMEs, etc.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi Dean,

Here is an updated patch (hopefully) fixing the bugs you've reported so
far. In particular, it fixes this:

1) mostly harmless memset bug in UpdateStatisticsForTypeChange

2) passing the right list (stat_clauses) to mcv_clauselist_selectivity

3) corrections to a couple of outdated comments

4) handling of NOT clauses in MCV lists (and in histograms)

The query you posted does not fail anymore, but there's a room for
improvement. We should be able to handle queries like this:

    select * from foo where a=1 and not b=1;

But we don't, because we only recognize F_EQSEL, F_SCALARLTSEL and
F_SCALARGTSEL, but F_NEQSEL (which is what "not b=1" uses). Should be
simple to fix, I believe.

5) handling of mcv_lowsel in statext_clauselist_selectivity

I do believe the new behavior is correct - as I suspected, I broke this
during the last rebase, where I also moved some stuff from the histogram
part to the MCV part. I've also added the (sum of MCV frequencies), as
you suggested.

I think we could improve the estimate further by computing ndistinct
estimate, and then using that to compute average frequency of non-MCV
items. Essentially what var_eq_const does:

    if (otherdistinct > 1)
        selec /= otherdistinct;

Not sure how to do that when there are not just equality clauses.

BTW I think there's a bug in handling the fullmatch flag - it should not
be passed to AND/OR subclauses the way it is, because then

    WHERE a=1 OR (a=2 AND b=2)

will probably set it to 'true' because of (a=2 AND b=2). Which will
short-circuit the statext_clauselist_selectivity, forcing it to ignore
the non-MCV part.

But that's something I need to look at more closely tomorrow. Another
thing I probably need to do is to add more regression tests, protecting
against bugs similar to those you found.

Thanks for the feedback so far!


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On 03/27/2018 07:34 PM, Tomas Vondra wrote:
> On 03/27/2018 07:03 PM, Dean Rasheed wrote:
>> On 27 March 2018 at 14:58, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>>> On 27 March 2018 at 01:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>>> 4) handling of NOT clauses in MCV lists (and in histograms)
>>>>
>>>> The query you posted does not fail anymore...
>>>>
>>> Ah, it turns out the previous query wasn't actually failing for the
>>> reason I thought it was -- it was failing because it had a
>>> ScalarArrayOpExpr that was being passed to
>>> mcv_clauselist_selectivity() because of the wrong list being passed to
>>> it. I could see from the code that a NOT clause would have tripped it
>>> up, but most NOT clauses actually get rewritten by negate_clause() so
>>> they end up not being NOT clauses.
>>>
>>
>> Thinking about that some, I think that the only NOT clauses this needs
>> to actually worry about are NOTs of boolean Vars. Anything else that
>> this code supports will have been transformed into something other
>> than a NOT before reaching this point. Thus it might be much simpler
>> to handle that as a special case in statext_is_compatible_clause() and
>> mcv_update_match_bitmap(), rather than trying to support general NOT
>> clauses, and going through a recursive call to
>> mcv_update_match_bitmap(), and then having to merge bitmaps. NOT of a
>> boolean Var could then be treated just like var=false, setting the
>> appropriate attribute match entry if it's found in the MCV list. This
>> would allow clauses like (a=1 and NOT b) to be supported, which I
>> don't think currently works, because fullmatch won't get set.
>>
> 
> Yes, I came to the same conclusion ;-) I'll send an updated patch later
> today.
> 

Attached is a patch fixing this. In the end I've decided to keep both
branches - one handling boolean Vars and one for NOT clauses. I think
you're right we can only see (NOT var) cases, but I'm not sure about that.

For example, what if an operator does not have a negator? Then we can't
transform NOT (a AND b) => (NOT a OR NOT b), I guess. So I kept this for
now, and we can remove this later.

I've added scalarneqsel, scalarlesel and scalargesel so that we
recognize those cases correctly. This fixes surprising behavior where
"obviously compatible" clauses like (a=1 AND b<1) became incompatible
when NOT was used, because

    NOT (a=1 AND b<1) = (a!=1 OR b>=1)

In my defense, the scalarlesel/scalargesel were introduced fairly
recently, I think.

I've also realized that the "fullmatch" flag is somewhat confused,
because some places interpreted it as "there is equality on each
attribute" but in fact it also required an actual MCV match. So when the
value was rare (not in MCV), it was always false.

There's a WIP part 0002, which should eventually be merged into 0001. It
should properly detect the case when each column has an equality, simply
by counting the top-level equality clauses (I'm not sure about the more
complex cases yet).

Another improvement done in this part is the ndistinct estimate. It
simply extracts Vars (from the top--level equality clauses, because it's
directly related to the fullmatch semantics), and uses that to compute
average frequency of non-MCV items. The function is just a simplified
version of the estimate_num_groups(), for this single-relation case.

BTW an unsquashed tag with those fixes is here:

    https://github.com/tvondra/postgres/tree/mvstats-20180328

it may be more convenient to quickly check the differences than
comparing the patches.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi,

The attached patch version modifies how the non-MCV selectivity is
computed, along the lines explained in the previous message.

The comments in statext_clauselist_selectivity() explain it in far more
detail, but we this:

1) Compute selectivity using the MCV (s1).

2) To compute the non-MCV selectivity (s2) we do this:

2a) See how many top-level equalities are there (and compute ndistinct
estimate for those attributes).

2b) If there is an equality on each column, we know there can only be a
single matching item. If we found it in the MCV (i.e. s1 > 0) we're
done, and 's1' is the answer.

2c) If only some columns have equalities, we estimate the selectivity
for equalities as

    s2 = ((1 - mcv_total_sel) / ndistinct)

If there are no remaining conditions, we're done.

2d) To estimate the non-equality clauses (on non-MCV part only), we
either repeat the whole process by calling clauselist_selectivity() or
approximating s1 to the non-MCV part. This needs a bit of care to
prevent infinite loops.


Of course, with 0002 this changes slightly, because we may try using a
histogram to estimate the non-MCV part. But that's just an extra step
right before (2a).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi all,

Attached is a rebased version of this patch series, mostly just fixing
the breakage caused by reworked format of initial catalog data.

Aside from that, the MCV building now adopts the logic introduced by
commit b5db1d93d2 for single-column MCV lists. The new algorithm seems
pretty good and I don't see why multi-column MCV lists should use
something special.

I'm sure there are plenty of open questions to discuss, particularly
stuff related to combining the various types of statistics to the final
estimate (a lot of that was already improved based on Dean's reviews).

On thing that occurred to me while comparing the single-column logic (as
implemented in selfuncs.c) and the new multi-column stuff, is dealing
with partially-matching histogram buckets.

In the single-column case, we pretty much assume uniform distribution in
each bucket, and linearly interpolate the selectivity. So for a bucket
with boundaries [0, 10] and condition "x <= 5" we return 0.5, for "x <
7" we return 0.7 and so on. This is what convert_to_scalar() does.

In the multi-column case, we simply count each matching bucket as 0.5,
without any attempts to linearly interpolate. It would not be difficult
to call "convert_to_scalar" for each condition (essentially repeating
the linear interpolation for each column), but then what? We could
simply compute a product of those results, of course, but that only
works assuming independence. And that's exactly the wrong thing to
assume here, considering the extended statistics are meant for cases
where the columns are not independent.

So I'd argue the 0.5 estimate for partially-matching buckets is the
right thing to do here, as it's minimizing the average error.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On 17 July 2018 at 14:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> For equalities it's going to be hard. The only thing I can think of at the
> moment is checking if there are any matching buckets at all, and using that
> to decide whether to extrapolate the MCV selectivity to the non-MCV part or
> not (or perhaps to what part of the non-MCV part).
>

So I decided to play a little more with this, experimenting with a
much simpler approach -- this is for MCV's only at the moment, see the
attached (very much WIP) patch (no doc or test updates, and lots of
areas for improvement).

The basic idea when building the MCV stats is to not just record the
frequency of each combination of values, but also what I'm calling the
"base frequency" -- that is the frequency that that combination of
values would have if the columns were independent (i.e., the product
of each value's individual frequency).

The reasoning then, is that if we find an MCV entry matching the query
clauses, the difference (frequency - base_frequency) can be viewed as
a correction to be applied to the selectivity returned by
clauselist_selectivity_simple(). If all possible values were covered
by matching MCV entries, the sum of the base frequencies of the
matching MCV entries would approximately cancel out with the simple
selectivity, and only the MCV frequencies would be left (ignoring
second order effects arising from the fact that
clauselist_selectivity_simple() doesn't just sum up disjoint
possibilities). For partial matches, it will use what multivariate
stats are available to improve upon the simple selectivity.

I wondered about just storing the difference (frequency -
base_frequency) in the stats, but it's actually useful to have both
values, because then the total of all the MCV frequencies can be used
to set an upper bound on the non-MCV part.

The advantage of this approach is that it is very simple, and in
theory ought to be reasonably applicable to arbitrary combinations of
clauses. Also, it naturally falls back to the univariate-based
estimate when there are no matching MCV entries. In fact, even when
there are no matching MCV entries, it can still improve upon the
univariate estimate by capping it to 1-total_mcv_sel.

I tested it with the same data posted previously and a few simple
queries, and the initial results are quite encouraging. Where the
previous patch sometimes gave noticeable over- or under-estimates,
this patch generally did better:

Query  Actual rows  Est (HEAD)  Est (24 Jun patch)  Est (new patch)
  Q1       50000      12625           48631              49308
  Q2       40000       9375           40739              38710
  Q3       90000      21644          172688              88018
  Q4      140000      52048          267528             138228
  Q5      140000      52978          267528             138228
  Q6      140000      52050          267528             138228
  Q7      829942     777806          149886             822788
  Q8      749942     748302          692686             747922
  Q9       15000      40989           27595              14131
 Q10       15997      49853           27595              23121

Q1: a=1 and b=1
Q2: a=1 and b=2
Q3: a=1 and (b=1 or b=2)
Q4: (a=1 or a=2) and (b=1 or b=2)
Q5: (a=1 or a=2) and (b<=2)
Q6: (a=1 or a=2 or a=4) and (b=1 or b=2)
Q7: (a=1 or a=2) and not (b=2)
Q8: (a=1 or a=2) and not (b=1 or b=2)
Q9: a=3 and b>0 and b<3
Q10: a=3 and b>0 and b<1000

I've not tried anything with histograms. Possibly the histograms could
be used as-is, to replace the non-MCV part (other_sel). Or, a similar
approach could be used, recording the base frequency of each histogram
bucket, and then using that to refine the other_sel estimate. Either
way, I think it would be necessary to exclude equality clauses from
the histograms, otherwise MCVs might end up being double-counted.

Regards,
Dean

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

06 августа 2018 г., 20:34:10


On 08/03/2018 04:24 PM, Dean Rasheed wrote:
> On 17 July 2018 at 14:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> For equalities it's going to be hard. The only thing I can think of at the
>> moment is checking if there are any matching buckets at all, and using that
>> to decide whether to extrapolate the MCV selectivity to the non-MCV part or
>> not (or perhaps to what part of the non-MCV part).
>>
> 
> So I decided to play a little more with this, experimenting with a
> much simpler approach -- this is for MCV's only at the moment, see the
> attached (very much WIP) patch (no doc or test updates, and lots of
> areas for improvement).
> 
> The basic idea when building the MCV stats is to not just record the
> frequency of each combination of values, but also what I'm calling the
> "base frequency" -- that is the frequency that that combination of
> values would have if the columns were independent (i.e., the product
> of each value's individual frequency).
> 
> The reasoning then, is that if we find an MCV entry matching the query
> clauses, the difference (frequency - base_frequency) can be viewed as
> a correction to be applied to the selectivity returned by
> clauselist_selectivity_simple(). If all possible values were covered
> by matching MCV entries, the sum of the base frequencies of the
> matching MCV entries would approximately cancel out with the simple
> selectivity, and only the MCV frequencies would be left (ignoring
> second order effects arising from the fact that
> clauselist_selectivity_simple() doesn't just sum up disjoint
> possibilities). For partial matches, it will use what multivariate
> stats are available to improve upon the simple selectivity.
> 
> I wondered about just storing the difference (frequency -
> base_frequency) in the stats, but it's actually useful to have both
> values, because then the total of all the MCV frequencies can be used
> to set an upper bound on the non-MCV part.
> 
> The advantage of this approach is that it is very simple, and in
> theory ought to be reasonably applicable to arbitrary combinations of
> clauses. Also, it naturally falls back to the univariate-based
> estimate when there are no matching MCV entries. In fact, even when
> there are no matching MCV entries, it can still improve upon the
> univariate estimate by capping it to 1-total_mcv_sel.
> 
> I tested it with the same data posted previously and a few simple
> queries, and the initial results are quite encouraging. Where the
> previous patch sometimes gave noticeable over- or under-estimates,
> this patch generally did better:
> 
> 
> Query  Actual rows  Est (HEAD)  Est (24 Jun patch)  Est (new patch)
>    Q1       50000      12625           48631              49308
>    Q2       40000       9375           40739              38710
>    Q3       90000      21644          172688              88018
>    Q4      140000      52048          267528             138228
>    Q5      140000      52978          267528             138228
>    Q6      140000      52050          267528             138228
>    Q7      829942     777806          149886             822788
>    Q8      749942     748302          692686             747922
>    Q9       15000      40989           27595              14131
>   Q10       15997      49853           27595              23121
> 
> Q1: a=1 and b=1
> Q2: a=1 and b=2
> Q3: a=1 and (b=1 or b=2)
> Q4: (a=1 or a=2) and (b=1 or b=2)
> Q5: (a=1 or a=2) and (b<=2)
> Q6: (a=1 or a=2 or a=4) and (b=1 or b=2)
> Q7: (a=1 or a=2) and not (b=2)
> Q8: (a=1 or a=2) and not (b=1 or b=2)
> Q9: a=3 and b>0 and b<3
> Q10: a=3 and b>0 and b<1000
> 

Interesting idea, and the improvements certainly seem encouraging.

I wonder what a counter-example would look like - I think the MCV and 
non-MCV parts would have to behave very differently (one perfectly 
dependent, the other perfectly independent). But that does seem very 
likely, and even if it was there's not much we can do about such cases.

> 
> I've not tried anything with histograms. Possibly the histograms could
> be used as-is, to replace the non-MCV part (other_sel). Or, a similar
> approach could be used, recording the base frequency of each histogram
> bucket, and then using that to refine the other_sel estimate. Either
> way, I think it would be necessary to exclude equality clauses from
> the histograms, otherwise MCVs might end up being double-counted.
> 

I do have an idea about histograms. I didn't have time to hack on it 
yet, but I think it could work in combination with your MCV algorithm.

Essentially there are two related issues with histograms:


1) equality conditions

Histograms work nicely with inequalities, not that well for equalities. 
For equality clauses, we can estimate the selectivity as 1/ndistinct, 
similarly to what we do in 1-D cases (we can use ndistinct coefficients 
if we have them, and MCV tracking the common combinations).

If there are both equalities and inequalities, we can then use the 
equality clauses merely as condition (to limit the set of buckets), and 
evaluate the inequalities for those buckets only. Essentially compute

     P(equals + inequals) = P(equals) * P(inequals | equals)

IMHO that should help with estimating selectivity of equality clauses.


2) estimating bucket selectivity

The other question is how to combine selectivities of multiple clauses 
for a single bucket. I think the linear approximation (convert_scalar or 
something like that) and computing geometric mean (as you proposed) is a 
solid plan.

I do have this on my TODO list for this week, unless something urgent 
comes up.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

03 сентября 2018 г., 05:17:15

Hi,

Attached is an updated version of the patch series, adopting a couple of
improvements - both for MCV lists and histograms.


MCV
---

For the MCV list part, I've adopted the approach proposed by Dean, using
base selectivity and using it to correct the non-MCV part. I agree the
simplicity of the approach is a nice feature, and it seems to produce
better estimates. I'm not sure I understand the approach perfectly, but
I've tried to add comments explaining how it works etc.

I've also changed how we build the MCV lists, particularly how we decide
how many / which items store in the MCV list. In the previous version
I've adopted the same algorithm we use for per-column MCV lists, but in
certain cases that turned out to be too restrictive.

Consider for example a table with multiple perfectly correlated columns,
with very few combinations. That is, something like this:

    CREATE TABLE t (a int, b int);

    INSERT INTO t SELECT mod(i,50), mod(i,50)
      FROM generate_series(1,1e6) s(i);

    CREATE STATISTICS s (mcv) ON a,b FROM t;

Now, the data distribution is very simple - uniform, with 50 distinct
combinations, each representing 2% of data (and the random sample should
be pretty close to that).

In these cases, analyze_mcv_list decides it does not need any MCV list,
because the frequency for each value is pretty much 1/ndistinct. For
single column that's reasonable, but for multiple correlated columns
it's rather problematic. We might use the same ndistinct approach
(assuming we have the ndistinct coefficients), but that still does not
allow us to decide which combinations are "valid" with respect to the
data. For example we can't decide (1,10) does not appear in the data.

So I'm not entirely sure adopting the same algorithm analyze_mcv_list
algorithm both for single-column and multi-column stats. It may make
sense to keep more items in the multi-column case for reasons that are
not really valid for a single single-column.

For now I've added a trivial condition to simply keep all the groups
when possible. This probably needs more thought.

BTW Dean's patch also modified how the maximum number of items on a MCV
list is determined - instead of the shaky defaults I used before, it
derives the size from attstattarget values for the columns, keeping the
maximum value. That seems to make sense, so I've kept this.


histograms
----------

For histograms, I've made the two improvements I mentioned previously.

Firstly, simple equality conditions (of the form "var = const") are
estimated using as 1/ndistinct (possibly using ndistinct coefficients
when available), and then used only as "conditions" (in the "conditional
probability" sense) when estimating the rest of the clauses using the
histogram.

That is P(clauses) is split into two parts

    P(clauses) = P(equalities) * P(remaining|clauses)

where the first part is estimated as 1/ndistinct, the second part is
estimated using histogram.

I'm sure this needs more thought, particularly when combining MCV and
histogram estimates. But in general it seems to work quite nicely.

The second improvement is about estimating what fraction of a bucket
matches the conditions. Instead of using the rough 1/2-bucket estimate,
I've adopted the convert_to_scalar approach, computing a geometric mean
for all the clauses (at a bucket level).

I'm not entirely sure the geometric mean is the right approach (or
better than simply using 1/2 the bucket) because multiplying the
per-clause frequencies is mostly equal to assumption of independence at
the bucket level. Which is rather incompatible with the purpose of
multi-column statistics, which are meant to be used exactly when the
columns are not independent.


measurements
------------

I think we need to maintain a set of tests (dataset + query), so that we
can compare impact of various changes in the algorithm. So far we've
used mostly ad-hoc queries, often created as counter-examples, and that
does not seem very practical.

So I'm attaching a simple SQL script that I consider an initial version
of that. It has a couple of synthetic data sets, and queries estimated
with and without extended statistics.

I'm also attaching a spreadsheet with results for (a) the original
version of the patch series, as submitted on 6/24, (b) the new version
attached here and (c) the new version using the per-bucket estimates
directly, without the geometric mean.

Overall, the new versions seem to perform better than the version from
6/24, and also compared to only per-column statistics. There are cases
where extended statistic produce over-estimates, but I find it somewhat
natural due to lower resolution of the multi-column stats.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attached is an updated version of the patch - rebased and fixing the
warnings reported by Thomas Munro.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On Fri, 11 Jan 2019, 21:18 Tomas Vondra <tomas.vondra@2ndquadrant.com wrote:

On 1/10/19 4:20 PM, Dean Rasheed wrote:
> ...
>
> So perhaps what we should do for multivariate stats is simply use the
> relative standard error approach (i.e., reuse the patch in [2] with a
> 20% RSE cutoff). That had a lot of testing at the time, against a wide
> range of data distributions, and proved to be very good, not to
> mention being very simple.
>
> That approach would encompass both groups more and less common than
> the base frequency, because it relies entirely on the group appearing
> enough times in the sample to infer that any errors on the resulting
> estimates will be reasonably well controlled. It wouldn't actually
> look at the base frequency at all in deciding which items to keep.
>

I've been looking at this approach today, and I'm a bit puzzled. That
patch essentially uses SRE to compute mincount like this:

mincount = n*(N-n) / (N-n+0.04*n*(N-1))

and then includes all items more common than this threshold.

Right.

How could
that handle items significantly less common than the base frequency?

Well what I meant was that it will *allow* items significantly less common than the base frequency, because it's not even looking at the base frequency. For example, if the table size were N=100,000 and we sampled n=10,000 rows from that, mincount would work out as 22. So it's easy to construct allowed items more common than that and still significantly less common than their base frequency.

A possible refinement would be to say that if there are more than stats_target items more common than this mincount threshold, rather than excluding the least common ones to get the target number of items, exclude the ones closest to their base frequencies, on the grounds that those are the ones for which the MCV stats will make the least difference. That might complicate the code somewhat though -- I don't have it in front of me, so I can't remember if it even tracks more than stats_target items.

Regards,

Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

13 января 2019 г., 03:04:00

On 1/12/19 8:49 AM, Dean Rasheed wrote:
> On Fri, 11 Jan 2019, 21:18 Tomas Vondra <tomas.vondra@2ndquadrant.com
> <mailto:tomas.vondra@2ndquadrant.com> wrote:
> 
> 
>     On 1/10/19 4:20 PM, Dean Rasheed wrote:
>     > ...
>     >
>     > So perhaps what we should do for multivariate stats is simply use the
>     > relative standard error approach (i.e., reuse the patch in [2] with a
>     > 20% RSE cutoff). That had a lot of testing at the time, against a wide
>     > range of data distributions, and proved to be very good, not to
>     > mention being very simple.
>     >
>     > That approach would encompass both groups more and less common than
>     > the base frequency, because it relies entirely on the group appearing
>     > enough times in the sample to infer that any errors on the resulting
>     > estimates will be reasonably well controlled. It wouldn't actually
>     > look at the base frequency at all in deciding which items to keep.
>     >
> 
>     I've been looking at this approach today, and I'm a bit puzzled. That
>     patch essentially uses SRE to compute mincount like this:
> 
>         mincount = n*(N-n) / (N-n+0.04*n*(N-1))
> 
>     and then includes all items more common than this threshold.
> 
> 
> Right.
> 
>     How could
>     that handle items significantly less common than the base frequency?
> 
> 
> Well what I meant was that it will *allow* items significantly less
> common than the base frequency, because it's not even looking at the
> base frequency. For example, if the table size were N=100,000 and we
> sampled n=10,000 rows from that, mincount would work out as 22. So it's
> easy to construct allowed items more common than that and still
> significantly less common than their base frequency.
> 

OK, understood. I agree that's a sensible yet simple approach, so I've
adopted it in the next version of the patch.

> A possible refinement would be to say that if there are more than
> stats_target items more common than this mincount threshold, rather than
> excluding the least common ones to get the target number of items,
> exclude the ones closest to their base frequencies, on the grounds that
> those are the ones for which the MCV stats will make the least
> difference. That might complicate the code somewhat though -- I don't
> have it in front of me, so I can't remember if it even tracks more than
> stats_target items.
> 

Yes, the patch does limit the number of items to stats_target (a maximum
of per-attribute stattarget values, to be precise). IIRC that's a piece
you've added sometime last year ;-)

I've been experimenting with removing items closest to base frequencies
today, and I came to the conclusion that it's rather tricky for a couple
of reasons.

1) How exactly do you measure "closeness" to base frequency? I've tried
computing the error in different ways, including:

  * Max(freq/base, base/freq)
  * abs(freq - base)

but this does not seem to affect the behavior very much, TBH.

2) This necessarily reduces mcv_totalsel, i.e. it increases the part not
covered by MCV. And estimates on this part are rather crude.

3) It does nothing for "impossible" items, i.e. combinations that do not
exist at all. Clearly, those won't be part of the sample, and so can't
be included in the MCV no matter which error definition we pick. And for
very rare combinations it might lead to sudden changes, depending on
whether the group gets sampled or not.

So IMHO it's better to stick to the simple SRE approach for now.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

13 января 2019 г., 03:31:30

On 1/10/19 6:09 PM, Dean Rasheed wrote:
> On Wed, 26 Dec 2018 at 22:09, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> Attached is an updated version of the patch - rebased and fixing the
>> warnings reported by Thomas Munro.
>>
> 
> Here are a few random review comments based on what I've read so far:
> 
> 
> On the CREATE STATISTICS doc page, the syntax in the new examples
> added to the bottom of the page is incorrect. E.g., instead of
> 
> CREATE STATISTICS s2 WITH (mcv) ON (a, b) FROM t2;
> 
> it should read
> 
> CREATE STATISTICS s2 (mcv) ON a, b FROM t2;
> 

Fixed.

> I think perhaps there should be also be a short explanatory sentence
> after each example (as in the previous one) just to explain what the
> example is intended to demonstrate. E.g., for the new MCV example,
> perhaps say
> 
>    These statistics give the planner more detailed information about the
>    specific values that commonly appear in the table, as well as an upper
>    bound on the selectivities of combinations of values that do not appear in
>    the table, allowing it to generate better estimates in both cases.
> 
> I don't think there's a need for too much detail there, since it's
> explained more fully elsewhere, but it feels like it needs a little
> more just to explain the purpose of the example.
> 

I agree, this part of docs can be quite terse. I've adopted the wording
you proposed, and I've done something similar for the histogram patch,
which needs to add something too. It's a bit repetitive, though.

> 
> There is additional documentation in perform.sgml that needs updating
> -- about what kinds of stats the planner keeps. Those docs are
> actually quite similar to the ones on planstats.sgml. It seems the
> former focus more one what stats the planner stores, while the latter
> focus on how the planner uses those stats.
> 

OK, I've expanded this part a bit too.

> 
> In func.sgml, the docs for pg_mcv_list_items need extending to include
> the base frequency column. Similarly for the example query in
> planstats.sgml.
> 

Fixed.

> 
> Tab-completion for the CREATE STATISTICS statement should be extended
> for the new kinds.
> 

Fixed.

> 
> Looking at mcv_update_match_bitmap(), it's called 3 times (twice
> recursively from within itself), and I think the pattern for calling
> it is a bit messy. E.g.,
> 
>             /* by default none of the MCV items matches the clauses */
>             bool_matches = palloc0(sizeof(char) * mcvlist->nitems);
> 
>             if (or_clause(clause))
>             {
>                 /* OR clauses assume nothing matches, initially */
>                 memset(bool_matches, STATS_MATCH_NONE, sizeof(char) *
> mcvlist->nitems);
>             }
>             else
>             {
>                 /* AND clauses assume everything matches, initially */
>                 memset(bool_matches, STATS_MATCH_FULL, sizeof(char) *
> mcvlist->nitems);
>             }
> 
>             /* build the match bitmap for the OR-clauses */
>             mcv_update_match_bitmap(root, bool_clauses, keys,
>                                     mcvlist, bool_matches,
>                                     or_clause(clause));
> 
> the comment for the AND case directly contradicts the initial comment,
> and the final comment is wrong because it could be and AND clause. For
> a NOT clause it does:
> 
>             /* by default none of the MCV items matches the clauses */
>             not_matches = palloc0(sizeof(char) * mcvlist->nitems);
> 
>             /* NOT clauses assume nothing matches, initially */
>             memset(not_matches, STATS_MATCH_FULL, sizeof(char) *
> mcvlist->nitems);
> 
>             /* build the match bitmap for the NOT-clause */
>             mcv_update_match_bitmap(root, not_args, keys,
>                                     mcvlist, not_matches, false);
> 
> so the second comment is wrong. I understand the evolution that lead
> to this function existing in this form, but I think that it can now be
> refactored into a "getter" function rather than an "update" function.
> I.e., something like mcv_get_match_bitmap() which first allocates the
> array to be returned and initialises it based on the passed-in value
> of is_or. That way, all the calling sites can be simplified to
> one-liners like
> 
>             /* get the match bitmap for the AND/OR clause */
>             bool_matches = mcv_get_match_bitmap(root, bool_clauses, keys,
>                                     mcvlist, or_clause(clause));
> 

Yes, I agree. I've reworked the function per your proposal, and I've
done the same for the histogram too.

> 
> In the previous discussion around UpdateStatisticsForTypeChange(), the
> consensus appeared to be that we should just unconditionally drop all
> extended statistics when ALTER TABLE changes the type of an included
> column (just as we do for per-column stats), since such a type change
> can rewrite the data in arbitrary ways, so there's no reason to assume
> that the old stats are still valid. I think it makes sense to extract
> that as a separate patch to be committed ahead of these ones, and I'd
> also argue for back-patching it.
> 

Wasn't the agreement to keep stats that don't include column values
(functional dependencies and ndistinct coefficients), and reset only
more complex stats? That's what happens in master and how it's extended
by the patch for MCV lists and histograms.

> 
> That's it for now. I'll try to keep reviewing if time permits.
> 

Thanks!


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On 1/14/19 4:31 PM, Tomas Vondra wrote:
> 
> On 1/14/19 12:20 PM, Dean Rasheed wrote:
>> (Removing Adrien from the CC list, because messages to that address
>> keep bouncing)
>>
>> On Sun, 13 Jan 2019 at 00:31, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>>
>>> On 1/10/19 6:09 PM, Dean Rasheed wrote:
>>>>
>>>> In the previous discussion around UpdateStatisticsForTypeChange(), the
>>>> consensus appeared to be that we should just unconditionally drop all
>>>> extended statistics when ALTER TABLE changes the type of an included
>>>> column (just as we do for per-column stats), since such a type change
>>>> can rewrite the data in arbitrary ways, so there's no reason to assume
>>>> that the old stats are still valid. I think it makes sense to extract
>>>> that as a separate patch to be committed ahead of these ones, and I'd
>>>> also argue for back-patching it.
>>>
>>> Wasn't the agreement to keep stats that don't include column values
>>> (functional dependencies and ndistinct coefficients), and reset only
>>> more complex stats? That's what happens in master and how it's extended
>>> by the patch for MCV lists and histograms.
>>>
>>
>> Ah OK, I misremembered the exact conclusion reached last time. In that
>> case the logic in UpdateStatisticsForTypeChange() looks wrong:
>>
>>     /*
>>      * If we can leave the statistics as it is, just do minimal cleanup
>>      * and we're done.
>>      */
>>     if (!attribute_referenced && reset_stats)
>>     {
>>         ReleaseSysCache(oldtup);
>>         return;
>>     }
>>
>> That should be "|| !reset_stats", or have more parentheses.
> 
> Yeah, it should have been
> 
>     if (!(attribute_referenced && reset_stats))
> 
> i.e. there's a parenthesis missing. Thanks for noticing this. I guess a
> regression test for this would be useful.
> 
>> In fact, I think that computing attribute_referenced is unnecessary
>> because the dependency information includes the columns that the
>> stats are for and ATExecAlterColumnType() uses that, so
>> attribute_referenced will always be true.
> Hmmm. I'm pretty sure I came to the conclusion it's in fact necessary,
> but I might be wrong. Will check.
> 

Turns out you were right - the attribute_referenced piece was quite
unnecessary. So I've removed it. I've also extended the regression tests
to verify changing type of another column does not reset the stats.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi,

thanks for the review. The attached patches address most of the issues
mentioned in the past several messages, both in the MCV and histogram parts.

A couple of items remains:

> 15. I see you broke out the remainder of the code from
> clauselist_selectivity() into clauselist_selectivity_simple().  The
> comment looks like just a copy and paste from the original.  That
> seems like quite a bit of duplication. Is it better to maybe trim down
> the original one?

I don't follow - where do you see the code duplication? Essentially, we
have clauselist_selectivity and clauselist_selectivity_simple, but the
first one calls the second one. The "simple" version is needed because
in some cases we need to perform estimation without multivariate stats
(e.g. to prevent infinite loop due to recursion).

> 18. In dependencies_clauselist_selectivity() there seem to be a new
> bug introduced.  We do:
>
> /* mark this one as done, so we don't touch it again. */
> *estimatedclauses = bms_add_member(*estimatedclauses, listidx);
>
> but the bms_is_member() check that skipped these has been removed.
>
> It might be easier to document if we just always do:
>
>  if (bms_is_member(listidx, *estimatedclauses))
> continue;
>
> at the start of both loops. list_attnums can just be left unset for
> the originally already estimatedclauses.

This was already discussed - I don't think there's any bug, but I'll
look into refactoring the code somehow to make it clear.

> 21. Comment does not really explain what the function does or what the
> arguments mean:
>
> /*
>  * statext_is_compatible_clause_internal
>  * Does the heavy lifting of actually inspecting the clauses for
>  * statext_is_compatible_clause.
>  */

Isn't it explained in the statext_is_compatible_clause comment?

> 25. Does statext_is_compatible_clause_internal)_ need to skip over
> RelabelTypes?

I don't think it should, because while RelabelType nodes represent casts
to binary-compatible types, there's no guarantee the semantics actually
is compatible. So for example if you do this:

  create table t (a int, b int);
  insert into t select mod(i,100), mod(i,100)
                  from generate_series(1,1000000) s(i);
  create statistics s (mcv) on a, b from t;
  analyze t;
  explain analyze select * from t where a = 1::oid and b = 1::oid;

then there will be a RelabelType nodes casting each column from int4 to
oid. So the estimation will be made following oid semantics. But the MCV
list contains int4 values, and is built using int4-specific operators.

I admit this int4/oid example is fairly trivial, but it's not clear to
me we can assume all RelabelType will behave like that. The types may be
binary-coerible, but may use vastly different operators - think about
citext vs. text, for example.

> 35. The evaluation order of this macro is wrong.
>
> #define ITEM_SIZE(ndims) \
> (ndims * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double))
>

Nope, as mentioned by Dean, it's actually correct.

> 36. Could do with some comments in get_mincount_for_mcv_list(). What's
> magic about 0.04?

That was copied from another patch, but I've removed the comment
explaining the details - I've now added it back, which I think should be
more than enough.

> 40. The comment in the above item seems to indicate the condition for
> when all items can fit in the number of groups, but the if condition
> does not seem to allow for an exact match?
>
> if (ngroups > nitems)
>
> if you want to check if the number of items can fit in the number of
> groups should it be: if (ngroups >= nitems) or if (nitems <= ngroups)
> ? Perhaps I've misunderstood. The comment is a little confusing as I'm
> not sure where the "Otherwise" code is located.

No, the whole point of that block is to decide how many groups to keep
if there are more groups than we have space for (based on stats target).
So if (ngroups == nitems) or (ngrouos < nitems) then we can keep all of
them.

> 41. I don't think palloc0() is required here. palloc() should be fine
> since you're initialising each element in the loop.
>
> ...
>
> I think I agree with the comment above that chunk about reducing the
> number of pallocs, even if it's just allocating the initial array as
> MCVItems instead of pointers to MCVItems

I've left this as it is for now. The number of extra pallocs() is fairly
low anyway, so I don't think it's worth the extra complexity.

> 47. Wondering about the logic behind the variation between elog() and
> ereport() in statext_mcv_deserialize(). They all looks like "can't
> happen" type errors.

That's mostly random, I'll review and fix that. All "can't happen" cases
should use elog().

> 3. I've not really got into understanding how the new statistics types
> are applied yet, but I found this:
>
> * If asked to build both MCV and histogram, first build the MCV part
> * and then histogram on the remaining rows.
>
> I guess that means we'll get different estimates with:
>
> create statistic a_stats (mcv,histogram) on a,b from t;
>
> vs
>
> create statistic a_stats1 (mcv) on a,b from t;
> create statistic a_stats2 (histogram) on a,b from t;
>
> Is that going to be surprising to people?

Well, I don't have a good answer to this, except for mentioning this in
the SGML docs.

> 5. serialize_histogram() and statext_histogram_deserialize(), should
> these follow the same function naming format?

Perhaps, although serialize_histogram() is static and so it's kinda
internal API.

> 8. Looking at statext_clauselist_selectivity() I see it calls
> choose_best_statistics() passing requiredkinds as STATS_EXT_INFO_MCV |
> STATS_EXT_INFO_HISTOGRAM, do you think the function now needs to
> attempt to find the best match plus the one with the most statistics
> kinds?
>
> It might only matter if someone had:
>
> create statistic a_stats1 (mcv) on a,b from t;
> create statistic a_stats2 (histogram) on a,b from t;
> create statistic a_stats3 (mcv,histogram) on a,b from t;
>
> Is it fine to just return a_stats1 and ignore the fact that a_stats3
> is probably better? Or too corner case to care?

I don't know. My assumption is people will not create such overlapping
statics.

> 9. examine_equality_clause() assumes it'll get a Var. I see we should
> only allow clauses that pass statext_is_compatible_clause_internal(),
> so maybe it's worth an Assert(IsA(var, Var)) along with a comment to
> mention anything else could not have been allowed.

Maybe.

> 10. Does examine_equality_clause need 'root' as an argument?

Probably not. I guess it's a residue some older version.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On Sun, Feb 03, 2019 at 02:43:24AM -0800, Andres Freund wrote:
> Are you planning to update the patch, or should the entry be marked as
> RWF?

Moved the patch to next CF for now, waiting on author as the last
review happened not so long ago.
--
Michael

Вложения

signature.asc

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

04 февраля 2019 г., 13:02:40


On 2/4/19 5:53 AM, Michael Paquier wrote:
> On Sun, Feb 03, 2019 at 02:43:24AM -0800, Andres Freund wrote:
>> Are you planning to update the patch, or should the entry be marked as
>> RWF?
> 
> Moved the patch to next CF for now, waiting on author as the last
> review happened not so long ago.

Thanks. Yes, I intend to send a new patch version soon.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Alvaro Herrera

Дата:

06 февраля 2019 г., 04:21:40

On 2019-Feb-04, Tomas Vondra wrote:

> On 2/4/19 5:53 AM, Michael Paquier wrote:

> > Moved the patch to next CF for now, waiting on author as the last
> > review happened not so long ago.
> 
> Thanks. Yes, I intend to send a new patch version soon.

I wonder what should we be doing with this series -- concretely, should
the effort concentrate on one of the two patches, and leave the other
for pg13, to increase the chances of the first one being in pg12?  I
would favor that approach, since it's pretty late in the cycle by now
and it seems dubious that both will be ready.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

David Rowley

Дата:

07 февраля 2019 г., 00:59:47

On Thu, 7 Feb 2019 at 03:16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> I wonder what should we be doing with this series -- concretely, should
> the effort concentrate on one of the two patches, and leave the other
> for pg13, to increase the chances of the first one being in pg12?  I
> would favor that approach, since it's pretty late in the cycle by now
> and it seems dubious that both will be ready.

I mostly have been reviewing the MCV patch with the thoughts that one
is better than none in PG12.  I don't see any particular reason that
we need both in the one release.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

07 февраля 2019 г., 02:44:08

On 2/6/19 10:59 PM, David Rowley wrote:
> On Thu, 7 Feb 2019 at 03:16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> I wonder what should we be doing with this series -- concretely, should
>> the effort concentrate on one of the two patches, and leave the other
>> for pg13, to increase the chances of the first one being in pg12?  I
>> would favor that approach, since it's pretty late in the cycle by now
>> and it seems dubious that both will be ready.
> 
> I mostly have been reviewing the MCV patch with the thoughts that one
> is better than none in PG12.  I don't see any particular reason that
> we need both in the one release.
> 

I agree with that, although most of the complexity likely lies in
integrating the stats into the selectivity estimation - if we get that
right for the MCV patch, adding histogram seems comparably simpler.

But yeah, let's focus on the MCV part.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Dean Rasheed

Дата:

07 февраля 2019 г., 09:41:03

On Wed, 6 Feb 2019 at 23:44, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> On 2/6/19 10:59 PM, David Rowley wrote:
> > On Thu, 7 Feb 2019 at 03:16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> >> I wonder what should we be doing with this series -- concretely, should
> >> the effort concentrate on one of the two patches, and leave the other
> >> for pg13, to increase the chances of the first one being in pg12?  I
> >> would favor that approach, since it's pretty late in the cycle by now
> >> and it seems dubious that both will be ready.
> >
> > I mostly have been reviewing the MCV patch with the thoughts that one
> > is better than none in PG12.  I don't see any particular reason that
> > we need both in the one release.
> >
>
> I agree with that, although most of the complexity likely lies in
> integrating the stats into the selectivity estimation - if we get that
> right for the MCV patch, adding histogram seems comparably simpler.
>
> But yeah, let's focus on the MCV part.
>

Agreed. I think the overall approach of the MCV patch is sound and
it's getting closer to being committable. David's review comments were
excellent. I'll try to review it as well when you post your next
update.

I have some more fundamental doubts about the histogram patch, to do
with the way it integrates with selectivity estimation, and some vague
half-formed ideas about how that could be improved, but nothing clear
enough that I can express right now.

So yes, let's focus on the MCV patch for now.

Regards,
Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

28 февраля 2019 г., 22:56:08

Hi,

Attached is an updated version of this patch series. I've decided to
rebase and send both parts (MCV and histograms), although we've agreed
to focus on the MCV part for now. I don't want to leave the histogram to
lag behind, because (a) then it'd be much more work to update it, and
(b) I think it's an useful feedback about likely future changes.

This should address most of the issues pointed out by David in his
recent reviews. Briefly:

1) It fixes/updates a number of comments and docs on various places,
removes redundant comments etc. In most cases I've simply adopted the
wording proposed by David, with minor tweaks in a couple of cases.

2) Reverts changes that exposed analyze_mcv_list - this was a leftover
from the attempt to reuse the single-column algorithm, but we've since
agreed it's not the right approach. So this change is unnecessary.

3) I've tweaked the code to accept RelabelType nodes as supported,
similarly to what examine_variable() does. Previously I concluded we
can't support RelabelType, but it seems that reasoning was bogus. I've
slightly tweaked the regression tests by changing one of the columns to
varchar, so that the queries actualy trigger this.

4) I've tweaked a couple of places (UpdateStatisticsForTypeChange,
statext_clauselist_selectivity and estimate_num_groups_simple) per
David's suggestions. Those were fairly straightforward simplifications.

5) I've removed mcv_count from statext_mcv_build(). As David pointed
out, this was not actually needed - it was another remnant of the
attempt to re-use analyze_mcv_list() which needs such array. But without
it we can access the groups directly.

6) One of the review questions was about the purpose of this code:

  for (i = 0; i < nitems; i++)
  {
      if (groups[i].count < mincount)
      {
          nitems = i;
          break;
      }
  }

It's quite simple - we want to include groups with more occurrences than
mincount, and the groups are sorted by the count (in descending order).
So we simply find the first group with count below mincount, and the
index is the number of groups to keep. I've tried to explain that in a
comment.

7) I've fixed a bunch of format patters in statext_mcv_deserialize(),
particularly those that confused %d and %u. We can't however use %d for
VARSIZE_ANY_EXHDR, because that macro expands into offsetof() etc. So
that would trigger compiler warnings.

8) Yeah, pg_stats_ext_mcvlist_items was broken. The issue was that one
of the output parameters is defined as boolean[], but the function was
building just string. Originally it used BuildTupleFromCStrings(), but
then it switched to heap_form_tuple() without building a valid array.

I've decided to simply revert back to BuildTupleFromCStrings(). It's not
going to be used very frequently, so the small performance difference is
not important.

I've also fixed the formatting issues, pointed out by David.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

David Rowley

Дата:

02 марта 2019 г., 16:01:50

On Fri, 1 Mar 2019 at 08:56, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> Attached is an updated version of this patch series.

I made a quick pass over the 0001 patch. I edited a few small things
along the way; patch attached.

I'll try to do a more in-depth review soon.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

mcv_stats_small_fixups_for_0001.patch

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Dean Rasheed

Дата:

09 марта 2019 г., 21:33:56

On Thu, 28 Feb 2019 at 19:56, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> Attached is an updated version of this patch series.

Here are some random review comments. I'll add more later, but I'm out
of energy for today.

1). src/test/regress/expected/type_sanity.out has bit-rotted.

2). Duplicate OIDs (3425).

3). It looks a bit odd that clauselist_selectivity() calls
statext_clauselist_selectivity(), which does MCV stats and will do
histograms, but it doesn't do dependencies, so
clauselist_selectivity() has to then separately call
dependencies_clauselist_selectivity(). It would seem neater if
statext_clauselist_selectivity() took care of calling
dependencies_clauselist_selectivity(), since dependencies are just
another kind of extended stats.

4). There are no tests for pg_mcv_list_items(). Given a table with a
small enough amount of data, so that it's all sampled, it ought to be
possible to get predictable MCV stats.

5). It's not obvious what some of the new test cases in the
"stats_ext" tests are intended to show. For example, the first test
creates a table with 5000 rows and a couple of indexes, does a couple
of queries, builds some MCV stats, and then repeats the queries, but
the results seem to be the same with and without the stats.

I wonder if it's possible to write smaller, more targeted tests.
Currently "stats_ext" is by far the slowest test in its group, and I'm
not sure that some of those tests add much. It ought to be possible to
write a function that calls EXPLAIN and returns a query's row
estimate, and then you could write tests to confirm the effect of the
new stats by verifying the row estimates change as expected.

6). This enum isn't needed for MCVs:

/*
 * Degree of how much MCV item matches a clause.
 * This is then considered when computing the selectivity.
 */
#define STATS_MATCH_NONE        0    /* no match at all */
#define STATS_MATCH_PARTIAL        1    /* partial match */
#define STATS_MATCH_FULL        2    /* full match */

STATS_MATCH_PARTIAL is never used for MCVs, so you may as well just
use booleans instead of this enum. If those are needed for histograms,
they can probably be made local to histogram.c.

7). estimate_num_groups_simple() isn't needed in this patch.

8). In README.mcv,
s/clauselist_mv_selectivity_mcvlist/mcv_clauselist_selectivity/.

9). In the list of supported clause types that follows (e) is the same
a (c), but with a more general description.

10). It looks like most of the subsequent description of the algorithm
is out of date and needs rewriting. All the stuff about full matches
and the use of ndistinct is now obsolete.

Regards,
Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Dean Rasheed

Дата:

10 марта 2019 г., 16:09:13

On Sat, 9 Mar 2019 at 18:33, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Thu, 28 Feb 2019 at 19:56, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> > Attached is an updated version of this patch series.
>
> Here are some random review comments. I'll add more later, but I'm out
> of energy for today.
>

Here are some more comments:

11). In dependency_degree():

-    /* sort the items so that we can detect the groups */
-    qsort_arg((void *) items, numrows, sizeof(SortItem),
-              multi_sort_compare, mss);
+    /*
+     * build an array of SortItem(s) sorted using the multi-sort support
+     *
+     * XXX This relies on all stats entries pointing to the same tuple
+     * descriptor. Not sure if that might not be the case.
+     */
+    items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+                               mss, k, attnums_dep);

That XXX comment puzzled me for a while. Actually it's OK though,
unless/until we try to support stats across multiple relations, which
will require a much larger refactoring of this code. For now though,
The stats entries all point to the same tuple descriptor from the
onerel passed to BuildRelationExtStatistics(), so it's OK to just use
the first tuple descriptor in this way. The comment should be updated
to explain that.

12). bms_member_index() should surely be in bitmapset.c. It could be
more efficient by just traversing the bitmap words and making use of
bmw_popcount(). Also, its second argument should be of type 'int' for
consistency with other bms_* functions.

13). estimate_ndistinct() has been moved from mvdistinct.c to
extended_stats.c and changed from static to extern, but it is only
called from mvdistinct.c, so that change is unnecessary (at least as
far as this patch is concerned).

14). The attnums Bitmapset passed to
statext_is_compatible_clause_internal() is an input/output argument
that it updates. That should probably be documented. When it calls
itself recursively for AND/OR/NOT clauses, it could just pass the
original Bitmapset through to be updated (rather than creating a new
one and merging it), as it does for other types of clause.

On the other hand, the outer function statext_is_compatible_clause()
does need to return a new bitmap, which may or may not be used by its
caller, so it would be cleaner to make that a strictly "out" parameter
and initialise it to NULL in that function, rather than in its caller.

15). As I said yesterday, I don't think that there is a clean
separator of concerns between the functions clauselist_selectivity(),
statext_clauselist_selectivity(),
dependencies_clauselist_selectivity() and
mcv_clauselist_selectivity(), I think things could be re-arranged as
follows:

statext_clauselist_selectivity() - as the name suggests - should take
care of *all* extended stats estimation, not just MCVs and histograms.
So make it a fairly small function, calling
mcv_clauselist_selectivity() and
dependencies_clauselist_selectivity(), and histograms when that gets
added.

Most of the current code in statext_clauselist_selectivity() is really
MCV-specific, so move that to mcv_clauselist_selectivity(). Amongst
other things, that would move the call to choose_best_statistics() to
mcv_clauselist_selectivity() (just as
dependencies_clauselist_selectivity() calls choose_best_statistics()
to get the best dependencies statistics). Then, when histograms are
added later, you won't have the problem pointed out before where it
can't apply both MCV and histogram stats if they're on different
STATISTICS objects.

Most of the comments for statext_clauselist_selectivity() are also
MCV-related. Those too would move to mcv_clauselist_selectivity().

Regards,
Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Dean Rasheed

Дата:

10 марта 2019 г., 18:57:55

On Sun, 10 Mar 2019 at 13:09, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> Here are some more comments:
>

One more thing --- the comment for statext_clauselist_selectivity() says:

 * So (simple_selectivity - base_selectivity) may be seen as a correction for
 * the part not covered by the MCV list.

That's not quite right. It should really say that (simple_selectivity
- base_selectivity) is an estimate for the part not covered by the MCV
list, or that (mcv_selectivity - base_selectivity) is a correction for
the part covered by the MCV list. Those 2 statements are actually
equivalent, and different from what you wrote.

Perhaps the easiest way to see it is to work through a simple example:

Suppose you had the following clauses:

  a = 1 AND b >= 0 AND b <= 10

The per-column stats might be expected to give reasonable independent
estimates for the following 2 things:

  P(a = 1)

  P(b >= 0 AND b <= 10)  -- in general, greater than P(b >= 0) * P(b <= 10)

but the overall estimate produced by clauselist_selectivity_simple()
would then just be the product of those 2 things:

  simple_sel = P(a = 1) * P(b >= 0 AND b <= 10)

which might not be so good if the columns were correlated.

Now suppose you had MCV stats, which included MCV items for the
following specific values:

  (a=1,b=1), (a=1,b=2), (a=1,b=3)

but no other relevant MCV entries. (There might be lots of other MCV
items that don't match the original clauses, but they're irrelavent
for this discssion.) That would mean that we could get reasonable
estimates for the following 2 quantities:

  mcv_sel = P(a = 1 AND b IN (1,2,3))
          = P(a = 1 AND b = 1) + P(a = 1 AND b = 2) + P(a = 1 AND b = 3)

  mcv_basesel = base_freq(a = 1 AND b IN (1,2,3))
              = P(a = 1) * (P(b = 1) + P(b = 2) + P(b = 3))

So how is that useful? Well, returning to the quantity that we're
actually trying to compute, it can be split into MCV and non-MCV
parts, and since they're mutually exclusive possibilities, their
probabilities just add up. Thus we can write:

  P(a = 1 AND b >= 0 AND b <= 10)

    = P(a = 1 AND b IN (1,2,3))                             -- MCV part
    + P(a = 1 AND b >= 0 AND b <= 10 AND b NOT IN (1,2,3))  -- non-MCV part

    = mcv_sel + other_sel

So the first term is easy -- it's just mcv_sel, from above. The second
term is trickier though, since we have no information about the
correlation between a and b in the non-MCV region. Just about the best
we can do is assume that they're independent, which gives:

  other_sel = P(a = 1 AND b >= 0 AND b <= 10 AND b NOT IN (1,2,3))
           ~= P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

and that can now be written in terms of things that we know

  other_sel ~= P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

             = P(a = 1) * P(b >= 0 AND b <= 10)
             - P(a = 1) * P(b IN (1,2,3))  -- mutually exclusive possibilities

             = simple_sel - mcv_basesel

So, as I said above, (simple_selectivity - base_selectivity) is an
estimate for the part not covered by the MCV list.

Another way to look at it is to split the original per-column estimate
up into MCV and non-MCV parts, and correct the MCV part using the MCV
stats:

  simple_sel = P(a = 1) * P(b >= 0 AND b <= 10)

             = P(a = 1) * P(b IN (1,2,3))
             + P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

The first term is just mcv_basesel, so we can define other_sel to be
the other term, giving

  simple_sel = mcv_basesel  -- MCV part
             + other_sel    -- non-MCV part

Clearly mcv_basesel isn't the best estimate for the MCV part, and it
should really be mcv_sel, so we can improve upon simple_sel by
applying a correction of (mcv_sel - basesel) to it:

  better estimate = simple_sel + (mcv_sel - mcv_basesel)
                  = mcv_sel + other_sel

                    (where other_sel = simple_sel - mcv_basesel)

Of course, that's totally equivalent, but looking at it this way
(mcv_selectivity - base_selectivity) can be seen as a correction for
the part covered by the MCV list.

All of that generalises to arbitrary clauses, because the matching
items in the MCV list are independent possibilities that sum up, and
the MCV and non-MCV parts are mutually exclusive. That's also why the
basesel calculation in mcv_clauselist_selectivity() must only include
matching MCV items, and the following XXX comment is wrong:

+   for (i = 0; i < mcv->nitems; i++)
+   {
+       *totalsel += mcv->items[i]->frequency;
+
+       if (matches[i] != STATS_MATCH_NONE)
+       {
+           /* XXX Shouldn't the basesel be outside the if condition? */
+           *basesel += mcv->items[i]->base_frequency;
+           s += mcv->items[i]->frequency;
+       }
+   }

So I believe that that code is correct, as written.

Regards,
Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

10 марта 2019 г., 20:36:55

Hi Dean,

Thanks for the review. I'll post a patch fixing most of the stuff soon,
but a few comments/questions regarding some of the issues:

On 3/9/19 7:33 PM, Dean Rasheed wrote:
> 5). It's not obvious what some of the new test cases in the
> "stats_ext" tests are intended to show. For example, the first test
> creates a table with 5000 rows and a couple of indexes, does a couple
> of queries, builds some MCV stats, and then repeats the queries, but
> the results seem to be the same with and without the stats.
>

Hmmm. I thought those tests are testing that we get the right plan, but
maybe I broke that somehow during the rebases. Will check.

> I wonder if it's possible to write smaller, more targeted tests.
> Currently "stats_ext" is by far the slowest test in its group, and I'm
> not sure that some of those tests add much. It ought to be possible to
> write a function that calls EXPLAIN and returns a query's row
> estimate, and then you could write tests to confirm the effect of the
> new stats by verifying the row estimates change as expected.

Sure, if we can write more targeted tests, that would be good. But it's
not quite clear to me how wrapping EXPLAIN in a function makes those
tests any faster?

On 3/10/19 2:09 PM, Dean Rasheed wrote:
> 12). bms_member_index() should surely be in bitmapset.c. It could be
> more efficient by just traversing the bitmap words and making use of
> bmw_popcount(). Also, its second argument should be of type 'int' for
> consistency with other bms_* functions.

Yes, moving to bitmapset.c definitely makes sense. I don't see how it
could use bms_popcount() though.

On 3/10/19 2:09 PM, Dean Rasheed wrote:
> 14). The attnums Bitmapset passed to
> statext_is_compatible_clause_internal() is an input/output argument
> that it updates. That should probably be documented. When it calls
> itself recursively for AND/OR/NOT clauses, it could just pass the
> original Bitmapset through to be updated (rather than creating a new
> one and merging it), as it does for other types of clause.
>

I don't think it's really possible, because the AND/OR/NOT clause is
considered compatible only when all the pieces are compatible. So we
can't tweak the original bitmapset directly in case the incompatible
clause is not the very first one.

> On the other hand, the outer function statext_is_compatible_clause()
> does need to return a new bitmap, which may or may not be used by its
> caller, so it would be cleaner to make that a strictly "out" parameter
> and initialise it to NULL in that function, rather than in its caller.


On 3/10/19 2:09 PM, Dean Rasheed wrote:
> 15). As I said yesterday, I don't think that there is a clean
> separator of concerns between the functions clauselist_selectivity(),
> statext_clauselist_selectivity(),
> dependencies_clauselist_selectivity() and
> mcv_clauselist_selectivity(), I think things could be re-arranged as
> follows:
>
> statext_clauselist_selectivity() - as the name suggests - should take
> care of *all* extended stats estimation, not just MCVs and histograms.
> So make it a fairly small function, calling
> mcv_clauselist_selectivity() and
> dependencies_clauselist_selectivity(), and histograms when that gets
> added.
>
> Most of the current code in statext_clauselist_selectivity() is really
> MCV-specific, so move that to mcv_clauselist_selectivity(). Amongst
> other things, that would move the call to choose_best_statistics() to
> mcv_clauselist_selectivity() (just as
> dependencies_clauselist_selectivity() calls choose_best_statistics()
> to get the best dependencies statistics). Then, when histograms are
> added later, you won't have the problem pointed out before where it
> can't apply both MCV and histogram stats if they're on different
> STATISTICS objects.

I agree clauselist_selectivity() shouldn't care about various types of
extended statistics (MCV vs. functional dependencies). But I'm not sure
the approach you suggested (moving stuff to mcv_clauselist_selectivity)
would work particularly well because most of it is not specific to MCV
lists. It'll also need to care about histograms, for example.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

David Rowley

Дата:

11 марта 2019 г., 01:27:54

On Mon, 11 Mar 2019 at 06:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> On 3/9/19 7:33 PM, Dean Rasheed wrote:
> > I wonder if it's possible to write smaller, more targeted tests.
> > Currently "stats_ext" is by far the slowest test in its group, and I'm
> > not sure that some of those tests add much. It ought to be possible to
> > write a function that calls EXPLAIN and returns a query's row
> > estimate, and then you could write tests to confirm the effect of the
> > new stats by verifying the row estimates change as expected.
>
> Sure, if we can write more targeted tests, that would be good. But it's
> not quite clear to me how wrapping EXPLAIN in a function makes those
> tests any faster?

I've not looked at the tests in question, but if they're executing an
inferior plan is used when no extended stats exists, then maybe that's
why they're slow.

I think Dean might mean to create a function similar to
explain_parallel_append() in partition_prune.sql then write tests that
check the row estimate with EXPLAIN (COSTS ON) but strip out the other
costing stuff instead of validating that the poor plan was chosen.

> On 3/10/19 2:09 PM, Dean Rasheed wrote:
> > 12). bms_member_index() should surely be in bitmapset.c. It could be
> > more efficient by just traversing the bitmap words and making use of
> > bmw_popcount(). Also, its second argument should be of type 'int' for
> > consistency with other bms_* functions.
>
> Yes, moving to bitmapset.c definitely makes sense. I don't see how it
> could use bms_popcount() though.

I think it could be done by first checking if the parameter is a
member of the set, and then if so, count all the bits that come on and
before that member. You can use bmw_popcount() for whole words before
the specific member's word then just bitwise-and a bit mask of a
bitmapword that has all bits set for all bits on and before your
parameter's BITNUM(), and add the bmw_popcount of the final word
bitwise-anding the mask. bms_add_range() has some masking code you
could copy.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

11 марта 2019 г., 05:59:33


On 3/10/19 11:27 PM, David Rowley wrote:
> On Mon, 11 Mar 2019 at 06:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> On 3/9/19 7:33 PM, Dean Rasheed wrote:
>>> I wonder if it's possible to write smaller, more targeted tests.
>>> Currently "stats_ext" is by far the slowest test in its group, and I'm
>>> not sure that some of those tests add much. It ought to be possible to
>>> write a function that calls EXPLAIN and returns a query's row
>>> estimate, and then you could write tests to confirm the effect of the
>>> new stats by verifying the row estimates change as expected.
>>
>> Sure, if we can write more targeted tests, that would be good. But it's
>> not quite clear to me how wrapping EXPLAIN in a function makes those
>> tests any faster?
> 
> I've not looked at the tests in question, but if they're executing an
> inferior plan is used when no extended stats exists, then maybe that's
> why they're slow.
> 

I don't think the tests are executing any queries - the tests merely
generate execution plans, without executing them.

> I think Dean might mean to create a function similar to
> explain_parallel_append() in partition_prune.sql then write tests that
> check the row estimate with EXPLAIN (COSTS ON) but strip out the other
> costing stuff instead of validating that the poor plan was chosen.
> 

I'm not opposed to doing that, of course. I'm just not sure it's a way
to make the tests faster. Will investigate.

>> On 3/10/19 2:09 PM, Dean Rasheed wrote:
>>> 12). bms_member_index() should surely be in bitmapset.c. It could be
>>> more efficient by just traversing the bitmap words and making use of
>>> bmw_popcount(). Also, its second argument should be of type 'int' for
>>> consistency with other bms_* functions.
>>
>> Yes, moving to bitmapset.c definitely makes sense. I don't see how it
>> could use bms_popcount() though.
> 
> I think it could be done by first checking if the parameter is a
> member of the set, and then if so, count all the bits that come on and
> before that member. You can use bmw_popcount() for whole words before
> the specific member's word then just bitwise-and a bit mask of a
> bitmapword that has all bits set for all bits on and before your
> parameter's BITNUM(), and add the bmw_popcount of the final word
> bitwise-anding the mask. bms_add_range() has some masking code you
> could copy.
> 

Ah, right - that would work.


cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Dean Rasheed

Дата:

11 марта 2019 г., 11:15:41

On Sun, 10 Mar 2019 at 17:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 3/10/19 2:09 PM, Dean Rasheed wrote:
> > 14). The attnums Bitmapset passed to
> > statext_is_compatible_clause_internal() is an input/output argument
> > that it updates. That should probably be documented. When it calls
> > itself recursively for AND/OR/NOT clauses, it could just pass the
> > original Bitmapset through to be updated (rather than creating a new
> > one and merging it), as it does for other types of clause.
>
> I don't think it's really possible, because the AND/OR/NOT clause is
> considered compatible only when all the pieces are compatible. So we
> can't tweak the original bitmapset directly in case the incompatible
> clause is not the very first one.
>

In the case where the overall clause is incompatible, you don't
actually care about the attnums returned. Right now it will return an
empty set (NULL). With this change it would return all the attnums
encountered before the incompatible piece, but that wouldn't matter.
In fact, you could easily preserve the current behaviour just by
having the outer statext_is_compatible_clause() function set attnums
back to NULL if the result is false.

Regards,
Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Dean Rasheed

Дата:

11 марта 2019 г., 11:35:36

On Sun, 10 Mar 2019 at 22:28, David Rowley <david.rowley@2ndquadrant.com> wrote:
>
> On Mon, 11 Mar 2019 at 06:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> >
> > On 3/9/19 7:33 PM, Dean Rasheed wrote:
> > > I wonder if it's possible to write smaller, more targeted tests.
> > > Currently "stats_ext" is by far the slowest test in its group, and I'm
> > > not sure that some of those tests add much. It ought to be possible to
> > > write a function that calls EXPLAIN and returns a query's row
> > > estimate, and then you could write tests to confirm the effect of the
> > > new stats by verifying the row estimates change as expected.
> >
> > Sure, if we can write more targeted tests, that would be good. But it's
> > not quite clear to me how wrapping EXPLAIN in a function makes those
> > tests any faster?
>
> I've not looked at the tests in question, but if they're executing an
> inferior plan is used when no extended stats exists, then maybe that's
> why they're slow.
>
> I think Dean might mean to create a function similar to
> explain_parallel_append() in partition_prune.sql then write tests that
> check the row estimate with EXPLAIN (COSTS ON) but strip out the other
> costing stuff instead of validating that the poor plan was chosen.
>

Yeah that's the sort of thing I was thinking of. I think it might be
possible to write simpler and faster tests by inserting far fewer rows
and relying on ANALYSE having sampled everything, so the row estimates
should be predictable. It may be the case that, with just a handful of
rows, the extended stats don't affect the plan, but you'd still see a
difference in the row estimates, and that could be a sufficient test I
think.

> > On 3/10/19 2:09 PM, Dean Rasheed wrote:
> > > 12). bms_member_index() should surely be in bitmapset.c. It could be
> > > more efficient by just traversing the bitmap words and making use of
> > > bmw_popcount(). Also, its second argument should be of type 'int' for
> > > consistency with other bms_* functions.
> >
> > Yes, moving to bitmapset.c definitely makes sense. I don't see how it
> > could use bms_popcount() though.
>
> I think it could be done by first checking if the parameter is a
> member of the set, and then if so, count all the bits that come on and
> before that member. You can use bmw_popcount() for whole words before
> the specific member's word then just bitwise-and a bit mask of a
> bitmapword that has all bits set for all bits on and before your
> parameter's BITNUM(), and add the bmw_popcount of the final word
> bitwise-anding the mask. bms_add_range() has some masking code you
> could copy.

Yep, that's what I was imagining. Except I think that to get a 0-based
index result you'd want the mask to have all bits set for bits
*before* the parameter's BITNUM(), rather than on and before. So I
think the mask would simply be

  ((bitmapword) 1 << bitnum) - 1

Regards,
Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

13 марта 2019 г., 04:25:40

Hi,

attached is an updated version of the patch, addressing most of the
issues raised in the recent reviews. There are two main exceptions:

1) I haven't reworked the regression tests to use a function to check
cardinality estimates and making them faster.

2) Review handling of bitmap in statext_is_compatible_clause_internal
when processing AND/OR/NOT clauses.

I plan to look into those items next, but I don't want block review of
other parts of the patch unnecessarily.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On 3/14/19 12:56 PM, Kyotaro HORIGUCHI wrote:
> At Wed, 13 Mar 2019 19:37:45 +1300, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f_6qDQj9m2H0jF4bRkZVLpfc7O9E+MxdXrq0wgv0z1NrQ@mail.gmail.com>
>> On Wed, 13 Mar 2019 at 17:20, Kyotaro HORIGUCHI
>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>>> bms_member_index seems working differently than maybe expected.
>>>
>>>  bms_member_index((2, 4), 0) => 0, (I think) should be -1
>>>  bms_member_index((2, 4), 1) => 0, should be -1
>>>  bms_member_index((2, 4), 2) => 0, should be 0
>>>  bms_member_index((2, 4), 3) => 1, should be -1
>>>  bms_member_index((2, 4), 4) => 1, should be 1
>>>  bms_member_index((2, 4), 5) => 2, should be -1
>>>  bms_member_index((2, 4), 6) => 2, should be -1
>>> ...
>>>  bms_member_index((2, 4), 63) => 2, should be -1
>>>  bms_member_index((2, 4), 64) => -1, correct
>>>
>>> It works correctly only when x is a member - the way the function
>>> is maybe actually used in this patch -, or needs to change the
>>> specifiction (or the comment) of the function.
>>
>> Looks like:
>>
>> + if (wordnum >= a->nwords)
>> + return -1;
>>
>> should be:
>>
>> + if (wordnum >= a->nwords ||
>> + (a->word[wordnum] & ((bitmapword) 1 << bitnum)) == 0)
>> + return -1;
> 
> Yeah, seems right.
> 

Yep, that was broken. The attached patch fixes this by simply calling
bms_is_member, instead of copying the checks into bms_member_index.

I've also reworked the regression tests to use a function extracting the
cardinality estimates, as proposed by Dean and David. I have not reduced
the size of data sets yet, so the tests are not much faster, but we no
longer check the exact query plan. That's probably a good idea anyway.
Actually - the tests are a bit faster because it allows removing indexes
that were used for the query plans.

FWIW I've noticed an annoying thing when modifying type of column not
included in a statistics. Consider this:

create table t (a int, b int, c text);
insert into t select mod(i,10), mod(i,10), ''
  from generate_series(1,10000) s(i);
create statistics s (dependencies) on a,b from t;
analyze t;

explain analyze select * from t where a = 1 and b = 1;

                                QUERY PLAN
---------------------------------------------------------------------
 Seq Scan on t  (cost=0.00..205.00 rows=1000 width=9)
                (actual time=0.014..1.910 rows=1000 loops=1)
   Filter: ((a = 1) AND (b = 1))
   Rows Removed by Filter: 9000
 Planning Time: 0.119 ms
 Execution Time: 2.234 ms
(5 rows)

alter table t alter c type varchar(61);

explain analyze select * from t where a = 1 and b = 1;

                              QUERY PLAN
---------------------------------------------------------------------
 Seq Scan on t  (cost=0.00..92.95 rows=253 width=148)
                (actual time=0.020..2.420 rows=1000 loops=1)
   Filter: ((a = 1) AND (b = 1))
   Rows Removed by Filter: 9000
 Planning Time: 0.128 ms
 Execution Time: 2.767 ms
(5 rows)

select stxdependencies from pg_statistic_ext;

             stxdependencies
------------------------------------------
 {"1 => 2": 1.000000, "2 => 1": 1.000000}
(1 row)

That is, we don't remove the statistics, but the estimate still changes.
But that's because the ALTER TABLE also resets reltuples/relpages:

select relpages, reltuples from pg_class where relname = 't';

 relpages | reltuples
----------+-----------
        0 |         0
(1 row)

That's a bit unfortunate, and it kinda makes the whole effort to not
drop the statistics unnecessarily kinda pointless :-(


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On 3/16/19 11:55 AM, Dean Rasheed wrote:
> On Fri, 15 Mar 2019 at 00:06, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> I've noticed an annoying thing when modifying type of column not
>> included in a statistics...
>>
>> That is, we don't remove the statistics, but the estimate still changes.
>> But that's because the ALTER TABLE also resets reltuples/relpages:
>>
>> That's a bit unfortunate, and it kinda makes the whole effort to not
>> drop the statistics unnecessarily kinda pointless :-(
>>
> 
> Well not entirely. Repeating that test with 100,000 rows, I get an
> initial estimate of 9850 (actual 10,000), which then drops to 2451
> after altering the column. But if you drop the dependency statistics,
> the estimate drops to 241, so clearly there is some benefit in keeping
> them in that case.
> 

Sure. What I meant is that to correct the relpages/reltuples estimates
you need to do ANALYZE, which rebuilds the statistics anyway. Although
VACUUM also fixes the estimates, without the stats rebuild.

> Besides, I thought there was no extra effort in keeping the extended
> statistics in this case -- isn't it just using the column
> dependencies, so in this case UpdateStatisticsForTypeChange() never
> gets called anyway?
> 

Yes, it does not get called at all. My point was that I was a little bit
confused because the test says "check change of unrelated column type
does not reset the MCV statistics" yet the estimates do actually change.

I wonder why we reset the relpages/reltuples to 0, instead of retaining
the original values, though. That would likely give us better density
estimates in estimate_rel_size, I think.

So I've tried doing that, and I've included it as 0001 into the patch
series. It seems to work, but I suppose the reset is there for a reason.
In any case, this is a preexisting issue, independent of what this patch
does or changes.

I've discovered another issue, though. Currently, clauselist_selectivity
has this as the very beginning:

    /*
     * If there's exactly one clause, just go directly to
     * clause_selectivity(). None of what we might do below is relevant.
     */
    if (list_length(clauses) == 1)
        return clause_selectivity(root, (Node *) linitial(clauses),
                                  varRelid, jointype, sjinfo);

Which however fails with queries like this:

    WHERE (a = 1 OR b = 1)

because clauselist_selectivity sees it as a single clause, passes it to
clause_selectivity and the OR-clause handling simply relies on

    (s1 + s2 - s1 * s2)

which entirely ignores the multivariate stats. The other similar places
in clause_selectivity() simply call clauselist_selectivity() so that's
OK, but OR-clauses don't do that.

For functional dependencies this is not a huge issue because those apply
only to AND-clauses. But there were proposals to maybe apply them to
other types of clauses, in which case it might become issue.

I think the best fix is moving the optimization after the multivariate
stats are applied. The only alternative I can think of is modifying
clauselist_selectivity so that it can be executed on OR-clauses. But
that seems much more complicated than the former option for almost no
other advantages.

I've also changed how statext_is_compatible_clause_internal() handles
the attnums bitmapset - you were right in your 3/10 message that we can
just pass the value, without creating a local bitmapset. So I've just
done that.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Dean Rasheed

Дата:

17 марта 2019 г., 00:26:40

On Fri, 15 Mar 2019 at 00:06, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> ... attached patch ...

Some more review comments, carrying on from where I left off:

16). This regression test fails for me:

@@ -654,11 +654,11 @@
 -- check change of unrelated column type does not reset the MCV statistics
 ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a =
1 AND b = ''1''');
  estimated | actual
 -----------+--------
-        50 |     50
+        11 |     50
 (1 row)

Maybe that's platform-dependent, given what you said about
reltuples/relpages being reset. An easy workaround for this would be
to modify this test (and perhaps the one that follows) to just query
pg_statistic_ext to see if the MCV statistics have been reset.

17). I'm definitely preferring the new style of tests because they're
much neater and easier to read, and to directly see the effect of the
extended statistics. One thing I'd consider adding is a query of
pg_statistic_ext using pg_mcv_list_items() after creating the MCV
stats, both to test that function, and to show that the MCV lists have
the expected contents (provided that output isn't too large).

18). Spurious whitespace added to src/backend/statistics/mvdistinct.c.

19). In the function comment for statext_mcv_clauselist_selectivity(),
the name at the top doesn't match the new function name. Also, I think
it should mention MCV in the initial description. I.e., instead of

+/*
+ * mcv_clauselist_selectivity
+ *        Estimate clauses using the best multi-column statistics.

it should say:

+/*
+ * statext_mcv_clauselist_selectivity
+ *        Estimate clauses using the best multi-column MCV statistics.

20). Later in the same comment, this part should now be deleted:

+ *
+ * So (simple_selectivity - base_selectivity) may be seen as a correction for
+ * the part not covered by the MCV list.

21). For consistency with other bms_ functions, I think the name of
the Bitmapset argument for bms_member_index() should just be called
"a". Nitpicking, I'd also put bms_member_index() immediately after
bms_is_member() in the source, to match the header.

22). mcv_get_match_bitmap() should really use an array of bool rather
than an array of char. Note that a bool is guaranteed to be of size 1,
so it won't make things any less efficient, but it will allow some
code to be made neater. E.g., all clauses like "matches[i] == false"
and "matches[i] != false" can just be made "!matches[i]" or
"matches[i]". Also the Min/Max expressions on those match flags can be
replaced with the logical operators && and ||.

23). Looking at this code in statext_mcv_build():

        /* store info about data type OIDs */
        i = 0;
        j = -1;
        while ((j = bms_next_member(attrs, j)) >= 0)
        {
            VacAttrStats *colstat = stats[i];

            mcvlist->types[i] = colstat->attrtypid;
            i++;
        }

it isn't actually making use of the attribute numbers (j) from attrs,
so this could be simplified to:

        /* store info about data type OIDs */
        for (i = 0; i < numattrs; i++)
            mcvlist->types[i] = stats[i]->attrtypid;

24). Later in that function, the following comment doesn't appear to
make sense. Is this possibly from an earlier version of the code?

            /* copy values from the _previous_ group (last item of) */

25). As for (23), in build_mss(), the loop over the Bitmapset of
attributes never actually uses the attribute numbers (j), so that
could just be a loop from i=0 to numattrs-1, and then that function
doesn't need to be passed the Bitmapset at all -- it could just be
passed the integer numattrs.

26). build_distinct_groups() looks like it makes an implicit
assumption that the counts of the items passed in are all zero. That
is indeed the case, if they've come from build_sorted_items(), because
that does a palloc0(), but that feels a little fragile. I think it
would be better if build_distinct_groups() explicitly set the count
each time it detects a new group.

27). In statext_mcv_serialize(), the TODO comment says

 * TODO: Consider packing boolean flags (NULL) for each item into a single char
 * (or a longer type) instead of using an array of bool items.

A more efficient way to save space might be to do away with the
boolean null flags entirely, and just use a special index value like
0xffff to signify a NULL value.

28). I just spotted the 1MB limit on the serialised MCV list size. I
think this is going to be too limiting. For example, if the stats
target is at its maximum of 10000, that only leaves around 100 bytes
for each item's values, which is easily exceeded. In fact, I think
this approach for limiting the MCV list size isn't a good one --
consider what would happen if there were lots of very large values.
Would it run out of memory before getting to that test? Even if not,
it would likely take an excessive amount of time.

I think this part of the patch needs a bit of a rethink. My first
thought is to do something similar to what happens for per-column
MCVs, and set an upper limit on the size of each value that is ever
considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
toowide_cnt in analyse.c). Over-wide values should be excluded early
on, and it will need to track whether or not any such values were
excluded, because then it wouldn't be appropriate to treat the stats
as complete and keep the entire list, without calling
get_mincount_for_mcv_list().

That's it for now.

Regards,
Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

17 марта 2019 г., 02:44:23


On 3/16/19 10:26 PM, Dean Rasheed wrote:
> On Fri, 15 Mar 2019 at 00:06, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> ... attached patch ...
> 
> Some more review comments, carrying on from where I left off:
> 
> 16). This regression test fails for me:
> 
> @@ -654,11 +654,11 @@
>  -- check change of unrelated column type does not reset the MCV statistics
>  ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
>  SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a =
> 1 AND b = ''1''');
>   estimated | actual
>  -----------+--------
> -        50 |     50
> +        11 |     50
>  (1 row)
> 
> Maybe that's platform-dependent, given what you said about
> reltuples/relpages being reset. An easy workaround for this would be
> to modify this test (and perhaps the one that follows) to just query
> pg_statistic_ext to see if the MCV statistics have been reset.
> 

Ah, sorry for not explaining this bit - the failure is expected, due to
the reset of relpages/reltuples I mentioned. We do keep the extended
stats, but the relsize estimate changes a bit. It surprised me a bit,
and this test made the behavior apparent. The last patchset included a
piece that changes that - if we decide not to change this, I think we
can simply accept the actual output.

> 17). I'm definitely preferring the new style of tests because they're
> much neater and easier to read, and to directly see the effect of the
> extended statistics. One thing I'd consider adding is a query of
> pg_statistic_ext using pg_mcv_list_items() after creating the MCV
> stats, both to test that function, and to show that the MCV lists have
> the expected contents (provided that output isn't too large).
> 

OK, will do.

> 18). Spurious whitespace added to src/backend/statistics/mvdistinct.c.
> 

fixed

> 19). In the function comment for statext_mcv_clauselist_selectivity(),
> the name at the top doesn't match the new function name. Also, I think
> it should mention MCV in the initial description. I.e., instead of
> 
> +/*
> + * mcv_clauselist_selectivity
> + *        Estimate clauses using the best multi-column statistics.
> 
> it should say:
> 
> +/*
> + * statext_mcv_clauselist_selectivity
> + *        Estimate clauses using the best multi-column MCV statistics.
> 

fixed

> 20). Later in the same comment, this part should now be deleted:
> 
> + *
> + * So (simple_selectivity - base_selectivity) may be seen as a correction for
> + * the part not covered by the MCV list.
> 

fixed

> 21). For consistency with other bms_ functions, I think the name of
> the Bitmapset argument for bms_member_index() should just be called
> "a". Nitpicking, I'd also put bms_member_index() immediately after
> bms_is_member() in the source, to match the header.
> 

I think I've already done the renames in the last patch I submitted (are
you looking at an older version of the code, perhaps?). I've moved it
right after bms_is_member - good idea.

> 22). mcv_get_match_bitmap() should really use an array of bool rather
> than an array of char. Note that a bool is guaranteed to be of size 1,
> so it won't make things any less efficient, but it will allow some
> code to be made neater. E.g., all clauses like "matches[i] == false"
> and "matches[i] != false" can just be made "!matches[i]" or
> "matches[i]". Also the Min/Max expressions on those match flags can be
> replaced with the logical operators && and ||.
> 

fixed

> 23). Looking at this code in statext_mcv_build():
> 
>         /* store info about data type OIDs */
>         i = 0;
>         j = -1;
>         while ((j = bms_next_member(attrs, j)) >= 0)
>         {
>             VacAttrStats *colstat = stats[i];
> 
>             mcvlist->types[i] = colstat->attrtypid;
>             i++;
>         }
> 
> it isn't actually making use of the attribute numbers (j) from attrs,
> so this could be simplified to:
> 
>         /* store info about data type OIDs */
>         for (i = 0; i < numattrs; i++)
>             mcvlist->types[i] = stats[i]->attrtypid;
> 

yep, fixed

> 24). Later in that function, the following comment doesn't appear to
> make sense. Is this possibly from an earlier version of the code?
> 
>             /* copy values from the _previous_ group (last item of) */
> 

yep, seems like a residue from an older version, fixed

> 25). As for (23), in build_mss(), the loop over the Bitmapset of
> attributes never actually uses the attribute numbers (j), so that
> could just be a loop from i=0 to numattrs-1, and then that function
> doesn't need to be passed the Bitmapset at all -- it could just be
> passed the integer numattrs.
> 

fixed

> 26). build_distinct_groups() looks like it makes an implicit
> assumption that the counts of the items passed in are all zero. That
> is indeed the case, if they've come from build_sorted_items(), because
> that does a palloc0(), but that feels a little fragile. I think it
> would be better if build_distinct_groups() explicitly set the count
> each time it detects a new group.
> 

good idea, fixed

> 27). In statext_mcv_serialize(), the TODO comment says
> 
>  * TODO: Consider packing boolean flags (NULL) for each item into a single char
>  * (or a longer type) instead of using an array of bool items.
> 
> A more efficient way to save space might be to do away with the
> boolean null flags entirely, and just use a special index value like
> 0xffff to signify a NULL value.
> 

Hmmm, maybe. I think there's a room for improvement.

> 28). I just spotted the 1MB limit on the serialised MCV list size. I
> think this is going to be too limiting. For example, if the stats
> target is at its maximum of 10000, that only leaves around 100 bytes
> for each item's values, which is easily exceeded. In fact, I think
> this approach for limiting the MCV list size isn't a good one --
> consider what would happen if there were lots of very large values.
> Would it run out of memory before getting to that test? Even if not,
> it would likely take an excessive amount of time.
> 

True. I don't have a very good argument for a specific value, or even
having an explicit limit at all. I've initially added it mostly as a
safety for development purposes, but I think you're right we can just
get rid of it. I don't think it'd run out of memory before hitting the
limit, but I haven't tried very hard (but I recall running into the 1MB
limit in the past).

> I think this part of the patch needs a bit of a rethink. My first
> thought is to do something similar to what happens for per-column
> MCVs, and set an upper limit on the size of each value that is ever
> considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
> toowide_cnt in analyse.c). Over-wide values should be excluded early
> on, and it will need to track whether or not any such values were
> excluded, because then it wouldn't be appropriate to treat the stats
> as complete and keep the entire list, without calling
> get_mincount_for_mcv_list().
> 
Which part? Serialization / deserialization? Or how we handle long
values when building the MCV list?

cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi,

On 3/17/19 12:47 PM, Dean Rasheed wrote:
> On Sat, 16 Mar 2019 at 23:44, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>>> 28). I just spotted the 1MB limit on the serialised MCV list size. I
>>> think this is going to be too limiting. For example, if the stats
>>> target is at its maximum of 10000, that only leaves around 100 bytes
>>> for each item's values, which is easily exceeded. In fact, I think
>>> this approach for limiting the MCV list size isn't a good one --
>>> consider what would happen if there were lots of very large values.
>>> Would it run out of memory before getting to that test? Even if not,
>>> it would likely take an excessive amount of time.
>>>
>>
>> True. I don't have a very good argument for a specific value, or even
>> having an explicit limit at all. I've initially added it mostly as a
>> safety for development purposes, but I think you're right we can just
>> get rid of it. I don't think it'd run out of memory before hitting the
>> limit, but I haven't tried very hard (but I recall running into the 1MB
>> limit in the past).
>>
> 
> I've just been playing around a little with this and found that it
> isn't safely dealing with toasted values. For example, consider the
> following test:
> 
> create or replace function random_string(x int) returns text
> as $$
>   select substr(string_agg(md5(random()::text), ''), 1, x)
>     from generate_series(1,(x+31)/32);
> $$ language sql;
> 
> drop table if exists t;
> create table t(a int, b text);
> insert into t values (1, random_string(10000000));
> create statistics s (mcv) on a,b from t;
> analyse t;
> 
> select length(b), left(b,5), right(b,5) from t;
> select length(stxmcv), length((m.values::text[])[2]),
>        left((m.values::text[])[2], 5), right((m.values::text[])[2],5)
>   from pg_statistic_ext, pg_mcv_list_items(stxmcv) m
>  where stxrelid = 't'::regclass;
> 
> The final query returns the following:
> 
>  length |  length  | left  | right
> --------+----------+-------+-------
>     250 | 10000000 | c2667 | 71492
> (1 row)
> 
> suggesting that there's something odd about the stxmcv value. Note,
> also, that it doesn't hit the 1MB limit, even though the value is much
> bigger than that.
> 
> If I then delete the value from the table, without changing the stats,
> and repeat the final query, it falls over:
> 
> delete from t where a=1;
> select length(stxmcv), length((m.values::text[])[2]),
>        left((m.values::text[])[2], 5), right((m.values::text[])[2],5)
>   from pg_statistic_ext, pg_mcv_list_items(stxmcv) m
>  where stxrelid = 't'::regclass;
> 
> ERROR:  unexpected chunk number 5008 (expected 0) for toast value
> 16486 in pg_toast_16480
> 
> So I suspect it was using the toast data from the table t, although
> I've not tried to investigate further.
> 

Yes, it was using the toasted value directly. The attached patch
detoasts the value explicitly, similarly to the per-column stats, and it
also removes the 1MB limit.

> 
>>> I think this part of the patch needs a bit of a rethink. My first
>>> thought is to do something similar to what happens for per-column
>>> MCVs, and set an upper limit on the size of each value that is ever
>>> considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
>>> toowide_cnt in analyse.c). Over-wide values should be excluded early
>>> on, and it will need to track whether or not any such values were
>>> excluded, because then it wouldn't be appropriate to treat the stats
>>> as complete and keep the entire list, without calling
>>> get_mincount_for_mcv_list().
>>>
>> Which part? Serialization / deserialization? Or how we handle long
>> values when building the MCV list?
>>
> 
> I was thinking (roughly) of something like the following:
> 
> * When building the values array for the MCV list, strip out rows with
> values wider than some threshold (probably something like the
> WIDTH_THRESHOLD = 1024 from analyse.c would be reasonable).
> 
> * When building the MCV list, if some over-wide values were previously
> stripped out, always go into the get_mincount_for_mcv_list() block,
> even if nitems == ngroups (for the same reason a similar thing happens
> for per-column stats -- if some items were stripped out, we're already
> saying that not all items will go in the MCV list, and it's not safe
> to assume that the remaining items are common enough to give accurate
> estimates).
> 

Yes, that makes sense I guess.

> * In the serialisation code, remove the size limit entirely. We know
> that each value is now at most 1024 bytes, and there are at most 10000
> items, and at most 8 columns, so the total size is already reasonably
> well bounded. In the worst case, it might be around 80MB, but in
> practice, it's always likely to be much much smaller than that.
> 

Yep, I've already removed the limit from the current patch.


cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On 3/21/19 4:05 PM, David Rowley wrote:
> On Mon, 18 Mar 2019 at 02:18, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> Yes, it was using the toasted value directly. The attached patch
>> detoasts the value explicitly, similarly to the per-column stats, and it
>> also removes the 1MB limit.
> 
> I just made a pass over 0001 and 0002.
> 
> 0002 is starting to look pretty good, but I did note down a few things
> while looking.  Some things below might just me being unclear on how
> something works. Perhaps that means more comments are needed, but it
> might also mean I need a brain upgrade. I'm hoping it's the former.
> 

That's good to hear. Thanks for the review.

> 0001:
> 
> 1. Could you write a full commit message for this patch. Without
> reading the backlog on this ticket it's not all that obvious what the
> patch aims to fix. (I have read the backlog, so I know, but the next
> person might not have)
> 
> 2. Should all the relpages variables be BlockNumber rather than double?
> 

Probably. But I think the conclusion from the discussion with Dean was
that tweaking the relpages/reltuples reset should really be a matter for
a separate patch. So I've removed it from this patch series and the
tests were modified to check the stats are still there.

> 0002:
> 
> 3. I'm not sure what the following is trying to say:
> 
> * Estimate selectivity on any clauses applicable by stats tracking
> * actual values first, then apply functional dependencies on the
> * remaining clauses.
> 
> can you reword it?
> 

It was supposed to say we first try to apply the more complicated stats
(those that track dependencies between values) before applying the
simpler ones that only track dependencies between columns. I've reworked
and simplified comments in this part of the code.

> 4. This seems out of date:
> 
> * clauses that we've already estimated for.  Each selectivity
> * function will set the appropriate bit in the bitmapset to mark that
> * no further estimation is required for that list item.
> 
> We're only passing estimatedclauses to 1 function before
> clauselist_selectivity_simple is called for the remainder.
> 

True. I've simplified/reworded this. The old wording was mostly a
residue of how this worked in previous patch versions.

> 5. In build_attnums_array there's
> Assert(AttrNumberIsForUserDefinedAttr(j)); I just wanted to point out
> that this could only possibly trigger of the bitmapset had a 0 member.
> It cannot have negative members.  Maybe it would be worth adding a
> comment to acknowledge that as it looks a bit misguided otherwise.
> 

Right. I've added an explanation, and another assert checking the
maximum value (because bitmaps store integers, but we only expect
attnums here).

> 6. In build_attnums_array(), what's the reason to return int *, rather
> than an AttrNumber * ? Likewise in the code that calls that function.
> 

Laziness, I guess. Also, bitmaps work with int members, so it was kinda
natural. But you're right AttrNumber is a better choice, so fixed.

> 7. Not properly indented. Should be two tabs.
> 
>  * build sorted array of SortItem with values from rows
> 
> Should also be "a sorted array"
> 

Fixed.

> 8. This comment seems to duplicate what is just mentioned in the
> header comment for the function.
> 
> /*
> * We won't allocate the arrays for each item independenly, but in one
> * large chunk and then just set the pointers. This allows the caller to
> * simply pfree the return value to release all the memory.
> */
> 
> Also, typo "independenly" -> "independently"
> 

Fixed. I've removed this comment, the function comment is enough.

> 9. Not properly indented:
> 
> /*
>  * statext_is_compatible_clause_internal
>  * Does the heavy lifting of actually inspecting the clauses for
>  * statext_is_compatible_clause. It needs to be split like this because
>  * of recursion.  The attnums bitmap is an input/output parameter collecting
>  * attribute numbers from all compatible clauses (recursively).
>  */
> 

Fixed. It might be a tad too similar to statext_is_compatible_clause
comment, though.

> 10. Header comment for get_mincount_for_mcv_list() ends with
> *---------- but does not start with that.
> 

Fixed.

> 11. In get_mincount_for_mcv_list() it's probably better to have the
> numerical literals of 0.0 instead of just 0.
> 

Why?

> 12. I think it would be better if you modified build_attnums_array()
> to add an output argument that sets the size of the array. It seems
> that most places you call this function you perform bms_num_members()
> to determine the array size.
> 

Hmmm. I've done this, but I'm not sure I like it very much - there's no
protection the value passed in is the right one, so the array might be
allocated either too small or too large. I think it might be better to
make it work the other way, i.e. pass the value out instead.

> 13. This comment seems to be having a fight with itself:
> 
> * Preallocate Datum/isnull arrays (not as a single chunk, as we will
> * pass the result outside and thus it needs to be easy to pfree().
> *
> * XXX On second thought, we're the only ones dealing with MCV lists,
> * so we might allocate everything as a single chunk to reduce palloc
> * overhead (chunk headers, etc.) without significant risk. Not sure
> * it's worth it, though, as we're not re-building stats very often.
> 

Yes, I've reworded/simplified the comment.

> 14. The following might be easier to read if you used a local variable
> instead of counts[dim].
> 
> for (i = 0; i < mcvlist->nitems; i++)
> {
> /* skip NULL values - we don't need to deduplicate those */
> if (mcvlist->items[i]->isnull[dim])
> continue;
> 
> values[dim][counts[dim]] = mcvlist->items[i]->values[dim];
> counts[dim] += 1;
> }
> 
> Then just assign the value of the local variable to counts[dim] at the end.
> 

I've tried that, but it didn't seem like an improvement so I've kept the
current code.

> 15. Why does this not use stats[dim]->attrcollid ?
> 
> ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
> 

Hmmm, that's a good question. TBH I don't recall why I used the default
collation here, but I think it's mostly harmless because it's used only
during serialization. But I'll check, it seems suspicious.

But that made me revisit how collations are handled when building the
MCV list, and I see it's using type->typcollation, which I seems wrong
as the column might use a different collation.

But if this is wrong, it's already wrong in dependencies and mvdistinct
statistics ...

> 16. The following:
> 
> else if (info[dim].typlen == -2) /* cstring */
> {
> info[dim].nbytes = 0;
> for (i = 0; i < info[dim].nvalues; i++)
> {
> values[dim][i] = PointerGetDatum(PG_DETOAST_DATUM(values[dim][i]));
> info[dim].nbytes += strlen(DatumGetCString(values[dim][i]));
> }
> }
> 
> seems to conflict with:
> 
> else if (info[dim].typlen == -2) /* cstring */
> {
> memcpy(data, DatumGetCString(v), strlen(DatumGetCString(v)) + 1);
> data += strlen(DatumGetCString(v)) + 1; /* terminator */
> }
> 
> It looks like you'll reserve 1 byte too few for each cstring value.
> 
> (Might also be nicer to assign the strlen to a local variable rather
> than leave it up to the compiler to optimize out the 2nd strlen call
> in the latter of the two code fragments above.)
> 

Good catch! Fixed.

> 17. I wonder if some compilers will warn about this:
> 
> ITEM_INDEXES(item)[dim] = (value - values[dim]);
> 
> Probably a cast to uint16 might fix them if they do.
> 

Possibly. I've added the explicit cast.

> 18. statext_mcv_deserialize: I don't think "Size" should have a
> capaital 'S' here:
> 
> elog(ERROR, "invalid MCV Size %ld (expected at least %zu)",
> VARSIZE_ANY_EXHDR(data), offsetof(MCVList, items));
> 
> Also, the following should likely use the same string to reduce the
> number of string constants:
> 
> elog(ERROR, "invalid MCV size %ld (expected %zu)",
> VARSIZE_ANY_EXHDR(data), expected_size);
> 

Yeah, it should have been "size". But I don't think reusing the same
string is a good idea, because those are two separate/different issues.

> 19. statext_mcv_deserialize: There seems to be a mix of ereports and
> elogs for "shouldn't happen" cases. Any reason to use ereport instead
> of elog for these?
> 
> I also really wonder if you need so many different error messages. I
> imagine if anyone complains about hitting this case then we'd just be
> telling them to run ANALYZE again.
> 

Yeah, it seems a bit of a mess. As those are really "should not happen"
issues, likely caused by some form of data corruption, I think we can
reduce it to fewer checks with one or two error messages.

> 20. Isn't this only needed for modules?
> 
> PG_FUNCTION_INFO_V1(pg_stats_ext_mcvlist_items);
> 

Yep, fixed.

> 21. Do you think it would be better to declare
> pg_stats_ext_mcvlist_items() to accept the oid of the pg_statistic_ext
> row rather than the stxmcv column? (However, I do see you have a mcv
> type, so perhaps you might want other types in the future?)
> 

I don't think so, I don't see what advantages would it have.

> 22. I see lots of usages of DEFAULT_COLLATION_OID in
> mcv_get_match_bitmap. Can you add a comment to explain why that's
> okay? I imagined the collation should match the column's collation.
> 

Yeah, same thing as above. Have to check.

> 23. Are these comments left over from a previous version?
> 
> /* OR - was MATCH_NONE, but will be MATCH_FULL */
> /* AND - was MATC_FULL, but will be MATCH_NONE */
> /* if the clause mismatches the MCV item, set it as MATCH_NONE */
> 

Fixed.

> 24. I think the following comment needs explained a bit better:
> 
> /*
>  * mcv_clauselist_selectivity
>  * Return the selectivity estimate of clauses using MCV list.
>  *
>  * It also produces two interesting selectivities - total selectivity of
>  * all the MCV items combined, and selectivity of the least frequent item
>  * in the list.
>  */
> Selectivity
> mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
>    List *clauses, int varRelid,
>    JoinType jointype, SpecialJoinInfo *sjinfo,
>    RelOptInfo *rel,
>    Selectivity *basesel, Selectivity *totalsel)
> 
> I see 3 possible selectivities.  What's different with *totalsel and
> the return value of the function?
> 
> (I can see from looking at the actual code that it's not, but I don't
> really know why it has to be different)
> 

Well, it returns the selectivity estimate (matching the clauses), and
then two additional selectivities:

1) total - a sum of frequencies for all MCV items (essentially, what
fraction of data is covered by the MCV list), which is then used to
estimate the non-MCV part

2) base - a sum of base frequencies for matching items (which is used
for correction of the non-MCV part)

I'm not sure I quite understand what's unclear here.

> 25. In README.mcv, I don't quite understand this:
> 
> TODO Currently there's no logic to consider building only an MCV list (and not
>      building the histogram at all), except for doing this decision manually in
>      ADD STATISTICS.
> 
> Not sure why histograms are mentioned and also not sure what ADD STATISTICS is.
> 

Yeah, that's obsolete. Removed.

> 26. I don't quite understand the "to defend against malicious input" part in.
> 
> It accepts one parameter - a pg_mcv_list value (which can only be obtained
> from pg_statistic_ext catalog, to defend against malicious input), and
> returns these columns:
> 
> It kinda sounds like there's some sort of magic going on to ensure the
> function can only be called using stxmcv, but it's just that it
> requires a pg_mcv_list type and that type has an input function that
> just errors out, so it could only possibly be set from C code.
> 

Yeah, the idea is that if it was possible to supply arbitrary binary
data as a MCV list, someone could inject arbitrarily broken value. By
only allowing values from the catalog (which we must have built) that's
no longer an issue.

> 27. This looks like an unintended change:
> 
>   /*
> - * Get the numdistinct estimate for the Vars of this rel.  We
> - * iteratively search for multivariate n-distinct with maximum number
> - * of vars; assuming that each var group is independent of the others,
> - * we multiply them together.  Any remaining relvarinfos after no more
> - * multivariate matches are found are assumed independent too, so
> - * their individual ndistinct estimates are multiplied also.
> + * Get the numdistinct estimate for the Vars of this rel.
> + *
> + * We iteratively search for multivariate n-distinct with the maximum
> + * number of vars; assuming that each var group is independent of the
> + * others, we multiply them together.  Any remaining relvarinfos after
> + * no more multivariate matches are found are assumed independent too,
> + * so their individual ndistinct estimates are multiplied also.
>   *
> 

Right. Reverted.

> 28. Can you explain what this is?
> 
> uint32 type; /* type of MCV list (BASIC) */
> 
> I see: #define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */
> 
> but it's not really clear to me what else could exist. Maybe the
> "type" comment can explain there's only one type for now, but more
> might exist in the future?
> 

It's the same idea as for dependencies/mvdistinct stats, i.e.
essentially a version number for the data structure so that we can
perhaps introduce some improved version of the data structure in the future.

But now that I think about it, it seems a bit pointless. We would only
do that in a major version anyway, and we don't keep statistics during
upgrades. So we could just as well introduce the version/flag/... if
needed. We can't do this for regular persistent data, but for stats it
does not matter.

So I propose we just remove this thingy from both the existing stats and
this patch.

> 29. Looking at the tests I see you're testing that you get bad
> estimates without extended stats.  That does not really seem like
> something that should be done in tests that are meant for extended
> statistics.
>

True, it might be a bit unnecessary. Initially the tests were meant to
show old/new estimates for development purposes, but it might not be
appropriate for regression tests. I don't think it's a big issue, it's
not like it'd slow down the tests significantly. Opinions?


This patch version also does two additional changes:

1) It moves the "single-clause" optimization until after the extended
statistics are applied. This addresses the issue I've explained in [1].

2) The other change is ignoring values that exceed WIDTH_THRESHOLD, as
proposed by Dean in [2]. The idea is similar to per-column stats, so
I've used the same value (1024). It turned out to be pretty simple
change in build_sorted_items, which means it affects both the old and
new statistic types (which is correct).

FWIW while looking at the code I think the existing statistics may be
broken for toasted values, as there's not a single detoast call. I'll
investigate/fix once this commitfest is over.


[1]
https://www.postgresql.org/message-id/d207e075-9fb3-3a95-7811-8e0ab5292b2a%402ndquadrant.com

[2]
https://www.postgresql.org/message-id/CAEZATCVazGRDjbZpRF6r-Asiv_-U8vcT-VA0oSZribhhmDUQHQ%40mail.gmail.com

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi,

Attached is an updated patch, fixing all the issues pointed out so far.
Unless there are some objections, I plan to commit the 0001 part by the
end of this CF. Part 0002 is a matter for PG13, as previously agreed.


On 3/24/19 1:17 AM, David Rowley wrote:
> On Sun, 24 Mar 2019 at 12:41, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> On 3/21/19 4:05 PM, David Rowley wrote:
>>> 11. In get_mincount_for_mcv_list() it's probably better to have the
>>> numerical literals of 0.0 instead of just 0.
>>
>> Why?
> 
> Isn't it what we do for float and double literals?
> 

OK, fixed.

>>
>>> 12. I think it would be better if you modified build_attnums_array()
>>> to add an output argument that sets the size of the array. It seems
>>> that most places you call this function you perform bms_num_members()
>>> to determine the array size.
>>
>> Hmmm. I've done this, but I'm not sure I like it very much - there's no
>> protection the value passed in is the right one, so the array might be
>> allocated either too small or too large. I think it might be better to
>> make it work the other way, i.e. pass the value out instead.
> 
> When I said "that sets the size", I meant "that gets set to the size",
> sorry for the confusion.  I mean, if you're doing bms_num_members()
> inside build_attnums_array() anyway, then this will save you from
> having to do that in the callers too.
> 

OK, I've done this now, and I'm fairly happy with it.

>>> 28. Can you explain what this is?
>>>
>>> uint32 type; /* type of MCV list (BASIC) */
>>>
>>> I see: #define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */
>>>
>>> but it's not really clear to me what else could exist. Maybe the
>>> "type" comment can explain there's only one type for now, but more
>>> might exist in the future?
>>>
>>
>> It's the same idea as for dependencies/mvdistinct stats, i.e.
>> essentially a version number for the data structure so that we can
>> perhaps introduce some improved version of the data structure in the future.
>>
>> But now that I think about it, it seems a bit pointless. We would only
>> do that in a major version anyway, and we don't keep statistics during
>> upgrades. So we could just as well introduce the version/flag/... if
>> needed. We can't do this for regular persistent data, but for stats it
>> does not matter.
>>
>> So I propose we just remove this thingy from both the existing stats and
>> this patch.
> 
> I see. I wasn't aware that existed for the other types.   It certainly
> gives some wiggle room if some mistakes were discovered after the
> release, but thinking about it, we could probably just change the
> "magic" number and add new code in that branch only to ignore the old
> magic number, perhaps with a WARNING to analyze the table again. The
> magic field seems sufficiently early in the struct that we could do
> that.  In the master branch we'd just error if the magic number didn't
> match, since we wouldn't have to deal with stats generated by the
> previous version's bug.
> 

OK. I've decided to keep the field for now, for sake of consistency with
the already existing statistic types. I think we can rethink that in the
future, if needed.

>>> 29. Looking at the tests I see you're testing that you get bad
>>> estimates without extended stats.  That does not really seem like
>>> something that should be done in tests that are meant for extended
>>> statistics.
>>>
>>
>> True, it might be a bit unnecessary. Initially the tests were meant to
>> show old/new estimates for development purposes, but it might not be
>> appropriate for regression tests. I don't think it's a big issue, it's
>> not like it'd slow down the tests significantly. Opinions?
> 
> My thoughts were that if someone did something to improve non-MV
> stats, then is it right for these tests to fail? What should the
> developer do in the case? update the expected result? remove the test?
> It's not so clear.
> 

I think such changes would affect a number of other places in regression
tests (changing plans, ...), so I don't see why fixing these tests would
be any different.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi,

I've now committed the MCV part, after addressing the last two issues
raised by Dean:

* The MCV build now always uses the mincount to decide which of the
items to keep in the list.

* Both the MCV build and deserialization now uses the same maximum number
of list items (10k).

Unfortunately, I forgot to merge these two fixes before pushing, so I had
to commit them separately. Sorry about that :/

Attached are the remaining parts of this patch series - the multivariate
histograms, and also a new patch tweaking regression tests for the old
statistic types (ndistinct, dependencies) to adopt the function-based
approach instead of the regular EXPLAIN.

But those are clearly a matter for the future (well, maybe it'd make sense
to commit the regression test change now).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

I believe I found a typo in mcv.c, fix attached.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

mcv-comment-fix.patch

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

31 марта 2019 г., 04:33:42

On Sun, Mar 31, 2019 at 08:50:53AM +0800, John Naylor wrote:
>I believe I found a typo in mcv.c, fix attached.
>

Thanks, pushed.

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Michael Paquier

Дата:

09 апреля 2019 г., 05:29:18

On Sat, Mar 30, 2019 at 09:13:01PM +0100, Tomas Vondra wrote:
> Hmmm, what's the right status in the CF app when a part of a patch was
> committed and the rest should be moved to the next CF? Committed, Moved
> to next CF, or something else?

This stuff has been around for nine commit fests, and you have been
able to finish the basic work.  So I think that committed is most
appropriate so as you can start later on with a new concept, new patch
sets, perhaps a new thread, and surely a new CF entry.
--
Michael

Вложения

signature.asc

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

09 апреля 2019 г., 19:05:18

On Tue, Apr 09, 2019 at 11:29:18AM +0900, Michael Paquier wrote:
>On Sat, Mar 30, 2019 at 09:13:01PM +0100, Tomas Vondra wrote:
>> Hmmm, what's the right status in the CF app when a part of a patch was
>> committed and the rest should be moved to the next CF? Committed, Moved
>> to next CF, or something else?
>
>This stuff has been around for nine commit fests, and you have been
>able to finish the basic work.  So I think that committed is most
>appropriate so as you can start later on with a new concept, new patch
>sets, perhaps a new thread, and surely a new CF entry.

OK, makes sense. I'll start a new thread for the remaining pieces.

cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Alvaro Herrera

Дата:

09 апреля 2019 г., 19:14:47

On 2019-Mar-27, Tomas Vondra wrote:

> Attached are the remaining parts of this patch series - the multivariate
> histograms, and also a new patch tweaking regression tests for the old
> statistic types (ndistinct, dependencies) to adopt the function-based
> approach instead of the regular EXPLAIN.
> 
> But those are clearly a matter for the future (well, maybe it'd make sense
> to commit the regression test change now).

IMO it makes sense to get the test patch pushed now.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

От

Tomas Vondra

Дата:

09 апреля 2019 г., 19:23:39

On Tue, Apr 09, 2019 at 12:14:47PM -0400, Alvaro Herrera wrote:
>On 2019-Mar-27, Tomas Vondra wrote:
>
>> Attached are the remaining parts of this patch series - the multivariate
>> histograms, and also a new patch tweaking regression tests for the old
>> statistic types (ndistinct, dependencies) to adopt the function-based
>> approach instead of the regular EXPLAIN.
>>
>> But those are clearly a matter for the future (well, maybe it'd make sense
>> to commit the regression test change now).
>
>IMO it makes sense to get the test patch pushed now.
>

OK, I'll take care of that soon.


cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: [HACKERS] PATCH: multivariate histograms and MCV lists

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения