Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Дата
Msg-id CAKJS1f-YHJWaTx8pharjcU15j4btuGt1ZCXJmgi8zkALRK96RQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] PATCH: multivariate histograms and MCV lists  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: [HACKERS] PATCH: multivariate histograms and MCV lists  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
On Wed, 23 Jan 2019 at 03:43, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I made another pass over the 0001 patch. I've not read through mcv.c
> again yet. Will try to get to that soon.
>
> 0001-multivariate-MCV-lists-20190117.patch

I started on mcv.c this morning. I'm still trying to build myself a
picture of how it works, but I have noted a few more things while I'm
reading.

24. These macros are still missing parenthesis around the arguments:

#define ITEM_INDEXES(item) ((uint16*)item)
#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))

While I don't see any reason to put parenthesis around the macro's
argument when passing it to another macro, since it should do it...
There is a good reason to have the additional parenthesis when it's
not passed to another macro.

Also, there's a number of places, including with these macros that
white space is not confirming to project standard. e.g.
((uint16*)item) should be ((uint16 *) (item))  (including fixing the
missing parenthesis)

25. In statext_mcv_build() I'm trying to figure out what the for loop
does below the comment:

* If we can fit all the items onto the MCV list, do that. Otherwise
* use get_mincount_for_mcv_list to decide which items to keep in the
* MCV list, based on the number of occurences in the sample.

The comment explains only as far as the get_mincount_for_mcv_list()
call so the following is completely undocumented:

for (i = 0; i < nitems; i++)
{
if (mcv_counts[i] < mincount)
{
nitems = i;
break;
}
}

I was attempting to figure out if the break should be there, or if the
code should continue and find the 'i' for the smallest mcv_counts, but
I don't really understand what the code is meant to be doing.

Also:  occurences  -> occurrences

26. Again statext_mcv_build() I'm a bit puzzled to why mcv_counts
needs to exist at all. It's built from:

mcv_counts = (int *) palloc(sizeof(int) * nitems);

for (i = 0; i < nitems; i++)
mcv_counts[i] = groups[i].count;

Then only used in the loop mentioned in #25 above.  Can't you just use
groups[i].count?

(Stopped in statext_mcv_build(). Need to take a break)

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Bossart, Nathan"
Дата:
Сообщение: Re: A few new options for vacuumdb
Следующее
От: Amit Langote
Дата:
Сообщение: Re: Typo: llvm*.cpp files identified as llvm*.c