Re: WIP: collect frequency statistics for arrays

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: WIP: collect frequency statistics for arrays
Дата
Msg-id BANLkTikpkO1kkqDscmR_bWPqBrawhnmTAw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WIP: collect frequency statistics for arrays  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: WIP: collect frequency statistics for arrays
Список pgsql-hackers
On Fri, Jun 10, 2011 at 9:03 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Initial comments are that the code is well structured and I doubt
there will be problems at the code level. Looks like a good patch.
I'm worrying about perfomance of "column <@ const" estimation. It takes O(m*(n+m)) of time, where m - const length and n - statistics target. Probably, it can be too slow is some some cases.
 
At the moment I see no tests. If this code will be exercised by
existing tests then you should put some notes with the patch to
explain that, or at least provide some pointers as to how I might test
this.
I didn't find in existing tests which check selectivity estimation accuracy. And I found difficult to create them because regression tests gives binary result while estimation accuracy is quantitative value. Existing regression tests covers case if typanalyze or selectivity estimation function falls down. I've added "ANALYZE array_op_test;" line into array test in order to these tests covers falldown case for this patch functions too. 
Seems that, selectivity estimation accuracy should be tested manually on various distributions. I've done very small amount of such tests. Unfortunately, few months pass before I got idea about "column <@ const" case. And now, I don't have sufficient time for it due to my GSoC project. It would be great if you can help me with this tests.
 
Also, I'd like to see some more explanation. Either in comments, or
just as a post to hackers. That saves me time, but we need to be clear
about what this does and does not do, what it might do in the future
etc.. 3+ years from now we need to be able to remember what the code
was supposed to do. You will forget yourself in time, if you write
enough patches. Based on this, I think you'll be writing quite a few
more.
I've added some more comments. I'm afraid that it should be completely rewritten before committing due to my english. If some particular points should be clarified more, please, specify them. 
 
And of course, a few lines for the docs also.
I found that in statistics patch for tsvector only article about pg_stats view was corrected. I've corrected this article a little bit too.

------
With best regards,
Alexander Korotkov.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Creating new remote branch in git?
Следующее
От: Seref Arikan
Дата:
Сообщение: Detailed documentation for external calls (threading, shared resources etc)