Re: [HACKERS] Bad n_distinct estimation; hacks suggested?

От: Andrew Dunstan
Тема: Re: [HACKERS] Bad n_distinct estimation; hacks suggested?
Дата: ,
Msg-id: 426EB580.9040606@dunslane.net
(см: обсуждение, исходный текст)
Ответ на: Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Simon Riggs)
Список: pgsql-performance

Скрыть дерево обсуждения

Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
 Re: Bad n_distinct estimation; hacks suggested?  ("Dave Held", )
 Re: Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
  Re: Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
   Re: Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
    Re: Bad n_distinct estimation; hacks suggested?  (Marko Ristola, )
     Re: Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
      Re: Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
       Re: Bad n_distinct estimation; hacks suggested?  (Greg Stark, )
        Re: Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
         Re: Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
         Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  ("Andrew Dunstan", )
         Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
          Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Marko Ristola, )
          Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
           Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
           Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Andrew Dunstan, )
          Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Andrew Dunstan, )
         Re: Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
          Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
           Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
            Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
             Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
             Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Andrew Dunstan, )
              Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Mischa Sandberg, )
               Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Andrew Dunstan, )
                Re: Distinct-Sampling (Gibbons paper) for Postgres  (, )
                 Re: Distinct-Sampling (Gibbons paper) for Postgres  (Josh Berkus, )
            Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
             Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
              Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Gurmeet Manku, )
               Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
              Citation for "Bad n_distinct estimation; hacks suggested?"  (Gurmeet Manku, )
               Foreign key constraints compile faster in 7.4  (Ashish Arte, )
                Re: Foreign key constraints compile faster in 7.4  (Tom Lane, )
 Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
 Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
  Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
 Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
  Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Mischa Sandberg, )
  Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Markus Schaber, )
   Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Mischa Sandberg, )
    Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
     Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (John A Meinel, )
      Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
     Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Mischa Sandberg, )


Simon Riggs wrote:

>The comment
>     * Every value in the sample appeared more than once.  Assume
>     * the column has just these values.
>doesn't seem to apply when using larger samples, as Josh is using.
>
>Looking at Josh's application it does seem likely that when taking a
>sample, all site visitors clicked more than once during their session,
>especially if they include home page, adverts, images etc for each page.
>
>Could it be that we have overlooked this simple explanation and that the
>Haas and Stokes equation is actually quite good, but just not being
>applied?
>
>
>
>

No, it is being aplied.  If every value in the sample appears more than
once, then f1 in the formula is 0, and the result is then just d, the
number of distinct values in the sample.

cheers

andrew


В списке pgsql-performance по дате сообщения:

От: Kevin Brown
Дата:
Сообщение: Re: Joel's Performance Issues WAS : Opteron vs Xeon
От: Mischa Sandberg
Дата:
Сообщение: Re: [HACKERS] Bad n_distinct estimation; hacks suggested?