Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

От: Gurmeet Manku
Тема: Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
Дата: ,
Msg-id: Pine.LNX.4.44.0504261443340.12051-100000@xenon.Stanford.EDU
(см: обсуждение, исходный текст)
Ответ на: Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Simon Riggs)
Список: pgsql-hackers

Скрыть дерево обсуждения

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
 Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
 Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  ("Andrew Dunstan", )
 Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
  Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Andrew Dunstan, )
   Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
   Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
    Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
  Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Marko Ristola, )
  Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
   Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
   Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Andrew Dunstan, )
 Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
  Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
   Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
    Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
     Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
     Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Andrew Dunstan, )
      Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Mischa Sandberg, )
       Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Andrew Dunstan, )
        Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
         Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Mischa Sandberg, )
         Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Markus Schaber, )
          Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Mischa Sandberg, )
           Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
            Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Josh Berkus, )
            Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Mischa Sandberg, )
            Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (John A Meinel, )
        Re: [PERFORM] Distinct-Sampling (Gibbons paper) for Postgres  (Josh Berkus, )
        Re: Distinct-Sampling (Gibbons paper) for Postgres  (, )
    Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Tom Lane, )
     Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Simon Riggs, )
      Re: [PERFORM] Bad n_distinct estimation; hacks suggested?  (Gurmeet Manku, )
      Citation for "Bad n_distinct estimation; hacks suggested?"  (Gurmeet Manku, )

 Hi everybody!

 Perhaps the following papers are relevant to the discussion here
 (their contact authors have been cc'd):


 1. The following proposes effective algorithms for using block-level
    sampling for n_distinct estimation:

 "Effective use of block-level sampling in statistics estimation"
 by Chaudhuri, Das and Srivastava, SIGMOD 2004.

 http://www-db.stanford.edu/~usriv/papers/block-sampling.pdf


 2. In a single scan, it is possible to estimate n_distinct by using
    a very simple algorithm:

 "Distinct sampling for highly-accurate answers to distinct value
  queries and event reports" by Gibbons, VLDB 2001.

 http://www.aladdin.cs.cmu.edu/papers/pdfs/y2001/dist_sampl.pdf


 3. In fact, Gibbon's basic idea has been extended to "sliding windows"
    (this extension is useful in streaming systems like Aurora / Stream):

 "Distributed streams algorithms for sliding windows"
 by Gibbons and Tirthapura, SPAA 2002.

 http://home.eng.iastate.edu/~snt/research/tocs.pdf


 Thanks,
 Gurmeet

 ----------------------------------------------------
 Gurmeet Singh Manku                      Google Inc.
 http://www.cs.stanford.edu/~manku    (650) 967 1890
 ----------------------------------------------------



В списке pgsql-hackers по дате сообщения:

От: Bruce Momjian
Дата:
Сообщение: Re: [PATCHES] Continue transactions after errors in psql
От: Alvaro Herrera
Дата:
Сообщение: Re: Behavior of shared/exclusive row locks