Re: Gsoc2012 idea, tablesample

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Gsoc2012 idea, tablesample
Дата
Msg-id CA+TgmoaQ6w=VcEwwVbBkN7GH4eVvaPEVfLKioHZbXcezgG4Eig@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Gsoc2012 idea, tablesample  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Ответы Re: Gsoc2012 idea, tablesample  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: Gsoc2012 idea, tablesample  (Florian Pflug <fgp@phlo.org>)
Список pgsql-hackers
On Fri, May 11, 2012 at 11:04 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> We
> will be probing random page numbers 1..P for random tuple indexes
> 1..M.  So how many random probes by ctid does that require? The
> chance of a hit on each probe is ((T/P)/M) -- the average number of
> tuples per page divided by the number of tuple index values allowed.
> So we need (S*T)/((T/P)/M) probes.  Simplifying that winds up being
> S*M*P the product of the sample size as a percentage, the maximum
> tuple index on a page, and the number of pages.  (A calculation some
> may have jumped to as intuitively obvious.)
>
> So let's call the number of probes N.  We randomly generate N
> distinct ctid values, where each is a random page number 1..P and a
> random index 1..M.  We attempt to read each of these in block number
> order, not following HOT chains.  For each, if the tuple exists and
> is visible, it is part of our result set.
>
> Since T cancels out of that equation, we don't need to worry about
> estimating it.  Results will be correct for any value of M which is
> *at least* as large as the maximum tuple index in the table,
> although values of M larger than that will degrade performance.  The
> same holds true for the number of pages.

The trouble is, AFAICS, that you can't bound M very well without
scanning the whole table.  I mean, it's bounded by theoretical limit,
but that's it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: Gsoc2012 idea, tablesample
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Gsoc2012 idea, tablesample