Re: Hash Index Build Patch

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Hash Index Build Patch
Дата
Msg-id 1190880557.4194.22.camel@ebony.site
обсуждение исходный текст
Ответ на Re: Hash Index Build Patch  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-patches
On Wed, 2007-09-26 at 16:06 -0400, Tom Lane wrote:
> Tom Raney <twraney@comcast.net> writes:
> > Alvaro Herrera wrote:
> >> Just wondering, wouldn't it be enough to obtain a tuple count estimate
> >> by using reltuples / relpages * RelationGetNumberOfBlocks, like the
> >> planner does?
>
> > We thought of that and the verdict is still out whether it is more
> > costly to scan the entire relation to get the accurate count or use the
> > estimate and hope for the best with the possibility of splits occurring
> > during the build.   If we use the estimate and it is completely wrong
> > (with the actual tuple count being much higher) the sort will provide no
> > benefit and it will behave as did the original code.
>
> I think this argument is *far* too weak to justify an extra pass over
> the relation.  The planner-style calculation is quite unlikely to give a
> major underestimate of the rowcount.  It might overestimate, eg if the
> relation is bloated by dead tuples, but an error in that direction won't
> kill you.

Agreed. Given the uncertainty in the hashing, calculating an exact
number of rows seems fruitless, whereas we know an extra scan will
certainly hurt. It might not show up in tests, but it will on life-size
tables.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


В списке pgsql-patches по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Minor recovery changes
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: Warning is adjusted of pgbench.