Re: I: About "Our CLUSTER implementation is pessimal" patch

Поиск
Список
Период
Сортировка
От Itagaki Takahiro
Тема Re: I: About "Our CLUSTER implementation is pessimal" patch
Дата
Msg-id AANLkTini6r3EvJ6XkLk9tPkSt-+g55SF52wwNa6gfrj5@mail.gmail.com
обсуждение исходный текст
Ответ на Re: I: About "Our CLUSTER implementation is pessimal" patch  (Josh Kupershmidt <schmiddy@gmail.com>)
Ответы Re: I: About "Our CLUSTER implementation is pessimal" patch  (Alvaro Herrera <alvherre@commandprompt.com>)
Список pgsql-hackers
On Wed, Sep 29, 2010 at 12:53 PM, Josh Kupershmidt <schmiddy@gmail.com> wrote:
> I thought this paragraph was a little confusing:

Thanks for checking.

> !     In the second case, a full table scan is followed by a sort operation.
> !     The method is faster than the first one when the table is highly
> fragmented.
> !     You need twice disk space of the sum in the case. In addition to the free
> !     space needed by the previous case, this approach may also need a temporary
> !     disk sort file which can be as big as the original table.
>
> I think the worst-case disk space could be made a little more clear
> here, and maybe some general wordsmithing as well. I wasn't sure what
> "twice disk space of the sum" was in this description -- sum of what
> (table and all indexes?).

To be exact, It's very complex.
During reconstructing tables, it requires about twice disk space of
the old table (for sort tapes and the new table).
After sorting the table, CLUSTER performs REINDEX. We need
{same size of the new table} + {twice disk space of the new indexes}.
Also, new relations will be the same sizes of old relations if they
have no free spaces.

So, I think "twice disk space of the sum of table and indexes" would be
the simplest explanation for safe margin.

> Also, AIUI, this second clustering method is similar to the older
> idiom of CREATE TABLE new AS SELECT * FROM old ORDER BY col; Since the
> paragraph describing this older idiom is being removed, perhaps a
> brief mention in the documentation could be made of this similarity.

Good idea.

> Some more wordsmithing: change
> !      The planner tries to choose a faster method in them base on the
> information
> to:
> !      The planner tries to choose the fastest method based on the information

Thanks.

--
Itagaki Takahiro


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Darren Duncan
Дата:
Сообщение: Re: Proposal: plpgsql - "for in array" statement
Следующее
От: Sushant Sinha
Дата:
Сообщение: Re: english parser in text search: support for multiple words in the same position