Re: [HACKERS] CLUSTER command progress monitor

Поиск
Список
Период
Сортировка
От Antonin Houska
Тема Re: [HACKERS] CLUSTER command progress monitor
Дата
Msg-id 15653.1511197525@localhost
обсуждение исходный текст
Ответ на Re: [HACKERS] CLUSTER command progress monitor  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] CLUSTER command progress monitor
Re: [HACKERS] CLUSTER command progress monitor
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Aug 30, 2017 at 10:12 PM, Tatsuro Yamada
> <yamada.tatsuro@lab.ntt.co.jp> wrote:
> >   1. scanning heap
> >   2. sort tuples
>
> These two phases overlap, though. I believe progress reporting for
> sorts is really hard.  In the simple case where the data fits in
> work_mem, none of the work of the sort gets done until all the data is
> read.  Once you switch to an external sort, you're writing batch
> files, so a lot of the work is now being done during data loading.
> But as the number of batch files grows, the final merge at the end
> becomes an increasingly noticeable part of the cost, and eventually
> you end up needing multiple merge passes.  I think we need some smart
> way to report on sorts so that we can tell how much of the work has
> really been done, but I don't know how to do it.

Whatever complexity is hidden in the sort, cost_sort() should have taken it
into consideration when called via plan_cluster_use_sort(). Thus I think that
once we have both startup and total cost, the current progress of the sort
stage can be estimated from the current number of input and output
rows. Please remind me if my proposal appears to be too simplistic.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] [PATCH] A hook for session start
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] CLUSTER command progress monitor