Re: add_path optimization

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: add_path optimization
Дата
Msg-id 603c8f070902011335xa94a333iaff960d000af9f7f@mail.gmail.com
обсуждение исходный текст
Ответ на Re: add_path optimization  (Grzegorz Jaskiewicz <gj@pointblue.com.pl>)
Ответы Re: add_path optimization
Список pgsql-hackers
On Sun, Feb 1, 2009 at 3:25 PM, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote:
> I don't like the fact that you hardcoded that here. I know that you are
> trying to pass on few calls in one go here, but still... ugly.

Well, I think you'll find that using a dynamically sized data
structure destroys the possibility of squeezing any additional
performance out of this part of the code.  The nice thing about
fixed-size data structures is that they cost essentially nothing to
stack-allocate; you just move the stack pointer and away you go.  We
should in fact be looking for MORE places where we can avoid the use
of constructs like List, since the second-highest CPU hog in my tests
was AllocSetAlloc(), beaten out only by add_path().  With this patch
applied, AllocSetAlloc() moves up to first.

(It would really be rather better to include all the paths generated
in each pass of the loop in the call to add_similar_path(), but that
looked rather more complicated because we can't predict how many of
them there will be, and so adding a fixed-size data structure is not
so easy.  Plus, the code would all have to be rewritten not to assume
that "continue" was the right way to move on to the next iteration of
the loop.  What would potentially be better still is to try to figure
out which nested loop will be the winner without allocating all of the
NestPath nodes in the first place, but that didn't seem possible
without much more invasive changes, and it's not clear that you would
actually still have a winner by the time you got done beating on it.)

I am also somewhat mystified as to why using an array of size 5 to
hold up to 5 data structure allocated in nearly-consecutive lines of C
code would qualify as ugly (completely apart from the fact that it's a
clear performance win).

> static int
> compare_fuzzy_path_costs(Path *path1, Path *path2, int *startup_cost)
> {
> ....
>        *startup_cost = (s == 0) ? t : s;
>
> Why not *startup_cost = s, and let the caller decide which value it wants to
> use ?
> or just return both, from single call (which would ?
> ...
>
>        return t;
> }

You're essentially suggesting removing logic from
compare_fuzzy_path_costs() and duplicating it at the two call sites of
that function.
If there's a point to that, I don't see it.  You might also take a
look at the widely used function compare_path_costs().

> To be fair, I don't see compare_fuzzy_path_costs change to save too much of
> time in planner.

Hmm, well I didn't either, but there's this handy tool called gprof
that you might want to try out.  I wouldn't be wasting my time
patching this part of the code if it didn't make a difference, and in
fact if you do 10% of the amount of benchmarking that I did in the
process of creating this patch, you will find that it in fact does
make a difference.

> I would myself probably convert that function into two defines/inline funcs,
> but that's just me.

It's already static to that .c file, so the compiler likely will
inline it.  In fact, I suspect you will find that removing the static
keyword from the implementation of that function in CVS HEAD is itself
sufficient to produce a small but measurable slowdown in planning of
large join trees, exactly because it will defeat inlining.

...Robert


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Grzegorz Jaskiewicz
Дата:
Сообщение: Re: add_path optimization
Следующее
От: Andrew Dunstan
Дата:
Сообщение: new buildfarm client code feature release