Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Дата
Msg-id 28180.1358109898@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)  (Andres Freund <andres@2ndquadrant.com>)
Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-01-13 14:17:52 -0500, Tom Lane wrote:
>> I find these numbers pretty hard to credit.

> There are quite some elog(DEBUG*)s in the backend, and those are taken,
> so I don't find it unreasonable that doing less work in those cases is
> measurable.

Meh.  If there are any elog(DEBUG)s in time-critical places, they should
be changed to ereport's anyway, so as to eliminate most of the overhead
when they're not firing.

> And if you look at the disassembly of ERROR codepaths:

I think your numbers are being twisted by -fno-omit-frame-pointer.
What I get, with the __builtin_unreachable version of the macro,
looks more like

MemoryContextAlloc:cmpq    $1073741823, %rsipushq    %rbxmovq    %rsi, %rbxja    .L53movq    8(%rdi), %raxmovb    $0,
48(%rdi)popq   %rbxmovq    (%rax), %raxjmp    *%rax
 
.L53:movl    $__func__.5262, %edxmovl    $576, %esimovl    $.LC2, %edicall    elog_startmovq    %rbx, %rdxmovl
$.LC3,%esimovl    $20, %edixorl    %eax, %eaxcall    elog_finish
 

With -fno-omit-frame-pointer it's a little worse, but still not what you
show --- in particular, for me gcc still pushes the elog calls out of
the main code path.  I don't think that the main path will get any
better with one elog function instead of two.  It could easily get worse.
On many machines, the single-function version would be worse because of
needing to use more parameter registers, which would translate into more
save/restore work in the calling function, which is overhead that would
likely be paid whether elog actually gets called or not.  (As an
example, in the above code gcc evidently isn't noticing that it doesn't
need to save/restore rbx so far as the main path is concerned.)

In any case, results from a single micro-benchmark on a single platform
with a single compiler version (and single set of compiler options)
don't convince me a whole lot here.

Basically, the aspects of this that I think are likely to be
reproducible wins across different platforms are (a) teaching the
compiler that elog(ERROR) doesn't return, and (b) reducing code size as
much as possible.  The single-function change isn't going to help on
either ground --- maybe it would have helped on (b) without the errno
problem, but that's going to destroy any possible code size savings.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dimitri Fontaine
Дата:
Сообщение: Re: ToDo: log plans of cancelled queries
Следующее
От: Hannu Krosing
Дата:
Сообщение: Re: logical changeset generation v3 - comparison to Postgres-R change set format