Обсуждение: Calling conventions

Поиск
Список
Период
Сортировка

Calling conventions

От
Matthew Wakeling
Дата:
I'm considering rewriting a postgres extension (GiST index bioseg) to make
it use version 1 calling conventions rather than version 0.

Does anyone have any ideas/opinions/statistics on what the performance
difference is between the two calling conventions?

Matthew

--
 Patron: "I am looking for a globe of the earth."
 Librarian: "We have a table-top model over here."
 Patron: "No, that's not good enough. Don't you have a life-size?"
 Librarian: (pause) "Yes, but it's in use right now."

Re: Calling conventions

От
Peter Eisentraut
Дата:
On Friday 17 July 2009 16:40:40 Matthew Wakeling wrote:
> I'm considering rewriting a postgres extension (GiST index bioseg) to make
> it use version 1 calling conventions rather than version 0.
>
> Does anyone have any ideas/opinions/statistics on what the performance
> difference is between the two calling conventions?

Version 1 is technically slower if you count the number of instructions, but
considering that everyone else, including PostgreSQL itself, uses version 1,
and version 0 has been deprecated for years and will break on some
architectures, it should be a no-brainer.

Re: Calling conventions

От
Matthew Wakeling
Дата:
On Fri, 17 Jul 2009, Peter Eisentraut wrote:
> On Friday 17 July 2009 16:40:40 Matthew Wakeling wrote:
>> I'm considering rewriting a postgres extension (GiST index bioseg) to make
>> it use version 1 calling conventions rather than version 0.
>>
>> Does anyone have any ideas/opinions/statistics on what the performance
>> difference is between the two calling conventions?
>
> Version 1 is technically slower if you count the number of instructions, but
> considering that everyone else, including PostgreSQL itself, uses version 1,
> and version 0 has been deprecated for years and will break on some
> architectures, it should be a no-brainer.

Is that so?

Well, here's my problem. I have GiST index type called bioseg. I have
implemented the very same algorithm in both a Postgres GiST extension and
as a standalone Java program. In general, the standalone Java program
performs about 100 times faster than Postgres when running a large
index-based nested loop join.

I profiled Postgres a few weeks back, and found a large amount of time
being spent in fmgr_oldstyle.

On Thu, 11 Jun 2009, Tom Lane wrote:
> Matthew Wakeling <matthew@flymine.org> writes:
>> Anyway, running opannotate seems to make it clear that time *is* spent in
>> the gistnext function, but almost all of that is in children of the
>> function. Lots of time is actually spent in fmgr_oldstyle though.
>
> So it'd be worth converting your functions to V1 style.

Are you saying that it would spend just as much time in fmgr_newstyle (or
whatever the correct symbol is)?

Matthew

--
 Contrary to popular belief, Unix is user friendly. It just happens to be
 very selective about who its friends are.                 -- Kyle Hearn

Re: Calling conventions

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
>> Does anyone have any ideas/opinions/statistics on what the performance
>> difference is between the two calling conventions?

> Version 1 is technically slower if you count the number of instructions,

That would be true if you compare version-0-to-version-0 calls (ie,
plain old C function call) to version-1-to-version-1 calling.  But
what is actually happening, since everything in the backend assumes
version 1, is that you have version-1-to-version-0 via an interface
layer.  Which is the worst of all possible worlds --- you have all
the overhead of a version-1 call plus the interface layer.

            regards, tom lane

Re: Calling conventions

От
"Kevin Grittner"
Дата:
Matthew Wakeling <matthew@flymine.org> wrote:

> I have implemented the very same algorithm in both a Postgres GiST
> extension and as a standalone Java program. In general, the
> standalone Java program performs about 100 times faster than
> Postgres when running a large index-based nested loop join.
>
> I profiled Postgres a few weeks back, and found a large amount of
> time being spent in fmgr_oldstyle.

I've seen the code in Java outperform the same code in optimized C,
because the "just in time" compiler can generate native code optimized
for the actual code paths being taken rather than a compile-time guess
at that, but a factor of 100?  Something else has to be going on here
beyond an interface layer.  Is this all in RAM with the Java code,
versus having disk access in PostgreSQL?

-Kevin

Re: Calling conventions

От
Matthew Wakeling
Дата:
On Fri, 17 Jul 2009, Kevin Grittner wrote:
> I've seen the code in Java outperform the same code in optimized C,
> because the "just in time" compiler can generate native code optimized
> for the actual code paths being taken rather than a compile-time guess
> at that, but a factor of 100?  Something else has to be going on here
> beyond an interface layer.  Is this all in RAM with the Java code,
> versus having disk access in PostgreSQL?

Yeah, it does seem a little excessive. The Java code runs all in RAM,
versus Postgres running all from OS cache or Postgres shared buffer (bit
hard to tell which of those two it is - there is no hard drive activity
anyway). The Java code does no page locking, whereas Postgres does loads.
The Java code is emulating just the index, whereas Postgres is fetching
the whole row as well. However, I still struggle to accept the 100 times
performance difference.

Matthew

--
 What goes up must come down. Ask any system administrator.

Re: Calling conventions

От
"Kevin Grittner"
Дата:
Matthew Wakeling <matthew@flymine.org> wrote:

> On Fri, 17 Jul 2009, Kevin Grittner wrote:

>> but a factor of 100?

> The Java code runs all in RAM, versus Postgres running all from OS
> cache or Postgres shared buffer (bit  hard to tell which of those
> two it is - there is no hard drive activity anyway). The Java code
> does no page locking, whereas Postgres does loads.  The Java code is
> emulating just the index, whereas Postgres is fetching the whole row
> as well.

Oh, well, if you load all the data into Java's heap and are accessing
it through HashMap or similar, I guess a factor of 100 is about right.
I see the big difference as the fact that the Java implementation is
dealing with everything already set up in RAM, versus needing to deal
with a "disk image" format, even if it is cached.  Try serializing
those Java objects to disk and storing the file name in the HashMap,
retrieving and de-serializing the object for each reference.  Even if
it's all cached, I expect you'd be running about 100 times slower.

The Java heap isn't a very safe place to persist long-term data,
however.

-Kevin

Re: Calling conventions

От
Tom Lane
Дата:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Oh, well, if you load all the data into Java's heap and are accessing
> it through HashMap or similar, I guess a factor of 100 is about right.
> I see the big difference as the fact that the Java implementation is
> dealing with everything already set up in RAM, versus needing to deal
> with a "disk image" format, even if it is cached.

Eliminating interprocess communication overhead might have something
to do with it, too ...

            regards, tom lane

Re: Calling conventions

От
Matthew Wakeling
Дата:
On Mon, 20 Jul 2009, Kevin Grittner wrote:
> Oh, well, if you load all the data into Java's heap and are accessing
> it through HashMap or similar, I guess a factor of 100 is about right.

No, that's not what I'm doing. Like I said, I have implemented the very
same algorithm as in Postgres, emulating index pages and all. A HashMap
would be unable to answer the query I am executing, but if it could it
would obviously be very much faster.

> I see the big difference as the fact that the Java implementation is
> dealing with everything already set up in RAM, versus needing to deal
> with a "disk image" format, even if it is cached.

The java program uses as near an on-disc format as Postgres does - just
held in memory instead of in OS cache.

Matthew

--
 Okay, I'm weird! But I'm saving up to be eccentric.

Re: Calling conventions

От
"Kevin Grittner"
Дата:
Matthew Wakeling <matthew@flymine.org> wrote:

> I have implemented the very same algorithm as in Postgres, emulating
> index pages and all.

> The java program uses as near an on-disc format as Postgres does -
> just held in memory instead of in OS cache.

Interesting.  Hard to explain without a lot more detail.  Are they
close enough in code structure for a comparison of profiling output
for both to make any sense?  Have you tried switching the calling
convention yet; and if so, what impact did that have?

-Kevin