Re: Shared memory

Поиск

Список

Период

Сортировка

От	Thomas Hallgren
Тема	Re: Shared memory
Дата	28 марта 2006 г. 11:48:01
Msg-id	44295AB0.5060203@tada.se обсуждение исходный текст
Ответ на	Re: Shared memory (Simon Riggs <simon@2ndquadrant.com>)
Ответы	Re: Shared memory Re: Shared memory Re: Shared memory
Список	pgsql-hackers

Дерево обсуждения

Hi Simon,
Thanks for your input. All good points. I actually did some work using Java stored 
procedures on DB2 a while back but I had managed to forget (or repress :-) ) all about the 
FENCED/NOT FENCED stuff. The current discussion definitely puts it in a different 
perspective. I think PL/Java has a pretty good 'NOT FENCED' implementation, as does many 
other PL's, but no PL has yet come up with a FENCED solution.

This FENCED/NOT FENCED terminology would be a good way to differentiate between the two 
approaches. Any chance of that syntax making it into the PostgreSQL grammar, should the need 
arise?

Some more comments inline:

Simon Riggs wrote:
> Just some thoughts from afar: DB2 supports in-process and out-of-process
> external function calls (UDFs) that it refers to as UNFENCED and FENCED
> procedures. For Java only, IBM have moved to supporting *only* FENCED
> procedures for Java functions, i.e. having a single JVM for all
> connections.>
Are you sure about this? As I recall it a FENCED stored procedure executed in a remote JVM 
of it's own. A parameter could be used that either caused a new JVM to be instantiated for 
each stored procedure call or to be kept for the duration of the session. The former would 
yield really horrible performance but keep memory utilization at a minimum. The latter would 
get a more acceptable performance but waste more memory (in par with PL/Java today).

> Each connection's Java function runs as a thread on a
> single dedicated JVM-only process. 
> 
If that was true, then different threads could share dirty session data. I wanted to do that 
using DB2 but found it impossible. That was a while back though.

> That approach definitely does increase the invocation time, but it
> significantly reduces the resources associated with the JVM, as well as
> allowing memory management to be more controllable (bliss...). So the
> overall picture could be more CPU and memory resources for each
> connection in the connection pool.
> 
My very crude measurements indicate that the overhead of using a separate JVM is between 
6-15MB of real memory per connection. Today, you get about 10MB/$ and servers configured 
with 4GB RAM or more are not uncommon.

I'm not saying that the overhead doesn't matter. Of course it does. But the time when you 
needed to be extremely conservative with memory usage has passed. It might be far less 
expensive to buy some extra memory then to invest in SMP architectures to minimize IPC overhead.

My point is, even fairly large app-servers (using connection pools with up to 200 
simultaneous connections) can run using relatively inexpensive boxes such as an AMD64 based 
server with 4GB RAM and show very good throughput with the current implementation.

> If you have a few small Java functions centralisation would not be good,
> but if you have a whole application architecture with many connections
> executing reasonable chunks of code then this can be a win.
> 
One thing to remembered is that a 'chunk of code' that executes in a remote JVM and uses 
JDBC will be hit by the IPC overhead on each interaction over the JDBC connection. I.e. the 
overhead is not just limited to the actual call of the UDF, it's also imposed on all 
database accesses that the UDF makes in turn.

> In that environment we used Java for major database functions, with SQL
> functions for small extensions.
> 
My guess is that those major database functions did a fair amount of JDBC. Am I right?

> Also the Java invocation time we should be celebrating is that by having
> Java in the database the Java<->DB time is much less than it would be if
> we had a Java stack sitting on another server.
> 

I think the cases when you have a Tomcat or JBoss sitting on the same physical server as the 
actual database are very common. One major reason being that you don't want network overhead 
between the middle tier and the backend. Moving logic into the database instead of keeping 
it in the middle tier is often done to get rid of the last hurdle, the overhead of IPC.

Regards,
Thomas Hallgren

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Shared memory