Обсуждение: Performance/Issues with CMP and JBoss

Поиск
Список
Период
Сортировка

Performance/Issues with CMP and JBoss

От
"Andy Dale"
Дата:
Hi,

We currently have an JBoss web application that persists a byte array it recieves (Using the EJB persistence API), and then tries to read it from the Database again from further parsing.  The application works really well with the default Hypersonic datasource, but it will not work correctly when using postgres as the datasource. 

The current problem we seem to have is that the data is persisted ok (or at least it seems to be in there with pgadmin), but cannot be read back out of the database all the time (in fact for about 90% of the time), the current behaviour of the application suggests it is trying to read it back out of the database (using EntityManager.find()) before it has really been saved, and thus fails to find the data.  Do i have to tweak some settings in the postgres.conf file ? i have tried turning off fsync (i do not want to do this, for reliability reasons) and it performed far better.  Can anyone advise me on the changes i need to make to speed up the inserting of data, i know that turning  autocommit off is supposed to increase performance.

Thanks in advance,

Andy

Re: Performance/Issues with CMP and JBoss

От
Douglas McNaught
Дата:
"Andy Dale" <andy.dale@gmail.com> writes:

> The current problem we seem to have is that the data is persisted ok
> (or at least it seems to be in there with pgadmin), but cannot be
> read back out of the database all the time (in fact for about 90% of
> the time), the current behaviour of the application suggests it is
> trying to read it back out of the database (using
> EntityManager.find()) before it has really been saved, and thus
> fails to find the data.  Do i have to tweak some settings in the
> postgres.conf file ? i have tried turning off fsync (i do not want
> to do this, for reliability reasons) and it performed far better.
> Can anyone advise me on the changes i need to make to speed up the
> inserting of data, i know that turning autocommit off is supposed to
> increase performance.

This is almost certainly a problem with your persistence layer rather
than with Postgres.  If you can see the data with PGAdmin then it's in
the database.  It may be that the transaction that saves the object is
not committing quickly, and so other connections don't see the object
until the commit happens.  But that's not the fault of Postgres.

-Doug

Re: Performance/Issues with CMP and JBoss

От
"Andy Dale"
Дата:
Hi,

I have performed some tests earlier on today, and i think the problem lies with Postgres and it's bad performance when being used with container managed persistence.  I am covinced of it being an issue with postgres because it works really well with MySQL and Hypersonic, but not with Postgres.  I have been reading around on the internet and it seems that Postgres does not work too well with container managed persistence, but surely all i have to do is just change some configuration settings, or can Postgres not be used with container managed persistence.

Thanks,

Andy

On 8/1/06, Douglas McNaught <doug@mcnaught.org> wrote:
"Andy Dale" <andy.dale@gmail.com> writes:

> The current problem we seem to have is that the data is persisted ok
> (or at least it seems to be in there with pgadmin), but cannot be
> read back out of the database all the time (in fact for about 90% of
> the time), the current behaviour of the application suggests it is
> trying to read it back out of the database (using
> EntityManager.find ()) before it has really been saved, and thus
> fails to find the data.  Do i have to tweak some settings in the
> postgres.conf file ? i have tried turning off fsync (i do not want
> to do this, for reliability reasons) and it performed far better.
> Can anyone advise me on the changes i need to make to speed up the
> inserting of data, i know that turning autocommit off is supposed to
> increase performance.

This is almost certainly a problem with your persistence layer rather
than with Postgres.  If you can see the data with PGAdmin then it's in
the database.  It may be that the transaction that saves the object is
not committing quickly, and so other connections don't see the object
until the commit happens.  But that's not the fault of Postgres.

-Doug

Re: Performance/Issues with CMP and JBoss

От
Richard Huxton
Дата:
Andy Dale wrote:
> Hi,
>
> I have performed some tests earlier on today, and i think the problem lies
> with Postgres and it's bad performance when being used with container
> managed persistence.

Is your problem performance or an error? It sounded like you were
getting errors in your first post.

 > I am covinced of it being an issue with postgres
> because it works really well with MySQL and Hypersonic, but not with
> Postgres.

*What* works really well? Can you tell us what query/queries are giving
you problems?

 > I have been reading around on the internet and it seems that
> Postgres does not work too well with container managed persistence, but
> surely all i have to do is just change some configuration settings, or can
> Postgres not be used with container managed persistence.

If you generate valid SQL then PostgreSQL can certainly process the
queries. As to whether tuning will help, nobody can say because you
haven't supplied details of:
1. Hardware
2. Operating System
3. Queries giving problems
4. Concurrency details
5. Current configuration settings
6. System activity (is CPU/RAM/IO maxed?)

Without at least *some* of these facts nobody can say anything useful.

--
   Richard Huxton
   Archonet Ltd

Re: Performance/Issues with CMP and JBoss

От
Richard Huxton
Дата:
Andy Dale wrote:
> Hi,
>
> I will explain in more details what the test (extremely simple) program is
> actually doing.  A session bean receives some data (roughly 3K) and then
> persists this data to the database with EntityManager.persist() (using the
> EJB 3.0 Persistence API that comes with JBoss).  Once the persist method
> returns a message is sent to a message driven bean where it tries to get
> the
> previously persisted data from the database using the EntityManager.find()
> method, this is where we run into problems with Postgres, it only seems to
> find the persisted object 5% of the time.  I don't know if you class it
> as a
> performance problem or an error but seems as it works in other databases i
> am more inclined to classify this as an error.

Well that's easy to decide.
1. Did the transaction that stored the object complete successfully?
2. Should the results of that transaction be visible from the reading
transaction?

If Yes & Yes, but you can't find the object there's an error.

> When i say works really well, i mean it in the sense that a simple query
> that the EntityManager produces for the find operation is something as
> simple as "select <columns> from table where <primary key> = ?", this is
> causing problems for Postgres, the query is in the form of a prepared
> statement so could this be causing any problems ?

Prepared queries mean you can't do certain optimisations, but for a
single-table fetch on columns with a unique index I'd expect an index to
be used (assuming analyse has been run recently).

> As far as hardware goes my testing machine is:
>
> P4 3.0 GHz
> 1GB RAM
> 20GB of HD (IDE)
>
> But we intend to use a HP Prolient server with the following spec:
>
> Intel Xeon 3.3 GHz
> 2 GB RAM
> 146GB Ultra SCSI 320
>
> The Operating System being run on both is Fedora Core 5
>
> The server/machine also needs to run other programs and processes so we
> don't want the database to hog to much of the resources, about 10 - 20 %
> RAM
> (and how to configure it) and CPU, the current config as defined in the
> postgres.conf file is as so:

Squeezing PostgreSQL and cache-space for its data into 256MB is going to
depend on how large your DB is. Oh, and if you have a lot of updates
then disk will probably be the limiting factor.

> # -----------------------------
> # PostgreSQL configuration file
> # -----------------------------
> #
> # This file consists of lines of the form:
> #

We don't really need to see all the comment lines

> #---------------------------------------------------------------------------
>
> # RESOURCE USAGE (except WAL)
> #---------------------------------------------------------------------------
>
>
> # - Memory -
>
> shared_buffers = 1000            # min 16 or max_connections*2, 8KB each
> #temp_buffers = 1000            # min 100, 8KB each
> #max_prepared_transactions = 5        # can be 0 or more
> # note: increasing max_prepared_transactions costs ~600 bytes of shared
> memory
> # per transaction slot, plus lock space (see max_locks_per_transaction).
> #work_mem = 1024            # min 64, size in KB
> #maintenance_work_mem = 16384        # min 1024, size in KB
> #max_stack_depth = 2048            # min 100, size in KB

OK, so you haven't actually done any configuration. PG *will* crawl with
the default settings, it's setup so you can install it on your 5-year
old laptop without keeling over. I recommend you read the short article
at the following URL and start from there.
   http://www.powerpostgresql.com/PerfList

Oh, and if you don't know what vacuum, analyse and the autovacuum tool
are you'll want to read the relevant parts of the manual.

--
   Richard Huxton
   Archonet Ltd

Re: Performance/Issues with CMP and JBoss

От
James Robinson
Дата:
Diagnosing JBossCMP issues is not for the faint of heart, in that one
of its main raison d'etre's is to hide SQL knowledge away from the
casual coder. Add into the mix the concurrency issues which naturally
occur since an EJB container is multithreaded and the overly
complicated JTA stuff and you've a real mess to diagnose when things
don't work just so.

When we ran CMP, we'd also run postgres in debugging mode so as to
have it emit queries onto stdout so that we could watch exactly what
the CMP was doing. Try starting up postgres ala:

    /usr/local/pgsql/bin/postmaster -d 2 -i -D /usr/local/pgsql/data
2>&1 | grep LOG

Your mileage may vary, as would your PGDATA dir etc. Read the docs on
the postmaster. But the end result is the ability to watch each query
fly by on your development machine -- letting you see the order of
which updates, inserts, selects, and commits happen from the postgres
backend's perspective.

I suspect the original poster has code issues being tickled by java
threading issues interacting poorly with their default transaction
isolation level of READ COMMITTED -- their inserts are being done in
one thread / JTA transaction, while the read is being done in another
and is loosing the race -- the inserting thread has not committed
yet. Running the backend in debugging mode should let you see the
select happening _before_ the first thread has committed its
transaction. Postgres is doing exactly what it is being told -- if
the inserting transaction has not yet committed, and the reading
transaction's isolation level is set to READ COMMITTED, then postgres
_will_not_ return anything to the reading connection / thread which
has not yet been committed.

Good luck with reading and following all of the EJB and CMP
specifications, the JBossCMP documentation, the JTA spec, and then
swallowing all of postgres [ or any other SQL backend ]. If you don't
have all of 'em fully understood yet, you will have to one day if you
continue with all that fat tech which was supposed to make things
easy for you. CMP is a very leaky overcomplicated abstraction.

----
James Robinson
Socialserve.com