Re: Unanswered questions about Postgre

Поиск
Список
Период
Сортировка
От Joe Kislo
Тема Re: Unanswered questions about Postgre
Дата
Msg-id 3A3555C4.62663AA6@athenium.com
обсуждение исходный текст
Ответ на Re: Unanswered questions about Postgre  (Bruce Momjian <pgman@candle.pha.pa.us>)
Ответы Re: Unanswered questions about Postgre  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Unanswered questions about Postgre  (Peter Eisentraut <peter_e@gmx.net>)
Re: Unanswered questions about Postgre  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
> What I think we _really_ need is a large object interface to TOAST data.
> We already have a nice API, and even psql local large object handling.
>
> If I have a file that I want loaded in/out of a TOAST column, we really
> should make a set of functions to do it, just like we do with large
> objects.
>
> This an obvious way to load files in/out of TOAST columns, and I am not
> sure why it has not been done yet.  I am afraid we are going to get
> critisized if we don't have it soon.

    Okay, let me criticize you now then :)  (just kidding)  Over the past
month I've been trying out postgre for two reasons.  I've posted a
number of questions to this mailing list, and the postgre community has
been extremely responsive and helpful.  Kudos to everybody working on
postgre.  Most of my questions have been along the line of asking why a
particular feature works differently then in other databases, or why
postgre seemed to act in an illogical fashion (such as corrupting my
database).

    First, I was evaluating Postgre for a medium scale application I will
working on for my current employer.  Technically this is re-architecting
a current application built on MySQL and Python.  I plan to move the
application to java servlets and some database other then MySQL,
preferably opensource.  Postgre, obviously with its' reputation, was the
beginning of this short list of databases to look at.  Unfortunately I
quickly discovered this lack of BLOB support.  I understand that the C
API can read/write -files- off the server's filesystem and load them
into the database.  Unfortunately we would absolutely require true
over-the-wire blob support through JDBC.  AFAIK, even with these "toast"
columns, it still wouldn't fill that need.  The need here is to load
binary data from the client, transfer it over the JDBC wire, and store
it in the database.  Some people before suggested a shared NFS
partition, then have the server use the existing BLOB support to load
the files off disk.  That's really not an acceptable solution.  So as
for using postgre in this upcoming application, it's really a no-go at
this point without that ability.  I actually suspect a number of people
also have a need to store BLOBs in a database, but maybe it's not as
important as I think.

    The second reason why I've been working with Postgre is I'm about to
release into the open source a java based object database abstraction
layer.  This layer maps java objects to a relational database by storing
their primitives in database primitives, and using java reflection to
reconstitute objects from the database.  This allows you to perform
complex joins and such in the -database- then map to the actual java
objects.
    When you attach a particular class to a database, you choose the
appropriate database adapter (such as one for oracle or postgre).  These
DBAdapters take care of all the DB specific things, such as native
column types, handling auto incrementing columns (generators or "serial
columns"), creating tables, altering tables when class definitions
change, database independent indexing, and blobs.  Programmers mostly
work at the object layer, and don't really worry about the particulars
of the underlying database.  (although they can execute raw SQL if they
really need to).  So this truly allows an application to be written
independent of any particular underlying database (and to my dismay,
there appear to be very big differences between these databases!).  This
allows you to change your underlying database easily, which means you
can choose the database server on it's merits, and not because it's been
grandfathered into your application :)

    Anyway, when implementing the Postgre DBAdapter, I found postgre to be
quite a nice database (and pretty fast too).  But there were two issues
which cripple the postgre DBAdapter from supporting the full feature
set.

    1) No blob support.  As I described above, it needs to be possible to
insert an arbitrarily large (or atleast up to say 5 megabytes) binary
object into the database, and have it accessible by a particular column
name in a table.  AFAIK, this is not currently possible in postgre

    2) Postgre does not record rollback segments.  Which means transactions
get ABORTed and rolled back for some odd reasons when they don't
normally need to.  For example, if you just send the SQL server some
garbage SQL, (eg: ASDF;) your transaction gets aborted and rolled back;
even though your garbage SQL didn't touch any rows.  At the object layer
in the aforementioned database layer, if you try insert an object into
the database and doing so would violate a unique key (such as the
primary key), a DuplicateKeyException will be thrown.  No other database
adapters I've implemented, such as MySQL, interbase or oracle, will
*also* abort the transaction.
    So if at the object layer, a DuplicateKeyException is supposed to
happen in that case, I would have to before every object is inserted
into the database, look up the database schema for the table... Then
confirm by issuing multiple SQL queries that no unique keys would be
violated by the new record.  If they are, throw the
DuplicateKeyException, and if not, insert the record.  But even that
won't catch all cases because a different transaction could have have an
uncommitted row with which the new record conflicts... In which case all
my queries would say things are in the clear, but when I go to insert
the record the insert would be blocked waiting on the other
transaction.  If that other transaction rollsback, we're in the clear..
If it commits, postgre says there's a key conflict, ABORTS the current
transaction, and rolls it back.  Eek.  In which case, the database layer
still isn't throwing a DuplicateKeyException, but a TransactionAborted
exception.

    -GRANTED- that a transaction can be aborted at anytime, and the
application programmer should plan for that, but I think this postgre
"feature" will cause transactions to be aborted unnecessarily;
especially if people migrate from another database to postgre.
Ofcourse, people really shouldn't be inserting objects which already
exist, but it would still be an inconsistency between Postgre and all
the other DBAdapters.

Thoughts?

-Joe

В списке pgsql-general по дате отправления:

Предыдущее
От: Hans-Jürgen Schönig
Дата:
Сообщение: Re: Large files on linux
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Unanswered questions about Postgre