Обсуждение: Blobs in Postgresql
Hi all-
I am evaluating databases for use in a large project that will hold image data as blobs. I know, everybody says to just store pointers to files on the disk...can't do it here...the images are of a confidential nature and access to the database (and resulting client app) will be highly restricted. The underlying platform will likely be Linux though Solaris x86-64 has been suggested as well.
I did some tests with MySQL and found the results very sub-par...the standard blob field only holds 64k (they have three types of blobs for whatever reason) and the real problem is that my uploads and downloads have failed because of packet size issues...this can be solved somewhat with server settings, but I get the impression that blobs are barely supported.
So turning to Postgresql, can I get any recommendations, suggestions and tips on blob handling in the database? The image sizes will be pretty variable, from a few kilobytes to several hundred megabytes, so I need something that will handle the various file sizes, hopefully transparently.
Thanks for any info,
Ron
I am evaluating databases for use in a large project that will hold image data as blobs. I know, everybody says to just store pointers to files on the disk...can't do it here...the images are of a confidential nature and access to the database (and resulting client app) will be highly restricted. The underlying platform will likely be Linux though Solaris x86-64 has been suggested as well.
I did some tests with MySQL and found the results very sub-par...the standard blob field only holds 64k (they have three types of blobs for whatever reason) and the real problem is that my uploads and downloads have failed because of packet size issues...this can be solved somewhat with server settings, but I get the impression that blobs are barely supported.
So turning to Postgresql, can I get any recommendations, suggestions and tips on blob handling in the database? The image sizes will be pretty variable, from a few kilobytes to several hundred megabytes, so I need something that will handle the various file sizes, hopefully transparently.
Thanks for any info,
Ron
> So turning to Postgresql, can I get any recommendations, suggestions and > tips on blob handling in the database? The image sizes will be pretty > variable, from a few kilobytes to several hundred megabytes, so I need > something that will handle the various file sizes, hopefully transparently. > PostgreSQL BLOB implementation is well. We used it without any problems with images from 20K-30M. Regards Pavel Stehule
"Ron Olson" <tachoknight@gmail.com> writes: > Hi all- > > I am evaluating databases for use in a large project that will hold image > data as blobs. I know, everybody says to just store pointers to files on the > disk... Well not everyone. I usually do, but if you're not handling these blobs under heavy load independent of the database (like web servers) then either approach works. > So turning to Postgresql, can I get any recommendations, suggestions and > tips on blob handling in the database? The image sizes will be pretty > variable, from a few kilobytes to several hundred megabytes, so I need > something that will handle the various file sizes, hopefully transparently. There are basically two options. If you are not handling data that are too large to copy around in memory, and you don't need to upload and download the data in chunks (usually these are the same issue) then you can just store your images in a bytea. Postgres transparently treats *all* large variable-sized data whether text, bytea, arrays, like a blob. It stores it in a separate table outside the main table. If your data can sometimes be so large that you cannot manipulate the whole thing in memory all at once (Keep in mind that Postgres expects to be able to handle a few copies of the data at the same time. Conservatively expect 5 simultaneous copies to have to fit in memory.) then you'll have to look into the large object interface which is a set of functions starting with lo_* -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Wednesday 2007-08-15 05:52, Gregory Stark wrote: > "Ron Olson" <tachoknight@gmail.com> writes: > > Hi all- > > > > I am evaluating databases for use in a large project that will hold image > > data as blobs. I know, everybody says to just store pointers to files on > > the disk... > > Well not everyone. I usually do, but if you're not handling these blobs > under heavy load independent of the database (like web servers) then either > approach works. I've always wondered how you keep transactions working when you only store pointers to large data. Do you need an external transaction manager to insure that the file doesn't get deleted when you "delete" the data via the pointer? Do you need an external application that handles all deletes, inserts, and updates?
On 8/15/07, Ron Olson <tachoknight@gmail.com> wrote: > Hi all- > > I am evaluating databases for use in a large project that will hold image > data as blobs. I know, everybody says to just store pointers to files on the > disk...can't do it here...the images are of a confidential nature and access > to the database (and resulting client app) will be highly restricted. The > underlying platform will likely be Linux though Solaris x86-64 has been > suggested as well. > > I did some tests with MySQL and found the results very sub-par...the > standard blob field only holds 64k (they have three types of blobs for > whatever reason) and the real problem is that my uploads and downloads have > failed because of packet size issues...this can be solved somewhat with > server settings, but I get the impression that blobs are barely supported. > > So turning to Postgresql, can I get any recommendations, suggestions and > tips on blob handling in the database? The image sizes will be pretty > variable, from a few kilobytes to several hundred megabytes, so I need > something that will handle the various file sizes, hopefully transparently. for fast performance, you should make sure to use the parameterized interface and send in the results as binary (or use a language that accesses the database that way). I would be nervous about storing blobs if they were very large. regarding the security issue, have you looked at encryption? merlin
The language is Java. I've made some tests and they work very well for 25meg files....works exactly the way it should, first time. MySQL had all kinds of nasty surprises for me when I first started working with blobs, but I can say that I took my code, changed the driver, and it all works like a champ (mind you, this was a quick test app).
I haven't looked at encryption at the database level....is such a thing available? I know Oracle has some form of data encryption at the database level so the nefarious DBA with the wide mustache and black brimmed hat always going "ah ha ha ha ha" can't make off with the data, but does Postgres have something similar?
BTW, to put into context, the database will be designed to hold evidence (well, photos and videos of). Thus the compelling need for some security, as well as the variation in file sizes.
I haven't looked at encryption at the database level....is such a thing available? I know Oracle has some form of data encryption at the database level so the nefarious DBA with the wide mustache and black brimmed hat always going "ah ha ha ha ha" can't make off with the data, but does Postgres have something similar?
BTW, to put into context, the database will be designed to hold evidence (well, photos and videos of). Thus the compelling need for some security, as well as the variation in file sizes.
On 8/17/07, Merlin Moncure <mmoncure@gmail.com> wrote:
On 8/15/07, Ron Olson <tachoknight@gmail.com> wrote:
> Hi all-
>
> I am evaluating databases for use in a large project that will hold image
> data as blobs. I know, everybody says to just store pointers to files on the
> disk...can't do it here...the images are of a confidential nature and access
> to the database (and resulting client app) will be highly restricted. The
> underlying platform will likely be Linux though Solaris x86-64 has been
> suggested as well.
>
> I did some tests with MySQL and found the results very sub-par...the
> standard blob field only holds 64k (they have three types of blobs for
> whatever reason) and the real problem is that my uploads and downloads have
> failed because of packet size issues...this can be solved somewhat with
> server settings, but I get the impression that blobs are barely supported.
>
> So turning to Postgresql, can I get any recommendations, suggestions and
> tips on blob handling in the database? The image sizes will be pretty
> variable, from a few kilobytes to several hundred megabytes, so I need
> something that will handle the various file sizes, hopefully transparently.
for fast performance, you should make sure to use the parameterized
interface and send in the results as binary (or use a language that
accesses the database that way). I would be nervous about storing
blobs if they were very large.
regarding the security issue, have you looked at encryption?
merlin
On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote: > The language is Java. I've made some tests and they work very well for 25meg > files....works exactly the way it should, first time. MySQL had all kinds of > nasty surprises for me when I first started working with blobs, but I can > say that I took my code, changed the driver, and it all works like a champ > (mind you, this was a quick test app). > > I haven't looked at encryption at the database level....is such a thing > available? I know Oracle has some form of data encryption at the database > level so the nefarious DBA with the wide mustache and black brimmed hat > always going "ah ha ha ha ha" can't make off with the data, but does > Postgres have something similar? > > BTW, to put into context, the database will be designed to hold evidence > (well, photos and videos of). Thus the compelling need for some security, as > well as the variation in file sizes. Well, my assumption was that you would encrypt the data on the client side and store it that way. PostgreSQL has open architecture. If you wanted to do the encryption on the server, one possible approach that jumps out at me is to write a small C function which receives the data, encrypts the image using a key sent by the client all (but not stored), and either stores the encrypted image back in the database via SPI or writes it out to a file. There are many strategies to encrypting data...first thing to think about is where the encryption happens, where the keys are stored, etc. merlin
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/17/07 23:16, Merlin Moncure wrote: > On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote: >> The language is Java. I've made some tests and they work very well for 25meg >> files....works exactly the way it should, first time. MySQL had all kinds of >> nasty surprises for me when I first started working with blobs, but I can >> say that I took my code, changed the driver, and it all works like a champ >> (mind you, this was a quick test app). >> >> I haven't looked at encryption at the database level....is such a thing >> available? I know Oracle has some form of data encryption at the database >> level so the nefarious DBA with the wide mustache and black brimmed hat >> always going "ah ha ha ha ha" can't make off with the data, but does >> Postgres have something similar? >> >> BTW, to put into context, the database will be designed to hold evidence >> (well, photos and videos of). Thus the compelling need for some security, as >> well as the variation in file sizes. > > Well, my assumption was that you would encrypt the data on the client > side and store it that way. > > PostgreSQL has open architecture. If you wanted to do the encryption > on the server, one possible approach that jumps out at me is to write > a small C function which receives the data, encrypts the image using a > key sent by the client all (but not stored), and either stores the > encrypted image back in the database via SPI or writes it out to a > file. > > There are many strategies to encrypting data...first thing to think > about is where the encryption happens, where the keys are stored, etc. Client-side encryption is important, because with server-side encryption, you are sending the Valuable Data across the wire (or, even worse!) wireless in cleartext form. It's more likely that there's a packet sniffer on the network than an Evil DBA snooping around. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGxwf7S9HxQb37XmcRAimGAJ98Kykormb63BedYknIij2xZvDgEACgw23C eWn7JJKSs1KL9dSfVx3p/BY= =OLl1 -----END PGP SIGNATURE-----
Ron Johnson wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 08/17/07 23:16, Merlin Moncure wrote: >> On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote: >>> The language is Java. I've made some tests and they work very well for 25meg >>> files....works exactly the way it should, first time. MySQL had all kinds of >>> nasty surprises for me when I first started working with blobs, but I can >>> say that I took my code, changed the driver, and it all works like a champ >>> (mind you, this was a quick test app). >>> >>> I haven't looked at encryption at the database level....is such a thing >>> available? I know Oracle has some form of data encryption at the database >>> level so the nefarious DBA with the wide mustache and black brimmed hat >>> always going "ah ha ha ha ha" can't make off with the data, but does >>> Postgres have something similar? >>> >>> BTW, to put into context, the database will be designed to hold evidence >>> (well, photos and videos of). Thus the compelling need for some security, as >>> well as the variation in file sizes. >> Well, my assumption was that you would encrypt the data on the client >> side and store it that way. >> >> PostgreSQL has open architecture. If you wanted to do the encryption >> on the server, one possible approach that jumps out at me is to write >> a small C function which receives the data, encrypts the image using a >> key sent by the client all (but not stored), and either stores the >> encrypted image back in the database via SPI or writes it out to a >> file. >> >> There are many strategies to encrypting data...first thing to think >> about is where the encryption happens, where the keys are stored, etc. > > Client-side encryption is important, because with server-side > encryption, you are sending the Valuable Data across the wire (or, > even worse!) wireless in cleartext form. > > It's more likely that there's a packet sniffer on the network than > an Evil DBA snooping around. > The two options I see are - 1. the client encrypts the data and sends it to the DB 2. the client uses an SSL connection to the server to prevent snooping and lets the DB encrypt for storage. I would suggest looking at pgcrypto in contrib for server side encryption. The main benefit I would see from the first is it doesn't matter if another DB admin changes the server security settings or not. The new guy may setup a new server and not enforce SSL connections. Of course if the client refused non-SSL connections you can prevent that. Either way the app provides the key to decrypt the data for viewing, so the developers current and future must maintain the security level you choose. What sort of security measures are taken for viewing the data? Will each user have a security certificate on their own USB flash drive to allow them to view the data? which could also prevent developers from accessing the data. Or is their password enough to allow the program to decrypt it for them? It would really come down to which encryption method you find easiest to implement that provides enough security for your needs. -- Shane Ambler pgSQL@Sheeky.Biz Get Sheeky @ http://Sheeky.Biz