Re: use BLOBS or use pointers to files on the filesystem?
От | Eric Ridge |
---|---|
Тема | Re: use BLOBS or use pointers to files on the filesystem? |
Дата | |
Msg-id | D3ADE25911614840BC69C72E3171E4ED24865B@tcdiexch.tcdi.com обсуждение исходный текст |
Ответ на | use BLOBS or use pointers to files on the filesystem? ("Travis Bear" <ec9098@no.spam.go.com>) |
Список | pgsql-general |
> > I'm kind of a database newbie in general, so be nice to me if this > > is a bonehead question. I am building a java servlet based web > > application to manage photos and images. My first thougth has been > > to store the images as BLOBs in the database, but I read > one user say > > that you're better off not using BLOBs. Instead, he says, > store your > > large data objects as files on the filesystem, and just use the > > database to store the path to the file and any file-related meta > > data. I remember that thread! That was me that basically said, "store your files on the filesystem, and your data in a database." In my world, we manage in the upwards of 18.5 million images. We have a small database table with about 5 fields (about 4.5gig of data) that tell us where the images are (over 2.3 terrabytes). So you see, we can keep our database server storage "small", and our image farm storage "big". We can upgrade our database server w/o bringing our images offline. We can upgrade our image farm w/o bringing the database offline. Another advantage is that storing pointers allows your images to live on a server that isn't yours! id | path ------------ 2312 | file:///c:/images/you.gif 2313 | http://www.somesite.com/images/me.gif 2314 | ftp://www.somesite.com/images/me2.gif As long as your db application understands the protocols you can point to images anywhere in the known universe. > Compared to retrieving stuff directly from the database, this would be absolutely it's going to be faster! Postgres, and most other db's for that matter, aren't designed to store files. They're not filesystems, they're databases! > significantly faster. You could use a caching mechanism though to make > that disadvantage go away (have a look at www.phpbuilder.com for > examples on how to do that in PHP - which you might draw upon for your > Java solution). caching is good. esp. if the originals are stored in a far away place. saves lots of round-trips. > > The downside of having information in the filesystem (your > images), and meta-information about it in the database (the path to the > images etc.) is that you have to work much harder to make your application > detect and fix concurrency issues: If any other application besides your Servlet The upside is that you have to work harder to make your application detect and fix concurrency issues. > can access the filesystem (which is pretty normal), those others might > modify it without modifying the database. This is a design decision to make up front: "Can (read: should) the filesystem be modified independently of the database?". I say YES: It's 4:45pm and Travis just received 10 CD's of images from a client that need to be loaded into his database by 6:00pm. Each CD contains 10,000 images. (Fourtunality, the CD's contain an index file that closely match his table schema). So while Travis is writing a little perl script to massage the index files into a bunch of INSERT statements, his assistant can begin copying the CD's over to the image server (or directory). And because Travis' storage and data systems are decoupled, he gets the job done on time and the client is happy. > Then you have a filesystem out of sync with the database. Once you have that problem, it > would be hard to recover from. (Could anyone explain that better? I can't seem to > find the right words to make it really clear.) Or, it could be part of the plan. Sure, storing the images as blobs is going to be easier from a development standpoint. But the easy solution isn't always the best solution. eric
В списке pgsql-general по дате отправления:
Предыдущее
От: "Robert B. Easter"Дата:
Сообщение: Re: What is the practical limitation of no multi-threading?