Обсуждение: used for large media files
Greetings. I am in the process of deciding my infrastruture for a web based application dealing with audio, video and image files. In my discussions with web developers PostgreSql came up as a candidate for my database. This is my FIRST introduction to this database, although I've heard of it - I have no knowledge of using it. What I am attempting to build is a database driven web site that allows users to easily upload either audio, video or images of any type - categorize the files, then output the files via streaming. I would like the users to be able to voice annotate images and build presentations. Would PostgreSQL be a good database for this type of application? Database size could grow large very quickly. I have not decided on the rest of my platform yet, for coding the interface and presenting it. Any input most appreciated.
In response to "Mike Kay" <mikekay@channelk.ca>: > Greetings. I am in the process of deciding my infrastruture for a web > based application dealing with audio, video and image files. In my > discussions with web developers PostgreSql came up as a candidate for my > database. This is my FIRST introduction to this database, although I've > heard of it - I have no knowledge of using it. > > What I am attempting to build is a database driven web site that allows > users to easily upload either audio, video or images of any type - > categorize the files, then output the files via streaming. I would like > the users to be able to voice annotate images and build presentations. > > Would PostgreSQL be a good database for this type of application? Database > size could grow large very quickly. Yes. PostgreSQL will handle this application swimmingly. In addition to everything else you'd want in a database, PG has a large object API that allows you to store very large data streams such that you can easily access individual parts of the stream (i.e., you can start accessing a video stream at any point in the stream without having to read through everything before that) http://www.postgresql.org/docs/8.3/static/largeobjects.html -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/
Now that's an interesting way of doing this I never thought about before. Using a fileserver though, how would I categorize and index the files? I was planning on using multiple databases to hold the data - one for each client and a separate database for each file type. Yes, they would be hosted on the same server. I see the bottleneck. I suppose that instead of saving the files, indexes and categories all in the same database, I could simply reference the location and file names in the database - and index and categorize in this manner. Does this make sense? > When this question comes up every now and again (check the archives) > the consensus turns out to be that, yes, Postgres will do this for you > just fine..... just so long as you realize that storing big blobs of > unchanging data in any relational database may not be the best use of > a database. > > On the plus side, you know that all your media and all its metadata is > transactionally safe, and that it's all in the same place. On the > negative side, it's all in the same place, which means the database > can become more of a bottleneck than it needs to be. Fileservers are > *cheap* compared to database servers, and scale out much better. > Databases go faster when they don't have to keep track of as much stuff. > > But this is all true of any RDMS. If it's what you want, then Postgres > will do as good a job of it as you're going to find anywhere else. > > On Jun 17, 2009, at 5:12 AM, Mike Kay wrote: > >> Greetings. I am in the process of deciding my infrastruture for a web >> based application dealing with audio, video and image files. In my >> discussions with web developers PostgreSql came up as a candidate >> for my >> database. This is my FIRST introduction to this database, although >> I've >> heard of it - I have no knowledge of using it. >> >> What I am attempting to build is a database driven web site that >> allows >> users to easily upload either audio, video or images of any type - >> categorize the files, then output the files via streaming. I would >> like >> the users to be able to voice annotate images and build presentations. >> >> Would PostgreSQL be a good database for this type of application? >> Database >> size could grow large very quickly. >> >> I have not decided on the rest of my platform yet, for coding the >> interface and presenting it. >> >> Any input most appreciated. >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > The highest achievement possible is compassion.
On Jun 17, 2009, at 8:43 AM, Mike Kay wrote: > Now that's an interesting way of doing this I never thought about > before. > Using a fileserver though, how would I categorize and index the files? > > I was planning on using multiple databases to hold the data - one > for each > client and a separate database for each file type. Yes, they would be > hosted on the same server. I see the bottleneck. > > I suppose that instead of saving the files, indexes and categories > all in > the same database, I could simply reference the location and file > names in > the database - and index and categorize in this manner. Does this make > sense? Storing all the metadata in the database and the content on the filesystem of the webserver lets both do what they're good at. Serving static files from the filesystem of the webserver is ridiculously cheap compared with retrieving the data from the database, as it's something that everything from the kernel up is optimized to do. Backups are much simpler too. All categorization and suchlike is done via the database, with the only time you hit the main filesystem being when you want to serve the media file itself to the user. The only structure you really need for the static files is something that makes accessing them reasonably cheap, which is mostly driven by limiting the number of files or subdirectories in each directory to a reasonable number. The names of the files and the directory structure don't really need to be meaningful. I've done something similar and added some level of transaction safety to removing files by having any deletion handled through the database, using triggers to add the filename of any object that was removed to a queue table. Then an external process polls that queue table and removes the actual file. Cheers, Steve
Steve Atkins wrote: > On Jun 17, 2009, at 8:43 AM, Mike Kay wrote: > >> Now that's an interesting way of doing this I never thought about >> before. >> Using a fileserver though, how would I categorize and index the files? >> >> I was planning on using multiple databases to hold the data - one for >> each >> client and a separate database for each file type. Yes, they would be >> hosted on the same server. I see the bottleneck. >> >> I suppose that instead of saving the files, indexes and categories >> all in >> the same database, I could simply reference the location and file >> names in >> the database - and index and categorize in this manner. Does this make >> sense? > > Storing all the metadata in the database and the content on the > filesystem > of the webserver lets both do what they're good at. > > Serving static files from the filesystem of the webserver is ridiculously > cheap compared with retrieving the data from the database, as it's > something that everything from the kernel up is optimized to do. > Backups are much simpler too. > Using the database to store BLOBs or do it via File system is a very old debate going back and fourth with common tone the db is slower the file system is faster. Using a DB easies maintenance, simplifies indexing, security and gives transaction protection to the files. In my view the only argument still holding water storing large binary files on the Filesystem vs the DB is the overhead/access time losses connecting and read data from DB. The file system just wins out yet has several draw backs. Also consider one does not need to use the large object interface anymore, the bytea type with TOAST simplify that problem . The draw back is you can't jump around the binary stream and the size is limited to 1Gig per record. One of the big draw backs to using File system and a DB for indexing/meta data is keeping the two up to date and linked. If files get accidentally deleted or moved to different directories the database index is now useless. This by itself can cause maintenance nightmare as the number of files and directories get into 10 of thousands. This also complicates disaster recovery because the directory structure has to be recreated exactly to get it to work again.