Re: To BLOB Or Not To BLOB
От | chewie@wookimus.net |
---|---|
Тема | Re: To BLOB Or Not To BLOB |
Дата | |
Msg-id | 20000417024654.C11802@wookimus.net обсуждение исходный текст |
Ответ на | Re: To BLOB Or Not To BLOB ("Robert B. Easter" <reaster@comptechnews.com>) |
Список | pgsql-general |
How do I start this? There are so many different ways to look at this question. Do I save files inside the database, or do I simply use the database as a tool to query information about a given file, it's properties, etc. Instead of making general statements, let me pick specific examples. Let's say we want to create an MP3 repository. Our goals are to be able to query the information saved in ID3 tags, to locate, organize, manipulate, or play the MP3. To narrow the scope of discussion, let's say I will be accessing these files locally, on my filesystem or a NFS mounted directory. What would be the most efficient way for us to query the information found in the ID3 tags? We could parse each file in a number of directories and their subdirectories for the information we require each time we need it, or we could parse it once and save the information in a database structured around ID3 information. I would say the later is definitely more desireable and acceptable. Inside the database, the ID3 tags can be indexed and our searches optimized by the information we're requesting. But should we store the mp3's inside the database? I say no. A few observations: * Filesystems are generally better at allocating space for files than databases are. * Storing files inside databases renders shell access or direct access to files as impossible. If the database goes South, so does access to the files. Now, our mp3 decoders are shell programs that read mp3 input by opening a file, a URL connection, or reading it from standard input (that is if we have a good mp3 player). Yet, in order for these mp3 decoders to be able to play an mp3 found inside the database, we are forced to use a specialized database interface. It may return the mp3 file to the decoder via standard input, or it may export the file to a temp directory and feed the mp3 player the location of that temp file. Yet, why should we go through all of this trouble? And why should we tie the operation of two clearly separate applications into such inflexible relationships? We add overhead to the database to output this file, and we add another level of logic to get the file to the decoder. A level of logic that could prove the downfall of a would-be MP3 mobile disc jockey, who's worked so hard to please the Wicked Witch of the Wedding. He'd be forced to pull out the CD's and tapes he so dreads -- if he remembered to bring them along. If we keep file access decoupled from gleaning catagorizable, quantifiable data about/from the file, we allow two separate and efficient tools to do their jobs independently. If file access is your concern, then use a file system that will give you the level of control you desire to develop your security policies. If grabbing information from a data file is important, such as parsing text files for content, then build indexes to the data inside the file and store those indexes in the database. Internet directories are excellent examples of this! We don't neuter local file access by the applications that need the data files, and we still provide access to searchable information. I would argue that access to a data file and querying sets of data files for information are two separate layers of logic. Logical layers are best served when they are given tools to interact, but are not crippled by tight integration. Saving data files inside a database, IMHO, is an example of tight integration, and something to be avoided if possible -- at least in this case. -- Chad "^chewie, gunnarr" Walstrom <chewie@wookimus.net> http://wookimus.net/chewie
В списке pgsql-general по дате отправления:
Предыдущее
От: Haroldo StengerДата:
Сообщение: Re: Does error within transaction imply restarting it?