Обсуждение: used for large media files

Поиск
Список
Период
Сортировка

used for large media files

От
"Mike Kay"
Дата:
Greetings. I am in the process of deciding my infrastruture for a web
based application dealing with audio, video and image files. In my
discussions with web developers PostgreSql came up as a candidate for my
database. This is my FIRST introduction to this database, although I've
heard of it - I have no knowledge of using it.

What I am attempting to build is a database driven web site that allows
users to easily upload either audio, video or images of any type -
categorize the files, then output the files via streaming. I would like
the users to be able to voice annotate images and build presentations.

Would PostgreSQL be a good database for this type of application? Database
size could grow large very quickly.

I have not decided on the rest of my platform yet, for coding the
interface and presenting it.

Any input most appreciated.


Re: used for large media files

От
Bill Moran
Дата:
In response to "Mike Kay" <mikekay@channelk.ca>:

> Greetings. I am in the process of deciding my infrastruture for a web
> based application dealing with audio, video and image files. In my
> discussions with web developers PostgreSql came up as a candidate for my
> database. This is my FIRST introduction to this database, although I've
> heard of it - I have no knowledge of using it.
>
> What I am attempting to build is a database driven web site that allows
> users to easily upload either audio, video or images of any type -
> categorize the files, then output the files via streaming. I would like
> the users to be able to voice annotate images and build presentations.
>
> Would PostgreSQL be a good database for this type of application? Database
> size could grow large very quickly.

Yes.  PostgreSQL will handle this application swimmingly.

In addition to everything else you'd want in a database, PG has a large
object API that allows you to store very large data streams such that
you can easily access individual parts of the stream (i.e., you can
start accessing a video stream at any point in the stream without having
to read through everything before that)
http://www.postgresql.org/docs/8.3/static/largeobjects.html

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: used for large media files

От
"Mike Kay"
Дата:
Now that's an interesting way of doing this I never thought about before.
Using a fileserver though, how would I categorize and index the files?

I was planning on using multiple databases to hold the data - one for each
client and a separate database for each file type. Yes, they would be
hosted on the same server.  I see the bottleneck.

I suppose that instead of saving the files, indexes and categories all in
the same database, I could simply reference the location and file names in
the database - and index and categorize in this manner. Does this make
sense?


> When this question comes up every now and again (check the archives)
> the consensus turns out to be that, yes, Postgres will do this for you
> just fine..... just so long as you realize that storing big blobs of
> unchanging data in any relational database may not be the best use of
> a database.
>
> On the plus side, you know that all your media and all its metadata is
> transactionally safe, and that it's all in the same place. On the
> negative side, it's all in the same place, which means the database
> can become more of a bottleneck than it needs to be. Fileservers are
> *cheap* compared to database servers, and scale out much better.
> Databases go faster when they don't have to keep track of as much stuff.
>
> But this is all true of any RDMS. If it's what you want, then Postgres
> will do as good a job of it as you're going to find anywhere else.
>
> On Jun 17, 2009, at 5:12 AM, Mike Kay wrote:
>
>> Greetings. I am in the process of deciding my infrastruture for a web
>> based application dealing with audio, video and image files. In my
>> discussions with web developers PostgreSql came up as a candidate
>> for my
>> database. This is my FIRST introduction to this database, although
>> I've
>> heard of it - I have no knowledge of using it.
>>
>> What I am attempting to build is a database driven web site that
>> allows
>> users to easily upload either audio, video or images of any type -
>> categorize the files, then output the files via streaming. I would
>> like
>> the users to be able to voice annotate images and build presentations.
>>
>> Would PostgreSQL be a good database for this type of application?
>> Database
>> size could grow large very quickly.
>>
>> I have not decided on the rest of my platform yet, for coding the
>> interface and presenting it.
>>
>> Any input most appreciated.
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>


The highest achievement possible is compassion.


Re: used for large media files

От
Steve Atkins
Дата:
On Jun 17, 2009, at 8:43 AM, Mike Kay wrote:

> Now that's an interesting way of doing this I never thought about
> before.
> Using a fileserver though, how would I categorize and index the files?
>
> I was planning on using multiple databases to hold the data - one
> for each
> client and a separate database for each file type. Yes, they would be
> hosted on the same server.  I see the bottleneck.
>
> I suppose that instead of saving the files, indexes and categories
> all in
> the same database, I could simply reference the location and file
> names in
> the database - and index and categorize in this manner. Does this make
> sense?

Storing all the metadata in the database and the content on the
filesystem
of the webserver lets both do what they're good at.

Serving static files from the filesystem of the webserver is
ridiculously
cheap compared with retrieving the data from the database, as it's
something that everything from the kernel up is optimized to do.
Backups are much simpler too.

All categorization and suchlike is done via the database, with the
only time you hit the main filesystem being when you want to serve
the media file itself to the user.

The only structure you really need for the static files is something
that
makes accessing them reasonably cheap, which is mostly driven by
limiting the number of files or subdirectories in each directory to a
reasonable number. The names of the files and the directory structure
don't really need to be meaningful.

I've done something similar and added some level of transaction
safety to removing files by having any deletion handled through
the database, using triggers to add the filename of any object that
was removed to a queue table. Then an external process polls that
queue table and removes the actual file.

Cheers,
   Steve



Re: used for large media files

От
justin
Дата:
Steve Atkins wrote:
> On Jun 17, 2009, at 8:43 AM, Mike Kay wrote:
>
>> Now that's an interesting way of doing this I never thought about
>> before.
>> Using a fileserver though, how would I categorize and index the files?
>>
>> I was planning on using multiple databases to hold the data - one for
>> each
>> client and a separate database for each file type. Yes, they would be
>> hosted on the same server.  I see the bottleneck.
>>
>> I suppose that instead of saving the files, indexes and categories
>> all in
>> the same database, I could simply reference the location and file
>> names in
>> the database - and index and categorize in this manner. Does this make
>> sense?
>
> Storing all the metadata in the database and the content on the
> filesystem
> of the webserver lets both do what they're good at.
>
> Serving static files from the filesystem of the webserver is ridiculously
> cheap compared with retrieving the data from the database, as it's
> something that everything from the kernel up is optimized to do.
> Backups are much simpler too.
>

Using the database to store BLOBs or do it via File system  is a very
old debate going back and fourth with common tone the db is slower the
file system is faster. Using a DB easies maintenance, simplifies
indexing, security and gives transaction protection to the files.

In my view the only argument still holding water storing  large binary
files on the Filesystem vs the DB is the overhead/access time losses
connecting and read data from DB.   The file system  just wins out  yet
has several draw backs.

Also consider one does not need to use the large object interface
anymore,  the bytea type with  TOAST simplify that problem .  The draw
back is you can't jump around the binary stream and the size is limited
to 1Gig per record.

One of the big draw backs to using File system and a DB for
indexing/meta data is keeping the two up to date and linked.  If files
get accidentally deleted or moved to different directories the database
index is now useless.  This by itself can cause maintenance nightmare as
the number of files and directories  get into 10 of thousands.  This
also complicates disaster recovery because the directory structure has
to be recreated exactly to get it to work again.