Re: Filesystem vs. Postgres for images

Поиск
Список
Период
Сортировка
От Christopher Petrilli
Тема Re: Filesystem vs. Postgres for images
Дата
Msg-id B57D2445-8D55-11D8-96AC-003065E15634@amber.org
обсуждение исходный текст
Ответ на Re: Filesystem vs. Postgres for images  (Jeremiah Jahn <jeremiah@cs.earlham.edu>)
Ответы Re: Filesystem vs. Postgres for images  ("scott.marlowe" <scott.marlowe@ihs.com>)
Список pgsql-general
On Apr 13, 2004, at 9:40 AM, Jeremiah Jahn wrote:

> There has got to be some sort of standard way to do this. We have the
> same problem where I work. Terabytes of images, but the question is
> still sort of around "BLOBs or Files?" Our final decision was to use
> the
> file system. We found that you didn't really gain anything by storing
> the images in the DB, other than having one place to get the data from.
> The file system approach is much easier to backup, because each image
> can be archived separately as well as browsed by 3rd party tools.

This is a pretty "classic problem," of performance modeling.  While it
wasn't images, I worked on a system that had several million small
files (5-100K) that needed to be stored.  The performance bottleneck
was a couple of things, in storing them in the FS (the bottleneck is
similar in PostgreSQL):

1. Directory name lookups do not scale well, so keep the number of
files in a directory to a manageable number (100-500).
2. Retrieval time is limited not by disk bandwidth, but by I/O seek
performance. More spindles = more concurrent I/O in flight. Also, this
is where SCSI takes a massive lead with tag-command-queuing.

In our case, we ended up using a three-tier directory structure, so
that we could manage the number of files per directory, and then
because load was relatively even across the top 20 "directories", we
split them onto 5 spindle-pairs (i.e. RAID-1).  This is a place where
RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is
a substantial performance increase.

Hope this helps.  Some of these things apply to PostgreSQL, except
until there's better manageability of TABLESPACE, and the ability to
split tables across multiple spaces, it's going to be hard to hit those
numbers.  This is a place where the "big databases" are better.  But
then, that's the top 5% of installs. Tradeoffs.

Chris
--
| Christopher Petrilli
| petrilli (at) amber.org


В списке pgsql-general по дате отправления:

Предыдущее
От: Jeremiah Jahn
Дата:
Сообщение: Re: Filesystem vs. Postgres for images
Следующее
От: Anton Nikiforov
Дата:
Сообщение: Re: Filesystem vs. Postgres for images