Re: Big image tables maintenance

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Big image tables maintenance
Дата
Msg-id 20180917135820.GU4184@tamriel.snowman.net
обсуждение исходный текст
Ответ на Big image tables maintenance  (still Learner <stilllearner23@gmail.com>)
Ответы Re: Big image tables maintenance  (still Learner <stilllearner23@gmail.com>)
Список pgsql-admin
Greetings,

(limiting this to -admin, cross-posting this to a bunch of different
lists really isn't helpful)

* still Learner (stilllearner23@gmail.com) wrote:
> I have a 10 TB size table with multiple bytea columns (image & doc)and
> makes 20TB of DB size. I have a couple of issues to maintain the DB.

*What* are those issues..?  That's really the first thing to discuss
here but you don't ask any questions about it or state what the issue is
(except possibly for backups, but we have solutions for that, as
mentioned below).

> 1. I Would like to separate the image column from the 10TB size table,
> place it in a separate schema. The change should not result in any query
> change in the application.  Is it possible? Doing this it should not affect
> the performance.

How large are these images?  PostgreSQL will already pull out large
column values and put them into a side-table for you, behind the scenes,
using a technique called TOAST.  Documentation about TOAST is available
here:

https://www.postgresql.org/docs/current/static/storage-toast.html

> 2. I can't maintain files on File system as the count is huge, so thinking
> of using any no-sql mostly mongo-DB, is it recommended? Or PostgreSQL
> itself can handle?

I suspect you'd find that your data size would end up being much, much
larger if you tried to store it as JSON or in a similar system, and
you're unlikely to get any performance improvement (much more likely the
opposite, in fact).

> 3. Taking the backup of 20TB data, is big task. Any more feasible solution
> other than online backup/pg_dump?

Absolutely, I'd recommend using pgBackRest which supports parallel
online backup and restore.  Using pg_dump for a large system like this
is really not a good idea- your restore time would likely be
particularly terrible and you have no ability to do point-in-time
recovery.  Using pgBackRest and a capable system, you'd be able to get a
complete backup of 20TB in perhaps 6-12 hours, with similar time on the
recovery side.  If you wish to be able to recover faster, running a
replica (as well as doing backups) may be a good idea, perhaps even a
time-delayed one.

> Each image retrieval is

Unfinished thought here..?

> Currently, we are on pg 9.4 and moving to 10.5 soon.

That's definitely a good plan.

Thanks!

Stephen

Вложения

В списке pgsql-admin по дате отправления:

Предыдущее
От: Ron
Дата:
Сообщение: Re: Big image tables maintenance
Следующее
От: Scott Ribe
Дата:
Сообщение: Re: Big image tables maintenance