Re: How to store "blobs" efficiently for small and large sizes, with random access

Поиск
Список
Период
Сортировка
От Andreas Joseph Krogh
Тема Re: How to store "blobs" efficiently for small and large sizes, with random access
Дата
Msg-id VisenaEmail.8c.b06498274ef2e53f.183f001419c@visena.app.internal.visena.net
обсуждение исходный текст
Ответ на Re: How to store "blobs" efficiently for small and large sizes, with random access  (Dominique Devienne <ddevienne@gmail.com>)
Ответы Re: How to store "blobs" efficiently for small and large sizes, with random access  (Dominique Devienne <ddevienne@gmail.com>)
Re: How to store "blobs" efficiently for small and large sizes, with random access  (Ron <ronljohnsonjr@gmail.com>)
Список pgsql-general
På onsdag 19. oktober 2022 kl. 13:21:38, skrev Dominique Devienne <ddevienne@gmail.com>:
On Wed, Oct 19, 2022 at 1:00 PM Andreas Joseph Krogh <andreas@visena.com> wrote:
> Ok, just something to think about;

Thank you. I do appreciate the feedback.

> Will your database grow beyond 10TB with blobs?

The largest internal store I've seen (for the subset of data that goes
in the DB) is shy of 3TB.
But we are an ISV, not one of our clients, which have truly massive
scale for data.
And they don't share the exact scale of their proprietary data with me...

> If so try to calculate how long it takes to restore, and comply with SLA,
> and how long it would have taken to restore without the blobs.

Something I don't quite get is why somehow backup is no longer needed
if the large blobs are external?
i.e. are you saying backups are so much more worse in PostgreSQL than
with the FS? I'm curious now.

I'm not saying you don't need backup (or redundancy) of other systems holding blobs, but moving them out of RDBMS makes you restore the DB to a consistent state, and able to serve clients, faster. In my experience It's quite unlikely that your (redundant) blob-store needs crash-recovery at the same time you DB does. The same goes with PITR, needed because of some logical error (like client deleted some data they shouldn't have), which is much faster without blobs in DB and doesn't affect the blobstore at all (if you have a smart insert/update/delete-policy there).

 

Also, managing the PostgreSQL server will be the client's own concern
mostly. We are not into Saas here.
As hinted above, the truly massive data is already not in the DB, used
by different systems, and processed
down to the GB sized inputs all the data put in the DB is generated
from. It's a scientific data heavy environment.
And one where security of the data is paramount, for contractual and
legal reasons. Files make that harder IMHO.

Anyways, this is straying from the main theme of this post I'm afraid.
Hopefully we can come back on the main one too. --DD

There's a reason “everybody” advices to move blobs out of DB, I've learned.

 

--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Dominique Devienne
Дата:
Сообщение: Re: How to store "blobs" efficiently for small and large sizes, with random access
Следующее
От: Dominique Devienne
Дата:
Сообщение: Re: How to store "blobs" efficiently for small and large sizes, with random access