Обсуждение: [GENERAL] Can PG replace redis, amqp, s3 in the future?

Поиск
Список
Период
Сортировка

[GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Thomas Güttler
Дата:
Is is possible that PostgreSQL will replace these building blocks in the future?

 - redis (Caching)
 - rabbitmq (amqp)
 - s3 (Blob storage)

One question is "is it possible?", then next "is it feasible?"

I think it would be great if I could use PG only and if I could
avoid the other types of servers.

The benefit is not very obvious on the first sight. I think it will saves you
time, money and energy only in the long run.

What do you think?

Regards,
  Thomas Güttler


--
I am looking for feedback for my personal programming guidelines:
https://github.com/guettli/programming-guidelines


Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Thomas Delrue
Дата:
On April 30, 2017 1:37:02 PM GMT+02:00, "Thomas Güttler" <guettliml@thomas-guettler.de> wrote:
>Is is possible that PostgreSQL will replace these building blocks in
>the future?
>
> - redis (Caching)
> - rabbitmq (amqp)
> - s3 (Blob storage)

These are three very different sets of functionalities, each requiring a different approach.  I am curious as to why
youare thinking about having a single piece of software that does these three very different things.  

>One question is "is it possible?", then next "is it feasible?"

Possible? Sure: p != 0
Probable? No
Desirable? No

>I think it would be great if I could use PG only and if I could
>avoid the other types of servers.

When you're holding a hammer, everything looks like a nail. But hammering screws doesn't get you very far.  Sometimes
youneed a screwdriver and on other days a glue gun... 

>The benefit is not very obvious on the first sight. I think it will
>saves you
>time, money and energy only in the long run.
>
>What do you think?

Do one thing(*) and do that thing well. Don't try to be everything to everyone.


--
Thomas
(Sent from my mobile device,  please forgive brevity or typos.)


Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Steve Atkins
Дата:
> On Apr 30, 2017, at 4:37 AM, Thomas Güttler <guettliml@thomas-guettler.de> wrote:
>
> Is is possible that PostgreSQL will replace these building blocks in the future?
>
> - redis (Caching)
> - rabbitmq (amqp)
> - s3 (Blob storage)

No.

You can use postgresql for caching, but caches don't require the data
durability that a database offers, and can be implemented much more
efficiently.

You can use postgresql to provide message queue services and it
does so reasonably well, particularly when the messages are generated within
the database. But it's not going to do so as efficiently, or be as easy to
monitor, to make highly redundant or to scale across a whole datacenter
as a dedicated message queue service.

You could use postgresql to store binary blobs, but it'd be a horrifically
inefficient way to do it. (Using postgresql to store the metadata, while
the content is stored elsewhere, sure).

Use the right tool for the job.

Cheers,
  Steve

>
> One question is "is it possible?", then next "is it feasible?"
>
> I think it would be great if I could use PG only and if I could
> avoid the other types of servers.
>
> The benefit is not very obvious on the first sight. I think it will saves you
> time, money and energy only in the long run.
>
> What do you think?
>
> Regards,
>  Thomas Güttler
>
>
> --
> I am looking for feedback for my personal programming guidelines:
> https://github.com/guettli/programming-guidelines
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Bill Moran
Дата:
On Sun, 30 Apr 2017 13:37:02 +0200
Thomas Güttler <guettliml@thomas-guettler.de> wrote:

> Is is possible that PostgreSQL will replace these building blocks in the future?
>
>  - redis (Caching)
>  - rabbitmq (amqp)
>  - s3 (Blob storage)
>
> One question is "is it possible?", then next "is it feasible?"
>
> I think it would be great if I could use PG only and if I could
> avoid the other types of servers.
>
> The benefit is not very obvious on the first sight. I think it will saves you
> time, money and energy only in the long run.
>
> What do you think?

There's a well-written article I saw recently that directly addresses
your question ... I'm too lazy to find it, but google will probably
turn it up for you.

The upshot is that Postgres does a lot of things well, but when the need
comes up to do them _REALLY_ well, you're generally better off picking a
tool that's specialized for your needs.

Take a message bus for example. PG's notify works pretty damn well as a
centralized message bus. But if you need a distributed message bus or you
need massive throughput, you're almost certainly better of with something
specifically designed for that purpose.

Of course, if you need structured, relational data to be stored reliably,
you can't do much better than Postgres.

--
Bill Moran <wmoran@potentialtech.com>


Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Thomas Güttler
Дата:
Am 30.04.2017 um 15:39 schrieb Thomas Delrue:
> On April 30, 2017 1:37:02 PM GMT+02:00, "Thomas Güttler" <guettliml@thomas-guettler.de> wrote:
>> Is is possible that PostgreSQL will replace these building blocks in
>> the future?
>>
>> - redis (Caching)
>> - rabbitmq (amqp)
>> - s3 (Blob storage)
>
> These are three very different sets of functionalities, each requiring a different approach.  I am curious as to why
youare thinking about having a single piece of software that does these three very different things.  


I love transactions. If you store data in four systems (three from above plus PG), then you have
better performance if you have high load. But what happens if a transaction fails (rolls back). Then this
can leave the other data sinks in a broken state. Example: a blob in s3 might be updated, but the rollback
in PG does not rollback in s3 ....

And one other benefit if you have one system: Configuration management is easier.

I know that if you have very high load, then you need to optimize.

But in my context the load is far from high. Robust transactions (including
rollback in all related systems) is more important for me.

For the rabbitmq/amqp part I found that the new SKIP LOCKED feature can help:

  https://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-postgresql-9-5/

Leaving two things open:

 - blob storage
 - redis/caching

>> One question is "is it possible?", then next "is it feasible?"
>
> Possible? Sure: p != 0
> Probable? No
> Desirable? No
>
>> I think it would be great if I could use PG only and if I could
>> avoid the other types of servers.
>
> When you're holding a hammer, everything looks like a nail. But hammering screws doesn't get you very far. Sometimes
youneed a screwdriver and on other days a glue gun... 

Yes, you are right.


Regards,
  Thomas Güttler

--
I am looking for feedback for my personal programming guidelines:
https://github.com/guettli/programming-guidelines


Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Thomas Güttler
Дата:
Am 30.04.2017 um 17:09 schrieb Bill Moran:
> On Sun, 30 Apr 2017 13:37:02 +0200
> Thomas Güttler <guettliml@thomas-guettler.de> wrote:
>
>> Is is possible that PostgreSQL will replace these building blocks in the future?
>>
>>  - redis (Caching)
>>  - rabbitmq (amqp)
>>  - s3 (Blob storage)
>>
>> One question is "is it possible?", then next "is it feasible?"
>>
>> I think it would be great if I could use PG only and if I could
>> avoid the other types of servers.
>>
>> The benefit is not very obvious on the first sight. I think it will saves you
>> time, money and energy only in the long run.
>>
>> What do you think?
>
> There's a well-written article I saw recently that directly addresses
> your question ... I'm too lazy to find it, but google will probably
> turn it up for you.
>

I tried to find it, but failed. Can you give me some keywords to find
this well-written article?



> Take a message bus for example. PG's notify works pretty damn well as a
> centralized message bus. But if you need a distributed message bus or you
> need massive throughput, you're almost certainly better of with something
> specifically designed for that purpose.

SELECT FOR UPDATE ... SKIP LOCKED looks nice:

 https://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-postgresql-9-5/




> Of course, if you need structured, relational data to be stored reliably,
> you can't do much better than Postgres.

Yes, PG is our solid central data storage.

Regards,
 Thomas Güttler


--
I am looking for feedback for my personal programming guidelines:
https://github.com/guettli/programming-guidelines


Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
"Sven R. Kunze"
Дата:
On 30.04.2017 16:25, Steve Atkins wrote:
You can use postgresql for caching, but caches don't require the data
durability that a database offers, and can be implemented much more
efficiently.

I for one can understand Thomas' need for a single solution.
Just recently I needed a cache which was supposed to be set up in a SERIALIZABLE manner as in https://www.postgresql.org/docs/devel/static/transaction-iso.html#xact-serializable Available cache mechanisms would have produce erroneous results. So, I went for PG.

But it's still a cache, isn't it?

You can use postgresql to provide message queue services and it
does so reasonably well, particularly when the messages are generated within
the database. But it's not going to do so as efficiently, or be as easy to
monitor, to make highly redundant or to scale across a whole datacenter
as a dedicated message queue service.

You could use postgresql to store binary blobs, but it'd be a horrifically
inefficient way to do it. (Using postgresql to store the metadata, while
the content is stored elsewhere, sure).

Use the right tool for the job.

I think it's not as easy as ads and buzz words make us believe it is.

Especially when it comes to reinventing the wheel, I prefer a single solution. With the better JSON support, PG made NoSQL obsolete. I don't see why this cannot happen with blob storage and massive scale out. Just a matter of time, if you ask me.

Regards,
Sven

Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Bill Moran
Дата:
On Mon, 1 May 2017 22:02:15 +0200
Thomas Güttler <guettliml@thomas-guettler.de> wrote:
> Leaving two things open:
>
>  - blob storage
>  - redis/caching

I've used Postgres for both of these purposes, and at the load
level we were experiencing at the time, it worked fine.

We later implemented Redis when our caching requirements exceeded
what Postgres could do in that capacity. We never switched to
anything else for blob storage, as Postgres was always sufficient.

--
Bill Moran <wmoran@potentialtech.com>


Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Bill Moran
Дата:
On Mon, 1 May 2017 22:05:03 +0200
Thomas Güttler <guettliml@thomas-guettler.de> wrote:
> >
> > There's a well-written article I saw recently that directly addresses
> > your question ... I'm too lazy to find it, but google will probably
> > turn it up for you.
>
> I tried to find it, but failed. Can you give me some keywords to find
> this well-written article?

I can't seem find it again. Sorry.

--
Bill Moran <wmoran@potentialtech.com>


Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Jeff Janes
Дата:
On Sun, Apr 30, 2017 at 4:37 AM, Thomas Güttler <guettliml@thomas-guettler.de> wrote:
Is is possible that PostgreSQL will replace these building blocks in the future?

 - redis (Caching)

PostgreSQL has its own caching.  It might not be quite as effective as redis', but you can us it if you are willing to take those trade offs.

 - rabbitmq (amqp)

PostgreSQL has its own system for this, and other ones can be layered on top of fully transactional tables.  
Again, you can use one or the other, depending on your needs, if you are willing to deal with the trade offs.

 
 - s3 (Blob storage)

No.  You can certainly use PostgreSQL to store blobs.  But then, you need to store the PostgreSQL data **someplace**.  If you don't store it in S3, you have to store it somewhere else.
 
Cheers,

Jeff

[GENERAL] Caching and Blobs in PG? Was: Can PG replace redis, amqp, s3 in thefuture?

От
Thomas Güttler
Дата:
Am 02.05.2017 um 05:43 schrieb Jeff Janes:
> On Sun, Apr 30, 2017 at 4:37 AM, Thomas Güttler <guettliml@thomas-guettler.de <mailto:guettliml@thomas-guettler.de>>
wrote:
>
>     Is is possible that PostgreSQL will replace these building blocks in the future?
>
>      - redis (Caching)
>
>
> PostgreSQL has its own caching.  It might not be quite as effective as redis', but you can us it if you are willing
to
> take those trade offs.

What kind of caching does PG offer?

I would use a table with a mtime-column and delete the content after N days.



>      - rabbitmq (amqp)
>
>
> PostgreSQL has its own system for this, and other ones can be layered on top of fully transactional tables.
> Again, you can use one or the other, depending on your needs, if you are willing to deal with the trade offs.
>
>
>
>      - s3 (Blob storage)
>
>


> No.  You can certainly use PostgreSQL to store blobs.  But then, you need to store the PostgreSQL data **someplace**.
> If you don't store it in S3, you have to store it somewhere else.

I don't understand what you mean here. AFAIK storing blobs in PG is not recommended since it is not very efficient.


Regards,
   Thomas Güttler




--
Thomas Guettler http://www.thomas-guettler.de/


Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
"Sven R. Kunze"
Дата:
On 03.05.2017 12:57, Thomas Güttler wrote:
> Am 02.05.2017 um 05:43 schrieb Jeff Janes:
>> On Sun, Apr 30, 2017 at 4:37 AM, Thomas Güttler
>> <guettliml@thomas-guettler.de <mailto:guettliml@thomas-guettler.de>>
>> wrote:
>>
>>     Is is possible that PostgreSQL will replace these building blocks
>> in the future?
>>
>>      - redis (Caching)
>>
>>
>> PostgreSQL has its own caching.  It might not be quite as effective
>> as redis', but you can us it if you are willing to
>> take those trade offs.
>
> What kind of caching does PG offer?
>
> I would use a table with a mtime-column and delete the content after N
> days.

After searching the web, it seems to me that PostgreSQL doesn't offer a
cron-like background job for cleanup tasks.

http://stackoverflow.com/questions/18187490/postgresql-delete-old-rows-on-a-rolling-basis


But there's an extension - pg_cron:
https://www.citusdata.com/blog/2016/09/09/pgcron-run-periodic-jobs-in-postgres/

>
>> No.  You can certainly use PostgreSQL to store blobs.  But then, you
>> need to store the PostgreSQL data **someplace**.
>> If you don't store it in S3, you have to store it somewhere else.
>
> I don't understand what you mean here. AFAIK storing blobs in PG is
> not recommended since it is not very efficient.

Seems like several people here disagree with this conventional wisdom.

I think what he was talking about the data itself. You have to store the
bits and bytes somewhere (e.g. on S3).

Sven


Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
John R Pierce
Дата:
On 5/4/2017 2:08 PM, Sven R. Kunze wrote:
No.  You can certainly use PostgreSQL to store blobs.  But then, you need to store the PostgreSQL data **someplace**.
If you don't store it in S3, you have to store it somewhere else.

I don't understand what you mean here. AFAIK storing blobs in PG is not recommended since it is not very efficient.

Seems like several people here disagree with this conventional wisdom.

I think what he was talking about the data itself. You have to store the bits and bytes somewhere (e.g. on S3).


afaik, S3 is not suitable for the $PGDATA directory, its more of an archival block file store for sequential access.    for the actual database storage in the AWS world, you'd either use EC2 local storage, or EBS, and I've heard from more than a few people that EBS can be something of a sand trap.

re: storing blobs in postgres, I would be very hesitant to storage LARGE amounts of bulk data directly in postgres

-- 
john r pierce, recycling bits in santa cruz

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
"David G. Johnston"
Дата:
On Thu, May 4, 2017 at 2:08 PM, Sven R. Kunze <srkunze@mail.de> wrote:
On 03.05.2017 12:57, Thomas Güttler wrote:
Am 02.05.2017 um 05:43 schrieb Jeff Janes:
​​


No.  You can certainly use PostgreSQL to store blobs.  But then, you need to store the PostgreSQL data **someplace**.
If you don't store it in S3, you have to store it somewhere else.

I don't understand what you mean here. AFAIK storing blobs in PG is not recommended since it is not very efficient.

Seems like several people here disagree with this conventional wisdom.

I think what he was talking about the data itself. You have to store the bits and bytes somewhere (e.g. on S3).

​I'm not sure I'd be comfortable placing my PostgreSQL data directory on an S3-based mount...

​I would look at any general recommendation in this area with suspicion.  Learn how different alternatives works, evaluate your needs, and make an informed decision for your specific scenario.  Reading other's experiences helps with the education aspect but unless their scenario is exactly like yours you should probably refrain from directly accepting their conclusions.

David J.

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis, amqp, s3 in the future?

От
Alan Hodgson
Дата:
On Thursday 04 May 2017 14:21:00 John R Pierce wrote:
> or EBS, and I've heard from more than a few people that EBS can be
> something of a sand trap.
>

Sorry for following up off-topic, but EBS has actually improved considerably
in the last few years. You can get guaranteed (and very high) IOPS on SSD
storage, and many instance types come with high-speed throughput to EBS. It's
much much better for databases than it was 5 years ago.


Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
John R Pierce
Дата:
On 5/4/2017 2:28 PM, Alan Hodgson wrote:
On Thursday 04 May 2017 14:21:00 John R Pierce wrote:
or EBS, and I've heard from more than a few people that EBS can be
something of a sand trap.

Sorry for following up off-topic, but EBS has actually improved considerably 
in the last few years. You can get guaranteed (and very high) IOPS on SSD 
storage, and many instance types come with high-speed throughput to EBS. It's 
much much better for databases than it was 5 years ago.


has it become more stable when Amazon has their occasional major hiccups?


-- 
john r pierce, recycling bits in santa cruz

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
John R Pierce
Дата:
On 5/4/2017 2:08 PM, Sven R. Kunze wrote:
> After searching the web, it seems to me that PostgreSQL doesn't offer
> a cron-like background job for cleanup tasks.
>
> http://stackoverflow.com/questions/18187490/postgresql-delete-old-rows-on-a-rolling-basis
>
>
>
> But there's an extension - pg_cron:
> https://www.citusdata.com/blog/2016/09/09/pgcron-run-periodic-jobs-in-postgres/


there's also pg_agent which is a cron-like extension, usually bundled
with pg_admin but also available standalone

https://www.pgadmin.org/docs4/dev/pgagent.html


--
john r pierce, recycling bits in santa cruz



Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
John R Pierce
Дата:
On 5/4/2017 2:50 PM, John R Pierce wrote:
But there's an extension - pg_cron:
https://www.citusdata.com/blog/2016/09/09/pgcron-run-periodic-jobs-in-postgres/


there's also pg_agent which is a cron-like extension, usually bundled with pg_admin but also available standalone

https://www.pgadmin.org/docs4/dev/pgagent.html


oh, it should be made clear... both of these extensions require an OS level cron/scheduler job to be run every minute or whatever, this job invokes some SQL stuff which checks the pg_cron or pg_agent tables and runs whatever sql tasks are due.


-- 
john r pierce, recycling bits in santa cruz

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis, amqp, s3 in the future?

От
Alan Hodgson
Дата:
On Thursday 04 May 2017 14:47:54 John R Pierce wrote:
> On 5/4/2017 2:28 PM, Alan Hodgson wrote:
> > On Thursday 04 May 2017 14:21:00 John R Pierce wrote:
> >> or EBS, and I've heard from more than a few people that EBS can be
> >> something of a sand trap.
> >
> > Sorry for following up off-topic, but EBS has actually improved
> > considerably in the last few years. You can get guaranteed (and very
> > high) IOPS on SSD storage, and many instance types come with high-speed
> > throughput to EBS. It's much much better for databases than it was 5
> > years ago.
>
> has it become more stable when Amazon has their occasional major hiccups?

No ... when they have a major problem it generally cascades across the region,
and several outages have hit EBS in particular. In us-east-1, at least. Other
regions seem more reliable.


Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
Jeff Janes
Дата:
On Wed, May 3, 2017 at 3:57 AM, Thomas Güttler <guettliml@thomas-guettler.de> wrote:
Am 02.05.2017 um 05:43 schrieb Jeff Janes:
On Sun, Apr 30, 2017 at 4:37 AM, Thomas Güttler <guettliml@thomas-guettler.de <mailto:guettliml@thomas-guettler.de>> wrote:

    Is is possible that PostgreSQL will replace these building blocks in the future?

     - redis (Caching)


PostgreSQL has its own caching.  It might not be quite as effective as redis', but you can us it if you are willing to
take those trade offs.

What kind of caching does PG offer?

It has shared_buffers to cache the data it needs frequently (not query results, but the data needed to produce the results), and also uses the file systems cache.  This is what I am referring to.  I wouldn't recommend using PostgreSQL simply as a cache for something else, if you don't want any other features of the database.  But if you want to throw Redis up as a layer of cache in front of PostgreSQL, maybe you should first see if that RAM, and a bit of tuning, can be used to make PostgreSQL fast enough to not need the Redis cache.
 



     - s3 (Blob storage)




No.  You can certainly use PostgreSQL to store blobs.  But then, you need to store the PostgreSQL data **someplace**.
If you don't store it in S3, you have to store it somewhere else.

I don't understand what you mean here. AFAIK storing blobs in PG is not recommended since it is not very efficient.

If the metadata is stored in PG and the blobs themselves are stored individually S3, you have a transaction atomicity problem. Solving that is not likely to be very efficient, either.  You have to pick your poison.

Cheers,

Jeff

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
"Peter J. Holzer"
Дата:
On 2017-05-04 23:08:25 +0200, Sven R. Kunze wrote:
> On 03.05.2017 12:57, Thomas Güttler wrote:
> >Am 02.05.2017 um 05:43 schrieb Jeff Janes:
> >>No.  You can certainly use PostgreSQL to store blobs.  But then, you
> >>need to store the PostgreSQL data **someplace**.
> >>If you don't store it in S3, you have to store it somewhere else.
> >
> >I don't understand what you mean here. AFAIK storing blobs in PG is not
> >recommended since it is not very efficient.
>
> Seems like several people here disagree with this conventional wisdom.

I think it depends very much on what level of "efficiency" you need. On
my home server (i5 processor, 32GB RAM, Samsung 850 SSD - not a piece of
junk, but not super powerful either) I can retrieve a small blob from a
100GB table in about 0.1 ms, and for large blobs the speed approaches
200MB/s. For just about everything I'd do on that server (or even at
work) this is easily fast enough.

Sure, just telling the kernel "send data from file descriptor A (which
happens to be a file) to file descriptor B (a socket)" is a lot more
efficient than copying data from disk into a postgresql process, then
from that process to an application server, from that to the webserver
and that finally sends it to the socket. But if that just lets my server
be 99.9% idle instead of 99.0% idle, I haven't gained much. Similarly,
if my server spends 90% of it's resources doing other stuff, I won't
gain much by optimizing this (I should better optimize that other stuff
it's spending so much time on).

I am in this regard a firm believer in not optimizing prematurely. Do
whatever makes sense from an application point of view. If the blobs are
logically part of some other data (e.g. PDFs in a literature database),
store them together (either all of them in PostgreSQL, or all in some
NoSQL database, or maybe on stone tablets, if that makes sense for some
reason). Only if you have good reason[1] to believe that physically
separating data which logically belongs together will resolve a
bottleneck, then by all means separate them.

        hp

[1] "I read somewhere on the internet" is usually not a good reason.

--
   _  | Peter J. Holzer    | A coding theorist is someone who doesn't
|_|_) |                    | think Alice is crazy.
| |   | hjp@hjp.at         | -- John Gordon
__/   | http://www.hjp.at/ |    http://downlode.org/Etext/alicebob.html

Вложения

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
John R Pierce
Дата:
On 5/5/2017 11:28 AM, Peter J. Holzer wrote:
On 2017-05-04 23:08:25 +0200, Sven R. Kunze wrote:
On 03.05.2017 12:57, Thomas Güttler wrote:
Am 02.05.2017 um 05:43 schrieb Jeff Janes:
No.  You can certainly use PostgreSQL to store blobs.  But then, you
need to store the PostgreSQL data **someplace**.
If you don't store it in S3, you have to store it somewhere else.
I don't understand what you mean here. AFAIK storing blobs in PG is not
recommended since it is not very efficient.
Seems like several people here disagree with this conventional wisdom.
I think it depends very much on what level of "efficiency" you need. On
my home server (i5 processor, 32GB RAM, Samsung 850 SSD - not a piece of
junk, but not super powerful either) I can retrieve a small blob from a
100GB table in about 0.1 ms, and for large blobs the speed approaches
200MB/s. For just about everything I'd do on that server (or even at
work) this is easily fast enough.


S3 is often used for terabyte to petabyte file collections.   I would not want to burden my relational database with this.


-- 
john r pierce, recycling bits in santa cruz

Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Tony Finch
Дата:
Bill Moran <wmoran@potentialtech.com> wrote:
>
> There's a well-written article I saw recently that directly addresses
> your question ... I'm too lazy to find it, but google will probably
> turn it up for you.

This? http://renesd.blogspot.co.uk/2017/02/is-postgresql-good-enough.html

Tony.
--
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/  -  I xn--zr8h punycode
Irish Sea: East or northeast 5 or 6. Slight or moderate. Fair. Good.


Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
"Peter J. Holzer"
Дата:
On 2017-05-05 11:46:55 -0700, John R Pierce wrote:
> On 5/5/2017 11:28 AM, Peter J. Holzer wrote:
>
>     On 2017-05-04 23:08:25 +0200, Sven R. Kunze wrote:
>
>         On 03.05.2017 12:57, Thomas Güttler wrote:
>
>             Am 02.05.2017 um 05:43 schrieb Jeff Janes:
>
>                 No.  You can certainly use PostgreSQL to store blobs.  But then, you
>                 need to store the PostgreSQL data **someplace**.
>                 If you don't store it in S3, you have to store it somewhere else.
>
>             I don't understand what you mean here. AFAIK storing blobs in PG is not
>             recommended since it is not very efficient.
>
>         Seems like several people here disagree with this conventional wisdom.
>
>     I think it depends very much on what level of "efficiency" you need. On
>     my home server (i5 processor, 32GB RAM, Samsung 850 SSD - not a piece of
>     junk, but not super powerful either) I can retrieve a small blob from a
>     100GB table in about 0.1 ms, and for large blobs the speed approaches
>     200MB/s. For just about everything I'd do on that server (or even at
>     work) this is easily fast enough.
>
>
> S3 is often used for terabyte to petabyte file collections.   I would not want
> to burden my relational database with this.

I repeat the the first sentence I wrote: "I think it depends very much
on what level of 'efficiency' you need." Just because some people need
to store petabytes of blob data doesn't mean everybody does. If you need
to store petabytes of blobs, PostgreSQL may not be the right tool. But
it may be the right tool if you just need to store a few thousand PDFs.
To tell people to never store blobs in PostgreSQL because PostgreSQL is
"not efficient" is just bullshit. There are many factors which determine
how you should store your data, and "efficiency" (however that is
defined, if it's defined at all and not just used as a buzzword) is only
one of them - and rarely, in my experience, the most important one.

        hp

--
   _  | Peter J. Holzer    | A coding theorist is someone who doesn't
|_|_) |                    | think Alice is crazy.
| |   | hjp@hjp.at         | -- John Gordon
__/   | http://www.hjp.at/ |    http://downlode.org/Etext/alicebob.html

Вложения

Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Bill Moran
Дата:
On Fri, 5 May 2017 19:52:42 +0100
Tony Finch <dot@dotat.at> wrote:

> Bill Moran <wmoran@potentialtech.com> wrote:
> >
> > There's a well-written article I saw recently that directly addresses
> > your question ... I'm too lazy to find it, but google will probably
> > turn it up for you.
>
> This? http://renesd.blogspot.co.uk/2017/02/is-postgresql-good-enough.html

Oh, hey! You found it, thanks!

--
Bill Moran <wmoran@potentialtech.com>


Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
Adam Brusselback
Дата:
there's also pg_agent which is a cron-like extension, usually bundled with pg_admin but also available standalone

https://www.pgadmin.org/docs4/dev/pgagent.html


--
john r pierce, recycling bits in santa cruz

In addition to that, there is also  jpgAgent: https://github.com/GoSimpleLLC/jpgAgent

It uses the same schema as pgagent in the database, and just replaces the actual agent portion of it with a compatible re-write.  Has been way more stable for us since we switched to it, as well as providing features we needed like email notifications and parallel running of steps.

Disclosure: I wrote it for my company... started on it well before all the alternatives like pg_cron, pg_bucket, etc came out.

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis,amqp, s3 in the future?

От
Steve Atkins
Дата:
> On May 7, 2017, at 9:16 AM, Adam Brusselback <adambrusselback@gmail.com> wrote:
>
> there's also pg_agent which is a cron-like extension, usually bundled with pg_admin but also available standalone
>
> https://www.pgadmin.org/docs4/dev/pgagent.html
>
>
> --
> john r pierce, recycling bits in santa cruz
>
> In addition to that, there is also  jpgAgent: https://github.com/GoSimpleLLC/jpgAgent
>
> It uses the same schema as pgagent in the database, and just replaces the actual agent portion of it with a
compatiblere-write.  Has been way more stable for us since we switched to it, as well as providing features we needed
likeemail notifications and parallel running of steps. 
>
> Disclosure: I wrote it for my company... started on it well before all the alternatives like pg_cron, pg_bucket, etc
cameout. 

There's also pglater, which is a minimal external process that'll let you implement any sort of cron-ish functionality
entirelyinside the database without needing to be woken up every minute by an external cron. 

https://github.com/wttw/pgsidekick

More proof-of-concept than anything remotely production-ready.

Cheers,
  Steve



Re: [GENERAL] Can PG replace redis, amqp, s3 in the future?

От
Scott Marlowe
Дата:
On Mon, May 1, 2017 at 2:59 PM, Sven R. Kunze <srkunze@mail.de> wrote:
> On 30.04.2017 16:25, Steve Atkins wrote:
>
> You can use postgresql for caching, but caches don't require the data
> durability that a database offers, and can be implemented much more
> efficiently.
>
>
> I for one can understand Thomas' need for a single solution.
> Just recently I needed a cache which was supposed to be set up in a
> SERIALIZABLE manner as in
> https://www.postgresql.org/docs/devel/static/transaction-iso.html#xact-serializable
> Available cache mechanisms would have produce erroneous results. So, I went
> for PG.

This brings up another subject, reliability. If PostgreSQL is fast
enough, and on stable hardware, it's often the preferred choice
because of its very good stability. Try running a big production noSQL
cluster and you'll find plenty of sharp corners in most. A lot of
times it's just easier to set up a pair of VMs (on good hardware) and
toss a pg db at the problem, esp if performance is a secondary
consideration, or not likely to tax pgsql's basic architecture.