Обсуждение: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

Поиск

Список

Период

Сортировка

Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

tutiluren@tutanota.com

Дата:

14 сентября 2020 г., 20:22:31

Even though I highly appreciate that PostgreSQL, a database software which doesn't cost money, exists *at all*, that fact is oftentimes overshadowed by a small but important number of very frustrating issues which I consider to largely ruin the overall "experience" of using PostgreSQL. I'd call them "almost show-stopping". I realize very well that not everyone "has the same priorities" (clearly not) or the expertise, will and free time to work on a certain "area" of the overall project/ecosystem, but these issues/bugs/limitations are so problematic to me that I have to express them directly to the PostgreSQL developers:

1. All non-ANSI characters are turned into "?"s for application_name. My administration panel is thus full of gibberish such as: "?a??? ???? ? ?m??? ??i???u?" and it always looks as if something is awfully broken. I would not dare to show it to a CEO or other important person, as they'd just go: "We're switching to IBM, effective immediately. Throw this open source rubbish out at once!" The explanation I've heard for this is that it's basically a security issue, as it's possible to set the application_name before the something-something (safe Unicode handler code?) has kicked in, but I have no problems with setting the application_name to Unicode characters *after* the database connection has already been fully established, in a separate query, as I already do, and I doubt that anyone else would have, either. So that explanation, while probably technically true, doesn't seem to make any sense.

2. pg_dump misinterprets non-ANSI values for the "--exclude-*" options (at least the --exclude-table-data one, which is the one I've tested) on Windows, resulting in it being impossible to make more "sophisticated" backups of PostgreSQL databases; it's either all or nothing. Other programs, including my own test scripts and commands, are perfectly able to use any Unicode character sent from/through both cmd.exe and PHP CLI, but not pg_dump, so the idea that "Windows it at fault" here just doesn't seem true. (Although I don't doubt for a second that it often *is* the case... Microsoft is not a nice entity in any way.) I spent a lot of time and efforts experimenting with and asking about this, but eventually gave up and concluded that it was yet another bug in an open source project "only" on Windows with no real/pressing interest in fixing it. For me, this means that I lose a ton of fresh data every day, or have to make *gigantic* backups. (I have several huge "temporary debug log" tables whose data have zero long-term value but tons of short-term value.) It makes me feel crippled and excluded in an uncomfortable manner.

3. The ability to embed PG to run in an automatic, quiet manner as part of something else. I know about SQLite, but it's extremely limited to the point of being virtually useless IMO, which is why I cannot use that for anything nontrivial. I want my familiar PostgreSQL, only not require it to be manually and separately installed on the machine where it is to run as part of some "application". If I could just "embed" it, this would allow me to create a single EXE which I can simply put on a different machine to run my entire "system" which otherwise takes *tons* of tedious, error-prone manual labor to install, set up and maintain. Of course, this is probably much easier said than done, but I don't understand why PG's architecture necessarily dictates that PG must be a stand-alone, separate thing. Or rather, why some "glue" cannot enable it to be used just like SQLite from a *practical* perspective, even if it still is a "server-client model" underneath the hood. (Which doesn't matter at all to me, nor should it matter to anyone else.)

4. There is no built-in means to have PG manage (or even suggest) indexes on its own. Trying to figure out what indexes to create/delete/fine-tune, and determine all the extremely complex rules for this art (yes, I just called index management an *art*, because it is!), is just utterly hopeless to me. It never gets any easier. Not even after many years. It's the by far worst part of databases to me (combined with point five). Having to use third-party solutions ensures that it isn't done in practice, at least for me. I don't trust, nor do I want to deal with, external software and extensions in my databases. I still have nightmares from PostGIS, which I only keep around, angrily, out of absolute necessity. I fundamentally don't like third-party add-ons to things, but want the core product to properly support things. Besides, this (adding/managing indexes) is not even some niche/obscure use-case, but something which is crucial for basically any nontrivial database of any kind!

5. Ever since my early days with PG in the mid-2000s, I've tried numerous times to read the manual, wikis and comments for the configuration files, specifically the performance directives, and asked many, many times for help about that, yet never been able to figure out what they want me to enter for all the numerous options. At this point, it isn't me being lazy/stupid; it's objectively very difficult to understand all of that. The practical end result of this is that I've always gone back to using the untouched default configuration file (except for the logging-related options), which, especially in the past on FreeBSD, *severely* crippled my PG database to not even come close to taking advantage of the full power of the hardware. Instead, it felt like I was using maybe 1% of the machine's power, even with a proper database design and indexes and all of that stuff, simply because the default config was so "conservative" and it couldn't be just set to "use whatever resources are available". I wish so much for PG to have a mode where it self-tunes itself as needed, over time, based on the actual workload, or at least allowed some kind of abstract "performance mode" such as: "you are allowed to use significant system resources, PG", or: "You are one of my most important applications. Just use as much power as you currently need, but at least save about 10% for the rest of the system, will you?" Maybe this is also harder than it sounds to accomplish, but for somebody like me who has zero funding, I cannot hire some professional to sit down with me and fine-tune my system for $899/hour. Also, besides the purely monetary issue, there are serious privacy implications with that scene. I wouldn't want an outsider to have intimate knowledge of my database system, which more than probably is a requirement for them to be able to do their job properly.

I'm sorry if any of the above sounds insulting/"entitled". These are the main things which truly bother me about PG and the "PG ecosystem", and I'd love to hear some first-hand comments on them. At least point 1 and 2 seem like they would be almost trivial to fix, at least compared to the rest.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Guyren Howe

Дата:

14 сентября 2020 г., 20:26:52

On Sep 14, 2020, 13:22 -0700, tutiluren@tutanota.com, wrote:

3. The ability to embed PG to run in an automatic, quiet manner as part of something else. I know about SQLite, but it's extremely limited to the point of being virtually useless IMO, which is why I cannot use that for anything nontrivial

I can’t for the life of me imagine how you arrived at this. SQLite is very capable indeed. Its dialect of SQL is (deliberately) very similar to Postgres, featuring such niceties as recursive CTEs and window functions, and it can handle heavy use and multi-terabyte databases if you need (cf bedrockdb).

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Guyren Howe

Дата:

14 сентября 2020 г., 20:30:39

Regarding indexes, I recommend the website use-the-index-luke.com. That guy’s other website about modern SQL is also great.

Regarding self-tuning, it was actually part of the original vision for relational databases that they would do that, but I’m not aware of any modern SQL database that does, although there are third-party tools that will eg offer index suggestions (I don’t know which is best, so I won’t propose any).

On Sep 14, 2020, 13:22 -0700, tutiluren@tutanota.com, wrote:

4. There is no built-in means to have PG manage (or even suggest) indexes on its own. Trying to figure out what indexes to create/delete/fine-tune, and determine all the extremely complex rules for this art (yes, I just called index management an *art*, because it is!), is just utterly hopeless to me. It never gets any easier. Not even after many years. It's the by far worst part of databases to me (combined with point five). Having to use third-party solutions ensures that it isn't done in practice, at least for me. I don't trust, nor do I want to deal with, external software and extensions in my databases. I still have nightmares from PostGIS, which I only keep around, angrily, out of absolute necessity. I fundamentally don't like third-party add-ons to things, but want the core product to properly support things. Besides, this (adding/managing indexes) is not even some niche/obscure use-case, but something which is crucial for basically any nontrivial database of any kind!

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Ravi Krishna

Дата:

14 сентября 2020 г., 20:31:11

I can’t for the life of me imagine how you arrived at this. SQLite is very capable indeed.
>Its dialect of SQL is (deliberately) very similar to Postgres, featuring such niceties as
>recursive CTEs and window functions, and it can handle heavy use and multi-terabyte
> databases if you need (cf bedrockdb).


But it is still a single user database in the sense that concurrent access to a table is not
handled well.  I use it for single user embedded applications

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Tom Lane

Дата:

14 сентября 2020 г., 21:39:57

tutiluren@tutanota.com writes:
> 1. All non-ANSI characters are turned into "?"s for application_name.

Yeah, that's hard to do much with unfortunately.  We cannot assume that
all databases in an installation share the same encoding, so for globally
visible strings like application_name, the only safe solution is to
restrict them to ASCII.

On the other hand, the very same thing could be said of database names
and role names, yet we have never worried much about whether those were
encoding-safe when viewed from databases with different encodings, nor
have there been many complaints about the theoretical unsafety.  So maybe
this is just overly anal-retentive and we should drop the restriction,
or at least pass through data that doesn't appear to be invalidly
encoded.

> 2. pg_dump misinterprets non-ANSI values for the "--exclude-*" options
> (at least the --exclude-table-data one, which is the one I've tested) on
> Windows, resulting in it being impossible to make more "sophisticated"
> backups of PostgreSQL databases; it's either all or nothing.

TBH, I'm going to throw that back on you as probably pilot error.  There's
no reason for such cases not to work if you're running the terminal window
in the same encoding that pg_dump thinks it's using.

> I spent a lot of time and efforts experimenting with and asking about
> this, but eventually gave up and concluded that it was yet another bug
> in an open source project "only" on Windows with no real/pressing
> interest in fixing it.

Yeah, I remember that discussion.  It being open source cuts two ways:
problems get fixed by people who have the motivation to find a workable
fix.  Not being a Windows user, I don't particularly care about whether
this case is broken or not, and even if I cared more, I do not have
the resources to figure it out myself.  There are other people around the
project who do Windows, but you evidently have not managed to persuade
them that this is something they should spend time on, either.

> 3. The ability to embed PG to run in an automatic, quiet manner as part
> of something else.

That would be a packaging problem for the "something else" to figure out,
no?  There are enough platform-specific issues about "automatic" that
I don't see it as very practical for the core Postgres project to take
on, in any case.  I'm aware of some packagings, such as Postgres.app for
macOS (https://postgresapp.com), that might be closer to what you are
looking for than the core project ever could be.

> 4. There is no built-in means to have PG manage (or even suggest)
> indexes on its own.

Indeed, and that does not seem to me like something that ought to be
built in.  There is at least one "index advisor" plug-in floating
around, which perhaps could help you.

> 5. Ever since my early days with PG in the mid-2000s, I've tried
> numerous times to read the manual, wikis and comments for the
> configuration files, specifically the performance directives, and asked
> many, many times for help about that, yet never been able to figure out
> what they want me to enter for all the numerous options. At this point,
> it isn't me being lazy/stupid; it's objectively very difficult to
> understand all of that.

We do the best we can as far as the documentation goes; if you have
concrete suggestions about how to improve that, we'll surely listen.

> The practical end result of this is that I've always gone back to using
> the untouched default configuration file (except for the logging-related
> options), which, especially in the past on FreeBSD, *severely* crippled
> my PG database to not even come close to taking advantage of the full
> power of the hardware.

You seem to be wishing that the default configuration was designed to
try to commandeer the whole machine.  It's deliberate project policy
that that not be so --- and given that a couple of bullet points up,
you were arguing for Postgres to be easily usable as an invisible
part of some other app, it doesn't seem like you really want it to
be so either.

There is a lot of info out there about Postgres tuning, and yes some
of it is contradictory, because one size doesn't fit all.  People
have different goals about how they want the system to act, and the
proper settings may vary across platforms or PG versions, and there's a
fair amount of just plain different opinions.  But I think you may be
overcomplicating it.  IMO there's not that much you have to adjust to
start with --- maybe just increase shared_buffers and possibly work_mem
--- and then tweak other settings only when you find yourself running into
that limit.  If you can figure out our logging settings then you can
figure out the rest of this, too.

            regards, tom lane

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

raf

Дата:

15 сентября 2020 г., 00:05:21

On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> tutiluren@tutanota.com writes:
> > 1. All non-ANSI characters are turned into "?"s for application_name.
> 
> Yeah, that's hard to do much with unfortunately.  We cannot assume that
> all databases in an installation share the same encoding, so for globally
> visible strings like application_name, the only safe solution is to
> restrict them to ASCII.
> 
> On the other hand, the very same thing could be said of database names
> and role names, yet we have never worried much about whether those were
> encoding-safe when viewed from databases with different encodings, nor
> have there been many complaints about the theoretical unsafety.  So maybe
> this is just overly anal-retentive and we should drop the restriction,
> or at least pass through data that doesn't appear to be invalidly
> encoded.

Perhaps recode database/role names from the source
database's encoding into utf8, and then recode from utf8
to the destination database's encoding?

For "globally visible strings", maybe recode to the
client_encoding setting, or a new encoding setting for
this purpose since client_encoding seems to be linked
to the database that the client is connected to. I'd've
thought that the application name would arrive encoded
as client_encoding (which defaults to the database
encoding). Maybe globally visible strings need to be
recoded on arrival from the client_encoding to utf8 (or
a "server_encoding" or "global_encoding" setting) so
they are always stored in a known encoding so they can
be recoded as necessary when viewed via connections to
other databases using a different encoding.

Just some thoughts. If they don't make any sense, feel
free to ignore them. :-)

> > 5. Ever since my early days with PG in the mid-2000s, I've tried
> > numerous times to read the manual, wikis and comments for the
> > configuration files, specifically the performance directives, and asked
> > many, many times for help about that, yet never been able to figure out
> > what they want me to enter for all the numerous options. At this point,
> > it isn't me being lazy/stupid; it's objectively very difficult to
> > understand all of that.
> 
> We do the best we can as far as the documentation goes; if you have
> concrete suggestions about how to improve that, we'll surely listen.

I thought the documentation on postgres performance tuning was fine.
Here's what I got from it (for a dedicated database server host):

  shared_buffers = 1GB # Should be 1/4 of RAM
  work_mem = 16MB # Should be bigger to do more sorts in-memory but it's per sort per user so not too big
  maintenance_work_mem = 128MB # Should be <= 256MB to make vacuum fast without taking away too much RAM from other
tasks
  min_wal_size = 480MB # Should be at least 3 * 10 * 16MB to avoid too many checkpoints slowing down writes
  checkpoint_completion_target = 0.9 # Should be 0.9 if checkpoint_segments was increased to spread out checkpoint
writes
  random_page_cost = 4.0 # Should be 4.0 for HDD, 1.5-2.5 for SSD, 1.0-1.01 if db fits entirely in RAM
  effective_cache_size = 3GB # Should be 1/2 to 3/4 of RAM

It's the "Should..." comments that matter. And it might be out of date...
Actually, it is out of date. The comment for checkpoint_completion_target
refers to checkpoint_segments which no longer exists (since 9.5) so
disregard that.

cheers,
raf

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Tom Lane

Дата:

15 сентября 2020 г., 00:26:02

raf <raf@raf.org> writes:
> On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> On the other hand, the very same thing could be said of database names
>> and role names, yet we have never worried much about whether those were
>> encoding-safe when viewed from databases with different encodings, nor
>> have there been many complaints about the theoretical unsafety.  So maybe
>> this is just overly anal-retentive and we should drop the restriction,
>> or at least pass through data that doesn't appear to be invalidly
>> encoded.

> Perhaps recode database/role names from the source
> database's encoding into utf8, and then recode from utf8
> to the destination database's encoding?

A lot of people seem to believe that transcoding through utf8
is 100% safe.  They're wrong :-( --- the Japanese, at least,
have reason not to trust it, because of the existence of multiple
incompatible conversion standards.  And you're still left with the
question of what to do when the destination encoding hasn't
got the character.

Moreover, this is all moderately expensive unless the encodings in
question are already utf8 or latin1.  So if we go this way I'd
prefer to do it as I said above -- just drop or question-mark-ize
any characters that don't pass validation in the recipient DB.
That's fairly cheap and it will work perfectly in the typical case
where the whole cluster is on one encoding anyway.

            regards, tom lane

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Bruce Momjian

Дата:

20 сентября 2020 г., 16:38:30

On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane wrote:
> tutiluren@tutanota.com writes:
> > 1. All non-ANSI characters are turned into "?"s for application_name.
> 
> Yeah, that's hard to do much with unfortunately.  We cannot assume that
> all databases in an installation share the same encoding, so for globally
> visible strings like application_name, the only safe solution is to
> restrict them to ASCII.
> 
> On the other hand, the very same thing could be said of database names
> and role names, yet we have never worried much about whether those were
> encoding-safe when viewed from databases with different encodings, nor
> have there been many complaints about the theoretical unsafety.  So maybe
> this is just overly anal-retentive and we should drop the restriction,
> or at least pass through data that doesn't appear to be invalidly
> encoded.

I think the issue is that role and database names are controlled by
privileged users, while application_name is not.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Bruce Momjian

Дата:

20 сентября 2020 г., 16:40:30

On Mon, Sep 14, 2020 at 10:22:31PM +0200, tutiluren@tutanota.com wrote:
> 4. There is no built-in means to have PG manage (or even suggest) indexes on
> its own. Trying to figure out what indexes to create/delete/fine-tune, and
> determine all the extremely complex rules for this art (yes, I just called
> index management an *art*, because it is!), is just utterly hopeless to me. It
> never gets any easier. Not even after many years. It's the by far worst part of
> databases to me (combined with point five). Having to use third-party solutions
> ensures that it isn't done in practice, at least for me. I don't trust, nor do
> I want to deal with, external software and extensions in my databases. I still
> have nightmares from PostGIS, which I only keep around, angrily, out of
> absolute necessity. I fundamentally don't like third-party add-ons to things,
> but want the core product to properly support things. Besides, this (adding/
> managing indexes) is not even some niche/obscure use-case, but something which
> is crucial for basically any nontrivial database of any kind!

I think you did a very good job of explaining your issues.  I think the
underlying problem is that Postgres is targeting a wide market, and your
use-case for a more limited or self-contained database doesn't fit many
of those markets.  Also, PostGIS is one of the most complex extensions,
so adding simpler ones should not be as hard.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Tom Lane

Дата:

20 сентября 2020 г., 17:15:26

Bruce Momjian <bruce@momjian.us> writes:
> On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane wrote:
>> On the other hand, the very same thing could be said of database names
>> and role names, yet we have never worried much about whether those were
>> encoding-safe when viewed from databases with different encodings, nor
>> have there been many complaints about the theoretical unsafety.  So maybe
>> this is just overly anal-retentive and we should drop the restriction,
>> or at least pass through data that doesn't appear to be invalidly
>> encoded.

> I think the issue is that role and database names are controlled by
> privileged users, while application_name is not.

That's certainly an argument against a completely laissez-faire approach,
but if we filtered invalidly-encoded data on the reading side, it seems
like we would be in good enough shape.

            regards, tom lane

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Joshua Drake

Дата:

21 сентября 2020 г., 17:53:08

Howdy,

First let me say thanks for the feedback! It is rare that we receive detailed "user feedback" on these lists so it is good to hear from the outside world. I am only going to address a few things as others have addressed the rest.

2. pg_dump misinterprets non-ANSI values for the "--exclude-*" options (at least the --exclude-table-data one, which is the one I've tested) on Windows, resulting in it being impossible to make more "sophisticated" backups of PostgreSQL databases; it's either all or nothing. Other programs, including my own test scripts and commands, are perfectly able to use any Unicode character sent from/through both cmd.exe and PHP CLI, but not pg_dump, so the idea that "Windows it at fault" here just doesn't seem true. (Although I don't doubt for a second that it often *is* the case... Microsoft is not a nice entity in any way.) I spent a lot of time and efforts experimenting with and asking about this, but eventually gave up and concluded that it was yet another bug in an open source project "only" on Windows with no real/pressing interest in fixing it. For me, this means that I lose a ton of fresh data every day, or have to make *gigantic* backups. (I have several huge "temporary debug log" tables whose data have zero long-term value but tons of short-term value.) It makes me feel crippled and excluded in an uncomfortable manner.

I have to agree that pg_dump is largely a step child backup program. It has consistently been found over the years to be lacking in a number of areas. Unfortunately, working on pg_dump isn't sexy and it is difficult to get volunteers or even paid resources to do such a thing. The real solution for pg_dump is a complete refactor which includes pg_dumpall and it is not a small undertaking. It should be noted that it is also a less and less used program. On our team it is normally used for only very specific needs (grabbing a schema) and we use binary backups or logical replication to receive specific data.

3. The ability to embed PG to run in an automatic, quiet manner as part of something else. I know about SQLite, but it's extremely limited to the point of being virtually useless IMO, which is why I cannot use that for anything nontrivial. I want my familiar PostgreSQL, only not require it to be manually and separately installed on the machine where it is to run as part of some "application". If I could just "embed" it, this would allow me to create a single EXE which I can simply put on a different machine to run my entire "system" which otherwise takes *tons* of tedious, error-prone manual labor to install, set up and maintain. Of course, this is probably much easier said than done, but I don't understand why PG's architecture necessarily dictates that PG must be a stand-alone, separate thing. Or rather, why some "glue" cannot enable it to be used just like SQLite from a *practical* perspective, even if it still is a "server-client model" underneath the hood. (Which doesn't matter at all to me, nor should it matter to anyone else.)

This is really using the wrong tool for the job type of issue. PG was never designed for such a scenario.

4. There is no built-in means to have PG manage (or even suggest) indexes on its own. Trying to figure out what indexes to create/delete/fine-tune, and determine all the extremely complex rules for this art (yes, I just called index management an *art*, because it is!), is just utterly hopeless to me. It never gets any easier. Not even after many years. It's the by far worst part of databases to me (combined with point five). Having to use third-party solutions ensures that it isn't done in practice, at least for me. I don't trust, nor do I want to deal with, external software and extensions in my databases. I still have nightmares from PostGIS, which I only keep around, angrily, out of absolute necessity. I fundamentally don't like third-party add-ons to things, but want the core product to properly support things. Besides, this (adding/managing indexes) is not even some niche/obscure use-case, but something which is crucial for basically any nontrivial database of any kind!

I think you are looking at this from a very windows centric way. Open Source has its origins from the Unix paradigm where each tool was designed to solve one type of problem and you used multiple tools to create a "solution". Though we have strayed from that on some items due to the evolving nature of software needs, that is still at our core and for good reason. Having tools, flags etc... to do such things (including your point #3) creates complexity best left to "vendors" not the software project.

5. Ever since my early days with PG in the mid-2000s, I've tried numerous times to read the manual, wikis and comments for the configuration files, specifically the performance directives, and asked many, many times for help about that, yet never been able to figure out what they want me to enter for all the numerous options. At this point, it isn't me being lazy/stupid; it's objectively very difficult to understand all of that.

This is absolutely true. The Postgresql documentation is FANTASTIC if you already understand what is going on or you need a reference. We have improved this a bit in 13 with the glossary but we still don't have definitive "5 steps to make your PostgreSQL server fast" and that is very much because it is a complicated question and it takes a lot of knowledge to do it correctly.

The practical end result of this is that I've always gone back to using the untouched default configuration file (except for the logging-related options), which, especially in the past on FreeBSD, *severely* crippled my PG database to not even come close to taking advantage of the full power of the hardware. Instead, it felt like I was using maybe 1% of the machine's power, even with a proper database design and indexes and all of that stuff, simply because the default config was so "conservative" and it couldn't be just set to "use whatever resources are available".

Not to be unkind but this does seem lazy. There are literally hundreds of "how to make postgres go fast", "how to optimize postgres" if you take 15 minutes to Google. It is true that the project (outside of the wiki) doesn't have much information in the official documentation but that doesn't mean that the information is not available.

I wish so much for PG to have a mode where it self-tunes itself as needed, over time, based on the actual workload, or at least allowed some kind of abstract "performance mode" such as: "you are allowed to use significant system resources, PG", or: "You are one of my most important applications. Just use as much power as you currently need, but at least save about 10% for the rest of the system, will you?" Maybe this is also harder than it sounds to accomplish, but for somebody like me who has zero funding, I cannot hire some professional to sit down with me and fine-tune my system for $899/hour.

See my comment about Google. The information is out there and easy to find. There is a lot of fresh and free content right here:

https://postgresconf.org/conferences/postgres-webinar-series/schedule/events

I hope you find the information you are looking for. There is a very helpful community on Discord and Slack as well:

Thanks,

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Fabio Ugo Venchiarutti

Дата:

22 сентября 2020 г., 14:17:54

On 21/09/2020 17:53, Joshua Drake wrote:
> 3. The ability to embed PG to run in an automatic, quiet manner as part 
> of something else. I know about SQLite, but it's extremely limited to 
> the point of being virtually useless IMO, which is why I cannot use that 
> for anything nontrivial. I want my familiar PostgreSQL, only not require 
> it to be manually and separately installed on the machine where it is to 
> run as part of some "application". If I could just "embed" it, this 
> would allow me to create a single EXE which I can simply put on a 
> different machine to run my entire "system" which otherwise takes *tons* 
> of tedious, error-prone manual labor to install, set up and maintain. Of 
> course, this is probably much easier said than done, but I don't 
> understand why PG's architecture necessarily dictates that PG must be a 
> stand-alone, separate thing. Or rather, why some "glue" cannot enable it 
> to be used just like SQLite from a *practical* perspective, even if it 
> still is a "server-client model" underneath the hood. (Which doesn't 
> matter at all to me, nor should it matter to anyone else.)

It depends what you mean by "embedded".
If you want sqlite's linked library approach, where the database calls 
literally run your process' address space, then that's a no go, as 
postgres is a multi-user database server with its own process hierarchy.

However, postgres also is a rather agnostic command that does not detach 
from the parent's terminal/stdio unless instructed to, so nothing stops 
your bespoke application from launching and managing its own postmaster 
as a directly managed child process - started as part of application 
initialisation - and having that listen on a local socket only available 
to the application itself; this is what we implemented in some of our 
installations where postgres is enslaved to the cluster control system - 
the configuration is re-generated at every restart (IIRC Patroni does 
something similar minus the unix socket part).

A plethora of systems are built around the notion of programs calling 
other programs and managing the process' life cycle. The limiting factor 
to such architecture tends to be the OS's process control semantics and 
API (notoriously weaker or more contrived on non-unix-like OSs), but 
that's not postgres' fault.

-- 
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

-- 

Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.

If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.

References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Jonathan Strong

Дата:

22 сентября 2020 г., 16:11:30

Joshua - adding to what Fabio said (with which I agree wholeheartedly!) -

It will be worthwhile for you to itemize the attributes and needed features for making a final decision about the architecture of the application you plan to deploy.

While you are familiar with PostgreSQL and like its features, it may not wind up being the best match for what you are trying to accomplish. I'm not sure your concerns about SQLite are true roadblocks. If you are looking to deploy an executable supported by one or more DLLs but no external processes, you won't find many mature and truly embeddable DBMSs out there. SQLite is widely used and well supported, and some of the functionality you'd normally consider putting in stored procedures might fit in with the idea of creating custom functions. A few more steps than creating a PostgreSQL stored procedure or function, but perhaps this approach would make SQLite a reasonable choice for you. See:

https://stackoverflow.com/questions/2108870/how-to-create-custom-functions-in-sqlite

Back to the question about attributes and features of the DBMS, and Fabio's question to you re: your definition of an embedded database. An embedded database is typically used by a single user and is only available within the context of the current application and user / memory space. It should also be self-tuning and self-managing, as a typical end user can't and shouldn't be expected to also handle any DBA tasks. Several databases are considered "embedded", but the criteria used for this label may not match your own. If you can spell out your specific expectations (e.g., the app and database can be deployed in a single exe, etc.), this will help tremendously in homing in on the right choice.

- Jon

Jonathan Strong

CIO / CTO / Consultant

P: 609-532-1715 E: jonathanrstrong@gmail.com

Quora Top Writer

On Tue, Sep 22, 2020 at 10:18 AM Fabio Ugo Venchiarutti <f.venchiarutti@ocado.com> wrote:

On 21/09/2020 17:53, Joshua Drake wrote:
> 3. The ability to embed PG to run in an automatic, quiet manner as part
> of something else. I know about SQLite, but it's extremely limited to
> the point of being virtually useless IMO, which is why I cannot use that
> for anything nontrivial. I want my familiar PostgreSQL, only not require
> it to be manually and separately installed on the machine where it is to
> run as part of some "application". If I could just "embed" it, this
> would allow me to create a single EXE which I can simply put on a
> different machine to run my entire "system" which otherwise takes *tons*
> of tedious, error-prone manual labor to install, set up and maintain. Of
> course, this is probably much easier said than done, but I don't
> understand why PG's architecture necessarily dictates that PG must be a
> stand-alone, separate thing. Or rather, why some "glue" cannot enable it
> to be used just like SQLite from a *practical* perspective, even if it
> still is a "server-client model" underneath the hood. (Which doesn't
> matter at all to me, nor should it matter to anyone else.)

It depends what you mean by "embedded".
If you want sqlite's linked library approach, where the database calls
literally run your process' address space, then that's a no go, as
postgres is a multi-user database server with its own process hierarchy.

However, postgres also is a rather agnostic command that does not detach
from the parent's terminal/stdio unless instructed to, so nothing stops
your bespoke application from launching and managing its own postmaster
as a directly managed child process - started as part of application
initialisation - and having that listen on a local socket only available
to the application itself; this is what we implemented in some of our
installations where postgres is enslaved to the cluster control system -
the configuration is re-generated at every restart (IIRC Patroni does
something similar minus the unix socket part).

A plethora of systems are built around the notion of programs calling
other programs and managing the process' life cycle. The limiting factor
to such architecture tends to be the OS's process control semantics and
API (notoriously weaker or more contrived on non-unix-like OSs), but
that's not postgres' fault.

--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--

Notice:
This email is confidential and may contain copyright material of
members of the Ocado Group. Opinions and views expressed in this message
may not necessarily reflect the opinions and views of the members of the
Ocado Group.

If you are not the intended recipient, please notify us
immediately and delete all copies of this message. Please note that it is
your responsibility to scan this message for viruses.

References to the
"Ocado Group" are to Ocado Group plc (registered in England and Wales with
number 7098618) and its subsidiary undertakings (as that expression is
defined in the Companies Act 2006) from time to time. The registered office
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way,
Hatfield, Hertfordshire, AL10 9UL.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

tutiluren@tutanota.com

Дата:

22 сентября 2020 г., 22:28:14

Sep 21, 2020, 7:53 PM by jd@commandprompt.com:

I have to agree that pg_dump is largely a step child backup program. It has consistently been found over the years to be lacking in a number of areas. Unfortunately, working on pg_dump isn't sexy and it is difficult to get volunteers or even paid resources to do such a thing. The real solution for pg_dump is a complete refactor which includes pg_dumpall and it is not a small undertaking. It should be noted that it is also a less and less used program. On our team it is normally used for only very specific needs (grabbing a schema) and we use binary backups or logical replication to receive specific data.

Huh? Are you saying that there is another, superior way to back up PostgreSQL databases other than pg_dump? I re-read the manual on it just now, but didn't see a single word about it being "legacy" or "deprecated" or even that there's any other way to do it. What do you mean?

3. The ability to embed PG to run in an automatic, quiet manner as part of something else. I know about SQLite, but it's extremely limited to the point of being virtually useless IMO, which is why I cannot use that for anything nontrivial. I want my familiar PostgreSQL, only not require it to be manually and separately installed on the machine where it is to run as part of some "application". If I could just "embed" it, this would allow me to create a single EXE which I can simply put on a different machine to run my entire "system" which otherwise takes *tons* of tedious, error-prone manual labor to install, set up and maintain. Of course, this is probably much easier said than done, but I don't understand why PG's architecture necessarily dictates that PG must be a stand-alone, separate thing. Or rather, why some "glue" cannot enable it to be used just like SQLite from a *practical* perspective, even if it still is a "server-client model" underneath the hood. (Which doesn't matter at all to me, nor should it matter to anyone else.)

This is really using the wrong tool for the job type of issue. PG was never designed for such a scenario.

I hate the "wrong tool for the job" argument. It assumes that everyone has infinite time, energy and brain capacity to learn endless redundant tools just to "use the right tool for the job" rather than "use what you actually know". I know PG. I don't know SQLite. They are very different. So obviously, I want to use PG.

What exactly makes PG unsuitable for this? I don't get it. But at the same time, I also realize that it's not going to happen at this point. The entire concept of a desktop computer appears to be phased out as we speak...

4. There is no built-in means to have PG manage (or even suggest) indexes on its own. Trying to figure out what indexes to create/delete/fine-tune, and determine all the extremely complex rules for this art (yes, I just called index management an *art*, because it is!), is just utterly hopeless to me. It never gets any easier. Not even after many years. It's the by far worst part of databases to me (combined with point five). Having to use third-party solutions ensures that it isn't done in practice, at least for me. I don't trust, nor do I want to deal with, external software and extensions in my databases. I still have nightmares from PostGIS, which I only keep around, angrily, out of absolute necessity. I fundamentally don't like third-party add-ons to things, but want the core product to properly support things. Besides, this (adding/managing indexes) is not even some niche/obscure use-case, but something which is crucial for basically any nontrivial database of any kind!

I think you are looking at this from a very windows centric way. Open Source has its origins from the Unix paradigm where each tool was designed to solve one type of problem and you used multiple tools to create a "solution". Though we have strayed from that on some items due to the evolving nature of software needs, that is still at our core and for good reason. Having tools, flags etc... to do such things (including your point #3) creates complexity best left to "vendors" not the software project.

While I understand what you mean, and even agree in theory, in practice, this always results in crappy third-party solutions which I don't want to deal with. PostGIS, for example, forces me to use "postgis" for its schema instead of "PostGIS" just because they arrogantly didn't construct their internal queries properly. "Little" things like that.

The practical end result of this is that I've always gone back to using the untouched default configuration file (except for the logging-related options), which, especially in the past on FreeBSD, *severely* crippled my PG database to not even come close to taking advantage of the full power of the hardware. Instead, it felt like I was using maybe 1% of the machine's power, even with a proper database design and indexes and all of that stuff, simply because the default config was so "conservative" and it couldn't be just set to "use whatever resources are available".

Not to be unkind but this does seem lazy. There are literally hundreds of "how to make postgres go fast", "how to optimize postgres" if you take 15 minutes to Google. It is true that the project (outside of the wiki) doesn't have much information in the official documentation but that doesn't mean that the information is not available.

Hundreds of crappy, outdated, confusing, badly written "web tutorialz" are worth nothing. A couple of clear, unambiguous documentation paragraphs are worth their (metaphorical) weight in gold.

Claiming that "the information is out there" is just hand-waving. It's shifting the burden to the user to actively hunt for information, and very likely be misled by all the garbage articles out there. I learned some horrible practices early on from "web tutz" and it took me many years to unlearn that stuff.

I know that writing documentation isn't fun, but it's necessary. Also, my overall point was to not even have to deal with the specifics, but just be able to tell PG with a single config option that it's allowed to "use most of the machine's resources".

I wish so much for PG to have a mode where it self-tunes itself as needed, over time, based on the actual workload, or at least allowed some kind of abstract "performance mode" such as: "you are allowed to use significant system resources, PG", or: "You are one of my most important applications. Just use as much power as you currently need, but at least save about 10% for the rest of the system, will you?" Maybe this is also harder than it sounds to accomplish, but for somebody like me who has zero funding, I cannot hire some professional to sit down with me and fine-tune my system for $899/hour.

See my comment about Google. The information is out there and easy to find.

I guess I'm the worst idiot in the world, then, who can't DuckDuckGo (Google is evil) it even after 15 years.

Seriously, I didn't type my feedback "for fun". It may be difficult for very intelligent people to understand (as often is the case, because you operate on a whole different level), but the performance-related PostgreSQL configuration options are a *nightmare* to me and many others. I spent *forever* reading about them and couldn't make any sense of it all. Each time I tried, I would give up, frustrated and angry, with no real clue what "magic numbers" it wanted.

It's quite baffling to me how this can be so difficult for you all to understand. Even if we disregard the sheer intelligence factor, it's clear that users of PG don't have the same intimate knowledge of PG's internals as the PG developers, nor could possibly be expected to.

As mentioned, I kept going back to the default configuration over and over again. Anyone who doesn't is either a genius or pretends/thinks that they understand it. (Or I'm extremely dumb.)

Very often, I get the feeling that things like that are the way they are on purpose. Work security and whatnot. But it's very frustrating for people like me who can't afford to buy help and don't have the enormous brain capacity necessary to comprehend the complex relations between the numerous performance-related config options. It really is that difficult.

Discord and Slack

Those modern services don't even let me load them. It's the same thing with everything these days: "verify with phone", "we've detected suspicious activity", "fake error message", etc.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Adrian Klaver

Дата:

22 сентября 2020 г., 22:47:56

On 9/22/20 3:28 PM, tutiluren@tutanota.com wrote:
> 
> Sep 21, 2020, 7:53 PM by jd@commandprompt.com:
> 
>     I have to agree that pg_dump is largely a step child backup program.
>     It has consistently been found over the years to be lacking in a
>     number of areas. Unfortunately, working on pg_dump isn't sexy and it
>     is difficult to get volunteers or even paid resources to do such a
>     thing. The real solution for pg_dump is a complete refactor which
>     includes pg_dumpall and it is not a small undertaking. It should be
>     noted that it is also a less and less used program. On our team it
>     is normally used for only very specific needs (grabbing a schema)
>     and we use binary backups or logical replication to receive specific
>     data.
> 
> Huh? Are you saying that there is another, superior way to back up 
> PostgreSQL databases other than pg_dump? I re-read the manual on it just 
> now, but didn't see a single word about it being "legacy" or 
> "deprecated" or even that there's any other way to do it. What do you mean?

It is not deprecated or legacy and it is still used by many. The issue 
is that people try to do 'partial' dumps without reading the docs on 
what those switches actually do and the potential conflicts therein. For 
large installations there it is often better to use some form of 
replication(binary and/or logical) to maintain a continuous backup. 
pg_dump can take a long time on very big databases, so it may not keep up.


> 
>     This is really using the wrong tool for the job type of issue. PG
>     was never designed for such a scenario.
> 
> I hate the "wrong tool for the job" argument. It assumes that everyone 
> has infinite time, energy and brain capacity to learn endless redundant 
> tools just to "use the right tool for the job" rather than "use what you 
> actually know". I know PG. I don't know SQLite. They are very different. 
> So obviously, I want to use PG.

This comes down to what your definition of embedded is?  A matter of 
determining whether we are talking apples or oranges.


> 
> What exactly makes PG unsuitable for this? I don't get it. But at the 
> same time, I also realize that it's not going to happen at this point. 
> The entire concept of a desktop computer appears to be phased out as we 
> speak...

> 
> While I understand what you mean, and even agree in theory, in practice, 
> this always results in crappy third-party solutions which I don't want 
> to deal with. PostGIS, for example, forces me to use "postgis" for its 
> schema instead of "PostGIS" just because they arrogantly didn't 
> construct their internal queries properly. "Little" things like that.

Huh? A schema is just a name space, why does it matter how the extension 
chooses to define it? I mean you could have number of permutations of 
postgis.

> 
>         The practical end result of this is that I've always gone back
>         to using the untouched default configuration file (except for
>         the logging-related options), which, especially in the past on
>         FreeBSD, *severely* crippled my PG database to not even come
>         close to taking advantage of the full power of the hardware.
>         Instead, it felt like I was using maybe 1% of the machine's
>         power, even with a proper database design and indexes and all of
>         that stuff, simply because the default config was so
>         "conservative" and it couldn't be just set to "use whatever
>         resources are available".
> 
> 
>     Not to be unkind but this does seem lazy. There are literally
>     hundreds of "how to make postgres go fast", "how to optimize
>     postgres" if you take 15 minutes to Google. It is true that the
>     project (outside of the wiki) doesn't have much information in the
>     official documentation but that doesn't mean that the information is
>     not available.
> 
> Hundreds of crappy, outdated, confusing, badly written "web tutorialz" 
> are worth nothing. A couple of clear, unambiguous documentation 
> paragraphs are worth their (metaphorical) weight in gold.

The problem is the 'unambiguous' part. There are so many ways Postgres 
is used, writing a configuration doc that suited everyone would have so 
many if/and/or/buts that it would define ambiguity. If you want that 
information post an email here with the information on your server specs 
and proposed usage.

> 
> Claiming that "the information is out there" is just hand-waving. It's 
> shifting the burden to the user to actively hunt for information, and 
> very likely be misled by all the garbage articles out there. I learned 
> some horrible practices early on from "web tutz" and it took me many 
> years to unlearn that stuff.
> 
> I know that writing documentation isn't fun, but it's necessary. Also, 
> my overall point was to not even have to deal with the specifics, but 
> just be able to tell PG with a single config option that it's allowed to 
> "use most of the machine's resources".

That would entail building an AI into  the code that would deal with all 
the possible OS(versions), Postgres(versions), hardware permutations.

> 
>         I wish so much for PG to have a mode where it self-tunes itself
>         as needed, over time, based on the actual workload, or at least
>         allowed some kind of abstract "performance mode" such as: "you
>         are allowed to use significant system resources, PG", or: "You
>         are one of my most important applications. Just use as much
>         power as you currently need, but at least save about 10% for the
>         rest of the system, will you?" Maybe this is also harder than it
>         sounds to accomplish, but for somebody like me who has zero
>         funding, I cannot hire some professional to sit down with me and
>         fine-tune my system for $899/hour.
> 
> 
>     See my comment about Google. The information is out there and easy
>     to find.
> 
> I guess I'm the worst idiot in the world, then, who can't DuckDuckGo 
> (Google is evil) it even after 15 years.
> 
> Seriously, I didn't type my feedback "for fun". It may be difficult for 
> very intelligent people to understand (as often is the case, because you 
> operate on a whole different level), but the performance-related 
> PostgreSQL configuration options are a *nightmare* to me and many 
> others. I spent *forever* reading about them and couldn't make any sense 
> of it all. Each time I tried, I would give up, frustrated and angry, 
> with no real clue what "magic numbers" it wanted.
> 
> It's quite baffling to me how this can be so difficult for you all to 
> understand. Even if we disregard the sheer intelligence factor, it's 
> clear that users of PG don't have the same intimate knowledge of PG's 
> internals as the PG developers, nor could possibly be expected to.
> 
> As mentioned, I kept going back to the default configuration over and 
> over again. Anyone who doesn't is either a genius or pretends/thinks 
> that they understand it. (Or I'm extremely dumb.)

Does your server runs to your satisfaction with the default settings?
If not what issues do you see?

> 
> Very often, I get the feeling that things like that are the way they are 
> on purpose. Work security and whatnot. But it's very frustrating for 
> people like me who can't afford to buy help and don't have the enormous 
> brain capacity necessary to comprehend the complex relations between the 
> numerous performance-related config options. It really is that difficult.
> 
>     Discord and Slack
> 
> Those modern services don't even let me load them. It's the same thing 
> with everything these days: "verify with phone", "we've detected 
> suspicious activity", "fake error message", etc.


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

tutiluren@tutanota.com

Дата:

23 сентября 2020 г., 18:51:39

I have to agree that pg_dump is largely a step child backup program.
It has consistently been found over the years to be lacking in a
number of areas. Unfortunately, working on pg_dump isn't sexy and it
is difficult to get volunteers or even paid resources to do such a
thing. The real solution for pg_dump is a complete refactor which
includes pg_dumpall and it is not a small undertaking. It should be
noted that it is also a less and less used program. On our team it
is normally used for only very specific needs (grabbing a schema)
and we use binary backups or logical replication to receive specific
data.

Huh? Are you saying that there is another, superior way to back up PostgreSQL databases other than pg_dump? I re-read the manual on it just now, but didn't see a single word about it being "legacy" or "deprecated" or even that there's any other way to do it. What do you mean?

It is not deprecated or legacy and it is still used by many. The issue is that people try to do 'partial' dumps without reading the docs on what those switches actually do and the potential conflicts therein. For large installations there it is often better to use some form of replication(binary and/or logical) to maintain a continuous backup. pg_dump can take a long time on very big databases, so it may not keep up.

Well, I've never done partial backups, frankly because I don't trust it to actually work. Full backups may be more wasteful with disk space, but at least "makes sense" in some abstract way. (Also, I try to keep as little data as possible, so they are never *gigantic*.)

This is really using the wrong tool for the job type of issue. PG
was never designed for such a scenario.

I hate the "wrong tool for the job" argument. It assumes that everyone has infinite time, energy and brain capacity to learn endless redundant tools just to "use the right tool for the job" rather than "use what you actually know". I know PG. I don't know SQLite. They are very different. So obviously, I want to use PG.

This comes down to what your definition of embedded is? A matter of determining whether we are talking apples or oranges.

Just what I said in my original question. Something which is bundled and invisible to the user.

While I understand what you mean, and even agree in theory, in practice, this always results in crappy third-party solutions which I don't want to deal with. PostGIS, for example, forces me to use "postgis" for its schema instead of "PostGIS" just because they arrogantly didn't construct their internal queries properly. "Little" things like that.

Huh? A schema is just a name space, why does it matter how the extension chooses to define it? I mean you could have number of permutations of postgis.

I'm saying that PostGIS has a bug due to incorrectly constructed internal queries which makes it impossible to properly name the schema where PostGIS is to reside, causing my database to look very ugly when it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And that was an example of how sloppy/bad third-party things always are, and is one reason why I don't like it when I have to rely on "extensions".

I know that writing documentation isn't fun, but it's necessary. Also, my overall point was to not even have to deal with the specifics, but just be able to tell PG with a single config option that it's allowed to "use most of the machine's resources".

That would entail building an AI into the code that would deal with all the possible OS(versions), Postgres(versions), hardware permutations.

I... guess. If "AI" means "a series of ifs". Which is what software... is? I doubt that people who can make the world's most advanced open source database cannot check the amount of RAM and see how fast the CPU/disk is.

Does your server runs to your satisfaction with the default settings?

Right now, yes, but that says very little as I'm the only user of it. I've had many nightmares in the past, however, where even determining whether the changes in the config did anything (good or bad) has been impossible. I fundamentally don't like the idea that the config is so "conservative" (crippled) with no obvious/easy way to "set a different general mode". If you honestly think that the numerous performance-related options are easy to understand, I don't know what to say.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Rob Sargent

Дата:

23 сентября 2020 г., 19:06:29

> 
> 
>     That would entail building an AI into the code that would deal with
>     all the possible OS(versions), Postgres(versions), hardware
>     permutations.
> 
> I... guess. If "AI" means "a series of ifs". Which is what software... 
> is? I doubt that people who can make the world's most advanced open 
> source database cannot check the amount of RAM and see how fast the 
> CPU/disk is.

But the very nature of "embedded" implies the db is subservient to 
something else, and that something else may need most of the resources.

But I heartily agree that tuning a server can be difficult.  It's highly 
dependent on the context and hardware resources is only a portion of 
that context.  I've seen generous help on this list for specific 
configuration help.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Adrian Klaver

Дата:

23 сентября 2020 г., 20:38:19

On 9/23/20 11:51 AM, tutiluren@tutanota.com wrote:

>     It is not deprecated or legacy and it is still used by many. The
>     issue is that people try to do 'partial' dumps without reading the
>     docs on what those switches actually do and the potential conflicts
>     therein. For large installations there it is often better to use
>     some form of replication(binary and/or logical) to maintain a
>     continuous backup. pg_dump can take a long time on very big
>     databases, so it may not keep up.
> 
> Well, I've never done partial backups, frankly because I don't trust it 
> to actually work. Full backups may be more wasteful with disk space, but 
> at least "makes sense" in some abstract way. (Also, I try to keep as 
> little data as possible, so they are never *gigantic*.)

Well not partial as in incremental. Instead dump only some portion of 
the schema with or without its associated data.

>     This comes down to what your definition of embedded is? A matter of
>     determining whether we are talking apples or oranges.
> 
> Just what I said in my original question. Something which is bundled and 
> invisible to the user.

That is going to be difficult with Postgres as it is its own process. It 
would take a good bit of tooling to hide that from the user.

>     Huh? A schema is just a name space, why does it matter how the
>     extension chooses to define it? I mean you could have number of
>     permutations of postgis.
> 
> I'm saying that PostGIS has a bug due to incorrectly constructed 
> internal queries which makes it impossible to properly name the schema 
> where PostGIS is to reside, causing my database to look very ugly when 
> it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And 
> that was an example of how sloppy/bad third-party things always are, and 
> is one reason why I don't like it when I have to rely on "extensions".

If that is the sum of your issues with PostGIS then I really don't have 
much sympathy. They are extensions so you aren't required to use them 
and rely on their way of doing things. You have the choice of writing 
your own code/extension or do without completely. I personally find 
compromise on my part wrt to extensions greatly reduces the amount of 
coding/effort I have to do to move forward. Honestly, I don't see how it 
is different from the compromises that arise out of using the core 
project itself.

>     That would entail building an AI into the code that would deal with
>     all the possible OS(versions), Postgres(versions), hardware
>     permutations.
> 
> I... guess. If "AI" means "a series of ifs". Which is what software... 
> is? I doubt that people who can make the world's most advanced open 
> source database cannot check the amount of RAM and see how fast the 
> CPU/disk is.

It is more then that. It would have to take into account the behavior 
changes that happen in Postgres between major versions. It also would 
have to account for OS specific parameters and the changes that happen 
there between OS versions. It also would need to 'know' how the database 
was going to be used; readonly, heavy writes, etc. Also how the database 
should play with other programs on the same machine. Add to the mix 
containers, cloud instances and so on and you are outrunning the ability 
of 'ifs' to handle it.

> 
>     Does your server runs to your satisfaction with the default settings?
> 
> Right now, yes, but that says very little as I'm the only user of it. 
> I've had many nightmares in the past, however, where even determining 
> whether the changes in the config did anything (good or bad) has been 
> impossible. I fundamentally don't like the idea that the config is so 
> "conservative" (crippled) with no obvious/easy way to "set a different 
> general mode". If you honestly think that the numerous 
> performance-related options are easy to understand, I don't know what to 
> say.

The thing is 'general mode' is going to mean something different to 
someone running a database in the MB-low GB range vs. high GB vs. TB vs. 
PB.

As to monitoring see:

https://www.postgresql.org/docs/12/monitoring-stats.html

https://www.postgresql.org/docs/12/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT

I know it is an extension, still it is useful:

https://www.postgresql.org/docs/12/pgstatstatements.html

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Tony Shelver

Дата:

24 сентября 2020 г., 06:13:29

On 9/23/20 11:51 AM, tutiluren@tutanota.com wrote:

> Huh? A schema is just a name space, why does it matter how the
> extension chooses to define it? I mean you could have number of
> permutations of postgis.
>
> I'm saying that PostGIS has a bug due to incorrectly constructed
> internal queries which makes it impossible to properly name the schema
> where PostGIS is to reside, causing my database to look very ugly when
> it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And
> that was an example of how sloppy/bad third-party things always are, and

> is one reason why I don't like it when I have to rely on "extensions".

Ummmm? I have PostGIS installed in my core app schema, in part because at the time I didn't know what I was doing. Better to have been in public...

You may also want to look at comparable ANSI (standards based) database products (Oracle for example) when it comes to the use of case in naming conventions. Different products handle things in different ways.

You may want to google around the issue, for example https://postgis.net/2017/11/07/tip-move-postgis-schema/ for moving schemas.

You may want to do some research on where PostGIS comes from. It could never have been developed as a core part of Postgres, so the fact that products like PostGIS are so domain specific.

The fact that the Postgesql extension system is so flexible and robust was probably a key factor in the choice it's choice in the development of PostGIS.

Effectively, you may lose a bit, but you gain a whole lot more. My suggestion is 'live with it'.

Or, move to a product that suits your use cases / desires better, But, good luck finding another open source "free" (or any) product with the functionality, robusiness and performance of PostGIS / Postgresql.

We tried the MS SQLServer equivalent, the install and use there is way nastier than PostGIS. Both the use and the performance sucked in comparison for our use cases.

MySQL's equivalent is nowhere near as functional, robust, as well documented or as widely used.

Oracle's equivalent is probably the closest. Pity that installing Oracle and their products as a whole is a nightmare, and rather wallet-draining...

Have a look at GIS / Mapping projects around the world, a majority are implemented on PostGIS. Openstreetmap is probably the biggest (think open source version of Google Maps), and it moved to PostGIS from MySQL several years ago.

We did a lot of research into PostGIS, as GIS / tracking is a core part of our business.

We didn't find a product that could compare on

Maturity
Functionality
Performance
Cost
Documentation
Support (huge community)

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Gavin Flower

Дата:

24 сентября 2020 г., 10:11:52

On 24/09/2020 18:13, Tony Shelver wrote:
>
>
>     On 9/23/20 11:51 AM, tutiluren@tutanota.com
>     <mailto:tutiluren@tutanota.com> wrote:
>
> >     Huh? A schema is just a name space, why does it matter how the
> >     extension chooses to define it? I mean you could have number of
> >     permutations of postgis.
> >
> > I'm saying that PostGIS has a bug due to incorrectly constructed
> > internal queries which makes it impossible to properly name the schema
> > where PostGIS is to reside, causing my database to look very ugly when
> > it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And
> > that was an example of how sloppy/bad third-party things always are, and
> > is one reason why I don't like it when I have to rely on "extensions".
>
>
All lowercase is good, as you don't have to remember which bits are 
capitalized.  And besides, there are far more important issues to 
consider as Tony has covered in detail.  It is definitely not a bug!

[..]

> Oracle's equivalent is probably the closest.  Pity that installing 
> Oracle and their products as a whole is a nightmare, and rather 
> wallet-draining...
>
> Have a look at GIS / Mapping projects around the world, a majority are 
> implemented on PostGIS.  Openstreetmap is probably the biggest (think 
> open source version of Google Maps), and it moved to PostGIS from 
> MySQL several years ago.
> We did a lot of research into PostGIS, as GIS / tracking is a core 
> part of our business.
> We didn't find a product that could compare on
>
>   * Maturity
>   * Functionality
>   * Performance
>   * Cost
>   * Documentation
>   * Support (huge community)
>
>
I remember going to free seminars promoting the Oracle database over 25 
years ago, and the only thing I can remembers now is the pie charts 
saying how much revenue Oracle had versus all the others.  Never  how 
many transactions, size of databases, number of users, nor any really 
useful metric -- just how much Oracle was being paid!

To be honest, I've heard many bad things about Oracle, and rarely 
anything good.

There are many reasons for going with PostgreSQL and PostGIS, but you 
are free to use something else if you prefer.


Cheers,
Gavin

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Bruce Momjian

Дата:

24 сентября 2020 г., 21:52:46

On Sun, Sep 20, 2020 at 01:15:26PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane wrote:
> >> On the other hand, the very same thing could be said of database names
> >> and role names, yet we have never worried much about whether those were
> >> encoding-safe when viewed from databases with different encodings, nor
> >> have there been many complaints about the theoretical unsafety.  So maybe
> >> this is just overly anal-retentive and we should drop the restriction,
> >> or at least pass through data that doesn't appear to be invalidly
> >> encoded.
> 
> > I think the issue is that role and database names are controlled by
> > privileged users, while application_name is not.
> 
> That's certainly an argument against a completely laissez-faire approach,
> but if we filtered invalidly-encoded data on the reading side, it seems
> like we would be in good enough shape.

Yes, if we want to filter, sure.  I thought we were not 100% able to
filter, but I guess if it safe, we can do it.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Bruce Momjian

Дата:

24 сентября 2020 г., 23:20:01

On Wed, Sep 23, 2020 at 12:28:14AM +0200, tutiluren@tutanota.com wrote:
> Sep 21, 2020, 7:53 PM by jd@commandprompt.com:
>     See my comment about Google. The information is out there and easy to find.
> 
> I guess I'm the worst idiot in the world, then, who can't DuckDuckGo (Google is
> evil) it even after 15 years.
> 
> Seriously, I didn't type my feedback "for fun". It may be difficult for very
> intelligent people to understand (as often is the case, because you operate on
> a whole different level), but the performance-related PostgreSQL configuration
> options are a *nightmare* to me and many others. I spent *forever* reading
> about them and couldn't make any sense of it all. Each time I tried, I would
> give up, frustrated and angry, with no real clue what "magic numbers" it
> wanted.
> 
> It's quite baffling to me how this can be so difficult for you all to
> understand. Even if we disregard the sheer intelligence factor, it's clear that
> users of PG don't have the same intimate knowledge of PG's internals as the PG
> developers, nor could possibly be expected to.
> 
> As mentioned, I kept going back to the default configuration over and over
> again. Anyone who doesn't is either a genius or pretends/thinks that they
> understand it. (Or I'm extremely dumb.)

I think there is a clear dependency that people reading the docs,
particularly for performance purposes, must have an existing knowledge
of a lot of low-level things --- this could be the cause of your
frustration.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

tutiluren@tutanota.com

Дата:

25 сентября 2020 г., 04:41:45

(I can't believe I'm replying to this drivel...)

> Huh? A schema is just a name space, why does it matter how the
> extension chooses to define it? I mean you could have number of
> permutations of postgis.
>
> I'm saying that PostGIS has a bug due to incorrectly constructed
> internal queries which makes it impossible to properly name the schema
> where PostGIS is to reside, causing my database to look very ugly when
> it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And
> that was an example of how sloppy/bad third-party things always are, and
> is one reason why I don't like it when I have to rely on "extensions".
All lowercase is good

That's your personal opinion -- not some kind of fact, and it definitely goes against everything that I believe.

, as you don't have to remember which bits are capitalized.

Yes, so let's remove all upper-case letters, punctuation, grammar and everything else as well. Let's just communicate with grunts. That way, we don't have to remember all of this unnecessary stuff which provides meaningless complexity to this existence...

And besides, there are far more important issues to consider as Tony has covered in detail.

And yet you focus on it, ignoring everything of importance that I had written...

It is definitely not a bug!

It is the very *definition* of a bug. The PostgreSQL manual CLEARLY states that you are to use double quotes around all identifiers or else it will turn them into lowercase. Please at least try to have some minimum knowledge of what you are talking about before trying to correct people.

The fact that there are a lot of sloppy, US-centric people who refuse to use correctly named identifiers and cannot understand how there can be anything besides a-z in an alphabet, doesn't change reality and doesn't make a bug "right".

There are many reasons for going with PostgreSQL and PostGIS, but you are free to use something else if you prefer.

"Something else"? If I "prefer"? Why waste my and your own time to send something when you clearly didn't read what you are replying to?

The answer, of course, is that you have zero interest in helping but 100% interest in making smug insults, like so many other people out there. It's incredibly tiresome to constantly have to wade through useless noise from horribly ignorant people, while getting further and further away from actually getting answers or meaningful feedback on the actual thing you asked.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Ron

Дата:

25 сентября 2020 г., 04:47:10

On 9/24/20 6:20 PM, Bruce Momjian wrote:
> On Wed, Sep 23, 2020 at 12:28:14AM +0200, tutiluren@tutanota.com wrote:
>> Sep 21, 2020, 7:53 PM by jd@commandprompt.com:
>>      See my comment about Google. The information is out there and easy to find.
>>
>> I guess I'm the worst idiot in the world, then, who can't DuckDuckGo (Google is
>> evil) it even after 15 years.
>>
>> Seriously, I didn't type my feedback "for fun". It may be difficult for very
>> intelligent people to understand (as often is the case, because you operate on
>> a whole different level), but the performance-related PostgreSQL configuration
>> options are a *nightmare* to me and many others. I spent *forever* reading
>> about them and couldn't make any sense of it all. Each time I tried, I would
>> give up, frustrated and angry, with no real clue what "magic numbers" it
>> wanted.
>>
>> It's quite baffling to me how this can be so difficult for you all to
>> understand. Even if we disregard the sheer intelligence factor, it's clear that
>> users of PG don't have the same intimate knowledge of PG's internals as the PG
>> developers, nor could possibly be expected to.
>>
>> As mentioned, I kept going back to the default configuration over and over
>> again. Anyone who doesn't is either a genius or pretends/thinks that they
>> understand it. (Or I'm extremely dumb.)
> I think there is a clear dependency that people reading the docs,
> particularly for performance purposes, must have an existing knowledge
> of a lot of low-level things --- this could be the cause of your
> frustration.

And that's a serious problem with the documentation. (Not that I know how to 
fix it in an OSS project.)

-- 
Angular momentum makes the world go 'round.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

tutiluren@tutanota.com

Дата:

25 сентября 2020 г., 04:49:32

Sep 24, 2020, 8:13 AM by tshelver@gmail.com:

On 9/23/20 11:51 AM, tutiluren@tutanota.com wrote:
> Huh? A schema is just a name space, why does it matter how the
> extension chooses to define it? I mean you could have number of
> permutations of postgis.
>
> I'm saying that PostGIS has a bug due to incorrectly constructed
> internal queries which makes it impossible to properly name the schema
> where PostGIS is to reside, causing my database to look very ugly when
> it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And
> that was an example of how sloppy/bad third-party things always are, and
> is one reason why I don't like it when I have to rely on "extensions".

Ummmm? I have PostGIS installed in my core app schema, in part because at the time I didn't know what I was doing. Better to have been in public...

You may also want to look at comparable ANSI (standards based) database products (Oracle for example) when it comes to the use of case in naming conventions. Different products handle things in different ways.

You may want to google around the issue, for example https://postgis.net/2017/11/07/tip-move-postgis-schema/ for moving schemas.

You may want to do some research on where PostGIS comes from. It could never have been developed as a core part of Postgres, so the fact that products like PostGIS are so domain specific.
The fact that the Postgesql extension system is so flexible and robust was probably a key factor in the choice it's choice in the development of PostGIS.

It's painfully clear that people don't read anymore. I don't know what to say other than: "This has nothing to do with what I said."

My suggestion is 'live with it'.

And my point was that third-party extensions always are sloppy/bad in some way, and what I described was a perfect example of such a thing. It's insulting beyond words to be forced to make an ugly schema called "postgis" just because of a BUG in PostGIS, when all my other schemas are named properly and PostgreSQL has ZERO problems calling it "PostGIS" -- it's PostGIS that has a BUG in it that makes it impossible to use if you name it properly.

Or, move to a product that suits your use cases / desires better

Always the same thing. The slightest criticism, no matter how warranted, always results in: "Fine. Go somewhere else. Use something else." Never: "Oh, right. Sorry, but we always used lowercase ourselves and therefore didn't consider this. In retrospect, it's an embarrassing mistake! We'll fix it in the next release. Thanks for pointing that out."

, But, good luck finding another open source "free" (or any) product with the functionality, robusiness and performance of PostGIS / Postgresql.

I didn't say there is, nor do I think so. It wasn't my point at all...

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

tutiluren@tutanota.com

Дата:

25 сентября 2020 г., 05:40:11

Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data.

It's funny that you should bring that up, considering how it was one of my points... See the point about pg_dump's bug on Windows.

I'm saying that PostGIS has a bug due to incorrectly constructed internal queries which makes it impossible to properly name the schema where PostGIS is to reside, causing my database to look very ugly when it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And that was an example of how sloppy/bad third-party things always are, and is one reason why I don't like it when I have to rely on "extensions".

If that is the sum of your issues with PostGIS then I really don't have much sympathy.

Why does nobody understand that it was an *example* and not some kind of full PostGIS review?

They are extensions so you aren't required to use them and rely on their way of doing things. You have the choice of writing your own code/extension or do without completely.

It sure is great to have such choices... I can't take it seriously when people say things like this. It's similar to "it's open source so you can easily vet it yourself". It's not taking reality into consideration at all.

As for doing without it, that would make it impossible to deal with GPS coordinates/maps. So it's not really a choice at all.

That would entail building an AI into the code that would deal with
all the possible OS(versions), Postgres(versions), hardware
permutations.

I... guess. If "AI" means "a series of ifs". Which is what software... is? I doubt that people who can make the world's most advanced open source database cannot check the amount of RAM and see how fast the CPU/disk is.

It is more then that. It would have to take into account the behavior changes that happen in Postgres between major versions. It also would have to account for OS specific parameters and the changes that happen there between OS versions. It also would need to 'know' how the database was going to be used; readonly, heavy writes, etc. Also how the database should play with other programs on the same machine. Add to the mix containers, cloud instances and so on and you are outrunning the ability of 'ifs' to handle it.

If it changes that much, it's far, far worse than I even thought, and it sounds like it will be pointless to even *try* to learn it as it keeps changing between versions/OSes/other stuff.

I can't help but feel as if people just don't want to answer this and other concerns I have. As if there's some silent agreement along the lines of "securing PG DBAs' jobs".

Does your server runs to your satisfaction with the default settings?

Right now, yes, but that says very little as I'm the only user of it. I've had many nightmares in the past, however, where even determining whether the changes in the config did anything (good or bad) has been impossible. I fundamentally don't like the idea that the config is so "conservative" (crippled) with no obvious/easy way to "set a different general mode". If you honestly think that the numerous performance-related options are easy to understand, I don't know what to say.

The thing is 'general mode' is going to mean something different to someone running a database in the MB-low GB range vs. high GB vs. TB vs. PB.

I don't mean this to sound rude, but it's like talking to a wall... What I mean is that there are obviously technical means for software to know whether they are exhausting the system they are running on or not, and expecting people to understand all these intricate internal parameters is just... bizarre. There ought to be some kind of "abstract" setting for those of us who aren't able to (or even *wish* to) comprehend all the PG internals, and just want an efficient database using (roughly) as much of our machine as we want.

This is not the first time I feel like I'm repeating myself over and over in different ways but never getting through. It could be that you are so familiar with PG's internals that it all is obvious to you, but it could just as well be that you don't want to hear about this.

"There's plenty of guides" and "the information is out there" doesn't help me and all the other people who have stuck with the default config and thus a massively restricted PG database for all these years. Just because it's easy to you doesn't mean it's easy to everyone else. Just dealing with composing efficient-enough SQL queries and designing an optimized database structure is (way) more than enough work for most of us. I don't have the luxury of some hired DBA who sits all day tuning the PG server. Besides, I've already explained the privacy issues with that even if I had the money...

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Thomas Kellerer

Дата:

25 сентября 2020 г., 08:04:53

tutiluren@tutanota.com schrieb am 25.09.2020 um 06:41:
>     All lowercase is good
>
> That's your personal opinion -- not some kind of fact, and it
> definitely goes against everything that I believe.
You have to accept that every programming environment has its own best practices
(in terms of coding style AND naming conventions). The recommendation to use all
lower-case unquoted names is not Gavin's personal opinion, it's the opinion
of the Postgres community.

In Postgres and many other relational databases - with SQL Server and MySQL being the (only?) exception -
using non-quoted identifiers is the recommended way. In Oracle, DB2 or Firebird this leads to all uppercase
names, in Postgres to all lowercase names.

If you don't follow the recommended best practices, you can't really blame those that do (e.g. the PostGIS project),
for your problems.


> The fact that there are a lot of sloppy, US-centric people who refuse
> to use correctly named identifiers and cannot understand how there
> can be anything besides a-z in an alphabet, doesn't change reality
> and doesn't make a bug "right".

I am not in the USA (and far from being US-centric as well) and I have been working
with relational databases for over thirty years. I never had problems using unquoted
ASCII names (using snake_case) for my database objects.

Would it be nice if I could use special characters like öäü in the names of tables and columns (without the hassle of
quotingthem)? 
Yes, absolutely.

Does not using them, limit me in any way doing what I want to do?
No, it doesn't.

> Always the same thing. The slightest criticism, no matter how
> warranted, always results in: "Fine. Go somewhere else. Use something
> else."
> Never: "Oh, right. Sorry, but we always used lowercase ourselves and
> therefore didn't consider this. In retrospect, it's an embarrassing
> mistake! We'll fix it in the next release. Thanks for pointing that
> out."

Changing this behaviour has been discussed on this list multiple times.

The problem is, that currently no one sees a way to change this without
breaking all (or most) existing code that relies on it (at least not with
the resources the project hast).

It's my understanding (as a user, not a developer) that the impact on the code base would be huge, and
the community doesn't indeed really see a benefit in doing so.

It has especially been discussed to implement a behaviour that complies
with the SQL standard which *requires* to fold non-quoted names to uppercase!

Would you be more happy with a standard compliant behaviour?
I guess not.

So, this is indeed one of the things that you either have to accept, or move on.

*Every* piece of software has some quirks (or bugs as you see it) which are annoying
and can't or won't be changed. If the number of quirks (or bugs) exceeds the benefit
you get from the software, then you should indeed start looking for something different.

But claiming that the people on this list are not open to criticism is simply wrong and
doesn't do them justice.
You might want to search the internet on the reaction to Uber's change from Postgres
to MySQL - many of the criticism from their side have resulted in changes to
Postgres. And there are many other examples (e.g. Windows port, replication, vacuum problems).

You are unlucky to be offended by the best practices when it comes to naming
database objects. I think having a "case preserving, case insensitive"
option in Postgres would be nice indeed, but it's not a _functional_ problem,
it's just an aesthetic one. Sticking to the recommended best practices
doesn't limit you in any way in exploiting all features of the software.


> The answer, of course, is that you have zero interest in helping but
> 100% interest in making smug insults,
The only person who is insulting other people is you.

But you apparently have zero interest in accepting that different environments, need different approaches.
And what works in "System A" does not work the same in "System B".


Thomas

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Adrian Klaver

Дата:

25 сентября 2020 г., 15:02:52

On 9/25/20 1:04 AM, Thomas Kellerer wrote:

> 
>> The fact that there are a lot of sloppy, US-centric people who refuse
>> to use correctly named identifiers and cannot understand how there
>> can be anything besides a-z in an alphabet, doesn't change reality
>> and doesn't make a bug "right".
> 
> I am not in the USA (and far from being US-centric as well) and I have 
> been working
> with relational databases for over thirty years. I never had problems 
> using unquoted
> ASCII names (using snake_case) for my database objects.
> 
> Would it be nice if I could use special characters like öäü in the names 
> of tables and columns (without the hassle of quoting them)?
> Yes, absolutely.
> 

But you can use them without quoting:

select version();
                                       version 

------------------------------------------------------------------------------------
  PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 
7.5.0, 64-bit

create table öäü (id int , fld_1 varchar);

insert into öäü values (1, 'test');

select * from öäü;
  id | fld_1
----+-------
   1 | test


> 
> Thomas
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Adrian Klaver

Дата:

25 сентября 2020 г., 15:39:50

On 9/24/20 10:40 PM, tutiluren@tutanota.com wrote:
> 
>     Well not partial as in incremental. Instead dump only some portion
>     of the schema with or without its associated data.
> 
> It's funny that you should bring that up, considering how it was one of 
> my points... See the point about pg_dump's bug on Windows.

Yes, I read your bug 
report(https://www.postgresql.org/message-id/MCk38UV--3-2@tutanota.com) 
and you where just as resistant to advice there as here.

> 
>         I'm saying that PostGIS has a bug due to incorrectly constructed
>         internal queries which makes it impossible to properly name the
>         schema where PostGIS is to reside, causing my database to look
>         very ugly when it has to say "postgis" instead of "PostGIS" for
>         PostGIS's schema. And that was an example of how sloppy/bad
>         third-party things always are, and is one reason why I don't
>         like it when I have to rely on "extensions".
> 
> 
>     If that is the sum of your issues with PostGIS then I really don't
>     have much sympathy.
> 
> Why does nobody understand that it was an *example* and not some kind of 
> full PostGIS review?
> 
>     They are extensions so you aren't required to use them and rely on
>     their way of doing things. You have the choice of writing your own
>     code/extension or do without completely.
> 
> It sure is great to have such choices... I can't take it seriously when 
> people say things like this. It's similar to "it's open source so you 
> can easily vet it yourself". It's not taking reality into consideration 
> at all.
> 
> As for doing without it, that would make it impossible to deal with GPS 
> coordinates/maps. So it's not really a choice at all.

Read this issue from PostGIS:

https://trac.osgeo.org/postgis/ticket/3496

" Then we can use the variable @extschema@ in lieu of the actual schema 
name. This will still allow users to do:

CREATE EXTENSION postgis SCHEMA whereever_I_damn_want_you_to_be;
"


>     It is more then that. It would have to take into account the
>     behavior changes that happen in Postgres between major versions. It
>     also would have to account for OS specific parameters and the
>     changes that happen there between OS versions. It also would need to
>     'know' how the database was going to be used; readonly, heavy
>     writes, etc. Also how the database should play with other programs
>     on the same machine. Add to the mix containers, cloud instances and
>     so on and you are outrunning the ability of 'ifs' to handle it.
> 
> If it changes that much, it's far, far worse than I even thought, and it 
> sounds like it will be pointless to even *try* to learn it as it keeps 
> changing between versions/OSes/other stuff.

Life is not static. If you want to stay current you have to keep learning.

> 
> I can't help but feel as if people just don't want to answer this and 
> other concerns I have. As if there's some silent agreement along the 
> lines of "securing PG DBAs' jobs".

The reason people have not answered is you have not provided the 
information necessary to formulate an answer. For instance, OS & 
version, Postgres version, size of data, database use case, number of 
users.


>     The thing is 'general mode' is going to mean something different to
>     someone running a database in the MB-low GB range vs. high GB vs. TB
>     vs. PB.
> 
> I don't mean this to sound rude, but it's like talking to a wall... What 
> I mean is that there are obviously technical means for software to know 
> whether they are exhausting the system they are running on or not, and 
> expecting people to understand all these intricate internal parameters 
> is just... bizarre. There ought to be some kind of "abstract" setting 
> for those of us who aren't able to (or even *wish* to) comprehend all 
> the PG internals, and just want an efficient database using (roughly) as 
> much of our machine as we want.
> 
> This is not the first time I feel like I'm repeating myself over and 
> over in different ways but never getting through. It could be that you 
> are so familiar with PG's internals that it all is obvious to you, but 
> it could just as well be that you don't want to hear about this.

No I am not all that familiar with the Postgres internals. I'm an end 
user for what qualifies as small databases. The configuration as is 
works for me, in that any impediments it might impose are hidden by 
other parts of the stack. I just know, from years on this list, that 
there are folks who would help you with configuration given some 
starting point information.



-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Thomas Kellerer

Дата:

25 сентября 2020 г., 15:55:17

Adrian Klaver schrieb am 25.09.2020 um 17:02:
>> Would it be nice if I could use special characters like öäü in the names of tables and columns (without the hassle
ofquoting them)? 
>> Yes, absolutely.
>>
> But you can use them without quoting:
>
> select version();
>                                        version
> ------------------------------------------------------------------------------------
>   PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 7.5.0, 64-bit
>
> create table öäü (id int , fld_1 varchar);
>
> insert into öäü values (1, 'test');
>
> select * from öäü;
>   id | fld_1
> ----+-------
>    1 | test

Ah cool ;)

I didn't know that, thanks.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Johannes Graën

Дата:

25 сентября 2020 г., 19:46:30

On 9/25/20 7:40 AM, tutiluren@tutanota.com wrote:
> I don't mean this to sound rude, but it's like talking to a wall... What
> I mean is that there are obviously technical means for software to know
> whether they are exhausting the system they are running on or not, and
> expecting people to understand all these intricate internal parameters
> is just... bizarre. There ought to be some kind of "abstract" setting
> for those of us who aren't able to (or even *wish* to) comprehend all
> the PG internals, and just want an efficient database using (roughly) as
> much of our machine as we want.

I found pgtune [1] to be a very good start for tuning PG. There's also a
tool available online [2].

[1] https://github.com/gregs1104/pgtune
[2] https://pgtune.leopard.in.ua/

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Bruce Momjian

Дата:

25 сентября 2020 г., 19:49:23

On Thu, Sep 24, 2020 at 11:47:10PM -0500, Ron wrote:
> On 9/24/20 6:20 PM, Bruce Momjian wrote:
> > On Wed, Sep 23, 2020 at 12:28:14AM +0200, tutiluren@tutanota.com wrote:
> > > Sep 21, 2020, 7:53 PM by jd@commandprompt.com:
> > >      See my comment about Google. The information is out there and easy to find.
> > > 
> > > I guess I'm the worst idiot in the world, then, who can't DuckDuckGo (Google is
> > > evil) it even after 15 years.
> > > 
> > > Seriously, I didn't type my feedback "for fun". It may be difficult for very
> > > intelligent people to understand (as often is the case, because you operate on
> > > a whole different level), but the performance-related PostgreSQL configuration
> > > options are a *nightmare* to me and many others. I spent *forever* reading
> > > about them and couldn't make any sense of it all. Each time I tried, I would
> > > give up, frustrated and angry, with no real clue what "magic numbers" it
> > > wanted.
> > > 
> > > It's quite baffling to me how this can be so difficult for you all to
> > > understand. Even if we disregard the sheer intelligence factor, it's clear that
> > > users of PG don't have the same intimate knowledge of PG's internals as the PG
> > > developers, nor could possibly be expected to.
> > > 
> > > As mentioned, I kept going back to the default configuration over and over
> > > again. Anyone who doesn't is either a genius or pretends/thinks that they
> > > understand it. (Or I'm extremely dumb.)
> > I think there is a clear dependency that people reading the docs,
> > particularly for performance purposes, must have an existing knowledge
> > of a lot of low-level things --- this could be the cause of your
> > frustration.
> 
> And that's a serious problem with the documentation. (Not that I know how to
> fix it in an OSS project.)

We added a glossary in PG 13, so we could certainly have some kind of
hardware terms primer which explains various OS features that affect
Postgres.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Bruce Momjian

Дата:

25 сентября 2020 г., 19:50:13

On Fri, Sep 25, 2020 at 10:04:53AM +0200, Thomas Kellerer wrote:
> I am not in the USA (and far from being US-centric as well) and I have been working
> with relational databases for over thirty years. I never had problems using unquoted
> ASCII names (using snake_case) for my database objects.
> 
> Would it be nice if I could use special characters like öäü in the names of tables and columns (without the hassle of
quotingthem)?
 
> Yes, absolutely.
> 
> Does not using them, limit me in any way doing what I want to do?
> No, it doesn't.
> 
> > Always the same thing. The slightest criticism, no matter how
> > warranted, always results in: "Fine. Go somewhere else. Use something
> > else."
> > Never: "Oh, right. Sorry, but we always used lowercase ourselves and
> > therefore didn't consider this. In retrospect, it's an embarrassing
> > mistake! We'll fix it in the next release. Thanks for pointing that
> > out."
> 
> Changing this behaviour has been discussed on this list multiple times.
> 
> The problem is, that currently no one sees a way to change this without
> breaking all (or most) existing code that relies on it (at least not with
> the resources the project hast).
> 
> It's my understanding (as a user, not a developer) that the impact on the code base would be huge, and
> the community doesn't indeed really see a benefit in doing so.
> 
> It has especially been discussed to implement a behaviour that complies
> with the SQL standard which *requires* to fold non-quoted names to uppercase!

I did write a blog entry about case folding:

    https://momjian.us/main/blogs/pgblog/2020.html#June_26_2020

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Sam Gendler

Дата:

26 сентября 2020 г., 03:09:49

On Thu, Sep 24, 2020 at 10:40 PM <tutiluren@tutanota.com> wrote:

Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data.
It's funny that you should bring that up, considering how it was one of my points... See the point about pg_dump's bug on Windows.

And you seem to have ignored the fact that one of the core developers pointed out that it likely isn't a pg_dump bug - if your terminal is using the same locale as the database, it should have no difficulty dealing with the characters you are having trouble with. It seems likely that you simply need to learn how to get your terminal set up correctly for it to work.

I'm saying that PostGIS has a bug due to incorrectly constructed internal queries which makes it impossible to properly name the schema where PostGIS is to reside, causing my database to look very ugly when it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And that was an example of how sloppy/bad third-party things always are, and is one reason why I don't like it when I have to rely on "extensions".
They are extensions so you aren't required to use them and rely on their way of doing things. You have the choice of writing your own code/extension or do without completely.
It sure is great to have such choices... I can't take it seriously when people say things like this. It's similar to "it's open source so you can easily vet it yourself". It's not taking reality into consideration at all.

As for doing without it, that would make it impossible to deal with GPS coordinates/maps. So it's not really a choice at all.

Never mind that your tone in your emails is remarkably rude for someone who is doing nothing but complain about perceived shortfalls in a product that is entirely free and of which you appear to be fairly far from an expert user, has it occurred to you that YOUR issue is that you have absolutely no understanding of the variety of uses that people put a database like postgres to out in the world? Imagine if the core database server was subject to the development schedule of every extension that you happen to think ought to be included in the core product - or vice versa. Someone finds a bug in PostGIS and it can't be fixed until the next major or minor release of the core postgres server? That would result in a terrible user experience and force development of all extensions to move in lockstep with the core server. The vast majority of users do not ever use GIS extensions and have absolutely no use for their presence in the core product. The fact that you need it is your one individual use case. I've been using Postgres professionally since 2005 and I've only ever used postgis for a single project. Who are you to say what should and should not be included in a product to which you contribute absolutely nothing? And in the same breath, you complain about the quality of that extension. If it is so lacking in quality, why would you want it embedded in the core service where it could potentially impact the quality of every user's experience? Maybe consider that the dev team has good reasons for keeping non-essential functionality in extensions instead of the core codebase.

That would entail building an AI into the code that would deal with
all the possible OS(versions), Postgres(versions), hardware
permutations.

I... guess. If "AI" means "a series of ifs". Which is what software... is? I doubt that people who can make the world's most advanced open source database cannot check the amount of RAM and see how fast the CPU/disk is.

It is more then that. It would have to take into account the behavior changes that happen in Postgres between major versions. It also would have to account for OS specific parameters and the changes that happen there between OS versions. It also would need to 'know' how the database was going to be used; readonly, heavy writes, etc. Also how the database should play with other programs on the same machine. Add to the mix containers, cloud instances and so on and you are outrunning the ability of 'ifs' to handle it.
If it changes that much, it's far, far worse than I even thought, and it sounds like it will be pointless to even *try* to learn it as it keeps changing between versions/OSes/other stuff.

I can't help but feel as if people just don't want to answer this and other concerns I have. As if there's some silent agreement along the lines of "securing PG DBAs' jobs".

You really come off as something of an ass with this commentary. DBAs have plenty of far more interesting things to do with their time than performance tune a server config and certainly don't need that in order to be secure in their role. What is clear is that you have absolutely no understanding of the variety of ways in which a database server might be configured depending on the way the database is being used, the quantity of data, the rate of queries, the size of queries, the other applications that must share the host, the speed and quantity of storage systems, etc. If it is such an easy problem to solve, then I'm sure everyone here would encourage you to build it yourself and contribute it back to the project - all of us who have spent years learning how to fine tune the performance of postgresql would surely thank you for taking a big chunk of what is otherwise basically toil off our plates. Take a look at the performance mailing list some time and look at the effort even people with decades of postgres experience go to in order to determine optimal settings for new hardware and then tell us how easy it would be to just automatically configure a server with magic values that work for everyone. You are just flat-out wrong, here, and you lack the expertise or experience to even understand that you don't know enough to have a valid opinion on this point. Everyone else has tried to let you know politely, but your tone has ticked me off so I'm just telling it to you straight.

The thing is 'general mode' is going to mean something different to someone running a database in the MB-low GB range vs. high GB vs. TB vs. PB.
I don't mean this to sound rude, but it's like talking to a wall... What I mean is that there are obviously technical means for software to know whether they are exhausting the system they are running on or not, and expecting people to understand all these intricate internal parameters is just... bizarre. There ought to be some kind of "abstract" setting for those of us who aren't able to (or even *wish* to) comprehend all the PG internals, and just want an efficient database using (roughly) as much of our machine as we want.

Well, you most definitely DO sound rude, and you have since your first email on this thread. And there are most definitely NOT technical means for software to know whether it is exhausting a system or not. Just for starters, it has no idea if it is sharing server resources with other processes or if all resources are dedicated to itself. If disk response is suddenly slow, is that because some index or table just hit critical mass and caused a knee in a performance curve that might be tuned out or is it because someone just started bulk copying a bunch of data to the same network attached storage that the database is using? There is no way to tell if a query is slow because other queries were occupying all of the CPU, because something else sucked up all the memory and the OS is swapping, or because some other process just consumed 90% of CPU resources on the host. The database doesn't know if 99% of your queries are going to be to a bunch of 1MB tables or to the one 40TB table that is actually just being written for auditing purposes and will rarely, if ever be queried on the production host. Never mind the differences between data warehousing workloads vs OLTP workloads.

Just maybe, the folks around here, many with decades of postgresql experience, know more than you do about the variety of uses a database might be put to, not to mention the variety of settings that go into tuning for them, and have come to the conclusion that it is a problem too complicated to solve via automation. Instead of arguing with them, and then complaining about how hard it is for YOU - someone who almost certainly is smarter than any algorithm is likely to be - cannot figure it out. Maybe take some time to do some reading and learning, set up a system and do some benchmarking of various config changes, and figure it out. Honestly, it really isn't all that hard, but it IS time consuming. Skill acquisition usually is. That's why employers pay well the people who do it. Not only are there already tools available for 'automatically' tuning your database based on your projected utilization and available resources, but many of us have somehow managed to acquire the knowledge you seek via the documentation and the shared knowledgebase that is the internet. Instead of berating the developers, how about just asking for assistance and then using that assistance to further your own knowledge. No one is going to be quick to volunteer to assist someone who just angrily writes accusatory complaints that not enough was done to proactively assist them - as if the folks on this mailing list don't have anything better to do with their time than to drop everything to help you - and yet they do. Instead of being angry about it, maybe thank them instead.

The postgresql docs are reference docs. They are written in a manner which documents available APIs and capabilities in a thorough manner. As a result, they aren't that useful for learning which bits and pieces are most useful for doing the thing that you want to do. But there are countless books, blog posts, mailing lists, and other resources available to you - just as they are for almost every other open source project of significance. Can you name a database that is even remotely as capable as PostgreSQL which is significantly more easily administered than PostgreSQL? Oracle? Nope, it is MASSIVELY more difficult. SQL Server? Nope, also a pain in the neck to tune and, in my opinion as something of a novice user of that system, a whole heck of a lot more poorly documented when it comes to administrative tasks. MySQL? Not really. Maybe slightly simpler to configure if you don't need all of the functionality that postgres brings to the table, but otherwise every bit as hard to manage. If you want Microsoft Access simplicity, you should use Microsoft Access. There's a reason all of those databases have manually managed configurations that take lots of experience to become fully competent with. The only way to simplify configuration is to simplify functionality, and you would inevitably complain about that if it happened.

"There's plenty of guides" and "the information is out there" doesn't help me and all the other people who have stuck with the default config and thus a massively restricted PG database for all these years. Just because it's easy to you doesn't mean it's easy to everyone else. Just dealing with composing efficient-enough SQL queries and designing an optimized database structure is (way) more than enough work for most of us. I don't have the luxury of some hired DBA who sits all day tuning the PG server. Besides, I've already explained the privacy issues with that even if I had the money...

None of us had the luxury of having some hired DBA who sat all day tuning the PG server. That's why we sat down and acquired those skills for ourselves. And I'm reasonably certain that none of us sit around all day idly modifying database configs during our copious spare time. Complex systems are hard. Welcome to your career. The only thing that will keep you gainfully employed for the next decades is if you learn how to quickly and effectively acquire new skills (and not piss of whole communities of potential support in the process). And you know what is a damn good way to do that? Read books about the tech. Read the mailing lists for the tech. And then contribute something useful to that tech. Feel free to fix any and/or all of the perceived defects that you seem to think are so trivial to resolve. I can promise you that if your implementations are of high quality and don't break other functionality or diverge from the project's core mission, there's a high likelihood that they'll be merged into the codebase eventually.

I just typed PostgreSQL into my oreilly bookshelf and came up with over 3000 hits. The first page is entirely high quality books published by reputable publishers within the last 3 years, covering pg 11 or 12. The first one is called "Mastering PostgreSQL 12" and seems a likely candidate. A few rows farther down, I find "PostgreSQL Configuration: Best Practices for Performance and Security" which was published less than 6 months ago and is surely entirely up to date. I refuse to believe that you could possibly have so much difficulty learning how to configure a system from a book like that, which almost certainly gives you step by step guidance for determining optimal values for the most relevant settings for use cases similar to yours. It's probably worth pointing out that there's likely a 99% chance that the authors of those books are reading your emails on this list. So maybe go buy, borrow, or check out one of their books and learn how to configure your system. Unless you pay them an hourly rate, no one here is going to configure your system for you.

There's my $0.02 (and then some).

--sam

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

John DeSoi

Дата:

26 сентября 2020 г., 19:33:11

> On Sep 22, 2020, at 5:28 PM, tutiluren@tutanota.com wrote:
>
> I hate the "wrong tool for the job" argument. It assumes that everyone has infinite time, energy and brain capacity
tolearn endless redundant tools just to "use the right tool for the job" rather than "use what you actually know". I
knowPG. I don't know SQLite. They are very different. So obviously, I want to use PG. 

https://duckdb.org/docs/why_duckdb

Maybe this would provide something in between. I have not used it, but it sounds interesting. Based on SQLite engine
anduses the Postgres SQL parser. Lots of built-in functions appear to be the same or very similar to Postgres. 

John DeSoi, Ph.D.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

tutiluren@tutanota.com

Дата:

01 октября 2020 г., 23:08:13

On Thu, Sep 24, 2020 at 10:40 PM <tutiluren@tutanota.com> wrote:

Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data.
It's funny that you should bring that up, considering how it was one of my points... See the point about pg_dump's bug on Windows.

And you seem to have ignored the fact that one of the core developers pointed out that it likely isn't a pg_dump bug - if your terminal is using the same locale as the database, it should have no difficulty dealing with the characters you are having trouble with. It seems likely that you simply need to learn how to get your terminal set up correctly for it to work.

Yeah, this isn't rude or insulting at all...

Funny how my "incorrectly set up terminal" works perfectly for all other programs and my own test scripts, but not for pg_dump specifically. And only when using "special" characters. As already pointed out multiple times, in great detail, to deaf ears. Very interesting how you can manage to twist and bend that into it still somehow being "my fault". Because of course it cannot be pg_dump's fault. Absolutely not. It is unthinkable. It's the "rude user"'s fault who had the audacity to point out yet another PG bug which more than likely won't *ever* be fixed, as it's not even recognized, much less cared about. Probably because they *want* PostgreSQL to be crippled on Windows, judging by the responses in the past and how incredibly broken the joke of an installer is.

You should call it "Linux software with minimal pre-alpha Windows support" instead of pretending that it's cross-platform, and that goes for many FOSS projects as well which think exactly the same as you. The fact that I still use this garbage OS (Windows) speaks volumes of how incredibly crappy Linux is, which is utterly *unusable*.

But of course I should be grateful no matter what because it doesn't cost money. Because my time and energy is worthless. And the competition is "even worse", so that means I cannot point out any fault, ever, no matter how serious or how easily it could be fixed. I should just shut up and thank everyone for insulting me through carelessness and words. Or "fix it myself", because that's obviously an option as I haven't done it so far...

I did read the rest of your e-mail, but it would be pointless to reply to it as you clearly have the mentality that everyone should dedicate their lives to configuring a database and buying books instead of using it, because everyone should be core developers and everything must always be cryptic and difficult and blablabla. I'm sick of this attitude, and especially of being called "rude" by such rude-beyond-words people.

It would be refreshing to hear your be honest for once and just admit that you *want* it to be difficult. You *like* that there's a high threshold and it makes you feel superior to exclude "dumb" people who can't figure out all these cryptic (and downright broken) things. I truly believe that this is the reason for a lot of "weird" things which seem to make no sense on the surface.

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Christophe Pettus

Дата:

01 октября 2020 г., 23:24:40

> On Oct 1, 2020, at 16:08, tutiluren@tutanota.com wrote:
> But of course I should be grateful no matter what because it doesn't cost money.

No one is asking you to be grateful.  However, you are asking for other people to do things that important to you, but
notthem.  They are not required to do so.  If you cannot persuade them, and are not in a position to pay them, then
that'sa reality you'll just have to accept. 

--
-- Christophe Pettus
   xof@thebuild.com

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Adrian Klaver

Дата:

02 октября 2020 г., 00:57:01

On 10/1/20 4:08 PM, tutiluren@tutanota.com wrote:
> 

> Yeah, this isn't rude or insulting at all...
> 

> It would be refreshing to hear your be honest for once and just admit 
> that you *want* it to be difficult. You *like* that there's a high 
> threshold and it makes you feel superior to exclude "dumb" people who 
> can't figure out all these cryptic (and downright broken) things. I 
> truly believe that this is the reason for a lot of "weird" things which 
> seem to make no sense on the surface.

Yes, if it makes you fell better we had a secret meeting where we 
decided to make your life and only your life a living hell when it came 
to dealing with Postgres.


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

"Peter J. Holzer"

Дата:

03 октября 2020 г., 07:54:40

On 2020-09-23 13:38:19 -0700, Adrian Klaver wrote:
> On 9/23/20 11:51 AM, tutiluren@tutanota.com wrote:
> >     This comes down to what your definition of embedded is? A matter of
> >     determining whether we are talking apples or oranges.
> >
> > Just what I said in my original question. Something which is bundled and
> > invisible to the user.
>
> That is going to be difficult with Postgres as it is its own process. It
> would take a good bit of tooling to hide that from the user.

I guess that comes down to the definition of "invisible". PostgreSQL
certainly won't be invisible to a user running «ps -e» on Linux or using
the task manager on Windows. It also won't be invisible to someone who
examines the file system and discovers a directory structure with
PostgreSQL's rather distinctive file names (the latter is of course true
for SQLite, too).

But I don't see any great difficulty in bundling PostgreSQL with an
application, such that it is automatically installed, configured,
populated and managed by the application. The user might never notice
that it's there at all.

But this is the responsibility of the application's author: The author
can put work into an installer or they can decide that it's better (for
some definition of "better") to tell the user to install PostgreSQL
themselves. It has very little to do with PostgreSQL.

> It is more then that. It would have to take into account the behavior
> changes that happen in Postgres between major versions. It also would have
> to account for OS specific parameters and the changes that happen there
> between OS versions. It also would need to 'know' how the database was going
> to be used;

This is IMHO the most important point: Memory, CPUs, OS versions: There
is only so much variety. But how the database is used - that can be very
different. For example, on our largest database server I have work_mem
set to 1/8 of the total RAM. This is not something I would recommend
generally, but it works well for our workload. And most importantly it
isn't something that can be determined statically - it depends on the
behaviour of the application. So either you already know that (which I,
as the author of our application do (mostly), but a Postgres tuning tool
can't) or you observe the application for some time (weeks, probably)
and adjust parameters (this is something a tool could do, and maybe
better than a human, but this is getting into AI territory).

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Вложения

signature.asc

Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

От

Achilleas Mantzios

Дата:

04 октября 2020 г., 21:40:49

On 2/10/20 2:08 π.μ., tutiluren@tutanota.com wrote:

On Thu, Sep 24, 2020 at 10:40 PM <tutiluren@tutanota.com> wrote:

Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data.
It's funny that you should bring that up, considering how it was one of my points... See the point about pg_dump's bug on Windows.

And you seem to have ignored the fact that one of the core developers pointed out that it likely isn't a pg_dump bug - if your terminal is using the same locale as the database, it should have no difficulty dealing with the characters you are having trouble with. It seems likely that you simply need to learn how to get your terminal set up correctly for it to work.
Yeah, this isn't rude or insulting at all...

Funny how my "incorrectly set up terminal" works perfectly for all other programs and my own test scripts, but not for pg_dump specifically. And only when using "special" characters. As already pointed out multiple times, in great detail, to deaf ears. Very interesting how you can manage to twist and bend that into it still somehow being "my fault". Because of course it cannot be pg_dump's fault. Absolutely not. It is unthinkable. It's the "rude user"'s fault who had the audacity to point out yet another PG bug which more than likely won't *ever* be fixed, as it's not even recognized, much less cared about. Probably because they *want* PostgreSQL to be crippled on Windows, judging by the responses in the past and how incredibly broken the joke of an installer is.

You should call it "Linux software with minimal pre-alpha Windows support" instead of pretending that it's cross-platform, and that goes for many FOSS projects as well which think exactly the same as you. The fact that I still use this garbage OS (Windows) speaks volumes of how incredibly crappy Linux is, which is utterly *unusable*.

But of course I should be grateful no matter what because it doesn't cost money. Because my time and energy is worthless. And the competition is "even worse", so that means I cannot point out any fault, ever, no matter how serious or how easily it could be fixed. I should just shut up and thank everyone for insulting me through carelessness and words. Or "fix it myself", because that's obviously an option as I haven't done it so far...

I did read the rest of your e-mail, but it would be pointless to reply to it as you clearly have the mentality that everyone should dedicate their lives to configuring a database and buying books instead of using it, because everyone should be core developers and everything must always be cryptic and difficult and blablabla. I'm sick of this attitude, and especially of being called "rude" by such rude-beyond-words people.

It would be refreshing to hear your be honest for once and just admit that you *want* it to be difficult. You *like* that there's a high threshold and it makes you feel superior to exclude "dumb" people who can't figure out all these cryptic (and downright broken) things. I truly believe that this is the reason for a lot of "weird" things which seem to make no sense on the surface.

I'd say take your time, take some deep breaths and decide that's good for you. Back in 2004 and after 3 yrs of full production software with postgresql someone from the mailing list (he's also in this thread!!) called me a "newbie", and I immediately started looking for alternatives, only to find out simply that there was no better DB software in the market/world back then (and still as we speak). So I stayed with PGSQL and wrote what I believe the best non-tcp-ip DB replication solution for marine and shipping business (over satellite), which still thrives today as far as easiness, automation, completeness, correctness and cost are concerned.

+ I discover every day that I am still a newbie, after 20 yrs with postgresql. This is not personal, this is about being successful in the long run, one should weigh his options and act accordingly. It took me/us a long time before we spent a single penny on someone to write or fix code that would work for us, but this moment eventually came, there is a roof when going with community software. That roof came for us much much later since we begun using PostgreSQL. Value for money is so hard to beat. I have seen the code by our MS SQL partners , some interesting and serious things happening there but when they listen what stock free pgsql can do they just freak out (the ones who understand).

So my advice, tell your CEO's the true potential of this technology and maybe show them some stats, some results of others, some numbers. Put them side by side with the rest of serious solutions and then decide.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

Вложения