Re: A Modest Upgrade Proposal

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: A Modest Upgrade Proposal
Дата
Msg-id CAMsr+YGXJJhwyrK=NZyT8aoTdQzQfvUQyFfZdobQyLt6zJVAPQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: A Modest Upgrade Proposal  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 14 July 2016 at 03:06, Robert Haas <robertmhaas@gmail.com> wrote:
 
Physical replication has
the same issue.  Users don't want to configure archive_command and
wal_keep_segments and max_wal_senders and wal_level and set up an
archive and create recovery.conf on the standby.  They want to spin up
a new standby - and we don't provide any way to just do that. [...snip...] 
Similarly, when the master fails, users want to promote a
standby (either one they choose or the one that is determined to be
furthest ahead) and remaster the others and that's not something you
can "just do".

Oh, I absolutely agree. But that's some pretty epic scope creep, and weren't you just saying we should cut logical replication to the bone to get the bare minimum in, letting users deal with keeping table definitions in sync, etc? We've got a development process where it takes a year to get even small changes in - mostly for good reasons, but it means it makes little sense to tie one feature to much bigger ones.

I often feel like with PostgreSQL we give users a box and some assembly instructions, rather than a complete system. But rather than getting a bike in a box with a manual, you get the frame in the box, and a really good manual on how to use the frame, plus some notes to take a look elsewhere to find wheels, brakes, and a seat, plus an incomplete list of eight different wheel, brake and seat types. Many of which won't work well together or only work for some conditions.

But damn, we make a good bike frame, and we document the exact stress tolerances of the forks!

Similarly, for logical replication, users will want to do things like
(1) spin up a new logical replication slave out of thin air,
replicating an entire database or several databases or selected
replication sets within selected databases; or (2) subscribe an
existing database to another server, replicating an entire database or
several databases; or (3) repoint an existing subscription at a new
server after a master change or dump/reload, resynchronizing table
contents if necessary; or (4) stop replication, either with or without
dropping the local copies of the replicated tables.  (This is not an
exhaustive list, I'm sure.)

Yep, and all of that's currently either fiddly or impossible.

To do some of those things even remotely well takes a massive amount more infrastructure though. Lots of people here will dismiss that, like they always do for things like connection pooling, by saying "an external tool can do that". Yeah, it can, but it sucks, you get eight different tools that each solve 80% of the problem (a different subset each), with erratic docs and maintenance and plenty of bugs. But OTOH even if we all agreed Pg should have magic self-healing auto-provisioning auto-discovering auto-scaling logical replication magic, there's a long path from that to delivering even the basics. Look at how long 2ndQ people have been working on just getting the basic low level mechanisms in place. There have been process issues there too, but most of it comes down to sheer scale and the difficulty of doing it in a co-ordinated, community friendly way that produces a robust result.

In addition to host management, you've also got little things like a way to dump schemas from multiple DBs and unify them in a predictable, consistent way, then keep them up to date as the schemas on each upstream change. While blocking changes that can't be merged into the downstream or allowing the downstream to fail. Since right now our schema copy mechanism is "run this binary and feed the SQL it produces into the other DB" we're rather a long way from there!


I don't mean to imply that the existing designs are bad as far as they
go.  In each case, the functionality that has been built is good.  But
it's all focused, as it seems to me, on providing capabilities rather
than on providing a way for users to manage a group of database
servers using high-level primitives.

100% agree.

BDR tried to get part-way there, but has as many problems as solutions, and to get that far it imposes a lot of restrictions. It's great for one set of use cases but has to be used carefully and with a solid understanding.

Many of the limtiations and restrictions imposed by BDR are because of limitations in the core server that make a smoother, more transparent solution unfeasable. Like with DDL management, our issues with full table rewrites, cluster-wide vs database-specific DDL, etc etc etc.

That higher-level stuff largely
gets left to add-on tools, which I don't think is serving us
particularly well.

+1
 
Those add-on tools often find that the core
support doesn't quite do everything they'd like it to do: that's why
WAL-E and repmgr, for example, end up having to do some creative
things to deliver certain features.  We need to start thinking of
groups of servers rather than individual servers as the unit of
deployment.

Yes... but it's a long path there, and we'll need to progressively build server infrastructure to make that posible.

There's also the issue that most companies who work in the PostgreSQL space have their own tools and have their own interests to protect. We could pretend that wasn't the case, but we'd still trip over the elephant we're refusing to see. 

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Reviewing freeze map code
Следующее
От: Vladimir Sitnikov
Дата:
Сообщение: Re: One process per session lack of sharing