Re: Integrating Replication into Core

Поиск
Список
Период
Сортировка
От	Markus Schiltknecht
Тема	Re: Integrating Replication into Core
Дата	28 ноября 2006 г. 09:21:20
Msg-id	456C3777.8030902@bluegap.ch обсуждение исходный текст
Ответ на	Re: Integrating Replication into Core (Andrew Sullivan <ajs@crankycanuck.ca>)
Ответы	Re: Integrating Replication into Core
Список	pgsql-hackers
Дерево обсуждения
Hello Andrew,

Andrew Sullivan wrote:
> On Sat, Nov 25, 2006 at 11:05:34AM -0800, Joshua D. Drake wrote:
>> Actually I don't buy this argument.

Nether do I. I can only reiterate that interfacing with the database
backend is *not* the problem. I've been porting Postgres-R forward since
7.4 and only few changes were necessary since then. And using a decent
version control system simplifies the task of propagating from CVS HEAD
to my branch. The few conflicts that arose were mostly trivial to
resolve (renaming or slight calling convention changes).

Andrew Sullivan wrote:
> Ok, good.  So why isn't Postgres-R something we have _now_?

(I note you don't count my version of Postgres-R (8), that might be 
reasonable depending on your definition of 'having Postgres-R'.)

I can't speak for others, but I just don't have much spare time left.
And it's a complex matter involving lots of corner cases like network
outages, crashes of the replication manager or GCS daemon, etc. Testing
and making it production grade software really takes a lot of time. IMO
this is where replication solutions could work together, because all of
them need to simulate a cluster somehow, to test their project. But this
certainly has nothing to do with PostgreSQL Core.

Another point for me is that the feedback I got on Postgres-R since
Toronto is very close to zero. Some people haven't even noticed that
there is Postgres-R code for 8.2. Or they don't count my variant for 
some reasons. For example Tom Lane who recently pointed out Postgres-R 
as an example of code drift in [1]. No offense, it's just very 
contradictory to the hype around replication.

> The work that I've seen on it, so far (and I speak as someone who
> invested a significant amount of staff time, cash money, and --
> frankly -- "political" credibility in software based on that idea) is
> that there isn't a way to make it production-grade without pretty
> severe constraints on what it can do.

Right, the Postgres-R algorithm has limitations. And it certainly does
not fit all use cases. The Toronto Meeting has opened my eyes in that 
aspect and I'm thankful for that.

> It was that unhappy discovery that led me to say, "Can we please
> _write down_ what we think 'replication' might require, and what the
> trade-offs can be?"  I'm trying to write requirements in public here;
> but all I get is silence.  This frustrates me partly because, as
> someone who stuck his neck out to make sure Slony was released as
> free software, I hear a lot of demands for features people apparently
> want without much in the way of design proposals -- never mind code -- 
> to achieve those features.  When Jan delivered the initial release of
> Slony, it was preceded by a design doc.  I note on -hackers long
> emails from (for example) Tom doing something very similar when
> proposing a major feature.  What I'm trying to do is to get the
> replication-interested community of PostgreSQL users to say "here's
> what we mean by 'replication'" before we all go off inventing the
> grammar.  We need to have a clue about the domain of discourse before
> we start settling the variable assignments.

As you surely have noticed, I've been discussing forth and back with
Bruce about replication for the documentation. I've been doing that
because I wanted to clarify what 'replication' is, what we are talking
about when we say 'multi-master replication' or 'data partitioning', etc..

Sadly, only very few people from the 'replication interested community'
were discussing. I've even been trying to get more of them involved.

> It seems to me that every single replication discussion on -hackers
> amounts to a bunch of futile attempts by colour blind people (of
> which I am one) to describe the colour 'high note', while their
> interlocutors describe the sound 'red'.  I'm trying to get us to say
> what it would mean even to do the describing.
>
> Specifying requirements for what software is supposed to do is one of
> those thankless tasks that everyone complains is never done in the
> free software community.  I am offering, earnestly, to do that.  I
> just need a few people to tell me what _they think_ the software in
> question ought to do.  I set up a mailing list.  I have solicited
> comments.  I'm not sure what else to do, but so far, I have the
> positive remarks of Jose (GORDA), the remarks of Markus (which amount
> to "this is a waste of time", unless I misread him), and nothing
> else.

I'm sorry if this sounded that negative. Defining what software is
supposed to do is certainly necessary, especially as long as replication
discussions on -hackers look like what you described above. Thus we
should better first define what we mean to make sure we are talking
about the same when speaking of 'multi-master replication' for example.

Please note that I've never raised my voice against that. I'm just
saying: it's not time for hooks or any other framework, yet. We don't 
even agree in that we need hooks to interface with the database. Even 
having to define points in code where I could hook would limit me in an 
unacceptable way, if I couldn't redefine them whenever I wanted.

> Surely, in a community that spends time on the topic of whether
> replication "should be in the back end", we oughta be able to come up
> with 10 or so people who are willing to say what "being in the back
> end" would mean.  At the moment, this trivial goal is all I'm aiming
> for.

Being in the back end for me means, I can code in C, use shared memory
and system catalogs, add another sub-process to PostgreSQL, introduce
another operation mode for (remote) backends, mess with the postmaster
and communicate to the backends via shared memory and signals (IPC).

IPC is even a good example for something which could be of use for me.
Back in April, I've sent a patch implementing internal messages passing
(see [2]).  It's a very general feature I need and, as pointed out in
the mail, it could even be of use for others. But I have no hope for it
to make it into core, because I've never seen something accepted which
could perhaps be of use in the future.

I've very well noticed that you and others offered to help in various
ways. Thank you for that. But I also got the impression that there's an
urge towards hooks or a framework or something so as PostgreSQL can
provide that and refer to it as "having everything needed" for 
replication. That sounds marketing driven, IMO.

I can assure you that I will continue to work on Postgres-R. I think its
design has been described well enough already. I will post more
design ideas for extensions and additions on the Postgres-R or on the 
replica-hooks mailing list as soon as I have them completely thought 
through and written down. And for sure I'll let you know if and how you 
or others can help me.

Regards

Markus

[1]: Tom Lane: Re: Getting a move on for 8.2 beta:
http://archives.postgresql.org/pgsql-hackers/2006-09/msg00139.php

[2]: My Patch for IMessages:
http://archives.postgresql.org/pgsql-patches/2006-04/msg00047.php
В списке pgsql-hackers по дате отправления:
Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Integrating Replication into Core