Re: Replication Docs

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Replication Docs
Дата
Msg-id 200611221736.kAMHasi00788@momjian.us
обсуждение исходный текст
Ответ на Replication Docs  (Markus Schiltknecht <markus@bluegap.ch>)
Ответы Re: Replication Docs  (Markus Schiltknecht <markus@bluegap.ch>)
Список pgsql-docs
Markus Schiltknecht wrote:
> Hello Bruce,
>
> I was trying to put together all comments to specific sections, thus the
> new thread. Hope that helps.
>
> *** Synchronous Multi-Master Replication ***
>
> Bruce Momjian wrote:
>  > OK, new title is "Synchonous Multi-Master Replication", and the next
>  > heading is "Asynchronous Multi-Master Replication".
>
> Good, I really like that one. :-)

Great (until we change it again)  ;-)

>  >> Why not simply call in "Multi Master Replication"? That implies
>  >> clustering, doesn't it?
>  >
>  > Well, not really because of the async multi-master that is the next
>  > item.
>
> Yes, it's fine that way. I was just unsure if you want to have sync and
> async in one paragraph or not. The proposal "Multi Master Replication"
> would only fit if we'd describe both in one paragraph. I like to
> describe both in more detail, as you did now.

OK, it is two separate entries now:

    http://momjian.us/main/writings/pgsql/sgml/high-availability.html

>  >> BTW, I'm slowly beginning to accept that you don't want to mix
>  >> "Statement-Based Replication Middleware" with "Multi Master
>  >> Replication". ;-)
>  >
>  > OK, are they mixed now?
>
> No, they're not. They're split, which I think is what you want. I've
> been uncomfortable with was that split into "Statement-Based Replication
> Middleware" and "Synchronous Multi-Master Replication". I've been
> arguing that the first describes one possible implementation of the
> second, while other implementations are not described (2PC, SHMEM,
> Postgres-R, etc...)
>
> I was trying to say that I'm beginning to accept that split, because
> especially pgpool really seems to put a lot of those burdens to the
> user. I've been trying to use some humor, but that mainly seems to
> confuse people. My english might not be good enough for humor, yet.
>
> However, where do you now fit Sequoia in? It uses "statement-based
> replication", but AFAIK it is much more clever than pgpool and handles
> non-deterministic functions. And the Sequoia people probably won't get
> excited about not calling them "Multi-Master Replication".

Uh, good point.  The title is now "Statement-Based Replication
Middleware".  That doesn't say multi-master, but it doesn't say
master/slave either.  The Sequoia PDF you sent me is very detailed:

  http://www.continuent.org/uploads/sequoia/Resources/2006-08-15Cecchet_ApacheConAsia2006.pdf

I think we are back to the issue of classification.  We have traditional
master/slave as slony, and multi-master as perhaps pgcluster, and lots
in between.  I am thinking pgpool and sequoia fit in there.  I have
added Sequoia to the Statement-Based Replication Middleware section.

> Bruce Momjian wrote:
>  > I just saw it [the slides about PGCluster-II].  It does seem more like
>  > Oracle RAC than any other method.
>
> Yes. I think it's not production ready, yet, so there's no point in
> mentioning it in the documentation.

OK.

> Bruce Momjian wrote:
>  > I figured that shared-disk/memory only really makes sense for
>  > multi-master clustering, so I mentioned it in that paragraph:
>  >
>  > ...<snipped the new paragraph>
>  >
>  > Is that enought?
>
> I'd say so, yes. We are not going into more details for other aspects so
> that's fine.

OK.

> You might not even mention shared-memory. I don't know of any
> implementation in the database world. Except perhaps using OpenMosix and
> running PostgreSQL on top of it. Maybe just leave it in there, it won't
> hurt.

OK, I will only mention shared disk now.

> Bruce Momjian wrote:
>  > One problem I have is that we we have shared disk failover, but no
>  > other shared case with a PostgreSQL implementation, and people don't
>  > want to mention Oracle RAC, so why do we mention it if we have no
>  > implementations even in the works.
>
> Most probably you're already aware that with PGCluster-II we have such
> an implementation in the works.

I do now.  :-)  I think we are OK with the additional sentence about
shared disk in the Synchonous Multi-Master Replication section, right?

> *** Asynchronous Multi-Master Replication ***
>
>  >> Again, IMHO, "Parallel Query Execution" says everything. The word
>  >> 'Clustering' does not help, because it's not defined nor commonly
>  >> used in any helpful way (probably besides marketing).
>  >
>  > OK, new title is Multi-Server Parallel Query Execution.  If I have
>  > just "Parallel Query Execution", it could be multi-process parallel
>  > query execution.
>
> Yes, the new title is good.
>
> In the text below, you are mainly describing what I call 'disconnected
> operation' (somebody have a better, more common term for that?). But the
> main advantage of async replication is having no delay before commit.
> Thus giving better performance for writing transactions.
>
> In case of async, multi master replication, conflicts can arise, which
> have to be resolved. I think your example does not make it clear that
> this applies to async, multi master replication in general. And that
> those can sometimes be resolved automatically.

OK, good point, section updated:

      <term>Asynchronous Multi-Master Replication</term>
      <listitem>

       <para>
        For servers that are not regularly connected, like laptops or
        remote servers, keeping data consistent among servers is a
        challenge.  Using asynchronous multi-master replication, each
        server works independently, and periodically communicates with
        the other servers to identify conflicting transactions.  The
        conflicts can be resolved by users or conflict resolution rules.
        rules.

>
>
> *** Multi-Master Parallel Query Execution ***
>
> Bruce Momjian wrote:
>  > Uh, multi-master replication allows for load balancing, but it doesn't
>  > help a single query to run any faster.  Think of having only one query
>  > running on the cluster.  Parallel execution allows a single query to
>  > use more than one computer, right?
>
> Right.
>
>  > Uh, this confuses me.  What is missing?  You split tables across
>  > multiple servers.
>
> In "Multi-Master Parallel Query Execution" you write: "One possible way
> this could work is for the data to be split among servers". So the
> example you give involves Data Partitioning.

OK.

> I wanted to point out that another way to do Parallel Query Execution is
> using Multi-Master Replication to have equal replicas and then query
> them in parallel. I don't think there is any solution for that, yet.
> Except, perhaps PGPool-II can do it?

Uh, if the data isn't partitioned, what value is there to hitting
multiple servers, for single query?  I am confused.

> *** Introduction Text on the top ***
>
> Bruce Momjian wrote:
>  > OK, updated to add "little" delay, and removed "small" from async
>  > case:
>  >
>  >   load-balanced servers will return consistent results with little
>  >   propagation delay. Asynchronous updating has a delay between the
>
> Hm, that does not address my concerns. But after thinking about it, I
> can accept the term 'consistent results' - it's clear enough what it
> means. I'm probably thinking into too many details...

OK.

> But now, the "little delays" certainly is in the wrong place. Such
> delays occur before commit, not before returning results.

Uh, I don't think the little appears to talk about the results but only
the propogation.

> Maybe revert it back to "..no propagation delay". Or completely leave
> away the "no propagation delay".

OK, how is this new text?

  This guarantees that a failover will not lose any data and that
  all load-balanced servers will return consistent results no matter
  which server is queried.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

В списке pgsql-docs по дате отправления:

Предыдущее
От: Markus Schiltknecht
Дата:
Сообщение: Replication Docs
Следующее
От: Markus Schiltknecht
Дата:
Сообщение: Re: Replication Docs