Re: Sync Rep: First Thoughts on Code

Поиск
Список
Период
Сортировка
От Aidan Van Dyk
Тема Re: Sync Rep: First Thoughts on Code
Дата
Msg-id 20081213184542.GA12094@yugib.highrise.ca
обсуждение исходный текст
Ответ на Re: Sync Rep: First Thoughts on Code  (Markus Wanner <markus@bluegap.ch>)
Список pgsql-hackers
Synchronous replication, "sync rep" is *not* intersted in the "slave's
visiblity of the commit", because PostgreSQL doesn't "serve" requests
when in recovery (wal receiving) mode *now*.

This sync rep patch/proposal/discution is *strictly* (at this point yet,
hot standby may eventually or hopefully soon change that) the means to
get the data "safely in 2 seperate places", before the COMMIT returns,
by means of wal streaming.  That "safely in 2 places" can have various
implementation options (like received, on disk, or applied), and
Fujii-san explained some of the options as to what to consider "safe"
and their trade-offs at his presentation at last year.

Once both sync-rep (the wal-streaming get changes in two places) and
hot-standby (run queries while WAL is being applied) are available in
PostgreSQL, at that point we might need to start "other client
visibility", but even then, we still don't need to worry about
multi-master options...

a.


* Markus Wanner <markus@bluegap.ch> [081213 12:17]:
> Hi,
> 
> Simon Riggs wrote:
> > On Sat, 2008-12-13 at 14:07 +0100, Markus Wanner wrote:
> >> Speaking of a "synchronous commit"
> >> is utterly misleading, because the commit itself is exactly the thing
> >> that's *not* synchronous.
> > 
> > Not really sure where you're going here.
> 
> I'm pointing to a potential misunderstanding, trying to help to prevent
> you from running into the same issues and discussions as I did.
> 
> I've learned the hard way, that the Postgres-R algorithm is not fully
> synchronous (in the strict sense). This caused confusion for people who
> take the word "synchronous" by its original meaning. The algorithm
> proposed here seems similar enough to potentially cause the same confusion.
> 
> As I see it now, I think it's well worth to point out the difference,
> from both, the technical as well as from the marketing perspective. The
> former for better understanding, the later to prevent users from
> thinking it must be slow per definition. Arguing that your approach is
> not fully synchronous definitely helps defending that concern.
> 
> However, I'm just now realizing, that the difference is only relevant as
> soon as you begin to allow read-only access on the slave. AFAIK that's
> among the goals of this effort, no?
> 
> > "synchronous replication" is
> > used exactly as described in the Wikipedia entry here:
> > http://en.wikipedia.org/wiki/Database_replication
> 
> That article describes pretty much all variants of replication, what
> exactly are you referring to?
> 
> Under "Database Replication > Multi-Master replication" it describes
> eager vs lazy variants, which is IMO a more appropriate and useful
> distinction than sync vs async. (But that's admittedly a sentence I've
> contributed myself, IIRC).
> 
> Under "Storage Replication > Synchronous Replication" one can read:
> "Write is not considered complete until acknowledgement by both local
> and remote storage." For the proposed approach this might hold true for
> WAL writing. However, the user certainly doesn't care how synchronous
> the log is shipped nor written, is as long as she doesn't see the
> changes on the slave.
> 
> That's the difference between fully synchronous and eager (or virtually
> or approximately synchronous) algorithms. You seem to refer to both as
> "synchronous". Phrases like "synchronous commit" or "synchronous data
> transfer" do not help me to understand what exactly you are talking about.
> 
> Explaining that the slave commits (and therefore makes the transactions
> visible) asynchronously would help. And it would prevent disappointment
> for users who expect changes to be immediately visible on the slave.
> 
> > No two word phrase is going to accurately sum up the complexity and
> > potential for data loss in these situations. DRBD saw that too and just
> > called them A, B and C and then describe them more accurately.
> 
> Agreed. I've chosen lazy, eager and sync, so far. I'm open for better
> terms, and I leave it up to you to call your variants whatever you like.
> But to understand what you are talking about, I'd prefer to get to know
> these distinctions crisp and clear.
> 
> > But I don't think we should say "PostgreSQL just implemented algorithm
> > B" which is just unhelpful. I don't think its "marketing" to refer to it
> > by the phrase most commonly used for the technology we are building.
> 
> I certainly agree to using such terms. Unfortunately, in my experience,
> synchronous replication is commonly used to mean that transactions are
> guaranteed to be immediately visible on remote nodes after the client
> got commit acknowledgment. That's the cause for confusion I'm envisioning.
> 
> 
> I'm hoping to be somewhat helpful to this effort of getting a log
> shipping replication variant into Postgres. It can only be beneficial
> for Postgres-R in that we gain field experience with ..uhm.. this
> special kind of replication, however we name it.
> 
> I'm already on xmas vacation, so I won't bother you any further on this
> issue. Have fun coding and make sure to enjoy this time of the year.
> 
> All the best.
> 
> Markus Wanner
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Sync Rep: First Thoughts on Code
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: Sync Rep: First Thoughts on Code