Re: Replication documentation addition
От | Bruce Momjian |
---|---|
Тема | Re: Replication documentation addition |
Дата | |
Msg-id | 200610261553.k9QFr9V23851@momjian.us обсуждение исходный текст |
Ответ на | Replication documentation addition (Bruce Momjian <bruce@momjian.us>) |
Список | pgsql-hackers |
With no new additions submitted today, I have moved my text into our SGML documentation: http://momjian.us/main/writings/pgsql/sgml/failover.html Please let me know what additional changes are needed. --------------------------------------------------------------------------- bruce wrote: > Richard Troy wrote: > > > > > Here is a new replication documentation section I want to add for 8.2: > > > > > > ftp://momjian.us/pub/postgresql/mypatches/replication > > > > > > > ...Read the document, as promissed... > > > > First paragraph, "(fail over)" is inconsistent with title, "failover", as > > are other spots throughout the document. The whole document should be > > consistent and I vote for "failover" and not "fail over." > > OK. Fixed to "failover" > > > Fourth paragraph, "This "sync problem" is the fundamental difficulty for > > servers working together"; "Sync problem" hasn't been defined. Actually, > > you're talking about the consistent attribute of the "acid" properties of > > all competent databases: Atomic, Consistency, Isolation, and Durability. > > At least define the term you are using - probably most easily done in the > > preceeding paragraph. > > OK, "sync problem" term removed, and spelled out fully. > > > The fifth paragraph needs a lot more help, I think. Howabout this > > alternative: > > > > So called "two phaised commit" was developed as a strategy in which two or > > more databases are updated simultaneously and none of the data is > > committed until all are committed. This guarantees consistency between the > > databases with all propagation delay being absorbed by the writer at write > > time. There are times when this propagation delay is large, so sometimes > > alternatives are worked out which we'll call here "asynchronous updates," > > however, in these cases, there is always a window of time in which some > > transaction can be lost should a failure occurr. For this reason, > > asynchronous updates are only used when the possibility of such losses is > > acceptible. > > I have modified the paragraph to use some of your terms. > > > Paragraphs six through to "shared disk failover" seem very awkward to me. > > I don't like them at all. > > > > "Shared disk failover" has nothing to do with "the sync problem" as it's > > not a multiple-database solution. It's an uptime, "24 X 7 X 365" issue. > > Further, it also has nothing to do with disk arrays, though it is often > > used with RAID to help avoid disk based corruption problems. > > Yes, please see updated version. I removed the sync problem term from > there. > > > The point about Warm Standby needs to include a warning about WAL that it > > MUST be sensitive to the semantics of the database design or else it's > > fatally flawed. I'm talking about "referential integrety". That is to say, > > it's inappropriate to capture updates on a table by table basis, as some > > such systems do, (I have no idea what's done by anyone in the PG world on > > this right now) because an update to one table (esp. inserts) very often > > go hand in glove with updates in other tables and to get one without the > > other can corrupt a database. > > We don't have that problem. We recover only full transactions. > > > The description of "Continuously running replication server" should > > include the critical caveat - repeated if you think it's already said > > elsewhere - that it is ONLY suitable for applications in which a loss of > > (missing) update data doesn't matter. For example, an airline reservation > > system would be an inappropriate application for such a "solution" because > > what seats are available cannot be guaranteed to be correct. > > I have added note about data loss for the Slony item. > > > Regarding data partitioning, I strongly disagree with the opening sentence > > in that it doesn't split a database into sets, it splits tables into sets. > > OK, changed. > > > Data partitioning is often done within a single database on a single > > server and therefore, as a concept, has nothing whatsoever to do with > > different servers. Similarly, the second paragraph of this section is > > Uh, why would someone split things up like that on a single server? > > > problematic. Please define your term first, then talk about some > > implementations - this is muddying the water. Further, there are both > > vertical and horizontal partitioning - you mention neither - and each has > > its own distinct uses. If partitioning is mentioned, it should be more > > complete. > > Uh, what exactly needs to be defined. > > > Next, Query Broadcast Load Balancing... also needs a lot of work. First, > > it's foremost in my memory that sending read queries everywhere and > > returning the first result set back is a key way to improve application > > performance at the cost of additional load on other systems - I guess > > that's not at all what the document is after here, but it's a worthy part > > of a dialogue on broadcasting queries. In other words, this has more parts > > to it than just what the document now entertains. Secondly, the document > > Uh, do we want to go into that here? I guess I could. > > > doesn't address _at_all_ whether this is a two-phaise-commit environment > > or not. If not, how are updates managed? If each server operates > > independently and one of them fails, what do you do then? How do you know > > _any_ server got an insert/update? ... Each server _can't_ operate > > independently unless the application does its own insert/update commits to > > every one of them - and that can't be fast, nor does it load balance, > > though it may contribute to superior uptime performance by the > > application. > > I think having the application middle layer do the commits is how it > works now. Can someone explain how pgpool works, or should we mention > how two-phase commit has to be done here? pgpool2 has additional > features. > > > Next up; I'm not aware of any current products or projects that provide > > parallel query execution, though Informix might - I can ask a colleague or > > two. Either way, it's probably best to simply define the term (perhaps in > > a little more detail), and not mention solutions - they change with time > > anyway. > > Actually, Bizgres MPP, based on PostgreSQL, does this, but mostly for > read-only queries. > > > While I've never used Oracle's clustering tools, I've read up on them and > > have customers who use them, and I think this description of Oracle > > clustering is a mis-read on what the Oracle system actually does. A check > > with a true Oracle clustering expert is in order here. > > OK, would someone please comment? > > > Hope this helps. If asked, I'm willing to (re)write some of the bits > > discussed above. > > Yes, please review the URL and let me know what else to change. Thanks. > > -- > Bruce Momjian bruce@momjian.us > EnterpriseDB http://www.enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
В списке pgsql-hackers по дате отправления: