Обсуждение: database replication

Поиск
Список
Период
Сортировка

database replication

От
DWalker@black-oak.com
Дата:
<p>I've been toying with the idea of implementing database replication for the last few days.  The system I'm proposing
willbe a seperate program which can be run on any machine and will most likely be implemented in Python.  What I'm
lookingfor at this point are gaping holes in my thinking/logic/etc.  Here's what I'm thinking...<p> <p>1) I want to
makethis program an additional layer over PostgreSQL.  I really don't want to hack server code if I can get away with
it. At this point I don't feel I need to.<p>2) The replication system will need to add at least one field to each table
ineach database that needs to be replicated.  This field will be a date/time stamp which identifies the "last update"
ofthe record.  This field will be called PGR_TIME for lack of a better name.  Because this field will be used from
withinprograms and triggers it can be longer so as to not mistake it for a user field.<p>3) For each table to be
replicatedthe replication system will programatically add one plpgsql function and trigger to modify the PGR_TIME field
onboth UPDATEs and INSERTs.  The name of this function and trigger will be along the lines of
<table_name>_replication_update_triggerand <table_name>_replication_update_function.  The function is a
simpletwo-line chunk of code to set the field PGR_TIME equal to NOW.  The trigger is called before each insert/update.
 Whenlooking at the Docs I see that times are stored in Zulu (GT) time.  Because of this I don't have to worry about
timezones and the like.  I need direction on this part (such as "hey dummy, look at page N of file X.").<p>4) At this
pointwe have tables which can, at a basic level, tell the replication system when they were last updated.<p>5) The
replicationsystem will have a database of its own to record the last replication event, hold configuration, logs, etc.
 I'dprefer to store the configuration in a PostgreSQL table but it could just as easily be stored in a text file on the
filesystemsomewhere.<p>6) To handle replication I basically check the local "last replication time" and compare it
againstthe remote PGR_TIME fields.  If the remote PGR_TIME is greater than the last replication time then change the
localcopy of the database, otherwise, change the remote end of the database.  At this point I don't have a way to know
WHICHfield changed between the two replicas so either I do ROW level replication or I check each field.  I check
PGR_TIMEto determine which field is the most current.  Some fine tuning of this process will have to occur no
doubt.<p>7)The commandline utility, fired off by something like cron, could run several times during the day -- command
lineparameters can be implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES FROM SERVER
B.<p> <p>Questions/Concerns:<p>1)How far do I go with this?  Do I start manhandling the system catalogs (pg_*
tables)?<p>2)As to #2 and #3 above, I really don't like tools automagically changing my tables but at this point I
don'tsee a way around it.  I guess this is where the testing comes into play.<p>3) Security: the replication app will
haveto have pretty good rights to the database so it can add the nessecary functions and triggers, modify table schema,
etc. <p> <p>  So, any "you're insane and should run home to momma" comments?<p> <p>              Damond<p> 

Re: [HACKERS] database replication

От
Ed Loehr
Дата:
DWalker@black-oak.com wrote:

> 6) To handle replication I basically check the local "last
> replication time" and compare it against the remote PGR_TIME
> fields.  If the remote PGR_TIME is greater than the last replication
> time then change the local copy of the database, otherwise, change
> the remote end of the database.  At this point I don't have a way to
> know WHICH field changed between the two replicas so either I do ROW
> level replication or I check each field.  I check PGR_TIME to
> determine which field is the most current.  Some fine tuning of this
> process will have to occur no doubt.

Interesting idea.  I can see how this might sync up two databases
somehow.  For true replication, however, I would always want every
replicated database to be, at the very least, internally consistent
(i.e., referential integrity), even if it was a little behind on
processing transactions.  In this method, its not clear how
consistency is every achieved/guaranteed at any point in time if the
input stream of changes is continuous.  If the input stream ceased,
then I can see how this approach might eventually catch up and totally
resync everything, but it looks *very* computationally  expensive.

But I might have missed something.  How would internal consistency be
maintained?


> 7) The commandline utility, fired off by something like cron, could
> run several times during the day -- command line parameters can be
> implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES
> FROM SERVER B.

My two cents is that, while I can see this kind of database syncing as
valuable, this is not the kind of "replication" I had in mind.  This
may already possible by simply copying the database.  What replication
means to me is a live, continuously streaming sequence of updates from
one database to another where the replicated database is always
internally consistent, available for read-only queries, and never "too
far" out of sync with the source/primary database.

What does replication mean to others?

Cheers,
Ed Loehr




Re: [HACKERS] database replication

От
"Damond Walker"
Дата:
>
> Interesting idea.  I can see how this might sync up two databases
> somehow.  For true replication, however, I would always want every
> replicated database to be, at the very least, internally consistent
> (i.e., referential integrity), even if it was a little behind on
> processing transactions.  In this method, its not clear how
> consistency is every achieved/guaranteed at any point in time if the
> input stream of changes is continuous.  If the input stream ceased,
> then I can see how this approach might eventually catch up and totally
> resync everything, but it looks *very* computationally  expensive.
>
   What's the typical unit of work for the database?  Are we talking about
update transactions which span the entire DB?  Or are we talking about
updating maybe 1% or less of the database everyday?  I'd think it would be
more towards the latter than the former.  So, yes, this process would be
computationally expensive but how many records would actually have to be
sent back and forth?

> But I might have missed something.  How would internal consistency be
> maintained?
>
   Updates that occur at site A will be moved to site B and vice versa.
Consistency would be maintained.  The only problem that I can see right off
the bat would be what if site A and site B made changes to a row and then
site C was brought into the picture?  Which one wins?
   Someone *has* to win when it comes to this type of thing.  You really
DON'T want to start merging row changes...

>
> My two cents is that, while I can see this kind of database syncing as
> valuable, this is not the kind of "replication" I had in mind.  This
> may already possible by simply copying the database.  What replication
> means to me is a live, continuously streaming sequence of updates from
> one database to another where the replicated database is always
> internally consistent, available for read-only queries, and never "too
> far" out of sync with the source/primary database.
>
   Sounds like you're talking about distributed transactions to me.  That's
an entirely different subject all-together.  What you describe can be done
by copying a database...but as you say, this would only work in a read-only
situation.

               Damond



Re: [HACKERS] database replication

От
Ryan Kirkpatrick
Дата:
On Fri, 24 Dec 1999 DWalker@black-oak.com wrote:

> I've been toying with the idea of implementing database replication
> for the last few days.
I too have been thinking about this some over the last year or
two, just trying to find a quick and easy way to do it. I am not so
interested in replication, as in synchronization, as in between a desktop
machine and a laptop, so I can keep the databases on each in sync with
each other. For this sort of purpose, both the local and remote databases
would be "idle" at the time of syncing.

> 2) The replication system will need to add at least one field to each
> table in each database that needs to be replicated. This field will be
> a date/time stamp which identifies the "last update" of the record.  
> This field will be called PGR_TIME for lack of a better name.  
> Because this field will be used from within programs and triggers it
> can be longer so as to not mistake it for a user field.
How about a single, seperate table with the fields of 'database',
'tablename', 'oid', 'last_changed', that would store the same data as your
PGR_TIME field. It would be seperated from the actually data tables, and
therefore would be totally transparent to any database interface
applications. The 'oid' field would hold each row's OID, a nice, unique
identification number for the row, while the other fields would tell which
table and database the oid is in. Then this table can be compared with the
this table on a remote machine to quickly find updates and changes, then
each differences can be dealt with in turn.

> 3) For each table to be replicated the replication system will
> programatically add one plpgsql function and trigger to modify the
> PGR_TIME field on both UPDATEs and INSERTs.  The name of this function
> and trigger will be along the lines of
> <table_name>_replication_update_trigger and
> <table_name>_replication_update_function.  The function is a simple
> two-line chunk of code to set the field PGR_TIME equal to NOW.  The
> trigger is called before each insert/update.  When looking at the Docs
> I see that times are stored in Zulu (GT) time.  Because of this I
> don't have to worry about time zones and the like.  I need direction
> on this part (such as "hey dummy, look at page N of file X.").
I like this idea, better than any I have come up with yet. Though,
how are you going to handle DELETEs? 

> 6) To handle replication I basically check the local "last replication
> time" and compare it against the remote PGR_TIME fields.  If the
> remote PGR_TIME is greater than the last replication time then change
> the local copy of the database, otherwise, change the remote end of
> the database.  At this point I don't have a way to know WHICH field
> changed between the two replicas so either I do ROW level replication
> or I check each field.  I check PGR_TIME to determine which field is
> the most current.  Some fine tuning of this process will have to occur
> no doubt.
Yea, this is indeed the sticky part, and would indeed require some
fine-tunning. Basically, the way I see it, is if the two timestamps for a
single row do not match (or even if the row and therefore timestamp is
missing on one side or the other altogether):local ts > remote ts => Local row is exported to remote.remote ts > local
ts=> Remote row is exported to local.local ts > last sync time && no remote ts =>     Local row is inserted on
remote.localts < last sync time && no remote ts =>    Local row is deleted.remote ts > last sync time && no local ts =>
  Remote row is inserted on local.remote ts < last sync time && no local ts =>    Remote row is deleted.
 
where the synchronization process is running on the local machine. By
exported, I mean the local values are sent to the remote machine, and the
row on that remote machine is updated to the local values. How does this
sound?

> 7) The commandline utility, fired off by something like cron, could
> run several times during the day -- command line parameters can be
> implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES
> FROM SERVER B.
Or run manually for my purposes. Also, maybe follow it
with a vacuum run on both sides for all databases, as this is going to
potenitally cause lots of table changes that could stand with a cleanup. 

> 1) How far do I go with this?  Do I start manhandling the system catalogs (pg_* tables)?
Initially, I would just stick to user table data... If you have
changes in triggers and other meta-data/executable code, you are going to
want to make syncs of that stuff manually anyway. At least I would want
to.

> 2) As to #2 and #3 above, I really don't like tools automagically
> changing my tables but at this point I don't see a way around it.  I
> guess this is where the testing comes into play.
Hence the reason for the seperate table with just a row's
identification and last update time. Only modifications to the synced
database is the update trigger, which should be pretty harmless.

> 3) Security: the replication app will have to have pretty good rights
> to the database so it can add the nessecary functions and triggers,
> modify table schema, etc.
Just run the sync program as the postgres super user, and there
are no problems. :)

>   So, any "you're insane and should run home to momma" comments?
No, not at all. Though it probably should be remaned from
replication to synchronization. The former is usually associated with a
continuous stream of updates between the local and remote databases, so
they are almost always in sync, and have a queuing ability if their
connection is loss for span of time as well. Very complex and difficult to
implement, and would require hacking server code. :( Something only Sybase
and Oracle have (as far as I know), and from what I have seen of Sybase's
replication server support (dated by 5yrs) it was a pain to setup and get
running correctly.The latter, synchronization, is much more managable, and can still
be useful, especially when you have a large database you want in two
places, mainly for read only purposes at one end or the other, but don't
want to waste the time/bandwidth to move and load the entire database each
time it changes on one end or the other. Same idea as mirroring software
for FTP sites, just transfers the changes, and nothing more.I also like the idea of using Python. I have been using it
recently for some database interfaces (to PostgreSQL of course :), and it
is a very nice language to work with. Some worries about performance of
the program though, as python is only an interpreted lanuage, and I have
yet to really be impressed with the speed of execution of my database
interfaces yet.Anyway, it sound like a good project, and finally one where I
actually have a clue of what is going on, and the skills to help. So, if
you are interested in pursing this project, I would be more than glad to
help. TTYL.

---------------------------------------------------------------------------
|   "For to me to live is Christ, and to die is gain."                    |
|                                            --- Philippians 1:21 (KJV)   |
---------------------------------------------------------------------------
|   Ryan Kirkpatrick  |  Boulder, Colorado  |  http://www.rkirkpat.net/   |
---------------------------------------------------------------------------




Re: [HACKERS] database replication

От
Duane Currie
Дата:
Hi Guys,

Now for one of my REALLY rare posts.
Having done a little bit of distributed data systems, I figured I'd
pitch in a couple cents worth.

> 2) The replication system will need to add at least one field to each 
>    table in each database that needs to be re plicated.  This 
>    field will be a date/time stamp which identifies the " last 
>    update" of the record.  This field will be called PGR_TIME 
>    for la ck of a better name.  Because this field will be used 
>    from within programs and triggers it can be longer so as to not 
>    mistake it for a user field.

I just started reading this thread, but I figured I'd throw in a couple
suggestions for distributed data control  (a few idioms I've had to
deal with b4):- Never use time (not reliable from system to system).  Use  a version number of some sort that can stay
consistentacross  all replicas
 
  This way, if a system's time is or goes out of wack, it doesn't  cause your database to disintegrate, and it's easier
totrack  conflicts (see below.  If using time, the algorithm gets  nightmarish)
 
- On an insert, set to version 1
- On an update, version++
- On a delete, mark deleted, and add a delete stub somewhere for the  replicator process to deal with in sync'ing the
databases.
- If two records have the same version but different data, there's  a conflict.  A few choices:      1.  Pick one as
thecorrect one (yuck!! invisible data loss)    2.  Store both copies, pick one as current, and alert         database
ownerof the conflict, so they can deal with        it "manually."    3.  If possible, some conflicts can be merged.  If
adisjoint        set of fields were changed in each instance, these changes        may both be applied and the record
merged. (Problem:        takes a lot more space.  Requires a version number for        every field, or persistent
storageof some old records.        However, this might help the "which fields changed" issue        you were talking
aboutin #6)
 
- A unique id across all systems should exist (or something that  effectively simulates a unique id.  Maybe a
compositionof the  originating oid (from the insert) and the originating database  (oid of the database's record?)
mightdo it.  Store this as  an extra field in every record.      (Two extra fieldss so far: 'unique id' and 'version')
 

I do like your approach:  triggers and a separate process. (Maintainable!! :)

Anyway, just figured I'd throw in a few suggestions,
Duane