database replication
От | DWalker@black-oak.com |
---|---|
Тема | database replication |
Дата | |
Msg-id | OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM обсуждение исходный текст |
Ответы |
Re: [HACKERS] database replication
(Ryan Kirkpatrick <pgsql@rkirkpat.net>)
Re: [HACKERS] database replication (Duane Currie <dcurrie@sandman.acadiau.ca>) |
Список | pgsql-hackers |
<p>I've been toying with the idea of implementing database replication for the last few days. The system I'm proposing willbe a seperate program which can be run on any machine and will most likely be implemented in Python. What I'm lookingfor at this point are gaping holes in my thinking/logic/etc. Here's what I'm thinking...<p> <p>1) I want to makethis program an additional layer over PostgreSQL. I really don't want to hack server code if I can get away with it. At this point I don't feel I need to.<p>2) The replication system will need to add at least one field to each table ineach database that needs to be replicated. This field will be a date/time stamp which identifies the "last update" ofthe record. This field will be called PGR_TIME for lack of a better name. Because this field will be used from withinprograms and triggers it can be longer so as to not mistake it for a user field.<p>3) For each table to be replicatedthe replication system will programatically add one plpgsql function and trigger to modify the PGR_TIME field onboth UPDATEs and INSERTs. The name of this function and trigger will be along the lines of <table_name>_replication_update_triggerand <table_name>_replication_update_function. The function is a simpletwo-line chunk of code to set the field PGR_TIME equal to NOW. The trigger is called before each insert/update. Whenlooking at the Docs I see that times are stored in Zulu (GT) time. Because of this I don't have to worry about timezones and the like. I need direction on this part (such as "hey dummy, look at page N of file X.").<p>4) At this pointwe have tables which can, at a basic level, tell the replication system when they were last updated.<p>5) The replicationsystem will have a database of its own to record the last replication event, hold configuration, logs, etc. I'dprefer to store the configuration in a PostgreSQL table but it could just as easily be stored in a text file on the filesystemsomewhere.<p>6) To handle replication I basically check the local "last replication time" and compare it againstthe remote PGR_TIME fields. If the remote PGR_TIME is greater than the last replication time then change the localcopy of the database, otherwise, change the remote end of the database. At this point I don't have a way to know WHICHfield changed between the two replicas so either I do ROW level replication or I check each field. I check PGR_TIMEto determine which field is the most current. Some fine tuning of this process will have to occur no doubt.<p>7)The commandline utility, fired off by something like cron, could run several times during the day -- command lineparameters can be implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES FROM SERVER B.<p> <p>Questions/Concerns:<p>1)How far do I go with this? Do I start manhandling the system catalogs (pg_* tables)?<p>2)As to #2 and #3 above, I really don't like tools automagically changing my tables but at this point I don'tsee a way around it. I guess this is where the testing comes into play.<p>3) Security: the replication app will haveto have pretty good rights to the database so it can add the nessecary functions and triggers, modify table schema, etc. <p> <p> So, any "you're insane and should run home to momma" comments?<p> <p> Damond<p>
В списке pgsql-hackers по дате отправления: