Обсуждение: Re: [GENERAL] Re: replication

Поиск
Список
Период
Сортировка

Re: [GENERAL] Re: replication

От
Richard Welsh
Дата:
Is it possible to log the transactions ( as complete statements ) on one
database then process the logged transactions on the second database at
intervals to keep it up to date?

RDW.

-----Original Message-----
From: Jeff Hoffmann <jeff@propertykey.com>
To: Aaron J. Seigo <aaron@gtv.ca>; pgsql-general@hub.org
<pgsql-general@hub.org>
Date: November 16, 1999 2:28 PM
Subject: Re: [GENERAL] Re: replication


>"Aaron J. Seigo" wrote:
>
>> this needs to be in the back end... otherwise, if you have multiple
people
>> performing updates on different replicated servers, how can you guarentee
>> concurancy? how do you manage the differences between read-only and
updateable
>> replicants? (this can be done using your wrapper method, but quickly gets
>> clumsy)
>
>yeah it can get clumsy, but fortunately updates are pretty rare in my
>case (and usually only performed by one person) so i don't really worry
>about those issues, but i should.  the only safety issue i use is
>starting a transaction for each connection, perform the action on all of
>the databases, and then commit the transaction on each.  it's not bad
>unless one of the commits fails and the others don't, then i have to
>retry the failed ones & hope it works.  i don't know how to fix it if i
>had to give up on one of the transactions for a mirror.  again, not
>perfect, but it works pretty well for me & my simpler cases.
>
>************
>
>


Re: [GENERAL] Re: replication

От
"Aaron J. Seigo"
Дата:
hi...

> Is it possible to log the transactions ( as complete statements ) on one
> database then process the logged transactions on the second database at
> intervals to keep it up to date?
>

transaction logging is already done on disk in pg_log, however you can't  count
on it being correct (esp as pgsql moves towards things such as WAL) as there
will be things in memory that aren't necessarily on disk... this leaves us with
no better a solution than doing it some other way...

the best place for this would be in the transaction handler of pgsql itself.
where a transaction completed on a Master server would cause the same updates
on all Replicants... this would do it for read-only database replication...
under this sort of system, you could only update one database, but all
databases would show the updates (MS Jet works this way? *tries to remember*)...

past that, you could then modify transactions to check not just for
transactions locally, but for transactions on any other given database
installation(s)... global locking, in other words...

this works as long as you can reach all your hosts all the time... if you
can't... then you either get you entire replication setup hung (which would be
pointless) ... or ... you keep revision #s on each row/field that are kept
locally should other Masters not be reachable by design or accident (i believe
this is how Oracle does it.. revision #s)... and once communication is
restored, then the systems sort out the revisions amongst themselves, asking
for user intervention on deadlocks. this would allow multiple Masters, multiple
Replicants and downtimes between them all. this is very tricky and probably the
last piece of the puzzle that would/should be tackled...

looking through the source there are a few promising places to start... (this
should probaly be on the hackers list now.. heh.. *shrug*)

acess/transam - the transaction manager... the commit functionality would need
to be altered to do updates across replicated systems...

storage/lmgr - lock manager... would need to be made aware of other lock tables
that may not be local.. possibly an interface to allow remote checking of lock
tables... a read-only node would always see a write lock on its own tables...
perhaps it would be rerouted to the Master for writing.

utils/time - checks the timliness of data due to transactions... could be
extended eventually to provide, along with the lock manager, concurency between
multiple masters? additional logging information would also be required to
pull this one off (assuming a network connection that isn't up 100% of the
time)...

then there is the point of where you would define the Master/Replicant
relationships within the database(s) itself.. e.g. would it be another system
table to keep track of Masters/Replicants and rules for their operations?

anyways, this is just the thoughts that come to mind... no real detail here,
nor any sense if this would be The Right Way to do it... however, my guess
would be to say that these places in the source would be where to start
looking...

word from those in the know?

--
Aaron J. Seigo
Sys Admin