Обсуждение: High Availability / Replication with Sequoia
Hi Guys, I have been testing / working with Postgres for a work project, and so far I am really impressed with this DB system. Takes a little getting used to, but I am really beginning to love it. I am looking now at a scenario that does not seem to be a native ability of Postgres, but might possibly be overcome with Sequoia. I am hoping that there exists the possibility of using Sequoia to replicate a DB between / among a number of machines in the office, some of which are not always connected to the lan. The scenario is like this..... On each of the machines I would want to have Postgres installed and only to accepting connections from the local machine. Also on each of these machines would be running Tomcat or similar hosting the required application (app to connect to local Postgres installation). Sequoia would then be used as a form of replication from machine to machine to ensure that the database is kept up to date. The application does not allow writeback to the db, so for all intents and purposes you can consider it read only. To keep the applications database up to date with new information I would be using ETL applications like Spoon / PDI. This will be done to an as yet undecided 'point of origin', but it is probably safe to say that it will be a commercial db server somewhere on our network. The latency from our network to the 'Data Warehouse' (read as badly managed dogs breakfast) is huge. Suffice to say the desire for local db's is high, as is the desire to make the application portable for our sometimes connected laptop users. Does anyone have any experience or comments that they would like to share about this sort of scenario? Its a fairly big jump from just having Postgres running on my laptop for dev purposes to pushing this to multiple machines and I would really appreciate any feedback you guys might have. Thanks in advance The Frog
Mr.Frog.to.you@googlemail.com wrote: > Hi Guys, > > I have been testing / working with Postgres for a work project, and so > far I am really impressed with this DB system. Takes a little getting > used to, but I am really beginning to love it. Good to hear it Mr ...Umm... Frog. > I am looking now at a scenario that does not seem to be a native > ability of Postgres, but might possibly be overcome with Sequoia. I am > hoping that there exists the possibility of using Sequoia to replicate > a DB between / among a number of machines in the office, some of which > are not always connected to the lan. You might want to look at slony with log-shipping http://www.slony.info/documentation/logshipping.html Have a master server on the lan, others grab files as and when they need. Do read the "limitations" though. > The application does not allow writeback to the db, so for all intents > and purposes you can consider it read only. Fine with slony. > To keep the applications database up to date with new information I > would be using ETL applications like Spoon / PDI. This will be done to > an as yet undecided 'point of origin', but it is probably safe to say > that it will be a commercial db server somewhere on our network. The > latency from our network to the 'Data Warehouse' (read as badly > managed dogs breakfast) is huge. Suffice to say the desire for local > db's is high, as is the desire to make the application portable for > our sometimes connected laptop users. The syncing with your commercial DB is probably the most fiddly bit. That's not so bad, since it's one-way. > Does anyone have any experience or comments that they would like to > share about this sort of scenario? Its a fairly big jump from just > having Postgres running on my laptop for dev purposes to pushing this > to multiple machines and I would really appreciate any feedback you > guys might have. Not used the log-shipping variant of slony, but I'm happy enough with the regular connected-version. -- Richard Huxton Archonet Ltd
Richard Huxton wrote: > Mr.Frog.to.you@googlemail.com wrote: >> I am looking now at a scenario that does not seem to be a native >> ability of Postgres, but might possibly be overcome with Sequoia. I am >> hoping that there exists the possibility of using Sequoia to replicate >> a DB between / among a number of machines in the office, some of which >> are not always connected to the lan. > > You might want to look at slony with log-shipping > http://www.slony.info/documentation/logshipping.html > Have a master server on the lan, others grab files as and when they > need. Do read the "limitations" though. I haven't used this or Sequoia before but slony log-shipping sounds closer to what you want for the mobile users. Sequoia appears to cater for always available servers to replicate to so may not suite your mobile users. >> The application does not allow writeback to the db, so for all intents >> and purposes you can consider it read only. > > Fine with slony. > >> To keep the applications database up to date with new information I >> would be using ETL applications like Spoon / PDI. This will be done to >> an as yet undecided 'point of origin', but it is probably safe to say >> that it will be a commercial db server somewhere on our network. The >> latency from our network to the 'Data Warehouse' (read as badly >> managed dogs breakfast) is huge. Suffice to say the desire for local >> db's is high, as is the desire to make the application portable for >> our sometimes connected laptop users. > > The syncing with your commercial DB is probably the most fiddly bit. > That's not so bad, since it's one-way. > Not sure I get your whole plan here or not.... The main thing I am thinking here is whether you plan to have each user sync from the commercial db or another postgresql db? Sequoia appears to support different db sources but I would check whether it supports replicating the same data between different 'brands' of db or whether the master and slave must be the same brand if this is where you plan to use it. You could look at having postgresql draw data from your commercial db on a set schedule something like odbclink at pgfoundry may help there. How many of your users are mobile? Could most users be accommodated by one central server with some mobile users getting the local copy to take with them? Maybe syncing data when they request it? Or are you set on automating the syncing? -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz
Thanks for the help guys, I should clear up a little what I am trying to achieve I think. The primary users of this db and application will be located in an office, each user with a desktop machine, all networked. They need to work with this DB in a fairly heavy kind of way, in so far as to say that 80% of their day will be working with the application and the db. The primary source of data will / must be located on a database server that is actually in a different facility. It is possible to reach this server from the office, and is done so daily, however the speed of connection is very slow and is frequently disconnected - in short unrelaible. To implement an extension of this 'primary' db with the associated hardware and licensing costs at the local site is beyond what the business is willing to pay. It also goes directly against the 'structure' that has been laid out by the IT group in that they want all the db servers in a single location - regardless of business impact they want to make their budget savings. So, what I want to do is to satisfy the IT group by keeping a 'master' copy of the db on their off-site facility, which in fact will be populated from a source system sitting on my desk. The ETL tools will be used for creating a completely (or as near as possible) automated system for populating the 'master' that is offsite. What I wanted to do next was to have Postgres installed on each of the local users machines, along with the application they require, and run them as a cluster - if one db goes down or one machine dies the client software / app can still connect to the cluster and keep functioning from another machine. I could then have the defective machine attended to and if necessary re-built... In short the ability to work would not be interrupted. Or at least thats the hope. These desktops shut down each night too, as the staff leave to go home. There is no possibility to install a server locally (unfortunately). So with this in mind I was hoping that the 'automatic' nature of Sequoia would allow for recovery / updating from the master or others in the cluster and keep all the local db's up to date without the users having to do anything. There is also a desire to have a mobile copy of this db / app for some of the mobile users that come in to the office. They wont be able to update while external due to the way the network is designed, but once back in the office they could do this. I was hoping once again to keep this as effortless as possible for the users. I am still hoping that this may be achieveable. In summary, what we are looking at is an install of Postgres on each machine, a copy of Tomcat running the application, and maybe Sequoia or Slony or some combination of both. ETL is handled separately (by me) and the users are supposed to just be able to get on with their work. Do you think this is achieveable or am I up the creek and reaching too far here? Cheers The Frog (you caught me out - its not my real name!)