Обсуждение: PostgreSQL Replication with read-only access to standby DB

Поиск
Список
Период
Сортировка

PostgreSQL Replication with read-only access to standby DB

От
Keaton Adams
Дата:
<font face="Verdana, Helvetica, Arial"><span style="font-size:12.0px">This is probably better answered by the
PostgreSQLdeveloper team, so I am posting/moving my discussion to this thread.  My apologies if there was a better
avenueto pursue this requested feature.<br /><br /> This is exactly what we are after.  Log based replication built
intothe core database that would also allow for read-only queries on the slave server.  Trigger based / application
layerbased replication is not a good option for our environment.  Neither is a solution that cannot easily handle DDL
replicationor multiple DBs per PostgreSQL instance.  We are using WAL log shipping in production through a series of
scriptsI wrote in order to have a hot-standby server, which has been working quite well since last September.  The
requestis to now allow that standby server to be more useful than just a hot spare by way of read-only queries for
reportingpurposes.<br /><br /> From a Google Code posting:<br /><br /> Title      Implementing support for read-only
querieson PITR slaves<br /> Student     Florian G. Pflug<br /> Mentor     Simon Riggs<br /><br /> "The support for PITR
(Point-In-Time-Recovery)in postgres can be used to build a simple form a master-slave replication. Currently, no
queriescan be executed on the slave, though - it only replays WAL (Write-Ahead-Log) segments it receives from the
master.I want to implement support for running read-only queries on such a PITR slave, making PITR useful not only for
disasterrecovery, but also for load-balancing."<br /><br /><br /> So, what would it take to get this read-only server
featureimplemented in PostgreSQL?  I have been working with PG / Open Source projects for only a year and need some
directionon how to propose having this development effort undertaken by the PG development group.  My prior experience
hasbeen all closed source from Oracle, Informix, IBM, etc.  If funding for this specific development effort would help
thisis an option that we could explore as well.<br /><br /> Thanks,<br /><br /> Keaton Adams<br /> MX Logic, Inc.<br
/><br/><br /><br /> On 3/26/08 11:48 AM, "Alvaro Herrera" <alvherre@commandprompt.com> wrote:<br /><br /><font
color="#0000FF">>Chris Browne wrote:<br /> > <br /></font><font color="#008000">>> I seem to recall there
beinga relevant Google Summer of Code project<br /> >> about this, last year.<br /></font><font
color="#0000FF">><br /></font><font color="#008000">>> I do not recall how far it got.  It obviously didn't
makeit into 8.3<br /> >> ;-)!<br /></font><font color="#0000FF">> <br /> > Some parts of it did -- for
examplewe got "read-only transactions"<br /> > which were a step towards that goal.  (The point here is that a
hot<br/> > standby needs to be able to execute readonly transactions.)<br /><br /></font></span></font><font
size="2"><fontface="Monaco, Courier New"><span style="font-size:10.0px">> against.  People who are using the current
warm-standbycode are already <br /> > grappling with issues like how to coordinate master/slave failover <br /> >
(includingmy second favorite acronym, STONITH for "shoot the other node <br /> > in the head").  I don't expect
handlingthat sort of thing will ever be <br /> > integrated into the PostgreSQL database core.  <br /><br /> Note
thatmost other database products don't integrate it in their core<br /> either.  They package separate tools for it,
andsell it as a single system,<br /> but you can often buy the separate tools independently.  Oracle's RAC is an<br />
exception,but it also works completely differently than any of this.<br /><br /> A<br /><br /><br /> > I know very
littleabout postgreSQL internals but it would be great if:<br /> > - WAL files could be applied while the standby
serveris operational / allow<br /> > read-only queries<br /><br /> This is the part that requires modifying
PostgreSQL,and that progress was <br /> made toward by Florian's GSoC project last summer.<br /><br /> > - Allow
masterserver to send WAL files to standby servers / * WAL traffic<br /> > to be streamed to another server<br />
>- Allow master server to send list of all known standby servers<br /> > - Allow standby server to check if
masterserver is alive and promote itself<br /> > as master (would need to ask / make sure other standby servers do
nottry<br /> > promote themselves at the same time)<br /><br /> These parts you could build right now, except that
there'snot too much <br /> value to more than one standby if you're not using them to execute queries <br /> against.
 Peoplewho are using the current warm-standby code are already <br /> grappling with issues like how to coordinate
master/slavefailover <br /> (including my second favorite acronym, STONITH for "shoot the other node <br /> in the
head"). I don't expect handling that sort of thing will ever be <br /> integrated into the PostgreSQL database core.
 Whatis happening instead <br /> is that the appropriate interfaces to allow building higher-level tools <br /> are
beingdesigned and made available.<br /><br /><br /> I'm in the same boat, looking for master-slave replication for 1
master& 2<br /> 'standby' read-only servers (one would get promoted to master in case of<br /> failure).<br /><br
/>I recently read about WAL here:<br /><font color="#0000FF"><u><a
href="http://developer.postgresql.org/pgdocs/postgres/warm-standby.html">http://developer.postgresql.org/pgdocs/postgres/warm-standby.html</a><br
/></u></font><br/> The standby server is not available for access, since it is continually<br /> performing recovery
processing.<br/><br /> PostgreSQL does not provide the system software required to identify a<br /> failure on the
primaryand notify the standby system and then the standby<br /> database server. Many such tools exist and are well
integratedwith other<br /> aspects required for successful failover, such as IP address migration.<br /><br /> In short
there'snot much automation magic at the moment and doesn't seem<br /> like what you're looking for. <br /> Pgpool-II
mightbe the best alternative. <br /><br /> I know very little about postgreSQL internals but it would be great if:<br
/>- WAL files could be applied while the standby server is operational / allow<br /> read-only queries<br /> - Allow
masterserver to send WAL files to standby servers / * WAL traffic<br /> to be streamed to another server<br /> - Allow
masterserver to send list of all known standby servers<br /> - Allow standby server to check if master server is alive
andpromote itself<br /> as master (would need to ask / make sure other standby servers do not try<br /> promote
themselvesat the same time)<br /><br /> Then in my ways, you can use a pool to query the read-only standby servers. <br
/><br/> As I was writing this out, I thought this would make a great SOC project,<br /> but then found it already
exists!<br/><font color="#0000FF"><u><a
href="http://code.google.com/soc/2007/postgres/appinfo.html?csaid=6545828A8197EBC6">http://code.google.com/soc/2007/postgres/appinfo.html?csaid=6545828A8197EBC6</a><br
/></u></font><br/> Great news, I'd be happy to pitch in any time to help design a solution like<br /> this :)<br /><br
/><br/></span></font></font><font face="Verdana, Helvetica, Arial"><span style="font-size:12.0px">Our organization is
lookingfor a hot-standby option for PostgreSQL that uses the WAL (transaction) data to keep the standby current and
alsoallows the standby to be read-only accessible for reporting.  We have implemented WAL shipping through a set of
scriptswe developed and that works well to have a standby DB on the ready in case we need to fail over, but we are
lookingto increase the value of the standby server by making it available for queries.  Because of the complexities of
ourenvironment using a table/trigger based replication method such as Slony won’t work well.<br /><br /> It would be
greatif there was a solution (Open Source or Commercial) that worked in a similar manner as Oracle Active Data
Guard:<br/><br /> “Oracle Active Data Guard enables a physical standby database to be open for read-only access – for
reporting,simple or complex queries – while changes from the production database are being applied to it. This means
anyoperation that requires up-to-date read-only access can be offloaded to the replica, enhancing and protecting the
performanceof the production database.”  <br /><br /> “All queries reading from the physical replica execute in
real-time,and return current results.  A Data Guard configuration consists of one production (or primary) database and
upto nine standby databases.  A standby database is initially created from a backup copy of the primary database.  Once
created,Data Guard automatically maintains the standby database as a synchronized copy of the primary database by
transmittingprimary database redo data to the standby system and then applying the redo data to the standby
database.”<br/><br /> Does anyone know of such a solution for PostgreSQL?<br /></span></font><font size="2"><font
face="Monaco,Courier New"><span style="font-size:10.0px"><br /></span></font></font><font color="#0000FF"><font
face="Verdana,Helvetica, Arial"><span style="font-size:12.0px"><br /><br /></span></font></font> 

Re: PostgreSQL Replication with read-only access to standby DB

От
Greg Smith
Дата:
On Thu, 27 Mar 2008, Keaton Adams wrote:

> I have been working with PG / Open Source projects for only a year and 
> need some direction on how to propose having this development effort 
> undertaken by the PG development group.

Don't everybody answer at once.

The "PostgreSQL Global Development Group" (or PGDG for short) is a legal 
entity with group of organizers.  They tell you outright at 
http://www.postgresql.org/developer/ : "We don't hire programmers, we 
reach across the Internet, drawing the best database developers in the 
world to PostgreSQL".  While it's great to donate money to them via 
http://www.postgresql.org/about/donate that fund is earmarked for 
"advocacy materials, conference expenses, legal expenses, and travel 
costs"; note the lack of the work "development" on that list.

> This is probably better answered by the PostgreSQL developer team

The people actually developing new features in PostgreSQL aren't all on a 
single "team" as you're used to in traditional software companies, they're 
a community:  lots of people with similar goals who happen to be working 
on the same project, each with their own agenda and source(s) of funding 
and motivation.  There is a "core team" of 7 people: 
http://www.postgresql.org/community/contributors/ but since they're all 
too busy to write your feature their existence doesn't really help you.

Bruce addresses part of what you're asking about at 
http://momjian.us/main/writings/pgsql/company_contributions/ which has 
some more good comments on the whole community aspect to things, I'd 
recommend that since you say you're still new to how open source projects 
work.  But that article is more aimed at companies offering bodies to work 
on the code rather than ones with dollars to spend.

The exact mechanics of how to effectively sponsor work on a feature you'd 
like to have is somewhat off-topic for this list.  Discussion here is 
aimed at hashing out technical issues, not business ones.  And it's kind 
of a touchy subject to bring up as well, since it's hard to make 
recommendations without looking unprofessional--which partly explains the 
dead silence you've gotten as a response here so far.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD