Обсуждение: Some newbie questions
Hello, Could you plz answer the following questions of a newbie: What is a good way to start understanding backend(postgres) code? Is there any documentation available especially for developers? What is commit log and why it is needed? Why does a replication solution need log shipping and why cant we just ship the transaction statements to a standby node? to be continued ... ;) Thanks, Srinivas
M2Y wrote: > Hello, > > Could you plz answer the following questions of a newbie: > > What is a good way to start understanding backend(postgres) code? Is > there any documentation available especially for developers? Most of the developer info is within comments in the code itself. Another place to start is http://www.postgresql.org/developer/coding > What is commit log and why it is needed? To achieve ACID (Atomic, Consistent, Isolatable, Durable) The changes needed to complete a transaction are saved to the commit log and flushed to disk, then the data files are changed. If the power goes out during the data file modifications the commit log can be used to complete the changes without losing any data. > Why does a replication solution need log shipping and why cant we > just ship the transaction statements to a standby node? Depends on what you wish to achieve. They are two ways to a similar solution. Log shipping is part of the core code with plans to make the duplicate server be able to satisfy select queries. Statement based replication is offered by other options such as slony. Each has advantages and disadvantages. Transaction logs are part of normal operation and can be copied to another server in the background without adding load or delays to the master server. Statement based replication has added complexity of waiting for the slaves to duplicate the transaction and handling errors from a slave applying the transaction. They also tend to have restrictions when it comes to replicating DDL changes - implemented as triggers run from INSERT/UPDATE not from CREATE/ALTER TABLE. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz
Thanks Shane for your response... On Sep 7, 11:52 pm, pgsql@Sheeky.Biz (Shane Ambler) wrote: > > What is a good way to start understanding backend(postgres) code? Is > > there any documentation available especially for developers? > > Most of the developer info is within comments in the code itself. > Another place to start ishttp://www.postgresql.org/developer/coding > I have seen this link. But, I am looking(or hoping) for any design doc or technical doc which details what is happening under the hoods as it will save a lot of time to catchup the main stream. > > What is commit log and why it is needed? > > To achieve ACID (Atomic, Consistent, Isolatable, Durable) > The changes needed to complete a transaction are saved to the commit log > and flushed to disk, then the data files are changed. If the power goes > out during the data file modifications the commit log can be used to > complete the changes without losing any data. This, I think, is transaction log or XLog. My question is about CLog in which two bits are there for each transaction which will denote the status of transaction. Since there is XLog from which we can determine what changes we have to redo and undo, what is the need for this CLog. > > > Why does a replication solution need log shipping and why cant we > > just ship the transaction statements to a standby node? > > Depends on what you wish to achieve. They are two ways to a similar > solution. > Log shipping is part of the core code with plans to make the duplicate > server be able to satisfy select queries. > Statement based replication is offered by other options such as slony. > > Each has advantages and disadvantages. Transaction logs are part of > normal operation and can be copied to another server in the background > without adding load or delays to the master server. > > Statement based replication has added complexity of waiting for the > slaves to duplicate the transaction and handling errors from a slave > applying the transaction. They also tend to have restrictions when it > comes to replicating DDL changes - implemented as triggers run from > INSERT/UPDATE not from CREATE/ALTER TABLE. I agree. Assuming that both master and backup are running same versions of the server and both are in sync, why cant we just send the command statements to standby in the main backend loop(before parsing) and let the standby ignore the SELECT kind of statements. I am a beginner ... plz forgive my ignorance and plz provide some clarity so that I can understand the system better. Thanks, Srinivas
M2Y <mailtoyahoo@gmail.com> writes: > On Sep 7, 11:52�pm, pgsql@Sheeky.Biz (Shane Ambler) wrote: >> Most of the developer info is within comments in the code itself. >> Another place to start ishttp://www.postgresql.org/developer/coding >> > I have seen this link. But, I am looking(or hoping) for any design doc > or technical doc which details what is happening under the hoods as it > will save a lot of time to catchup the main stream. Well, you should certainly not neglect http://developer.postgresql.org/pgdocs/postgres/internals.html Also note that many subtrees of the source code contain README files with assorted overview material. regards, tom lane
M2Y escribió: > On Sep 7, 11:52 pm, pgsql@Sheeky.Biz (Shane Ambler) wrote: > > > What is a good way to start understanding backend(postgres) code? Is > > > there any documentation available especially for developers? > > > What is commit log and why it is needed? > > > > To achieve ACID (Atomic, Consistent, Isolatable, Durable) > > The changes needed to complete a transaction are saved to the commit log > > and flushed to disk, then the data files are changed. If the power goes > > out during the data file modifications the commit log can be used to > > complete the changes without losing any data. > > This, I think, is transaction log or XLog. My question is about CLog > in which two bits are there for each transaction which will denote the > status of transaction. Since there is XLog from which we can determine > what changes we have to redo and undo, what is the need for this CLog. That's correct -- what Shane is describing is the transaction log (usually know here as WAL). However, this xlog is write-only (except in the case of a crash); clog is read-write, and must be fast to query since it's used very frequently to determine visibility of each tuple. Perhaps what you need to read is the chapter on our MVCC implementation, which relies heavily on clog. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Sun, 7 Sep 2008, M2Y wrote: > Why does a replication solution need log shipping and why cant we just > ship the transaction statements to a standby node? Here's one of the classic examples of why that doesn't work: create table x (d decimal); insert into x values (random()); If you execute those same statements on two different nodes, they will end up with different values for the random number and therefore the nodes won't match anymore. A similar issue shows up if you use functions that check the current system time, that will be slightly different between the two: even if the clocks are perfectly synced, by the time the standy received the transaction it will be later than the original. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD