Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.130
diff -c -r1.130 config.sgml
*** doc/src/sgml/config.sgml	30 Jun 2007 19:12:01 -0000	1.130
--- doc/src/sgml/config.sgml	13 Jul 2007 11:17:36 -0000
***************
*** 1412,1417 ****
--- 1412,1457 ----
         </para>
        </listitem>
       </varlistentry>
+ 
+      <varlistentry id="guc-wal-writer-delay" xreflabel="wal_writer_delay">
+       <term><varname>wal_writer_delay</varname> (<type>integer</type>)</term>
+       <indexterm>
+        <primary><varname>wal_writer_delay</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+ 		Specifies the delay between activity rounds for the WAL Writer. In each
+ 		round the writer will flush WAL to disk. It then sleeps for 
+ 		wal_writer_delay milliseconds, and repeats. The default value is 200 
+ 		milliseconds (200ms). Note that on many systems, the effective 
+ 		resolution of sleep delays is 10 milliseconds; setting wal_writer_delay 
+ 		to a value that is not a multiple of 10 might have the same results as 
+ 		setting it to the next higher multiple of 10. This parameter can only 
+ 		be set in the postgresql.conf file or on the server command line.
+        </para>
+       </listitem>
+      </varlistentry>
+      
+      <varlistentry id="guc-synchronous-commit" xreflabel="synchronous_commit">
+       <term><varname>synchronous_commit</varname> (<type>boolean</type>)</term>
+       <indexterm>
+        <primary><varname>synchronous_commit</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+ 		Specifies whether explicit or implicit commit commands will wait 
+ 		for WAL to be written to disk before the command returns success.
+ 		Turning this parameter off will give you asynchronous commits, 
+ 		which will increase performance for some workloads, though 
+ 		introduces risk of data loss, see <xref linkend="wal-asynch-commit">.
+        </para>
+        <para>
+ 		This parameter can be set for individual transactions or sessions,
+ 		so it is advisable to use asynchronous commits only for those
+ 		transactions for which risk of data loss is acceptable.
+        </para>
+       </listitem>
+      </varlistentry>
       
       <varlistentry id="guc-wal-buffers" xreflabel="wal_buffers">
        <term><varname>wal_buffers</varname> (<type>integer</type>)</term>
Index: doc/src/sgml/wal.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/wal.sgml,v
retrieving revision 1.44
diff -c -r1.44 wal.sgml
*** doc/src/sgml/wal.sgml	28 Jun 2007 00:02:37 -0000	1.44
--- doc/src/sgml/wal.sgml	13 Jul 2007 11:17:38 -0000
***************
*** 23,29 ****
     ordinarily meets this requirement.  In fact, even if a computer is
     fatally damaged, if the disk drives survive they can be moved to
     another computer with similar hardware and all committed
!    transactions will remain intact.
    </para>
  
    <para>
--- 23,33 ----
     ordinarily meets this requirement.  In fact, even if a computer is
     fatally damaged, if the disk drives survive they can be moved to
     another computer with similar hardware and all committed
!    transactions will remain intact. We refer to this mode of operation 
!    as synchronous commit, since the user issuing the commit waits for 
!    the writing of the <acronym>WAL</acronym> to permanent storage. 
!    Asynchronous commit is also possible as a performance option, 
!    described in <xref linkend="wal-asynch-commit">
    </para>
  
    <para>
***************
*** 394,397 ****
--- 398,586 ----
     seem to be a problem in practice.
    </para>
   </sect1>
+ 
+  <sect1 id="wal-asynch-commit">
+   <title>Asynchronous Commit</title>
+ 
+    <indexterm>
+     <primary>synchronous commit</primary>
+    </indexterm>
+ 
+    <indexterm>
+     <primary>asynchronous commit</primary>
+    </indexterm>
+ 
+   <para>
+    Asynchronous Commit allows a commit command to complete faster, at the 
+    cost that the most recent transactions will be lost if the database 
+    should crash. This feature is particularly useful for real-time
+    sensor data collection applications, such as RFID tags or other
+    monitoring applications.
+   </para>
+ 
+   <para>
+    Normal commits, or as we now refer to them, synchronous commits, wait
+    for the writing of the <acronym>WAL</acronym> to permanent storage
+    before returning control to the user.  An asynchronous commit will
+    return control to the user <emphasis>before</> <acronym>WAL</acronym> 
+    data has been written and flushed to disk, which gives a significant 
+    performance boost if there has been few other disk accesses over the course 
+    of this transaction.  From an SQL perspective, this makes asynchronous 
+    commits particularly useful for shorter, mainly INSERT transactions or 
+    writes on tables small enough to reside mainly in memory.
+   </para>
+ 
+   <para>
+    Asynchronous commits introduce the risk of data loss. There
+    is a time window between the time that a commit has returned
+    successfully to the user and the time of the WAL write during which
+    data will certainly be lost <emphasis>if</> the server crashes.  
+    The data loss is deterministic: if the server stays up then the 
+    commits will be durable, while if the server crashes there will be 
+    certain data loss of the transactions that have most recently
+    committed.
+   </para>
+ 
+   <para>
+    The user can select the commit mode of their transactions, so that 
+    it is possible to have both synchronous and asynchronous commit 
+    transactions concurrently.  The proof that this is safe is presented 
+    later in this section.  The commit mode is 
+    controlled by the user settable parameter 
+    <xref linkend="guc-synchronous-commit">. 
+    <varname>synchronous_commit</varname> may be set in 
+    <filename>postgresql.conf</filename>, or for a specific session using 
+ <programlisting>
+ SET synchronous_commit = off
+ </programlisting>
+    which will provide faster, asynchronous commits. 
+    <varname>synchronous_commit</varname> can be set at any point right 
+    up to the final COMMIT statement.  This parameter may also be set for
+    just one individual transaction using 
+ <programlisting>
+ SET LOCAL synchronous_commit = off
+ </programlisting>
+   </para>
+ 
+   <para>
+    Whatever the setting of <varname>synchronous_commit</varname>, 
+    commits will always be synchronous for utility commands, such 
+    as VACUUM, as well as for any transaction that created or removed 
+    files.
+   </para>
+ 
+   <sect2 id="wal-asynch-commit-risk">
+    <title>Risk of Data Loss</title>
+ 
+    <para>
+ 	If the database crashes during the risk window between the 
+ 	asynchronous commit and the writing of the <acronym>WAL</acronym>
+     then data written during that transaction <emphasis>will</> be lost.
+ 	Only data written by asynchronous transactions will be lost, just as 
+ 	if the transaction had never actually completed.  Transactions are
+ 	atomic, so this data loss does not propagate and any already 
+ 	written data is completely safe.
+    </para>
+ 
+    <para>
+ 	The window of data loss is limited because the WAL Writer process 
+ 	regularly writes <acronym>WAL</acronym> to disk every 
+ 	<xref linkend="guc-wal-writer-delay"> milliseconds.
+ 	The actual maximum duration of the risk window is twice the 
+ 	<varname>wal_writer_delay</varname> because the writes are optimised
+ 	to favor writing whole pages at a time during busy periods.
+    </para>
+ 
+    <para>
+ 	The extent of data loss for asynchronous commit transactions will be
+ 	limited to the number of transactions that normally complete in the
+ 	duration of the maximum time window.  You can calculate this for your 
+ 	specific application/system, though typically this would be in the 
+ 	ballpark of 1000 transactions.  This aspect means that asynch commits
+ 	are a useful technique for applications such as sensor measurements or
+ 	web statistics where a steady stream of data must be processed, yet
+ 	individual measurements are not particularly important.  Data loss 
+ 	would not be acceptable when the data processed is financial
+ 	transactions or other valuable customer commitments.
+    </para>
+ 
+    <para>
+     Asynchronous commit provides different behaviour to setting 
+     <varname>fsync</varname> = off, since that is a server-wide
+     setting that will alter the behaviour of all transactions,
+     overriding the setting of <varname>synchronous_commit</varname>,
+     as well as risking much wider data loss.  With <varname>fsync</varname> 
+ 	= off the WAL written but not fsynced, so data is lost only in case
+ 	of a system crash. With asynchronous commit the WAL is not written
+ 	at all by the user, so data is lost if there is a database server crash.
+    </para>
+ 
+    <para>
+     <varname>commit_delay</varname> also sounds very similar to 
+ 	asynchronous commit, but it is actually a synchronous commit 
+ 	with an additional wait that allows a technique known as group 
+ 	commit. <varname>commit_delay</varname> is also a server-wide setting.
+    </para>
+ 
+   </sect2>
+ 
+   <sect2 id="wal-asynch-commit-proof">
+    <title>Proof of safety for concurrent use</title>
+ 
+    <para>
+ 	It is useful that we can run both synchronous and asynchronous 
+ 	commit transactions concurrently, though there must be a clear proof
+ 	to ensure we minimise the possibility of technical error.
+    </para>
+ 
+    <para>
+ 	We have two transactions, T1 and T2.  The Log Sequence Number (LSN)
+ 	is the point in the WAL sequence where a transaction commit is  
+ 	recorded, so LSN1 and LSN2 are the commit records of those transactions. 
+ 	If T2 can see changes made by T1 then when T2 commits it
+ 	must be true that LSN2 > LSN1.  Thus when T2 commits it is certain
+ 	that all of the changes made by T1 are also now recorded in the WAL.
+ 	This situation is true whether or not T1 was asynchronous or
+ 	synchronous.  As a result, it is safe for asynchronous commits and 
+ 	synchronous commits to work concurrently without endangering data 
+ 	written by synchronous commits.  Sub-transactions are not important 
+ 	here since the final write to disk only occurs at the commit of 
+ 	the top level transaction.
+    </para>
+ 
+    <para>
+ 	Changes to data blocks cannot reach disk unless WAL is flushed up 
+ 	to the point of the LSN of the data blocks.  Any attempt to write
+ 	unsafe data to disk will trigger a write which ensures the safety
+ 	of all data written by that and prior transactions. Data blocks 
+ 	and clog pages are both protected by LSNs.
+    </para>
+ 
+    <para>
+ 	Non-WAL-logged changes to a temp table are also safe.  Those changes
+ 	could reach disk in advance of T2's commit, but we don't care since 
+ 	temp table contents don't survive crashes anyway.
+    </para>
+ 
+    <para>
+ 	Non-WAL-logged change made via one of the paths we have introduced
+ 	to avoid WAL overhead for bulk updates.  In these cases it's entirely
+ 	possible for the data to reach disk before T1's commit, because T2 will
+ 	fsync it down to disk without any sort of interlock, as soon as it
+ 	finishes the bulk update.  However, all these paths are designed to 
+ 	write data that no other transaction can see until after T2 commits. 
+ 	That commit must follow T1's in the WAL log, so until it has reached 
+ 	disk, the contents of the bulk-updated file are unimportant after a
+ 	crash.
+    </para>
+ 
+    <para>
+ 	Transaction status hint bits are normally set for data rows changed 
+ 	by a transaction. When we attempt to set a hint bit, we check the
+ 	LSN for a transaction and if this has already been written then we
+ 	will set the status hint bit.  If we haven't yet flushed WAL up to
+ 	that LSN then we will defer writing the status hint bit for that row.
+    </para>
+   </sect2>
+  </sect1>
  </chapter>