Обсуждение: Please advise features in 7.1
Hello, I've looked at the resources available through the web page to CVS and other stuff, however I cant find a statement of whats likely to be in 7.1 and what is planned for later. Reason: I want to know if any of these features are scheduled. 1. Calculated fields in table definitions . eg. Create table test ( A Integer, B integer, the_sum As (A+B), ); This is like MSSQL 2. Any parameterised triggers 3. Any parameterised stored procedures that return a result set. These are _extraordinarily_ useful for application development. If anyone has a way of bolting on any of these to 7.0, I'd be keen to hear from you. Regards John
"John Huttley" <John@mwk.co.nz> writes:
> Reason: I want to know if any of these features are scheduled.
> 1. Calculated fields in table definitions . eg.
>    Create table test (
>             A Integer,
>             B integer,
>            the_sum   As  (A+B),
> );
You can do that now (and for many versions past) with a trigger.
It's not quite as convenient as it ought to be, but it's possible.
AFAIK there's no change in that situation for 7.1.
> 2. Any parameterised triggers
We've had parameterized triggers for years.  Maybe you attach some
meaning to that term beyond what I do?
> 3. Any parameterised stored procedures that return a result set.
There is some support (dating back to Berkeley Postquel) for functions
returning sets, but it's pretty ugly and limited.  Proper support might
happen in 7.2 ...
        regards, tom lane
			
		----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "John Huttley" <John@mwk.co.nz> Cc: <pgsql-hackers@postgresql.org> Sent: Thursday, 23 November 2000 19:05 Subject: Re: [HACKERS] Please advise features in 7.1 > "John Huttley" <John@mwk.co.nz> writes: > > Reason: I want to know if any of these features are scheduled. > > > 1. Calculated fields in table definitions . eg. > > > Create table test ( > > A Integer, > > B integer, > > the_sum As (A+B), > > ); > > You can do that now (and for many versions past) with a trigger. > It's not quite as convenient as it ought to be, but it's possible. > AFAIK there's no change in that situation for 7.1. > Yes, Perhaps defining the table with a dummy field and setting up a 'before' trigger which replaced that field with a calculated value? Messy but feasible. > > 2. Any parameterised triggers > > We've had parameterized triggers for years. Maybe you attach some > meaning to that term beyond what I do? I'm referring to the manual that says functions used for triggers must have no parameters and return a type Opaque. And indeed it is impossible to create a trigger from a plSQL function that takes any parameters. Thus if we have a lot of triggers which are very similar, we cannot just use one function and pass an identifying parameter or two to it. We must create an individual function for each trigger. Its irritating more than fatal. > > 3. Any parameterised stored procedures that return a result set. > > There is some support (dating back to Berkeley Postquel) for functions > returning sets, but it's pretty ugly and limited. Proper support might > happen in 7.2 ... Something to look forward to! Meanwhile I'll have a play and see if its possible to use a read trigger to populate a temporary table. hmm, that might require a statement level trigger. Another thing for 7.2, i guess. The application programming we are doing now utilises stored procedures returning record sets (MSSQL) and the lack is showstopper in our migration plans. Sigh. Thanks Tom Regards John
----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "John Huttley" <John@mwk.co.nz> Cc: <pgsql-hackers@postgresql.org> Sent: Thursday, 23 November 2000 19:05 Subject: Re: [HACKERS] Please advise features in 7.1 > "John Huttley" <John@mwk.co.nz> writes: > > Reason: I want to know if any of these features are scheduled. > > > 1. Calculated fields in table definitions . eg. > > > Create table test ( > > A Integer, > > B integer, > > the_sum As (A+B), > > ); > > You can do that now (and for many versions past) with a trigger. > It's not quite as convenient as it ought to be, but it's possible. > AFAIK there's no change in that situation for 7.1. > Yes, Perhaps defining the table with a dummy field and setting up a 'before' trigger which replaced that field with a calculated value? Messy but feasible. > > 2. Any parameterised triggers > > We've had parameterized triggers for years. Maybe you attach some > meaning to that term beyond what I do? I'm referring to the manual that says functions used for triggers must have no parameters and return a type Opaque. And indeed it is impossible to create a trigger from a plSQL function that takes any parameters. Thus if we have a lot of triggers which are very similar, we cannot just use one function and pass an identifying parameter or two to it. We must create an individual function for each trigger. Its irritating more than fatal. > > 3. Any parameterised stored procedures that return a result set. > > There is some support (dating back to Berkeley Postquel) for functions > returning sets, but it's pretty ugly and limited. Proper support might > happen in 7.2 ... Something to look forward to! Meanwhile I'll have a play and see if its possible to use a read trigger to populate a temporary table. hmm, that might require a statement level trigger. Another thing for 7.2, i guess. The application programming we are doing now utilises stored procedures returning record sets (MSSQL) and the lack is showstopper in our migration plans. Sigh. Thanks Tom Regards John
At 18:00 23/11/00 +1300, John Huttley wrote: > >1. Calculated fields in table definitions . eg. > Can't really do this - you might want to consider a view with an insert & update rule. I'm not sure how flexible rules are and you may not be able to write rules to make views functions like tables, but that is at least part of their purpose I think. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
At 06:00 PM 11/23/00 +1300, John Huttley wrote: >1. Calculated fields in table definitions . eg. > > Create table test ( > A Integer, > B integer, > the_sum As (A+B), >); ... >These are _extraordinarily_ useful for application development. > >If anyone has a way of bolting on any of these to 7.0, I'd be keen to hear from >you. Create a trigger on insert/update for this case... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
"john huttley" <john@mwk.co.nz> writes:
>> We've had parameterized triggers for years.  Maybe you attach some
>> meaning to that term beyond what I do?
> I'm referring to the manual that says functions used for triggers must
> have no parameters and return a type Opaque.
The function has to be declared that way, but you can actually pass a
set of string parameters to it from the CREATE TRIGGER command.  The
strings show up in some special variable or other inside the function.
(No, I don't know why it was done in that ugly way...)  See the manual's
discussion of trigger programming.
        regards, tom lane
			
		Thanks for your help, everyone. This is a summary of replies. 1. Calculated fields in table definitions . eg. Create table test ( A Integer, B integer, the_sum As (A+B), ); This functionality can be achieved through the use of views. Implementing the create table syntax may not be too hard, but not in 7.1... 2 Parameterised Triggers Functionality is there, just that the documentation gave the wrong implication. An user manual example of using parameterised triggers to implement referential integrity would be welcome. 3. Stored Procedures returning a record set. Dream on! Regards John
On Tue, 28 Nov 2000, John Huttley wrote: > 3. Stored Procedures returning a record set. > > Dream on! This is something I would be really interested to see working. What are the issues? my understanding is that it is technically feasible but too complicated to add to PL/PGsql? it seems to me a basic service that needs to be implemented soon, even if its just returning multiple rows of one column... - Andrew
Hi, how long is PG7.1 already in beta testing? can it be released before Christmas day? can PG7.1 will recover database fromsystem crash? Thanks, XuYifeng
At 04:17 PM 11/28/00 +0800, xuyifeng wrote: >Hi, > > how long is PG7.1 already in beta testing? can it be released before Christmas day? > can PG7.1 will recover database from system crash? This guy's a troll from the PHP Builder's site (at least, Tim Perdue and I suspect this due to some posts he made in regard to Tim's SourceForge/Postgres article). Since he's read Tim's article, and at least some of the follow-up posts (given that he's posted responses himself), he should know by now that PG 7.1 is still in a pre-beta state and won't be released before Christmas day. I also posted a fairly long answer to a question Tim's posted at phpbuilder.com regarding recoverability and this guy's undoubtably read it, too. Have I forgotten anything, xuyifeng? - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
no doubt, I have touched some problems PG has, right? if PG is so good, is there any necessary for the team to improve PG again? Regards, XuYifeng ----- Original Message ----- From: Don Baccus <dhogaza@pacifier.com> To: xuyifeng <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org> Sent: Tuesday, November 28, 2000 10:37 PM Subject: Re: [HACKERS] beta testing version > At 04:17 PM 11/28/00 +0800, xuyifeng wrote: > >Hi, > > > > how long is PG7.1 already in beta testing? can it be released before Christmas day? > > can PG7.1 will recover database from system crash? > > This guy's a troll from the PHP Builder's site (at least, Tim Perdue and I suspect this > due to some posts he made in regard to Tim's SourceForge/Postgres article). > > Since he's read Tim's article, and at least some of the follow-up posts (given that > he's posted responses himself), he should know by now that PG 7.1 is still in a pre-beta > state and won't be released before Christmas day. I also posted a fairly long answer > to a question Tim's posted at phpbuilder.com regarding recoverability and this guy's > undoubtably read it, too. > > Have I forgotten anything, xuyifeng? > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > Nature photos, on-line guides, Pacific Northwest > Rare Bird Alert Service and other goodies at > http://donb.photo.net. >
At 11:15 PM 11/28/00 +0800, xuyifeng wrote: >no doubt, I have touched some problems PG has, right? if PG is so good, >is there any necessary for the team to improve PG again? See? Troll... The guy worships MySQL, just in case folks haven't made the connection. I'm going to ignore him from now on, suggest others do the same, I'm sure he'll go away eventually. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
you are complete wrong, if I don't like PG, I'll never go here or talk anything about PG, I don't care it. I just want PG can be improved quickly, for me crash recover is very urgent problem, otherewise PG is forced to stay on my desktop machine, We'll dare not move it to our Server, I always see myself as a customer, customer is always right. Regards, XuYifeng ----- Original Message ----- From: Don Baccus <dhogaza@pacifier.com> To: xuyifeng <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org> Sent: Tuesday, November 28, 2000 11:16 PM Subject: Re: [HACKERS] beta testing version > At 11:15 PM 11/28/00 +0800, xuyifeng wrote: > >no doubt, I have touched some problems PG has, right? if PG is so good, > >is there any necessary for the team to improve PG again? > > See? Troll... > > The guy worships MySQL, just in case folks haven't made the connection. > > I'm going to ignore him from now on, suggest others do the same, I'm sure > he'll go away eventually. > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > Nature photos, on-line guides, Pacific Northwest > Rare Bird Alert Service and other goodies at > http://donb.photo.net. >
> no doubt,  I have touched some problems PG has, right?  if PG is so good,
> is there any necessary for the team  to improve PG again?
*rofl*
Good call Don :)
                    - Thomas
			
		On Tue, Nov 28, 2000 at 02:04:01PM +1300, John Huttley wrote: > Thanks for your help, everyone. > > This is a summary of replies. > > 1. Calculated fields in table definitions . eg. > > Create table test ( > A Integer, > B integer, > the_sum As (A+B), > ); > > This functionality can be achieved through the use of views. Using a view for this isn't quite the same functionality as a computed field, from what I understand, since the calculation will be done at SELECT time, rather than INSERT/UPDATE. This can also be done with a trigger, which, while more cumbersome to write, would be capable of doing the math at modification time. Ross -- Open source code is like a natural resource, it's the result of providing food and sunshine to programmers, and then staying out of their way. [...] [It] is not going away because it has utility for both the developers and users independent of economic motivations. Jim Flynn, Sunnyvale, Calif.
xuyifeng wrote: > I just noticed this conversation so I have not followed all of it, but you seem to have strange priorities > I just want PG can be improved quickly, for me crash recover is very urgent problem, Crash avoidance is usually much more urgent, at least on production servers. > otherewise PG is forced to stay on my desktop machine, We'll dare not move it to our Server, Why do you keep crashing your server ? If your desktop crashes less often than your server you might exchange them, no? > I always see myself as a customer, customer is always right. I'd like to see myself as being always right too ;) ------------------- Hannu
On Tue, 28 Nov 2000, xuyifeng wrote: > no doubt, I have touched some problems PG has, right? if PG is so good, > is there any necessary for the team to improve PG again? There is always room for improvements for any software package ... whether it be PgSQL, Linux, FreeBSD or PHPBuilder ... as ppl learn more, understand more and come up with new techniques, things tend to get better ... > > Regards, > XuYifeng > > ----- Original Message ----- > From: Don Baccus <dhogaza@pacifier.com> > To: xuyifeng <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org> > Sent: Tuesday, November 28, 2000 10:37 PM > Subject: Re: [HACKERS] beta testing version > > > > At 04:17 PM 11/28/00 +0800, xuyifeng wrote: > > >Hi, > > > > > > how long is PG7.1 already in beta testing? can it be released before Christmas day? > > > can PG7.1 will recover database from system crash? > > > > This guy's a troll from the PHP Builder's site (at least, Tim Perdue and I suspect this > > due to some posts he made in regard to Tim's SourceForge/Postgres article). > > > > Since he's read Tim's article, and at least some of the follow-up posts (given that > > he's posted responses himself), he should know by now that PG 7.1 is still in a pre-beta > > state and won't be released before Christmas day. I also posted a fairly long answer > > to a question Tim's posted at phpbuilder.com regarding recoverability and this guy's > > undoubtably read it, too. > > > > Have I forgotten anything, xuyifeng? > > > > > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > > Nature photos, on-line guides, Pacific Northwest > > Rare Bird Alert Service and other goodies at > > http://donb.photo.net. > > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
On Tue, 28 Nov 2000, xuyifeng wrote: > you are complete wrong, if I don't like PG, I'll never go here or talk > anything about PG, I don't care it. I just want PG can be improved > quickly, for me crash recover is very urgent problem, otherewise PG is > forced to stay on my desktop machine, We'll dare not move it to our > Server, I always see myself as a customer, customer is always right. except when they are wrong ... ... but, as for crash recover, the plan right now is that on Thursday, Dec 1st, 7.1 goes beta ... if you are so keen on the crash recovery stuff, what I'd recommend is grab the snapshot, and work with that on your machine, get used to the features that it presents and report any bugs you find. Between beta and release, there will be bug fixes, but no features added, so it makes for a relatively safe starting point. I wouldn't use it in production (or, rather, I personally would, but it isn't something I'd recommend for the faint of heart), but it will give you a base to start from ... release will be shortly into the new year, depending on what sorts of bugs ppl report and how quickly they can be fixed ... if all goes well, Jan 1st will be release date, but, from experience, we're looking at closer to jan 15th :) > > Regards, > XuYifeng > > > ----- Original Message ----- > From: Don Baccus <dhogaza@pacifier.com> > To: xuyifeng <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org> > Sent: Tuesday, November 28, 2000 11:16 PM > Subject: Re: [HACKERS] beta testing version > > > > At 11:15 PM 11/28/00 +0800, xuyifeng wrote: > > >no doubt, I have touched some problems PG has, right? if PG is so good, > > >is there any necessary for the team to improve PG again? > > > > See? Troll... > > > > The guy worships MySQL, just in case folks haven't made the connection. > > > > I'm going to ignore him from now on, suggest others do the same, I'm sure > > he'll go away eventually. > > > > > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > > Nature photos, on-line guides, Pacific Northwest > > Rare Bird Alert Service and other goodies at > > http://donb.photo.net. > > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
On Tue, 28 Nov 2000, Hannu Krosing wrote: > xuyifeng wrote: > > > > I just noticed this conversation so I have not followed all of it, > but you seem to have strange priorities > > > I just want PG can be improved quickly, for me crash recover is very urgent problem, > > Crash avoidance is usually much more urgent, at least on production > servers. Good call, but I kinda jumped to the conclusion that since PgSQL itself isn't that crash prone, its his OS or his hardware that was the problem :0
This is one of the not-so-stomped boxes running PostgreSQL -- I've never restarted PostgreSQL on it since it was installed. 12:03pm up 122 days, 7:54, 1 user, load average: 0.08, 0.11, 0.09 I had some index corruption problems in 6.5.3 but since 7.0.X I haven't heard so much as a peep from any PostgreSQL backend. It's superbly stable on all my machines.. Damn good work guys. -Mitch ----- Original Message ----- From: "The Hermit Hacker" <scrappy@hub.org> To: "Hannu Krosing" <hannu@tm.ee> Cc: "xuyifeng" <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org>; "Don Baccus" <dhogaza@pacifier.com> Sent: Tuesday, November 28, 2000 8:53 AM Subject: Re: [HACKERS] beta testing version > On Tue, 28 Nov 2000, Hannu Krosing wrote: > > > xuyifeng wrote: > > > > > > > I just noticed this conversation so I have not followed all of it, > > but you seem to have strange priorities > > > > > I just want PG can be improved quickly, for me crash recover is very urgent problem, > > > > Crash avoidance is usually much more urgent, at least on production > > servers. > > Good call, but I kinda jumped to the conclusion that since PgSQL itself > isn't that crash prone, its his OS or his hardware that was the problem :0 > > >
Mitch Vincent wrote: > > This is one of the not-so-stomped boxes running PostgreSQL -- I've never > restarted PostgreSQL on it since it was installed. > 12:03pm up 122 days, 7:54, 1 user, load average: 0.08, 0.11, 0.09 > I had some index corruption problems in 6.5.3 but since 7.0.X I haven't > heard so much as a peep from any PostgreSQL backend. It's superbly stable on > all my machines.. I have a 6.5.x box at 328 days of active use. Crash "recovery" seems silly to me. :-) -Bop -- Brought to you from boop!, the dual boot Linux/Win95 Compaq Presario 1625 laptop, currently running RedHat 6.1. Your bopping may vary.
At 03:25 PM 11/28/00 -0700, Ron Chmara wrote: >Mitch Vincent wrote: >> >> This is one of the not-so-stomped boxes running PostgreSQL -- I've never >> restarted PostgreSQL on it since it was installed. >> 12:03pm up 122 days, 7:54, 1 user, load average: 0.08, 0.11, 0.09 >> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't >> heard so much as a peep from any PostgreSQL backend. It's superbly stable on >> all my machines.. > >I have a 6.5.x box at 328 days of active use. > >Crash "recovery" seems silly to me. :-) Well, not really ... but since our troll is a devoted MySQL user, it's a bit of a red-herring anyway, at least as regards his own server. You know, the one he's afraid to put Postgres on, but sleeps soundly at night knowing the mighty bullet-proof MySQL with its full transaction semantics, archive logging and recovery from REDO logs and all that will save him? :) Again ... he's a troll, not even a very entertaining one. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
On Tue, 28 Nov 2000, Don Baccus wrote: > At 03:25 PM 11/28/00 -0700, Ron Chmara wrote: > >Mitch Vincent wrote: > >> > >> This is one of the not-so-stomped boxes running PostgreSQL -- I've never > >> restarted PostgreSQL on it since it was installed. > >> 12:03pm up 122 days, 7:54, 1 user, load average: 0.08, 0.11, 0.09 > >> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't > >> heard so much as a peep from any PostgreSQL backend. It's superbly stable on > >> all my machines.. > > > >I have a 6.5.x box at 328 days of active use. > > > >Crash "recovery" seems silly to me. :-) > > Well, not really ... but since our troll is a devoted MySQL user, it's a bit > of a red-herring anyway, at least as regards his own server. > > You know, the one he's afraid to put Postgres on, but sleeps soundly at > night knowing the mighty bullet-proof MySQL with its full transaction > semantics, archive logging and recovery from REDO logs and all that > will save him? :) > > Again ... he's a troll, not even a very entertaining one. Or informed?
NO, I just tested how solid PgSQL is, I run a program busy inserting record into PG table, when I suddenly pulled out power from my machine and restarted PG, I can not insert any record into database table, all backends are dead without any respone (not core dump), note that I am using FreeBSD 4.2, it's rock solid, it's not OS crash, it just losted power. We use WindowsNT and MSSQL on our production server, before we accept MSSQL, we use this method to test if MSSQL can endure this kind of strik, it's OK, all databases are safely recovered, we can continue our work. we are a stock exchange company, our server are storing millilion $ finance number, we don't hope there are any problems in this case, we are using UPS, but UPS is not everything, it you bet everything on UPS, you must be idiot. I know you must be an avocation of PG, but we are professional customer, corporation user, we store critical data into database, not your garbage data. Regards, XuYifeng ----- Original Message ----- From: Don Baccus <dhogaza@pacifier.com> To: Ron Chmara <ron@opus1.com>; Mitch Vincent <mitch@venux.net>; <pgsql-hackers@postgresql.org> Sent: Wednesday, November 29, 2000 6:58 AM Subject: Re: [HACKERS] beta testing version > At 03:25 PM 11/28/00 -0700, Ron Chmara wrote: > >Mitch Vincent wrote: > >> > >> This is one of the not-so-stomped boxes running PostgreSQL -- I've never > >> restarted PostgreSQL on it since it was installed. > >> 12:03pm up 122 days, 7:54, 1 user, load average: 0.08, 0.11, 0.09 > >> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't > >> heard so much as a peep from any PostgreSQL backend. It's superbly stable on > >> all my machines.. > > > >I have a 6.5.x box at 328 days of active use. > > > >Crash "recovery" seems silly to me. :-) > > Well, not really ... but since our troll is a devoted MySQL user, it's a bit > of a red-herring anyway, at least as regards his own server. > > You know, the one he's afraid to put Postgres on, but sleeps soundly at > night knowing the mighty bullet-proof MySQL with its full transaction > semantics, archive logging and recovery from REDO logs and all that > will save him? :) > > Again ... he's a troll, not even a very entertaining one. > > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > Nature photos, on-line guides, Pacific Northwest > Rare Bird Alert Service and other goodies at > http://donb.photo.net. >
On Wed, Nov 29, 2000 at 09:59:34AM +0800, xuyifeng wrote: > NO, I just tested how solid PgSQL is, I run a program busy inserting > record into PG table, when I suddenly pulled out power from my machine ... Nobody claims PostgreSQL is proof against power failures. > ... We use WindowsNT and MSSQL on our production server, > before we accept MSSQL, we use this method to test if MSSQL can endure > this kind of strike, it's OK, all databases are safely recovered, we > can continue our work. You got lucky. Period. MSSQL is not proof against power failures, and neither is NTFS. In particular, that the database accepted transactions afterward is far from proof that its files were not corrupted. Incompetent testers produce invalid tests. Invalid tests lead to meaningless conclusions. Incompetent testers' employers suffer from false confidence, and poor decision-making. Nathan Myers ncm@zembu.com
xuyifeng wrote: > > NO, I just tested how solid PgSQL is, I run a program busy inserting record into PG table, when I > suddenly pulled out power from my machine and restarted PG, I can not insert any record into database > table, all backends are dead without any respone (not core dump), note that I am using FreeBSD 4.2, > it's rock solid, it's not OS crash, it just losted power. We use WindowsNT and MSSQL on our production > server, before we accept MSSQL, we use this method to test if MSSQL can endure this kind of strik, > it's OK, all databases are safely recovered, we can continue our work. The only way to safely recover them after a major crash would be manual/supervised recovery from backups + logs As not even NTFS is safe from power failures (I have lost an NTFS file system a few times due to not having an UPS) it is irrelevant if MSSQL is. Even if MSSQL is "crash proof" (tm), how can you _prove_ your customers/superiors that the last N minutes of transactions were not lost ? If the DB is able to "continue your work" after the crash, you can of course cover up the fact that the crash even happened and blame the lost transactions on someone else when they surface at the next audit ;) Or just claim thet computer technology is so complicated that losing a few transactions is normal - but you could go on working ;) :~) ;-p What you want for mission-critical data is replicated databases or at least off-site logging, not "crash recovery" at some arbitrarily chosen layer. You will need to recover from the crash even if it destroys the whole computer. May I suggest another test for your NT/MSSQL setup - dont pull the plug but change the input voltage to 10 000 VAC, if this goes well, test vith 100 000 VAC ;) This is also a scenario much less likely to be protected by an UPS than power loss. > we are a stock exchange company, > our server are storing millilion $ finance number, we don't hope there are any problems in this case, > we are using UPS, but UPS is not everything, it you bet everything on UPS, you must be idiot. So are you, if you bet everything on hoping that DB will do crash recovery from any type of crash. A common case of "crash" that may need to be recovered from is also a human error , like typing drop database at the wrong console; > I know you must be an avocation of PG, but we are professional customer, corporation user, we store critical > data into database, not your garbage data. Then you'd better have a crash recovery infrastructure/procedures in place and not hope that DB server will do that automatically for you -------------------- Hannu
I don't have the same luck, sorry to say! I am running Mandrake linux with OpenWall patched 2.2.17 kernel, dual p3 550Mhz, 1gb memory. It's a really busy webserver that constantly is running with 10 in load. Sometime it spikes to ~40-50 in load (the most we had was 114(!)). I am running postgresql 7.0.2 (from the Mandrake rpm's). One problem i have is that in one database we rapidly insert/delete in some tables, and to maintain a good performance on that db, i have to run a vacuum every hour(!). I think that db has excessive indexes all over the place (if that could have anything to do with it?). Another other problem that is more severe is that the database "crashes" (read: stops working), if i run psql and do a select it says "001129.07:04:15.688 [25474] FATAL 1: Memory exhausted in AllocSetAlloc()" and fails. I have a cron script that watches postgres, and restarts it if it cant get a select right. It fails this way maybe once a day or two days. I've searched the mailinglist archives for this problem, but it allways seems that my problem doesn't fit the descriptions of the other ppl's problem generating this error message. I have not found the right time to upgrade to 7.0.3 yet, and i don't know if that would solve anything. Another problem i have is that i get "001128.12:58:01.248 [23444] FATAL 1: Socket command type unknown" in my logs. I don't know if i get that from the unix odbc driver, the remote windows odbc driver, or in unix standard db connections. I get "pq_recvbuf: unexpected EOF on client connection" alot too, but that i think only indicates that the socket was closed in a not-so-nice way, and that it is no "real" error. It seems that the psql windows odbc driver is generating this. The postmaster is running with these parameters: "-N 512 -B 1024 -i -o -S 4096" But as a happy note i can tell you that we have a Linux box here (pentium 100, kernel 2.0.3x) that has near 1000 days uptime, and runs postgres 6.5.x. It has never failed, not even a single time :) Magnus Naeslund -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Programmer/Networker [|] Magnus NaeslundPGP Key: http://www.genline.nu/mag_pgp.txt -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ----- Original Message ----- From: "Mitch Vincent" <mitch@venux.net> To: <pgsql-hackers@postgresql.org> Sent: Tuesday, November 28, 2000 19:12 Subject: Re: [HACKERS] beta testing version > This is one of the not-so-stomped boxes running PostgreSQL -- I've never > restarted PostgreSQL on it since it was installed. > > 12:03pm up 122 days, 7:54, 1 user, load average: 0.08, 0.11, 0.09 > > I had some index corruption problems in 6.5.3 but since 7.0.X I haven't > heard so much as a peep from any PostgreSQL backend. It's superbly stable on > all my machines.. > > Damn good work guys. > > -Mitch > > ----- Original Message ----- > From: "The Hermit Hacker" <scrappy@hub.org> > To: "Hannu Krosing" <hannu@tm.ee> > Cc: "xuyifeng" <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org>; > "Don Baccus" <dhogaza@pacifier.com> > Sent: Tuesday, November 28, 2000 8:53 AM > Subject: Re: [HACKERS] beta testing version > > > > On Tue, 28 Nov 2000, Hannu Krosing wrote: > > > > > xuyifeng wrote: > > > > > > > > > > I just noticed this conversation so I have not followed all of it, > > > but you seem to have strange priorities > > > > > > > I just want PG can be improved quickly, for me crash recover is very > urgent problem, > > > > > > Crash avoidance is usually much more urgent, at least on production > > > servers. > > > > Good call, but I kinda jumped to the conclusion that since PgSQL itself > > isn't that crash prone, its his OS or his hardware that was the problem :0 > > > > > > > >
> server, before we accept MSSQL, we use this method to test if MSSQL can endure this kind of strik, > it's OK, all databases are safely recovered, we can continue our work. we are a stock exchange company, And how exactly did you test the integrity of your data? Unless every single record has got at least a CRC stored somewhere, you won't be able AT ALL to check for database integrity. The reports from NTFS and MSSQL internal checking are meaningless for your data integrity. We are doing this checksumming in our project, and already got a few nasty surprises when the "CRC daemon" stumbled over a few corrupted records we never would have discovered otherwise. Exactly this checksumming weeded out our server alternatives; at present only PostgreSQL is left, was the most reliable of all. Horst
"Magnus Naeslund\(f\)" <mag@fbab.net> writes:
> Another other problem that is more severe is that the database "crashes"
> (read: stops working), if i run psql and do a select it says
> "001129.07:04:15.688 [25474] FATAL 1:  Memory exhausted in AllocSetAlloc()"
> and fails.
That's odd.  Does any select at all --- even, say, "SELECT 2+2" --- fail
like that, or just ones referencing a particular table, or maybe you
meant just one specific query?
> Another problem i have is that i get "001128.12:58:01.248 [23444] FATAL 1:
> Socket command type  unknown" in my logs. I don't know if i get that from
> the unix odbc driver, the remote windows odbc driver, or in unix standard db
> connections.
Do any of your client applications complain that they're being
disconnected on?  This might come from something not doing disconnection
cleanly, in which case the client probably wouldn't notice anything wrong.
> I get "pq_recvbuf: unexpected EOF on client connection" alot too, but that i
> think only indicates that the socket was closed in a not-so-nice way, and
> that it is no "real" error.
> It seems that the psql windows odbc driver is generating this.
That message is quite harmless AFAIK, although it'd be nice to clean up
the ODBC driver so that it disconnects in the approved fashion.
        regards, tom lane
			
		Is "if" clause support in PG? for example: "drop table aa if exist" "insert into aa values(1) if not exists select * from aa where i=1" I would like PG support it. --- XuYifeng ----- Original Message ----- From: John Huttley <John@mwk.co.nz> To: <pgsql-hackers@postgresql.org> Sent: Tuesday, November 28, 2000 9:04 AM Subject: [HACKERS] Please advise features in 7.1 (SUMMARY) > Thanks for your help, everyone. > > This is a summary of replies. > > 1. Calculated fields in table definitions . eg. > > Create table test ( > A Integer, > B integer, > the_sum As (A+B), > ); > > This functionality can be achieved through the use of views. > Implementing the create table syntax may not be too hard, > but not in 7.1... > > 2 Parameterised Triggers > > Functionality is there, just that the documentation gave the wrong implication. > An user manual example of using parameterised triggers to implement referential > integrity > would be welcome. > > 3. Stored Procedures returning a record set. > > Dream on! > > > Regards > > John > > >
> our server alternatives; at present only PostgreSQL is left, was the most > reliable of all. mind i ask on which platform (Operating system) did you do your test,i'm mostly used to linux but after i paid my computer (still 5 month remaining),i want to get a used SGI box from reputable system and put NetBSD as well as PostgreSQL on it (and maybe AolServer too,depending on the threading model of NetBSD). Alain Toussaint
> Is "if" clause support in PG?
> for example:
> "drop table aa if exist"
> "insert into aa values(1) if not exists select * from aa where i=1"
No. afaict it is not in any SQL standard, so is unlikely to get much
attention from developers.
                    - Thomas
			
		On Thu, 30 Nov 2000, Thomas Lockhart wrote: > > Is "if" clause support in PG? > > for example: > > "drop table aa if exist" > > "insert into aa values(1) if not exists select * from aa where i=1" > > No. afaict it is not in any SQL standard, so is unlikely to get much > attention from developers. Plus, for that second one can't you just do: INSERT INTO aa SELECT 1 WHERE NOT EXISTS (SELECT * FROM aa WHERE i=1); - Andrew
At 05:24 AM 11/30/00 +0000, Thomas Lockhart wrote: >> Is "if" clause support in PG? >> for example: >> "drop table aa if exist" >> "insert into aa values(1) if not exists select * from aa where i=1" > >No. afaict it is not in any SQL standard, so is unlikely to get much >attention from developers. The insert, at least, can be written in standard SQL anyway... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
v7.1 should improve crash recovery for situations like this ... you'll still have to do a recovery of the data on corruption of this magnitude, but at least with the WAL stuff that Vadim is producing, you'll be able to recover up until the point that the power cable was pulled out of the wall ... On Wed, 29 Nov 2000, xuyifeng wrote: > NO, I just tested how solid PgSQL is, I run a program busy inserting record into PG table, when I > suddenly pulled out power from my machine and restarted PG, I can not insert any record into database > table, all backends are dead without any respone (not core dump), note that I am using FreeBSD 4.2, > it's rock solid, it's not OS crash, it just losted power. We use WindowsNT and MSSQL on our production > server, before we accept MSSQL, we use this method to test if MSSQL can endure this kind of strik, > it's OK, all databases are safely recovered, we can continue our work. we are a stock exchange company, > our server are storing millilion $ finance number, we don't hope there are any problems in this case, > we are using UPS, but UPS is not everything, it you bet everything on UPS, you must be idiot. > I know you must be an avocation of PG, but we are professional customer, corporation user, we store critical > data into database, not your garbage data. > > Regards, > XuYifeng > > ----- Original Message ----- > From: Don Baccus <dhogaza@pacifier.com> > To: Ron Chmara <ron@opus1.com>; Mitch Vincent <mitch@venux.net>; <pgsql-hackers@postgresql.org> > Sent: Wednesday, November 29, 2000 6:58 AM > Subject: Re: [HACKERS] beta testing version > > > > At 03:25 PM 11/28/00 -0700, Ron Chmara wrote: > > >Mitch Vincent wrote: > > >> > > >> This is one of the not-so-stomped boxes running PostgreSQL -- I've never > > >> restarted PostgreSQL on it since it was installed. > > >> 12:03pm up 122 days, 7:54, 1 user, load average: 0.08, 0.11, 0.09 > > >> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't > > >> heard so much as a peep from any PostgreSQL backend. It's superbly stable on > > >> all my machines.. > > > > > >I have a 6.5.x box at 328 days of active use. > > > > > >Crash "recovery" seems silly to me. :-) > > > > Well, not really ... but since our troll is a devoted MySQL user, it's a bit > > of a red-herring anyway, at least as regards his own server. > > > > You know, the one he's afraid to put Postgres on, but sleeps soundly at > > night knowing the mighty bullet-proof MySQL with its full transaction > > semantics, archive logging and recovery from REDO logs and all that > > will save him? :) > > > > Again ... he's a troll, not even a very entertaining one. > > > > > > > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > > Nature photos, on-line guides, Pacific Northwest > > Rare Bird Alert Service and other goodies at > > http://donb.photo.net. > > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote: > >v7.1 should improve crash recovery for situations like this ... you'll >still have to do a recovery of the data on corruption of this magnitude, >but at least with the WAL stuff that Vadim is producing, you'll be able to >recover up until the point that the power cable was pulled out of the wall No, WAL won't help if an actual database file is corrupted, say by a disk drive hosing a block or portion thereof with zeros. WAL-based recovery at startup works on an intact database. Still, in the general case you need real backup and recovery tools. Then you can apply archives of REDOs to a backup made of a snapshot and rebuild up to the last transaction. As opposed to your last pg_dump. So what about mirroring (RAID 1)? As the docs tell ya, that protects you against one drive failing but not against power failure, which can cause bad data to be written to both mirrors if both are actively writing when the plug is pulled. Power failures are evil, face it! :) - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
On Thu, 30 Nov 2000, Don Baccus wrote: > At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote: > > > >v7.1 should improve crash recovery for situations like this ... you'll > >still have to do a recovery of the data on corruption of this magnitude, > >but at least with the WAL stuff that Vadim is producing, you'll be able to > >recover up until the point that the power cable was pulled out of the wall > > No, WAL won't help if an actual database file is corrupted, say by a > disk drive hosing a block or portion thereof with zeros. WAL-based > recovery at startup works on an intact database. No, WAL does help, cause you can then pull in your last dump and recover up to the moment that power cable was pulled out of the wall ...
On Thu, Nov 30, 2000 at 07:47:08PM -0400, The Hermit Hacker wrote: > On Thu, 30 Nov 2000, Don Baccus wrote: > > At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote: > > > > > >v7.1 should improve crash recovery for situations like this ... you'll > > >still have to do a recovery of the data on corruption of this magnitude, > > >but at least with the WAL stuff that Vadim is producing, you'll be able to > > >recover up until the point that the power cable was pulled out of the wall > > > > No, WAL won't help if an actual database file is corrupted, say by a > > disk drive hosing a block or portion thereof with zeros. WAL-based > > recovery at startup works on an intact database. > > No, WAL does help, cause you can then pull in your last dump and recover > up to the moment that power cable was pulled out of the wall ... False, on so many counts I can't list them all. Nathan Myers ncm
> > No, WAL does help, cause you can then pull in your last dump and recover > > up to the moment that power cable was pulled out of the wall ... > > False, on so many counts I can't list them all. Why? If we're not talking hardware damage and you have a dump made sometime previous to the crash, why wouldn't that work to restore the database? I've had to restore a corrupted database from a dump before, there wasn't any hardware damage, the database (more specifically the indexes) were corrupted. Of course WAL wasn't around but I don't see why this wouldn't work... Note I'm not saying you're wrong, just asking that you explain your comment a little more. If WAL can't be used to help recover from crashes where database corruption occurs, what good is it? -Mitch
On Thu, 30 Nov 2000, Nathan Myers wrote: > On Thu, Nov 30, 2000 at 07:47:08PM -0400, The Hermit Hacker wrote: > > On Thu, 30 Nov 2000, Don Baccus wrote: > > > At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote: > > > > > > > >v7.1 should improve crash recovery for situations like this ... you'll > > > >still have to do a recovery of the data on corruption of this magnitude, > > > >but at least with the WAL stuff that Vadim is producing, you'll be able to > > > >recover up until the point that the power cable was pulled out of the wall > > > > > > No, WAL won't help if an actual database file is corrupted, say by a > > > disk drive hosing a block or portion thereof with zeros. WAL-based > > > recovery at startup works on an intact database. > > > > No, WAL does help, cause you can then pull in your last dump and recover > > up to the moment that power cable was pulled out of the wall ... > > False, on so many counts I can't list them all. would love to hear them ... I'm always opening to having my misunderstandings corrected ...
On Thu, 30 Nov 2000, Nathan Myers wrote: > On Thu, Nov 30, 2000 at 07:47:08PM -0400, The Hermit Hacker wrote: > > On Thu, 30 Nov 2000, Don Baccus wrote: > > > At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote: > > > > > > > >v7.1 should improve crash recovery for situations like this ... you'll > > > >still have to do a recovery of the data on corruption of this magnitude, > > > >but at least with the WAL stuff that Vadim is producing, you'll be able to > > > >recover up until the point that the power cable was pulled out of the wall > > > > > > No, WAL won't help if an actual database file is corrupted, say by a > > > disk drive hosing a block or portion thereof with zeros. WAL-based > > > recovery at startup works on an intact database. > > > > No, WAL does help, cause you can then pull in your last dump and recover > > up to the moment that power cable was pulled out of the wall ... > > False, on so many counts I can't list them all. *YAWN*
On Thu, Nov 30, 2000 at 05:37:58PM -0800, Mitch Vincent wrote: > > > No, WAL does help, cause you can then pull in your last dump and recover > > > up to the moment that power cable was pulled out of the wall ... > > > > False, on so many counts I can't list them all. > > Why? If we're not talking hardware damage and you have a dump made > sometime previous to the crash, why wouldn't that work to restore the > database? I've had to restore a corrupted database from a dump before, > there wasn't any hardware damage, the database (more specifically the > indexes) were corrupted. Of course WAL wasn't around but I don't see > why this wouldn't work... I posted a more detailed explanation a few minutes ago, but it appears to have been eaten by the mailing list server. I won't re-post the explanations that you all have seen over the last two days, about disk behavior during a power outage; they're in the archives (I assume -- when last I checked, web access to it didn't work). Suffice to say that if you pull the plug, there is just too much about the state of the disks that is unknown. As for replaying logs against a restored snapshot dump... AIUI, a dump records tuples by OID, but the WAL refers to TIDs. Therefore, the WAL won't work as a re-do log to recover your transactions because the TIDs of the restored tables are all different. To get replaying we need an "update log", something that might be in 7.2 if somebody does a lot of work. > Note I'm not saying you're wrong, just asking that you explain your > comment a little more. If WAL can't be used to help recover from > crashes where database corruption occurs, what good is it? The WAL is a performance optimization for the current recovery capabilities, which assume uncorrupted table files. It protects against those database server crashes that happen not to corrupt the table files (i.e. most). It doesn't protect against corruption of the tables, by bugs in PG or in the OS or from "hardware events". It also doesn't protect against OS crashes that result in write-buffered sectors not having been written before the crash. Practically, this means that WAL file entries older than a few seconds are not useful for much. In general, it's foolish to expect a single system to store very valuable data with much confidence. To get full recoverability, you need a "hot failover" system duplicating your transactions in real time. (Even then, you're vulnerable to application-level mistakes.) Nathan Myers ncm@zembu.com
At 05:15 PM 11/30/00 -0800, Nathan Myers wrote: >As for replaying logs against a restored snapshot dump... AIUI, a >dump records tuples by OID, but the WAL refers to TIDs. Therefore, >the WAL won't work as a re-do log to recover your transactions >because the TIDs of the restored tables are all different. Actually, the dump doesn't record tuple OIDs (unless you specifically ask for them), it just dumps source sql. When this gets reloaded you get an equivalent database, but not the same database, that you started out with. That's why I've presumed you can't run the WAL against it. If you and I are wrong I'd love to be surprised! - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote: > > v7.1 should improve crash recovery ... > ... with the WAL stuff that Vadim is producing, you'll be able to > recover up until the point that the power cable was pulled out of > the wall. Please do not propagate falsehoods like the above. It creates unsatisfiable expectations, and leads people to fail to take proper precautions and recovery procedures. After a power outage on an active database, you may have corruption at low levels of the system, and unless you have enormous redundancy (and actually use it to verify everything) the corruption may go undetected and result in (subtly) wrong answers at any future time. The logging in 7.1 protects transactions against many sources of database crash, but not necessarily against OS crash, and certainly not against power failure. (You might get lucky, or you might just think you were lucky.) This is the same as for most databases; an embedded database that talks directly to the hardware might be able to do better. Nathan Myers ncm@zembu.com
On Thu, 30 Nov 2000, Nathan Myers wrote: > On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote: > > > > v7.1 should improve crash recovery ... > > ... with the WAL stuff that Vadim is producing, you'll be able to > > recover up until the point that the power cable was pulled out of > > the wall. > > Please do not propagate falsehoods like the above. It creates > unsatisfiable expectations, and leads people to fail to take > proper precautions and recovery procedures. > > After a power outage on an active database, you may have corruption > at low levels of the system, and unless you have enormous redundancy > (and actually use it to verify everything) the corruption may go > undetected and result in (subtly) wrong answers at any future time. > > The logging in 7.1 protects transactions against many sources of > database crash, but not necessarily against OS crash, and certainly > not against power failure. (You might get lucky, or you might just > think you were lucky.) This is the same as for most databases; an > embedded database that talks directly to the hardware might be able > to do better. We're talking about transaction logging here ... nothing gets written to it until completed ... if I take a "known to be clean" backup from the night before, restore that and then run through the transaction logs, my data should be clean, unless my tape itself is corrupt. If the power goes off half way through a write to the log, then that transaction wouldn't be marked as completed and won't roll into the restore ... if a disk goes corrupt, I'd expect that the redo log would possibly have a problem with corruption .. but if I pull the plug, unless I've somehow damaged the disk, I would expect my redo log to be clean *and*, unless Vadim totally messed something up, if there is any corruption in the redo log, I'd expect that restoring from it would generate from red flags ...
On Thu, 30 Nov 2000, Nathan Myers wrote: > After a power outage on an active database, you may have corruption > at low levels of the system, and unless you have enormous redundancy > (and actually use it to verify everything) the corruption may go > undetected and result in (subtly) wrong answers at any future time. Nathan, why are you so hostile against postgres? Is there an ax to grind? The conditions under which WAL will completely recover your database: 1) OS guarantees complete ordering of fsync()'d writes. (i.e. having two blocks A and B, A is fsync'd before B, it could NOT happen that B is on disk but A is not). 2) on boot recovery, OS must not corrupt anything that was fsync'd. Rule 1) is met by all unixish OSes in existance. Rule 2 is met by some filesystems, such as reiserfs, tux2, and softupdates. > The logging in 7.1 protects transactions against many sources of > database crash, but not necessarily against OS crash, and certainly > not against power failure. (You might get lucky, or you might just > think you were lucky.) This is the same as for most databases; an > embedded database that talks directly to the hardware might be able > to do better.
> As for replaying logs against a restored snapshot dump... AIUI, a > dump records tuples by OID, but the WAL refers to TIDs. Therefore, > the WAL won't work as a re-do log to recover your transactions > because the TIDs of the restored tables are all different. True for current way of backing up - ie saving data in "external" (sql) format. But there is another way - saving data files in their natural (binary) format. WAL records may be applyed to such dump, right? > To get replaying we need an "update log", something that might be What did you mean by "update log"? Are you sure that WAL is not "update log" ? -:) > in 7.2 if somebody does a lot of work. > > > Note I'm not saying you're wrong, just asking that you explain your > > comment a little more. If WAL can't be used to help recover from > > crashes where database corruption occurs, what good is it? > > The WAL is a performance optimization for the current recovery > capabilities, which assume uncorrupted table files. It protects > against those database server crashes that happen not to corrupt > the table files (i.e. most). It doesn't protect against corruption > of the tables, by bugs in PG or in the OS or from "hardware events". > It also doesn't protect against OS crashes that result in > write-buffered sectors not having been written before the crash. > Practically, this means that WAL file entries older than a few > seconds are not useful for much. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Even now, without BAR, WAL entries become unuseful only after checkpoints (and I wouldn't recomend to create them each few seconds -:)). WAL based BAR will require archiving of log records. Vadim
On Fri, Dec 01, 2000 at 01:54:23AM -0500, Alex Pilosov wrote: > On Thu, 30 Nov 2000, Nathan Myers wrote: > > After a power outage on an active database, you may have corruption > > at low levels of the system, and unless you have enormous redundancy > > (and actually use it to verify everything) the corruption may go > > undetected and result in (subtly) wrong answers at any future time. > > Nathan, why are you so hostile against postgres? Is there an ax to grind? Alex, please don't invent enemies. It's clear what important features PostgreSQL still lacks; over the next several releases these features will be implemented, at great expense. PostgreSQL is useful and usable now, given reasonable precautions and expectations. In the future it will satisfy greater (albeit still reasonable) expectations. > The conditions under which WAL will completely recover your database: > > 1) OS guarantees complete ordering of fsync()'d writes. (i.e. having two > blocks A and B, A is fsync'd before B, it could NOT happen that B is on > disk but A is not). > 2) on boot recovery, OS must not corrupt anything that was fsync'd. > > Rule 1) is met by all unixish OSes in existance. Rule 2 is met by some > filesystems, such as reiserfs, tux2, and softupdates. No. The OS asks the disk to write blocks in a certain order, but disks normally reorder writes. Not only that; as noted earlier, typical disks report the write completed long before the blocks actually hit the disk. A logging file system protects against the simpler forms of OS crash, where the OS data-structure corruption is noticed before any more disk writes are scheduled. It can't (by itself) protect against disk errors. For critical applications, you must supply that protection yourself, with (e.g.) battery-backed mirroring. > > The logging in 7.1 protects transactions against many sources of > > database crash, but not necessarily against OS crash, and certainly > > not against power failure. (You might get lucky, or you might just > > think you were lucky.) This is the same as for most databases; an > > embedded database that talks directly to the hardware might be able > > to do better. The best possible database code can't overcome a broken OS or a broken disk. It would be unreasonable to expect otherwise. Nathan Myers ncm@zembu.com
Date: Fri, 1 Dec 2000 01:54:23 -0500 (EST) From: Alex Pilosov <alex@pilosoft.com> On Thu, 30 Nov 2000, Nathan Myers wrote: > After a power outage on an active database, you may have corruption > at low levels of the system, and unless you haveenormous redundancy > (and actually use it to verify everything) the corruption may go > undetected and result in(subtly) wrong answers at any future time. Nathan, why are you so hostile against postgres? Is there an ax to grind? I don't think he is being hostile (I work with him, so I know that he is generally pro-postgres). The conditions under which WAL will completely recover your database: 1) OS guarantees complete ordering of fsync()'dwrites. (i.e. having two blocks A and B, A is fsync'd before B, it could NOT happen that B is on disk but A isnot). 2) on boot recovery, OS must not corrupt anything that was fsync'd. Rule 1) is met by all unixish OSes in existance. Rule 2 is met by some filesystems, such as reiserfs, tux2, and softupdates. I think you are missing his main point, which he stated before, which is that modern disk hardware is both smarter and stupider than most people realize. Some disks cleverly accept writes into a RAM cache, and return a completion signal as soon as they have done that. They then feel free to reorder the writes to magnetic media as they see fit. This significantly helps performance. However, it means that all bets off on a sudden power loss. Your rule 1 is met at the OS level, but it is not met at the physical drive level. The fact that the OS guarantees ordering of fsync()'d writes means little since the drive is capable of reordering writes behind the back of the OS. At least with IDE, it is possible to tell the drive to disable this sort of caching and reordering. However, GNU/Linux, at least, does not do this. After all, doing it would hurt performance, and would move us back to the old days when operating systems had to care a great deal about disk geometry. I expect that careful attention to the physical disks you purchase can help you avoid these problems. For example, I would hope that EMC disk systems handle power loss gracefully. But if you buy ordinary off the shelf PC hardware, you really do need to arrange for a UPS, and some sort of automatic shutdown if the UPS is running low. Otherwise, although the odds are certainly with you, there is no 100% guarantee that a busy database will survive a sudden power outage. Ian
On Thu, Nov 30, 2000 at 11:06:31PM -0800, Vadim Mikheev wrote: > > As for replaying logs against a restored snapshot dump... AIUI, a > > dump records tuples by OID, but the WAL refers to TIDs. Therefore, > > the WAL won't work as a re-do log to recover your transactions > > because the TIDs of the restored tables are all different. > > True for current way of backing up - ie saving data in "external" > (sql) format. But there is another way - saving data files in their > natural (binary) format. WAL records may be applyed to > such dump, right? But (AIUI) you can only safely/usefully copy those files when the database is shut down. Many people hope to run PostgreSQL 24x7x365. With vacuuming, you might just as well shut down afterward; but when that goes away (in 7.2?), when will you get the chance to take your backups? Clearly we need either another form of snapshot backup that can be taken with the database running, and compatible with the current WAL (or some variation on it); or, we need another kind of log, in addition to the WAL. > > To get replaying we need an "update log", something that might be > > in 7.2 if somebody does a lot of work. > > What did you mean by "update log"? > Are you sure that WAL is not "update log" ? -:) No, I'm not sure. I think it's possible that a new backup utility could be written to make a hot backup which could be restored and then replayed using the current WAL format. It might be easier to add another log which could be replayed against the existing form of backups. That last is what I called the "update log". The point is, WAL now does one job superbly: maintain a consistent on-disk database image. Asking it to do something else, such as supporting hot BAR, could interfere with it doing its main job. Of course, only the person who implements hot BAR can say. Nathan Myers ncm@zembu.com
At 00:55 1/12/00 -0800, Nathan Myers wrote: >On Thu, Nov 30, 2000 at 11:06:31PM -0800, Vadim Mikheev wrote: >> > As for replaying logs against a restored snapshot dump... AIUI, a >> > dump records tuples by OID, but the WAL refers to TIDs. Therefore, >> > the WAL won't work as a re-do log to recover your transactions >> > because the TIDs of the restored tables are all different. >> >> True for current way of backing up - ie saving data in "external" >> (sql) format. But there is another way - saving data files in their >> natural (binary) format. WAL records may be applyed to >> such dump, right? > >But (AIUI) you can only safely/usefully copy those files when the >database is shut down. > This is not true; the way Vadim has implemeted WAL is to write a series of files of fixed size. When all transactions that have records in one file have completed, that file is (currently) deleted. When BAR is going, the files will be archived. The only circumstance in which this strategy will fail is if there are a large number of intensive long-standing single transactions - which is unlikely (not to mention bad practice). As a result of this, BAR will just need to take a snapshot of the database and apply the logs (basically like a very extended recovery process). You have raised some interesting issues regrading write-order etc. Can we assume that when fsync *returns*, all records are written - though not necessarily in the order that the IO's were executed? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
At 12:55 AM 12/1/00 -0800, Nathan Myers wrote: >Many people hope to run PostgreSQL 24x7x365. With vacuuming, you >might just as well shut down afterward; but when that goes away >(in 7.2?), when will you get the chance to take your backups? >Clearly we need either another form of snapshot backup that can >be taken with the database running, and compatible with the >current WAL (or some variation on it); or, we need another kind >of log, in addition to the WAL. Vadim's not ignorant of such matters, when he says "make a copy of the files" he's not talking about using tar on a running database. BAR tools are needed, as Vadim has pointed out here in the past. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
At 11:06 PM 11/30/00 -0800, Vadim Mikheev wrote: >> As for replaying logs against a restored snapshot dump... AIUI, a >> dump records tuples by OID, but the WAL refers to TIDs. Therefore, >> the WAL won't work as a re-do log to recover your transactions >> because the TIDs of the restored tables are all different. > >True for current way of backing up - ie saving data in "external" >(sql) format. But there is another way - saving data files in their >natural (binary) format. WAL records may be applyed to >such dump, right? Right. That's what's missing in PG 7.1, the existence of tools to make such backups. Probably the best answer to the "what does WAL get us, if it doesn't get us full recoverability" questions is to simply say "it's a prerequisite to getting full recoverability, PG 7.1 sets the foundation and later work will get us there". - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
At 12:30 AM 12/1/00 -0800, Ian Lance Taylor wrote: >For example, I would hope that EMC >disk systems handle power loss gracefully. They must, their marketing literature says so :) > But if you buy ordinary >off the shelf PC hardware, you really do need to arrange for a UPS, >and some sort of automatic shutdown if the UPS is running low. Which is what disk subsystems like those from EMC do for you. They've got build-in battery backup that lets them guarantee (assuming the hardware's working right) that in the case of a power outage, all blocks the operating system thinks have been written will in actuality be written before the disk subsystem powers itself down. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> > > As for replaying logs against a restored snapshot dump... AIUI, a > > > dump records tuples by OID, but the WAL refers to TIDs. Therefore, > > > the WAL won't work as a re-do log to recover your transactions > > > because the TIDs of the restored tables are all different. > > > > True for current way of backing up - ie saving data in "external" > > (sql) format. But there is another way - saving data files in their > > natural (binary) format. WAL records may be applyed to > > such dump, right? > > But (AIUI) you can only safely/usefully copy those files when the > database is shut down. No. You can read/save datafiles at any time. But block reads must be "atomic" - no one should be able to change any part of a block while we read it. Cp & tar are probably not suitable for this, but internal BACKUP command could do this. Restoring from such backup will like recovering after pg_ctl -m i stop: all data blocks are consistent and WAL records may be applyed to them. > Many people hope to run PostgreSQL 24x7x365. With vacuuming, you > might just as well shut down afterward; but when that goes away > (in 7.2?), when will you get the chance to take your backups? Ability to shutdown 7.2 will be preserved -:)) But it's not required for backup. > > > To get replaying we need an "update log", something that might be > > > in 7.2 if somebody does a lot of work. > > > > What did you mean by "update log"? > > Are you sure that WAL is not "update log" ? -:) > > No, I'm not sure. I think it's possible that a new backup utility > could be written to make a hot backup which could be restored and > then replayed using the current WAL format. It might be easier to > add another log which could be replayed against the existing form > of backups. That last is what I called the "update log". Consistent read of data blocks is easier to implement, sure. > The point is, WAL now does one job superbly: maintain a consistent > on-disk database image. Asking it to do something else, such as > supporting hot BAR, could interfere with it doing its main job. > Of course, only the person who implements hot BAR can say. There will be no interference because of BAR will not ask WAL to do anything else it does right now - redo-ing changes. Vadim
On Fri, Dec 01, 2000 at 06:39:57AM -0800, Don Baccus wrote: > > Probably the best answer to the "what does WAL get us, if it doesn't > get us full recoverability" questions is to simply say "it's a > prerequisite to getting full recoverability, PG 7.1 sets the foundation > and later work will get us there". Not to quibble, but for most of us, the answer to Don's question is: "It gives a ~20x speedup over 7.0." That's pretty valuable to some of us. If it turns out to be useful for other stuff, that's gravy. Nathan Myers ncm@zembu.com
At 11:02 AM 12/1/00 -0800, Nathan Myers wrote: >On Fri, Dec 01, 2000 at 06:39:57AM -0800, Don Baccus wrote: >> >> Probably the best answer to the "what does WAL get us, if it doesn't >> get us full recoverability" questions is to simply say "it's a >> prerequisite to getting full recoverability, PG 7.1 sets the foundation >> and later work will get us there". > >Not to quibble, but for most of us, the answer to Don's question is: >"It gives a ~20x speedup over 7.0." That's pretty valuable to some of us. >If it turns out to be useful for other stuff, that's gravy. Oh, but given that power failures eat disks anyway, you can just run PG 7.0 with -F and be just as fast as PG 7.1, eh? With no theoretical loss in safety? Where's your faith in all that doom and gloom you've been spreading? :) :) You're right, of course, we'll get roughly -F performance while maintaining a much more comfortable level of risk than you get with -F. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
Ok, this has peaked my interest in learning exactly what WAL is and what it does... I don't see any in-depth explanation of WAL on the postgresql.org site, can someone point me to some documentation? (if any exists, that is). Thanks! -Mitch ----- Original Message ----- From: "Nathan Myers" <ncm@zembu.com> To: <pgsql-hackers@postgresql.org> Sent: Friday, December 01, 2000 11:02 AM Subject: Re: [HACKERS] beta testing version > On Fri, Dec 01, 2000 at 06:39:57AM -0800, Don Baccus wrote: > > > > Probably the best answer to the "what does WAL get us, if it doesn't > > get us full recoverability" questions is to simply say "it's a > > prerequisite to getting full recoverability, PG 7.1 sets the foundation > > and later work will get us there". > > Not to quibble, but for most of us, the answer to Don's question is: > "It gives a ~20x speedup over 7.0." That's pretty valuable to some of us. > If it turns out to be useful for other stuff, that's gravy. > > Nathan Myers > ncm@zembu.com >
On Fri, Dec 01, 2000 at 09:13:28PM +1100, Philip Warner wrote: > > You have raised some interesting issues regrading write-order etc. Can we > assume that when fsync *returns*, all records are written - though not > necessarily in the order that the IO's were executed? Not with ordinary disks. With a battery-backed disk server, yes. Nathan Myers ncm@zembu.com
On Fri, Dec 01, 2000 at 08:10:40AM -0800, Vadim Mikheev wrote: > > > ... a new backup utility > > could be written to make a hot backup which could be restored and > > then replayed using the current WAL format. It might be easier to > > add another log which could be replayed against the existing form > > of backups. That last is what I called the "update log". > > Consistent read of data blocks is easier to implement, sure. > > > The point is, WAL now does one job superbly: maintain a consistent > > on-disk database image. Asking it to do something else, such as > > supporting hot BAR, could interfere with it doing its main job. > > Of course, only the person who implements hot BAR can say. > > There will be no interference because of BAR will not ask WAL to do > anything else it does right now - redo-ing changes. The interference I meant is that the current WAL file format is designed for its current job. For BAR, you would be better-served by a more compact format, so you need not archive your logs so frequently. (The size of the WAL doesn't matter much because you can rotate them very quickly.) A more compact format is also better as a basis for replication, to minimize network traffic. To compress the WAL would hurt performance -- but adding performance was the point of the WAL. A log encoded at a much higher semantic level could be much more compact, but wouldn't be useful as a WAL because it describes differences from a snapshot backup, not from the current table file contents. Thus, I'm not saying that you can't implement both WAL and hot BAR using the same log; rather, it's just not _obviously_ the best way to do it. Nathan Myers ncm@zembu.com
From: "Nathan Myers" <ncm@zembu.com> > On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote: > > [snip] > The logging in 7.1 protects transactions against many sources of > database crash, but not necessarily against OS crash, and certainly > not against power failure. (You might get lucky, or you might just > think you were lucky.) This is the same as for most databases; an > embedded database that talks directly to the hardware might be able > to do better. > If PG had a type of tree based logging filesystem, that it self handles, wouldn't that be almost perfectly safe? I mean that you might lose some data in an transaction, but the client never gets an OK anyways... Like a combination of raw block io and tux2 like fs. Doesn't Oracle do it's own block io, no? Magnus -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Programmer/Networker [|] Magnus NaeslundPGP Key: http://www.genline.nu/mag_pgp.txt -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
On Fri, Dec 01, 2000 at 12:00:12AM -0400, The Hermit Hacker wrote: > On Thu, 30 Nov 2000, Nathan Myers wrote: > > On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote: > > > v7.1 should improve crash recovery ... > > > ... with the WAL stuff that Vadim is producing, you'll be able to > > > recover up until the point that the power cable was pulled out of > > > the wall. > > > > Please do not propagate falsehoods like the above. It creates > > unsatisfiable expectations, and leads people to fail to take > > proper precautions and recovery procedures. > > > > After a power outage on an active database, you may have corruption > > at low levels of the system, and unless you have enormous redundancy > > (and actually use it to verify everything) the corruption may go > > undetected and result in (subtly) wrong answers at any future time. > > > > The logging in 7.1 protects transactions against many sources of > > database crash, but not necessarily against OS crash, and certainly > > not against power failure. (You might get lucky, or you might just > > think you were lucky.) This is the same as for most databases; an > > embedded database that talks directly to the hardware might be able > > to do better. > > We're talking about transaction logging here ... nothing gets written > to it until completed ... if I take a "known to be clean" backup from > the night before, restore that and then run through the transaction > logs, my data should be clean, unless my tape itself is corrupt. If > the power goes off half way through a write to the log, then that > transaction wouldn't be marked as completed and won't roll into the > restore ... Sorry, wrong. First, the only way that your backups could have any relationship with the transaction logs is if they are copies of the raw table files with the database shut down, rather than the normal "snapshot" backup. Second, the transaction log is not, as has been noted far too frequently for Vince's comfort, really written atomically. The OS has promised to write it atomically, and given the opportunity, it will. If you pull the plug, all promises are broken. > if a disk goes corrupt, I'd expect that the redo log would possibly > have a problem with corruption .. but if I pull the plug, unless I've > somehow damaged the disk, I would expect my redo log to be clean > *and*, unless Vadim totally messed something up, if there is any > corruption in the redo log, I'd expect that restoring from it would > generate from red flags ... You have great expectations, but nobody has done the work to satisfy them, so when you pull the plug, I'd expect that you will be left in the dark, alone and helpless. Vadim has done an excellent job on what he set out to do: optimize transaction processing. Designing and implementing a factor-of-twenty speed improvement on a professional-quality database engine demanded great effort and expertise. To complain that he hasn't also done a lot of other stuff would be petty. Nathan Myers ncm@zembu.com
ncm@zembu.com (Nathan Myers) writes:
> On Fri, Dec 01, 2000 at 09:13:28PM +1100, Philip Warner wrote:
>> You have raised some interesting issues regrading write-order etc. Can we
>> assume that when fsync *returns*, all records are written - though not
>> necessarily in the order that the IO's were executed?
> Not with ordinary disks.  With a battery-backed disk server, yes.
I think the real point of this discussion is that there's no such thing
as an ironclad guarantee.  That's why people make backups.
All we can do is the best we can ;-).  In that light, I think it's
reasonable for Postgres to proceed on the assumption that fsync does
what it claims to do, ie, all blocks are written when it returns.
We can't realistically expect to persuade a disk controller that
reorders writes to stop doing so.  We can, however, expect that we've
minimized the probability of failures induced by anything other than
disk hardware failure or power failure.
        regards, tom lane
			
		At 03:35 PM 11/30/00 -0800, Nathan Myers wrote: >On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote: >> >> v7.1 should improve crash recovery ... >> ... with the WAL stuff that Vadim is producing, you'll be able to >> recover up until the point that the power cable was pulled out of >> the wall. > >Please do not propagate falsehoods like the above. It creates >unsatisfiable expectations, and leads people to fail to take >proper precautions and recovery procedures. Yeah, I posted similar stuff to the PHPbuilder forum in regard to PG. >The logging in 7.1 protects transactions against many sources of >database crash, but not necessarily against OS crash, and certainly >not against power failure. (You might get lucky, or you might just >think you were lucky.) This is the same as for most databases; an >embedded database that talks directly to the hardware might be able >to do better. Let's put it this way ... Oracle, a transaction-safe DB with REDO logging, has for a very long time implemented disk mirroring. Now, why would they do that if you could pull the plug on the processor and depend on REDO logging to save you? And even then you're expected to provide adequate power backup to enable clean shutdown. The real safety you get is that your battery sez "we need to shut down!" but has enough power to let you. Transactions in progress aren't logged, but everything else can tank cleanly, and your DB is in a consistent state. Mirroring protects you against (some) disk drive failures (but not those that are transparent to the RAID controller/driver - if your drive writes crap to the primary side of the mirror and no errors are returned to the hardware/driver, the other side of the mirror can faithfully reproduce them on the mirror!) But since drives contain bearings and such that are much more likely to fail than electronics (good electronics and good designs, at least), mechanical failure's more likely and will be known to whatever is driving the drive. And you're OK then... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
On Thu, 30 Nov 2000, Nathan Myers wrote: > Second, the transaction log is not, as has been noted far too frequently > for Vince's comfort, really written atomically. The OS has promised > to write it atomically, and given the opportunity, it will. If you pull > the plug, all promises are broken. Say what? Vince. -- ========================================================================== Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net128K ISDN from $22.00/mo - 56K Dialup from $16.00/moat Pop4 Networking Online Campground Directory http://www.camping-usa.com Online Giftshop Superstore http://www.cloudninegifts.com ==========================================================================
At 02:47 PM 12/1/00 -0500, Tom Lane wrote: >All we can do is the best we can ;-). In that light, I think it's >reasonable for Postgres to proceed on the assumption that fsync does >what it claims to do, ie, all blocks are written when it returns. >We can't realistically expect to persuade a disk controller that >reorders writes to stop doing so. We can, however, expect that we've >minimized the probability of failures induced by anything other than >disk hardware failure or power failure. Right. This is very much the guarantee that RAID (non-zero) makes, except "other than disk hardware failure" is replaced by "other than the failure of two drives". RAID gives you that (very, very substantial boost which is why it is so popular for DB servers). It doesn't give you power failure assurance for much the same reason that PG (or Oracle, etc) can. If transaction processing alone could give you protection against a single disk hardware failure, Oracle wouldn't've bothered implementing mirroring in the past before software (and even reasonable hardware) RAID was available. Likewise, if mirroring + transaction processing could protect against disks hosing themselves in power failure situations Oracle wouldn't suggest that enterprise level customers invest in external disk subsystems with battery backup sufficient to guarantee everything the db server believes has been written really is written. Of course, Oracle license fees are high enough that proper hardware support tends to look cheap in comparison... Vadim's WAL code is excellent, and the fact that we run in essence with -F performance and also less write activity to the disk both increases performance, and tends to lessen the probability that the disk will actually be writing a block when the power goes off. The dice aren't quite so loaded against the server with this lowered disk activity... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.