Обсуждение: data loss due to improper handling of postmaster ....
Hi folks, this is a long email. I too experienced a data loss of 11 hrs recently. i have the most recent postgresql 7.2.1 on RedHat 6.2 but my case was bit different and i feel my wrong handling of situation was also responsible for it. I would be grateful if someone could tell me what should have been done *instead* to prevent the data loss. as far as i remember the following is the post mortem : the load average of my database server had reached 5.15 and my website had become slugglish so i decided to stop the postmaster and start again, (i dont know it it was a right thing but was inituitive to me) so i did # su - postgres # pg_ctl stop <-- did not work out it said postmaster could not be stopped. # pg_ctl stop -m immediate it said postmaster is stopped , but it was wrong ps auxwww still showed some processes running. # pg_clt -l /var/log/pgsql start said started successfully (but in reality not ) at this point postmaster is neither dead nor running essentially my live website was down, so under pressure i decided to reboot the system and told my ISP to do so. but even the reboot was not smooth , the unix admin of my isp says some process does not let the system reboot (and it was postmaster). so he has to put the machine in power cycle and the machine fscked in startup. as a result i too got similar messages as Bojan has given below . and my website was not connecting to the database. it used to say "database in recovery mode.... " then i did "pg_ctl stop" then start but nothing worked out. since it was my production database i had to restore the database in minimum time so i used my old backup that was 11 hrs old and hence a major data loss. I strongly beleive Postgresql is the best open source database around and is *safe* unless fiddled in a wrong manner. But there are problems in using it. due to The current Lack of inbuilt failover and replication solutions in postgresql people like me would tend to become desperate because one cannot keep webserver down for long as a result we take wrong steps. For mere mortals like me there should be set of guidelines for safe handling of the server. (DOS' and DON'TS type) to prevent DATA LOSS. Also i would like suggestions on how to live with postgresql with its current limitations of replication ( or failover solutions) and without data loss. what i currently do is backup my database with pg_dump but there are problems with it. Because of large size of my database pg_dump takes 20-30 mins and the server load increases this means i cannot do it quite frequently on my production server. so in worst case i still loose of duration ranging from 1-24 hrs depending on frequency of pg_dump. And for many of us even 1Hour of data is *quite* a loss for us. I would also want comments on usability of USOGRES / RSERV replication systems with postgres 7.2.1 hoping to get some tips from the intellectuals out here regds mallah. On Tuesday 07 May 2002 07:52 pm, Bojan Belovic wrote: > My database apparently crashed - don't know how or why. It happend in the > middle of the night so I wasn't around to troubleshoot it at the time. It > looks like it died during the scheduled vacuum. > > Here's the log that gets generated when I attempt to bring it back up: > > postmaster successfully started > DEBUG: database system shutdown was interrupted at 2002-05-07 09:35:35 EDT > DEBUG: CheckPoint record at (10, 1531023244) > DEBUG: Redo record at (10, 1531023244); Undo record at (10, 1531022908); > Shutdown FALSE > DEBUG: NextTransactionId: 29939385; NextOid: 9729307 > DEBUG: database system was not properly shut down; automatic recovery in > progress... > DEBUG: redo starts at (10, 1531023308) > DEBUG: ReadRecord: record with zero len at (10, 1575756128) > DEBUG: redo done at (10, 1575756064) > FATAL 2: write(logfile 10 seg 93 off 15474688) failed: Success > /usr/bin/postmaster: Startup proc 1339 exited with status 512 - abort > > Any suggestions? What are my options, other than doing a complete restore > of the DB from a dump (which is not really an option as the backup is not > as recent as it should be). > > Thanks! > > Bojan
Hi all, Hi Tom, I was exploring ways to improve the time required to update a large table from the join of two others as in: UPDATE a FROM b, c; I found that whatever index I create, compound or not, PG insist on performing the cartesian product first. Now, if "b" and "c" are relatively small, that make sense, but when the cartesian product of "b" and "c" is and order of magnitude larger than "a" it makes no sense at all. Shouldn't the number of rows in "b" and "c" be reduced to the matching the criterias on "a" first? If "b" is meant to be a many to one relation to "c", shouldn't "a" be joined to "b" first? Is there a way I can force the "a" join "b" first? I also tried to do it in a PLPGSQL FOR LOOP, but because the script is processed as a transaction, I do not get much better performance. It would sure be nice to be able to commit the change on a row per row basis. THX jll
Jean-Luc Lachance <jllachan@nsd.ca> writes: > I was exploring ways to improve the time required to update a large > table from the join of two others as in: > UPDATE a FROM b, c; > I found that whatever index I create, compound or not, PG insist on > performing the cartesian product first. Surely not. test=# create table a (f1 int primary key, f2 int); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'a_pkey' for table 'a' CREATE test=# create table b (f1 int primary key, f2 int); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'b_pkey' for table 'b' CREATE test=# create table c (f1 int primary key, f2 int); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'c_pkey' for table 'c' CREATE test=# explain update a set f2 = a.f2 + 1 from b,c where a.f1 = b.f1; QUERY PLAN ---------------------------------------------------------------------------------Nested Loop (cost=0.00..30121.50 rows=1000000width=18) -> Merge Join (cost=0.00..121.50 rows=1000 width=18) Merge Cond: ("outer".f1 = "inner".f1) -> Index Scan using a_pkey on a (cost=0.00..52.00 rows=1000 width=14) -> Index Scan using b_pkeyon b (cost=0.00..52.00 rows=1000 width=4) -> Seq Scan on c (cost=0.00..20.00 rows=1000 width=0) (6 rows) The target table doesn't have any special status in the planner's consideration of possible join paths. So whatever problem you are having, it's not that. How about providing a more complete description of your tables and query? regards, tom lane
Tom, table the "c" is not implicated in your test query. Try: create table a (a1 int primary key, a2 int, a3 int, a4 int); create table b (b1 int, b2 int, b3 int, b4 int, primary key (b1, b2)); create table d (d1 int, d2 int, d3 int, d4 int, primary key (d1, d2)); explain update a set a4 = d.d2 from b,d where a.a2 = b.b1 and a.a3 = b.b2 andb.b3 = d.d1 and b.b4 = d.d2 and a.a4 >= d.d3 and a.a4 <= d.d4; Which is closer to what I have. +----------------------------------- | /\ A1 A2 A3 A4 B1 B2 B3 B4 D1 D2 D3 D4 | | | | | | | | +--------------+ | +---------+ | | | | | +--------------+ +---------+ Tom Lane wrote: > > Jean-Luc Lachance <jllachan@nsd.ca> writes: > > I was exploring ways to improve the time required to update a large > > table from the join of two others as in: > > UPDATE a FROM b, c; > > I found that whatever index I create, compound or not, PG insist on > > performing the cartesian product first. > > Surely not. > > test=# create table a (f1 int primary key, f2 int); > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'a_pkey' for table 'a' > CREATE > test=# create table b (f1 int primary key, f2 int); > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'b_pkey' for table 'b' > CREATE > test=# create table c (f1 int primary key, f2 int); > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'c_pkey' for table 'c' > CREATE > > test=# explain update a set f2 = a.f2 + 1 from b,c where a.f1 = b.f1; > QUERY PLAN > > --------------------------------------------------------------------------------- > Nested Loop (cost=0.00..30121.50 rows=1000000 width=18) > -> Merge Join (cost=0.00..121.50 rows=1000 width=18) > Merge Cond: ("outer".f1 = "inner".f1) > -> Index Scan using a_pkey on a (cost=0.00..52.00 rows=1000 width=14) > -> Index Scan using b_pkey on b (cost=0.00..52.00 rows=1000 width=4) > -> Seq Scan on c (cost=0.00..20.00 rows=1000 width=0) > (6 rows) > > The target table doesn't have any special status in the planner's > consideration of possible join paths. So whatever problem you are > having, it's not that. How about providing a more complete description > of your tables and query? > > regards, tom lane
Jean-Luc Lachance <jllachan@nsd.ca> writes: > Try: > create table a (a1 int primary key, a2 int, a3 int, a4 int); > create table b (b1 int, b2 int, b3 int, b4 int, primary key (b1, b2)); > create table d (d1 int, d2 int, d3 int, d4 int, primary key (d1, d2)); > explain update a set a4 = d.d2 from b,d where a.a2 = b.b1 and a.a3 = > b.b2 and > b.b3 = d.d1 and b.b4 = d.d2 and a.a4 >= d.d3 and a.a4 <= d.d4; > Which is closer to what I have. > +----------------------------------- > | /\ > A1 A2 A3 A4 B1 B2 B3 B4 D1 D2 D3 D4 > | | | | | | | | > +--------------+ | +---------+ | > | | | | > +--------------+ +---------+ Well, I still don't see any tendency to want to form a "cartesian product", if by that you mean an unconstrained join. I tried these cases: After creating tables as above: Hash Join (cost=129.77..162.27 rows=1 width=54) Hash Cond: ("outer".a2 = "inner".b1) Join Filter: (("outer".a3 = "inner".b2)AND ("outer".a4 >= "inner".d3) AND ("outer".a4 <= "inner".d4)) -> Seq Scan on a (cost=0.00..20.00 rows=1000width=22) -> Hash (cost=129.70..129.70 rows=25 width=32) -> Merge Join (cost=69.83..129.70 rows=25 width=32) Merge Cond: (("outer".d1 = "inner".b3) AND ("outer".d2 = "inner".b4)) -> Index Scan usingd_pkey on d (cost=0.00..52.00 rows=1000 width=16) -> Sort (cost=69.83..72.33 rows=1000 width=16) -> Seq Scan on b (cost=0.00..20.00 rows=1000 width=16) (I'm using current development sources here because of the nicer EXPLAIN display, but the planner behavior hasn't changed from 7.2 AFAIR.) This doesn't tell us very much because the default assumptions for the table sizes are all the same, 1000 rows/10 pages. I tried goosing it up: test=# update pg_class set reltuples=1000000,relpages=10000 where relname = 'a'; UPDATE 1 test=# update pg_class set reltuples=1000000,relpages=10000 where relname = 'b'; UPDATE 1 test=# update pg_class set reltuples=1000000,relpages=10000 where relname = 'd'; UPDATE 1 test=# explain ... Merge Join (cost=8294766.62..20924766.62 rows=69444444 width=54) Merge Cond: (("outer".b2 = "inner".a3) AND ("outer".b1= "inner".a2)) Join Filter: (("inner".a4 >= "outer".d3) AND ("inner".a4 <= "outer".d4)) -> Sort (cost=8093208.78..8155708.78rows=25000000 width=32) -> Merge Join (cost=373805.69..758805.69 rows=25000000 width=32) Merge Cond: (("outer".b4 = "inner".d2) AND ("outer".b3 = "inner".d1)) -> Sort (cost=186902.84..189402.84rows=1000000 width=16) -> Seq Scan on b (cost=0.00..20000.00 rows=1000000width=16) -> Sort (cost=186902.84..189402.84 rows=1000000 width=16) -> SeqScan on d (cost=0.00..20000.00 rows=1000000 width=16) -> Sort (cost=201557.84..204057.84 rows=1000000 width=22) -> Seq Scan on a (cost=0.00..20000.00 rows=1000000 width=22) Or if we make table a smaller than the other two: test=# update pg_class set reltuples=100000,relpages=1000 where relname = 'a'; UPDATE 1 test=# explain ... Merge Join (cost=981132.44..2248632.44 rows=6944444 width=54) Merge Cond: (("outer".d2 = "inner".b4) AND ("outer".d1 ="inner".b3)) Join Filter: (("inner".a4 >= "outer".d3) AND ("inner".a4 <= "outer".d4)) -> Sort (cost=186902.84..189402.84rows=1000000 width=16) -> Seq Scan on d (cost=0.00..20000.00 rows=1000000 width=16) -> Sort (cost=794229.60..800479.60 rows=2500000 width=38) -> Merge Join (cost=198580.89..241580.89 rows=2500000width=38) Merge Cond: (("outer".a3 = "inner".b2) AND ("outer".a2 = "inner".b1)) -> Sort (cost=11678.05..11928.05 rows=100000 width=22) -> Seq Scan on a (cost=0.00..2000.00 rows=100000width=22) -> Sort (cost=186902.84..189402.84 rows=1000000 width=16) -> SeqScan on b (cost=0.00..20000.00 rows=1000000 width=16) The only thing I see here that is unhappy-making is that the planner does not recognize that joining both columns of a 2-column primary key guarantees a unique result (for instance, the join of a and b here should be estimated at 100000 rows, not 2500000, since there can't be more than one b row joined to any a row). It would do better if it had any ANALYZE statistics to work with, but ideally it ought to be able to deduce that even without stats, just from the presence of a unique index. It will do so for single-column unique indexes but I haven't yet figured out a reasonable (read inexpensive) way to make the corresponding deduction for the multi-column case. In any case, toy examples like this aren't going to prove much about the behavior with real tables and real statistics ... if you want constructive answers you're going to need to show us your actual EXPLAIN ANALYZE results and the pg_stats entries for the tables. regards, tom lane