WAL replay bugs
От | Heikki Linnakangas |
---|---|
Тема | WAL replay bugs |
Дата | |
Msg-id | 5342EB88.2050506@vmware.com обсуждение исходный текст |
Ответы |
Re: WAL replay bugs
(Michael Paquier <michael.paquier@gmail.com>)
Re: WAL replay bugs (sachin kotwal <kotsachin@gmail.com>) |
Список | pgsql-hackers |
I've been playing with a little hack that records a before and after image of every page modification that is WAL-logged, and writes the images to a file along with the LSN of the corresponding WAL record. I set up a master-standby replication with that hack in place in both servers, and ran the regression suite. Then I compared the after images after every WAL record, as written on master, and as replayed by the standby. The idea is that the page content in the standby after replaying a WAL record should be identical to the page in the master, when the WAL record was generated. There are some known cases where that doesn't hold, but it's a useful sanity check. To reduce noise, I've been focusing on one access method at a time, filtering out others. I did that for GIN first, and indeed found a bug in my new incomplete-split code, see commit 594bac42. After fixing that, and zeroing some padding bytes (38a2b95c), I'm now getting a clean run with that. Next, I took on GiST, and lo-and-behold found a bug there pretty quickly as well. This one has been there ever since we got Hot Standby: the redo of a page update (e.g an insertion) resets the right-link of the page. If there is a concurrent scan, in a hot standby server, that scan might still need the rightlink, and will hence miss some tuples. This can be reproduced like this: 1. in master, create test table. CREATE TABLE gisttest (id int4); CREATE INDEX gisttest_idx ON gisttest USING gist (id); INSERT INTO gisttest SELECT g * 1000 from generate_series(1, 100000) g; -- Test function. Starts a scan, fetches one row from it, then waits 10 seconds until fetching the rest of the rows. -- Returns the number of rows scanned. Should be 100000 if you follow -- these test instructions. CREATE OR REPLACE FUNCTION gisttestfunc() RETURNS int AS $$ declare i int4; t text; cur CURSOR FOR SELECT 'foo' FROM gisttest WHERE id >= 0; begin set enable_seqscan=off; set enable_bitmapscan=off; i = 0; OPEN cur; FETCH cur INTO t; perform pg_sleep(10); LOOP EXIT WHEN NOT FOUND; -- this is bogus on first iterationi = i + 1; FETCH cur INTO t; END LOOP; CLOSE cur; RETURN i; END; $$ LANGUAGE plpgsql; 2. in standby SELECT gisttestfunc(); <blocks> 3. Quickly, before the scan in standby continues, cause some page splits: INSERT INTO gisttest SELECT g * 1000+1 from generate_series(1, 100000) g; 4. The scan in standby finishes. It should return 100000, but will return a lower number if you hit the bug. At a quick glance, I think fixing that is just a matter of not resetting the right-link. I'll take a closer look tomorrow, but for now I just wanted to report what I've been doing. I'll post the scripts I've been using later too - nag me if I don't. - Heikki
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Stephen FrostДата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)
Следующее
От: Peter GeogheganДата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)