Обсуждение: Increasing test coverage of WAL redo functions
To test WAL replay, I often set up a master-standby system with
streaming replication and run "make installcheck" on the master.
However, the regression suite doesn't generate all WAL record types. I
spent some time looking at the lcov report (make coverage-html), and
crafted new tests to test those redo functions that were not otherwise
covered.
All the new test cases are related to indexes, mostly vacuuming them.
See attached. With this patch, all WAL record types are tested, although
there are still a few codepaths within the redo functions (aside from
"can't happen" error checks) that are not exercised.
There are a couple of problems with these new tests:
1. Whether the vacuum tests test what they're supposed to, depends on
what else is going on in the system. If there's another backend present
that holds onto an snapshot, vacuum won't be able to remove any rows, so
that code will go untested. Running those tests in parallel with other
tests makes it quite likely that nothing can be vacuumed.
2. These make the regression database larger. The following tables and
indexes are added:
postgres=# \d+
                          List of relations
  Schema |       Name       | Type  | Owner  |  Size   | Description
--------+------------------+-------+--------+---------+-------------
  public | btree_tall_tbl   | table | heikki | 24 kB   |
  public | btree_test_tbl   | table | heikki | 392 kB  |
  public | gin_test_tbl     | table | heikki | 588 MB  |
  public | gist_point_tbl   | table | heikki | 1056 kB |
  public | spgist_point_tbl | table | heikki | 1056 kB |
  public | spgist_text_tbl  | table | heikki | 1472 kB |
(6 rows)
postgres=# \di+
                                    List of relations
  Schema |       Name       | Type  | Owner  |      Table       |  Size
   | Descri
ption
--------+------------------+-------+--------+------------------+---------+-------
------
  public | btree_tall_idx   | index | heikki | btree_tall_tbl   | 1176 kB |
  public | btree_test_idx   | index | heikki | btree_test_tbl   | 472 kB  |
  public | gin_test_idx     | index | heikki | gin_test_tbl     | 220 MB  |
  public | gist_pointidx    | index | heikki | gist_point_tbl   | 1744 kB |
  public | spgist_point_idx | index | heikki | spgist_point_tbl | 1120 kB |
  public | spgist_text_idx  | index | heikki | spgist_text_tbl  | 440 kB  |
(6 rows)
The GIN test needs to create a huge table, to cover the case of
splitting an internal posting tree page. That 588MB table plus index is
obviously too much to include in the regular regression suite. I'm not
sure how much smaller it could be made, but it's going to be in that
ballpark anyway. It also takes a long time to run.
I think the rest are tolerable, they make the regression database about
9 MB larger, from 45 MB to 53 MB, and only take a few seconds to run, on
my laptop.
My plan is to leave out that large GIN test for now, and commit the
rest. I'll add comments to the vacuum tests explaining that it's a hit
and miss whether they manage to vacuum anything. It's still better to
have the tests even if they sometimes fail to test vacuum as intended,
than not have the tests at all. In either case, they won't fail unless
there's a bug somewhere, and they will still exercise some code that's
not otherwise tested at all.
Thoughts?
PS. The brin test case is currently in a funny position in
serial_schedule, compared to parallel_schedule. This patch moves it to
where I think it belongs.
- Heikki
			
		Вложения
Hi, On 2014-11-19 16:27:56 +0200, Heikki Linnakangas wrote: > To test WAL replay, I often set up a master-standby system with streaming > replication and run "make installcheck" on the master. However, the > regression suite doesn't generate all WAL record types. I spent some time > looking at the lcov report (make coverage-html), and crafted new tests to > test those redo functions that were not otherwise covered. That sounds good. A next step would be to have them run in some automated fashion... > All the new test cases are related to indexes, mostly vacuuming them. See > attached. With this patch, all WAL record types are tested, although there > are still a few codepaths within the redo functions (aside from "can't > happen" error checks) that are not exercised. > > There are a couple of problems with these new tests: > > 1. Whether the vacuum tests test what they're supposed to, depends on what > else is going on in the system. If there's another backend present that > holds onto an snapshot, vacuum won't be able to remove any rows, so that > code will go untested. Running those tests in parallel with other tests > makes it quite likely that nothing can be vacuumed. Yes, that's "annoying", but not prohibitive imo. Even having the routines only triggered every couple runs is better than them not being triggered at all. > 2. These make the regression database larger. The following tables and > indexes are added: > > postgres=# \d+ > List of relations > Schema | Name | Type | Owner | Size | Description > --------+------------------+-------+--------+---------+------------- > public | btree_tall_tbl | table | heikki | 24 kB | > public | btree_test_tbl | table | heikki | 392 kB | > public | gin_test_tbl | table | heikki | 588 MB | > public | gist_point_tbl | table | heikki | 1056 kB | > public | spgist_point_tbl | table | heikki | 1056 kB | > public | spgist_text_tbl | table | heikki | 1472 kB | > (6 rows) > > postgres=# \di+ > List of relations > Schema | Name | Type | Owner | Table | Size | > Descri > ption > --------+------------------+-------+--------+------------------+---------+------- > ------ > public | btree_tall_idx | index | heikki | btree_tall_tbl | 1176 kB | > public | btree_test_idx | index | heikki | btree_test_tbl | 472 kB | > public | gin_test_idx | index | heikki | gin_test_tbl | 220 MB | > public | gist_pointidx | index | heikki | gist_point_tbl | 1744 kB | > public | spgist_point_idx | index | heikki | spgist_point_tbl | 1120 kB | > public | spgist_text_idx | index | heikki | spgist_text_tbl | 440 kB | > (6 rows) > The GIN test needs to create a huge table, to cover the case of splitting an > internal posting tree page. That 588MB table plus index is obviously too > much to include in the regular regression suite. I'm not sure how much > smaller it could be made, but it's going to be in that ballpark anyway. It > also takes a long time to run. How long? > I think the rest are tolerable, they make the regression database about 9 MB > larger, from 45 MB to 53 MB, and only take a few seconds to run, on my > laptop. > My plan is to leave out that large GIN test for now, and commit the > rest. How about putting the gin test in a separate schedule? expensive_parallel_schedule or something? I'd be more comfortable if the tests were actually there, even if not run by default. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Heikki Linnakangas wrote: > 2. These make the regression database larger. The following tables and > indexes are added: > > postgres=# \d+ > List of relations > Schema | Name | Type | Owner | Size | Description > --------+------------------+-------+--------+---------+------------- > public | btree_tall_tbl | table | heikki | 24 kB | > public | btree_test_tbl | table | heikki | 392 kB | > public | gin_test_tbl | table | heikki | 588 MB | > public | gist_point_tbl | table | heikki | 1056 kB | > public | spgist_point_tbl | table | heikki | 1056 kB | > public | spgist_text_tbl | table | heikki | 1472 kB | > (6 rows) I think it's good to have these tests, though Tom was complaining earlier about the size of the regression test database. Would it work to have this in a separate test suite, like the numeric_big stuff? We can have it run optionally, and perhaps set up a few buildfarm members to exercise them on a regular basis. Also I'm surprised that BRIN did not turn up here. At least the "page evacuation protocol" to obtain a new revmap page is not exercised by the current tests. I suppose it's because all WAL records are covered by other activity, and page evacuation does not emit a specialized WAL record. If we have the "big" test for this, maybe we can enlarge the table for the brin index too to ensure we cover this. BTW looking at the lcov reports the other day I noticed that the lines PG_FUNCTION_INFO_V1 do not get marked as "ran", which decreases the coverage percentages ... in one of the BRIN files this was quite noticeable, bringing the function coverage count down to about 50-60% when it should have been 100%. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-11-19 11:54:47 -0300, Alvaro Herrera wrote: > Heikki Linnakangas wrote: > > Schema | Name | Type | Owner | Size | Description > > --------+------------------+-------+--------+---------+------------- > > public | btree_tall_tbl | table | heikki | 24 kB | > > public | btree_test_tbl | table | heikki | 392 kB | > > public | gin_test_tbl | table | heikki | 588 MB | > > public | gist_point_tbl | table | heikki | 1056 kB | > > public | spgist_point_tbl | table | heikki | 1056 kB | > > public | spgist_text_tbl | table | heikki | 1472 kB | > > (6 rows) > > I think it's good to have these tests, though Tom was complaining > earlier about the size of the regression test database. Would it work > to have this in a separate test suite, like the numeric_big stuff? > We can have it run optionally, and perhaps set up a few buildfarm > members to exercise them on a regular basis. I think the tests except the gin one are resonably sized - I'd much rather run them all the time. We shouldn't make the buildfarm configuration unnecessarily complex. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 11/19/2014 04:54 PM, Alvaro Herrera wrote: > Also I'm surprised that BRIN did not turn up here. At least the "page > evacuation protocol" to obtain a new revmap page is not exercised by the > current tests. I suppose it's because all WAL records are covered by > other activity, and page evacuation does not emit a specialized WAL > record. If we have the "big" test for this, maybe we can enlarge the > table for the brin index too to ensure we cover this. Yeah, all of BRIN's redo functions are fully covered by current regression tests. Well done :-). The evacuation code indeed isn't covered. That would be pretty easy to fix; just make a table that has a brin index on it large enough. - Heikki
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Heikki Linnakangas wrote:
>> 2. These make the regression database larger. The following tables and
>> indexes are added:
> I think it's good to have these tests, though Tom was complaining
> earlier about the size of the regression test database.  Would it work
> to have this in a separate test suite, like the numeric_big stuff?
I was going to suggest the same.
> BTW looking at the lcov reports the other day I noticed that the lines
> PG_FUNCTION_INFO_V1 do not get marked as "ran", which decreases the
> coverage percentages ... in one of the BRIN files this was quite
> noticeable, bringing the function coverage count down to about 50-60%
> when it should have been 100%.
Kind of off topic for this thread, but why are those there at all?
They are unnecessary for internal C functions.
        regards, tom lane
			
		On 11/19/2014 05:01 PM, Andres Freund wrote: > On 2014-11-19 11:54:47 -0300, Alvaro Herrera wrote: >> Heikki Linnakangas wrote: >>> Schema | Name | Type | Owner | Size | Description >>> --------+------------------+-------+--------+---------+------------- >>> public | btree_tall_tbl | table | heikki | 24 kB | >>> public | btree_test_tbl | table | heikki | 392 kB | >>> public | gin_test_tbl | table | heikki | 588 MB | >>> public | gist_point_tbl | table | heikki | 1056 kB | >>> public | spgist_point_tbl | table | heikki | 1056 kB | >>> public | spgist_text_tbl | table | heikki | 1472 kB | >>> (6 rows) >> >> I think it's good to have these tests, though Tom was complaining >> earlier about the size of the regression test database. Would it work >> to have this in a separate test suite, like the numeric_big stuff? >> We can have it run optionally, and perhaps set up a few buildfarm >> members to exercise them on a regular basis. > > I think the tests except the gin one are resonably sized - I'd much > rather run them all the time. We shouldn't make the buildfarm > configuration unnecessarily complex. Agreed. Committed, except for the large GIN test. I kept a smaller version of the GIN test, which exercises vacuum and page deletion, but not the internal page split. I also slimmed down the other tests down a little bit. This grew "pg_dumpall | wc -c" from 5505689 to 6926066 bytes. The size of the regression database grew, according to psql's "\l+" command grew from 45 MB to 57 MB. The amount of WAL generated by "make installcheck" grew from 75 MB to 104 MB. That's a quite big increase in all those measures, but I think it's still acceptable, and I would really like these to run as part of the regular regression suite. - Heikki
On 2014-11-19 19:59:33 +0200, Heikki Linnakangas wrote: > On 11/19/2014 05:01 PM, Andres Freund wrote: > >On 2014-11-19 11:54:47 -0300, Alvaro Herrera wrote: > >>Heikki Linnakangas wrote: > >>> Schema | Name | Type | Owner | Size | Description > >>>--------+------------------+-------+--------+---------+------------- > >>> public | btree_tall_tbl | table | heikki | 24 kB | > >>> public | btree_test_tbl | table | heikki | 392 kB | > >>> public | gin_test_tbl | table | heikki | 588 MB | > >>> public | gist_point_tbl | table | heikki | 1056 kB | > >>> public | spgist_point_tbl | table | heikki | 1056 kB | > >>> public | spgist_text_tbl | table | heikki | 1472 kB | > >>>(6 rows) > >> > >>I think it's good to have these tests, though Tom was complaining > >>earlier about the size of the regression test database. Would it work > >>to have this in a separate test suite, like the numeric_big stuff? > >>We can have it run optionally, and perhaps set up a few buildfarm > >>members to exercise them on a regular basis. > > > >I think the tests except the gin one are resonably sized - I'd much > >rather run them all the time. We shouldn't make the buildfarm > >configuration unnecessarily complex. > > Agreed. > > Committed, except for the large GIN test. I kept a smaller version of the > GIN test, which exercises vacuum and page deletion, but not the internal > page split. I also slimmed down the other tests down a little bit. > > This grew "pg_dumpall | wc -c" from 5505689 to 6926066 bytes. The size of > the regression database grew, according to psql's "\l+" command grew from 45 > MB to 57 MB. The amount of WAL generated by "make installcheck" grew from 75 > MB to 104 MB. Why not just drop some of these slightly larger tables after the test? Then the maximum size of the database/the size of the dump doesn't increase as much? I don't think there's these are that interesting to look into after the test has finished. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-11-19 19:59:33 +0200, Heikki Linnakangas wrote:
>> This grew "pg_dumpall | wc -c" from 5505689 to 6926066 bytes. The size of
>> the regression database grew, according to psql's "\l+" command grew from 45
>> MB to 57 MB. The amount of WAL generated by "make installcheck" grew from 75
>> MB to 104 MB.
> Why not just drop some of these slightly larger tables after the test?
> Then the maximum size of the database/the size of the dump doesn't
> increase as much? I don't think there's these are that interesting to
> look into after the test has finished.
While the post-run database size is interesting, I think the peak in-run
size is probably the more critical number, since that affects whether
buildfarm critters with limited disk space are going to run out.
Still, I agree that we could drop any such tables that don't add new
cases for pg_dump to think about.
Another thought, if we drop large tables at completion of their tests,
is that these particular tests ought not be run in parallel with each
other.  That's just uselessly concentrating the peak space demand.
        regards, tom lane