Обсуждение: Reduce maximum error in tuples estimation after vacuum.
Hello, Postgresql estimates the number of live tuples after the vacuum has left some buffers unscanned. This estimation does well for most cases, but makes completely different result with a strong imbalance of tuple density. For example, create table t (a int, b int); insert into t (select a, (random() * 100000)::int from generate_series((select count(*) from t) + 1, 1000000) a); update t set b = b + 1 where a < (select count(*) from t) * 0.7; vacuum t; delete from t where a < (select count(*) from t) * 0.99; After this, pg_stat_user_tables.n_live_tup shows 417670 which is 41 times larger than the real number of rows 100001. And what makes it worse, autovacuum nor autoanalyze won't run until n_dead_tup goes above 8 times larger than the real number of tuples in the table for the default settings.. | postgres=# select n_live_tup, n_dead_tup | from pg_stat_user_tables where relname='t'; | n_live_tup | n_dead_tup | ------------+------------ | 417670 | 0 | | postgres=# select reltuples from pg_class where relname='t'; | reltuples | ----------- | 417670 | | postgres=# select count(*) from t; | count | ------- | 10001 Using n_dead_tup before vacuuming seems to make it better but I heard that the plan is abandoned from some reason I don't know. So I've come up with the another plan - using FSM to estimate the tuple density in unscanned pages. The point is that make estimation reliying on the uniformity of tuple length instead of tuple density. This change seems keeping that errors under a few times of tuples. Additional page reads for FSM are about 4000th (SlotsPerFSMPage) of the skipped pages, and I suppose this is tolerable during vacuum. Overall algorithm could be illistrated as below, - summing up used bytes, max offnum(PageGetMaxOffsetNumber), maximum free bytes for tuple data , and free bytes after page vacuum through all scanned pages. - summing up free bytes informed by FSM through all skipped pages. - Calculate mean tuple length from the overall used bytes and sum of max offnums, and scanned pages. - Guess tuple density in skipped pages using overall free bytes from FSM and the mean tuple length calculated above. - Finally, feed estimated number of the live tuples BEFORE vacuum into vac_estimate_reltuples. Of course this method affected by the imbalance of tuple LENGTH, but it also seems to be kept within a few times of the number of tuples. for rows with invariable length, the test for head shows, where "tups est" is pg_class.reltuples and "tups real" is count(*). del% | pages | n_live_tup | tups est | tups real | est/real | bufs -----+-------+------------+----------+-----------+----------+------0.9 | 4425 | 100001 | 470626 | 100001 | 4.706| 3985 0.95 | 4425 | 50001 | 441196 | 50001 | 8.824 | 4206 0.99 | 4425 | 417670 | 417670 | 10001 | 41.763 | 4383 and with the patch 0.9 | 4425 | 106169 | 106169 | 100001 | 1.062 | 3985 0.95 | 4425 | 56373 | 56373 | 50001 | 1.127 | 4206 0.99 | 4425 | 10001 | 16535 | 10001 | 1.653 | 4383 What do you think about this? ===== The attached files are: - vacuum_est_improve_20130614.patch: the patch for this proposal - vactest.sql: sql script to cause the sitiation - vactest.sh: test script to find the errors relating this patch. - test_result.txt: all of the test result for various deletion ratio which the test script above yields. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index d6d20fd..1e581c1 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -1280,7 +1280,8 @@ acquire_sample_rows(Relation onerel, int elevel, *totalrows = vac_estimate_reltuples(onerel, true, totalblocks, bs.m, - liverows); + liverows, + onerel->rd_rel->reltuples); if (bs.m > 0) *totaldeadrows = floor((deadrows/ bs.m) * totalblocks + 0.5); else diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 641c740..4bdf0c1 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -501,10 +501,10 @@ doublevac_estimate_reltuples(Relation relation, bool is_analyze, BlockNumbertotal_pages, BlockNumber scanned_pages, - double scanned_tuples) + double scanned_tuples, + double old_rel_tuples){ BlockNumber old_rel_pages = relation->rd_rel->relpages; - double old_rel_tuples = relation->rd_rel->reltuples; double old_density; double new_density; double multiplier; diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 7e46f9e..80304a6 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -396,7 +396,11 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, double num_tuples, tups_vacuumed, nkeep, - nunused; + nunused, + scanned_used_bytes, + scanned_possible_max_tups, + scanned_freespace, + total_freespace; IndexBulkDeleteResult **indstats; int i; PGRUsage ru0; @@ -414,6 +418,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, empty_pages = vacuumed_pages = 0; num_tuples= tups_vacuumed = nkeep = nunused = 0; + scanned_used_bytes = scanned_possible_max_tups = 0; + scanned_freespace = total_freespace = 0; indstats = (IndexBulkDeleteResult **) palloc0(nindexes * sizeof(IndexBulkDeleteResult*)); @@ -487,6 +493,10 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, bool has_dead_tuples; TransactionId visibility_cutoff_xid = InvalidTransactionId; + /* FSM slot for the previous block should be well mainteined */ + if (blkno > 0) + total_freespace += GetRecordedFreeSpace(onerel, blkno - 1); + if (blkno == next_not_all_visible_block) { /* Time to advance next_not_all_visible_block */ @@ -692,6 +702,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, hastup = false; prev_dead_count= vacrelstats->num_dead_tuples; maxoff = PageGetMaxOffsetNumber(page); + scanned_possible_max_tups += maxoff; /* * Note: If you change anything in the loop below, also lookat @@ -892,6 +903,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, } freespace = PageGetHeapFreeSpace(page); + scanned_freespace += freespace; + scanned_used_bytes += PageGetItemBytes(page); /* mark page all-visible, if appropriate */ if (all_visible&& !all_visible_according_to_vm) @@ -969,15 +982,45 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, RecordPageWithFreeSpace(onerel,blkno, freespace); } + total_freespace += GetRecordedFreeSpace(onerel, nblocks - 1); + /* save stats for use later */ vacrelstats->scanned_tuples = num_tuples; vacrelstats->tuples_deleted = tups_vacuumed; - /* now we can compute the new value for pg_class.reltuples */ - vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false, - nblocks, - vacrelstats->scanned_pages, - num_tuples); + { + double prev_ntuples = onerel->rd_rel->reltuples; + + if (tups_vacuumed > num_tuples && + (double)vacrelstats->scanned_pages / nblocks > 0.1) + { + /* + * When certain amount of tuples are vacuumed, there might be + * unignorable imballance in the number of tuples between scanned + * pages and skipped ones. So we guess the number of tuples on the + * skipped pages assuming the uniformity of tuple size among the + * whole relation, and the correctness of FSM at this point. + */ + BlockNumber scanned_pages = vacrelstats->scanned_pages; + BlockNumber skipped_pages = nblocks - scanned_pages; + double buffer_max_bytes = + (scanned_used_bytes + scanned_freespace) / scanned_pages; + double skipped_freespace = total_freespace - scanned_freespace; + double mean_tup_len = + buffer_max_bytes / (scanned_possible_max_tups / scanned_pages); + double skipped_tups_per_blk = + (buffer_max_bytes - skipped_freespace / skipped_pages) / + mean_tup_len; + prev_ntuples = skipped_tups_per_blk * nblocks; + } + + /* now we can compute the new value for pg_class.reltuples */ + vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false, + nblocks, + vacrelstats->scanned_pages, + num_tuples, + prev_ntuples); + } /* * Release any remaining pin on visibility map page. diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index d8dd8b0..aedc48e 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -147,7 +147,8 @@ extern void vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode);extern double vac_estimate_reltuples(Relationrelation, bool is_analyze, BlockNumber total_pages, BlockNumber scanned_pages, - double scanned_tuples); + double scanned_tuples, + double old_rel_tuples);extern void vac_update_relstats(Relation relation, BlockNumbernum_pages, double num_tuples, diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h index abcf8a0..aa067d9 100644 --- a/src/include/storage/bufpage.h +++ b/src/include/storage/bufpage.h @@ -233,6 +233,13 @@ typedef PageHeaderData *PageHeader; ((ItemId) (&((PageHeader) (page))->pd_linp[(offsetNumber) - 1]))/* + * PageGetItemBytes + * Returns the size in bytes occupied by items data on the page + */ +#define PageGetItemBytes(page) \ + (((PageHeader) (page))->pd_special - ((PageHeader) (page))->pd_upper) + +/* * PageGetContents * To be used in case the page does not contain item pointers. * drop table if exists t; create table t (a int, b int); insert into t (select a, (random() * 100000)::int from generate_series((select count(*) from t) + 1, 1000000) a); update t set b = b + 1 where a < (select count(*) from t) * 0.7; vacuum t; delete from t where a < (select count(*) from t) * 0.99; vacuum t; select c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t) as tuples, reltuples::float / (select count(*) fromt) as ratio from pg_stat_user_tables s, pg_class c where s.relname = 't' and c.relname = 't'; #! /bin/sh dbname="postgres" function insert_result() {psql ${dbname} -f - > /dev/null <<EOF insert into result select $1, $2, $3, c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t), reltuples::float /(select count(*) from t) from pg_stat_user_tables s, pg_class c where s.relname = 't' and c.relname = 't'; EOF } function vac_with_bufs() {psql ${dbname} -c "vacuum verbose t" |& egrep "INFO: *\"t\": found " | sed -e 's/^.* versionsin \([0-9]*\) .*$/\1/' } function update_result_bufs() {local test_no=$1local delratio=$2local vac_no=$3local bufs=$4 psql ${dbname} -c "update result set bufs=${bufs} where \"#\"=$test_no and \"del%\"=$delratio and \"##\"=$vac_no" >/dev/null } function store_result() {local test_no=$1local delratio=$2scanned_bufs=`vac_with_bufs`insert_result $test_no $delratio 1update_result_bufs$test_no $delratio 1 $scanned_bufs # scanned_bufs=`vac_with_bufs` # insert_result $test_no $delratio 2 # update_result_bufs $test_no $delratio 2 $scanned_bufs } function test1() {local delratio=$1 echo "test1 ratio = $delratio"psql ${dbname} -f - > /dev/null <<EOF drop table if exists t; create table t (a int, b int); insert into t (select a, (random() * 100000)::int from generate_series((select count(*) from t) + 1, $nrows) a); update t set b = b + 1 where a < (select count(*) from t) * 0.7; vacuum t; delete from t where a < (select count(*) from t) * $delratio; EOF store_result 1 $delratio } function test2() {local delratio=$1 echo "test2 ratio = $delratio"psql ${dbname} -f - > /dev/null <<EOF drop table if exists t; create table t (a int, b text); insert into t (select a, 'abcdefg' from generate_series((select count(*) from t) + 1, $nrows) a); update t set b = repeat('abcdefghij', 250) where a < (select count(*) from t) * 0.7; vacuum t; delete from t where a < (select count(*) from t) * $delratio; EOFstore_result 2 $delratio } psql ${dbname} -f - > /dev/null <<EOF drop table if exists result; create table result ("#" int, "del%" float, "##" int, pages int, n_live_tup int, "tups est" int, "tups real" int, "est/real"numeric(10, 3), bufs int default 0); EOF nrows=1000000 test1 0.1 test1 0.2 test1 0.3 test1 0.4 test1 0.5 test1 0.6 test1 0.7 test1 0.8 test1 0.9 test1 0.95 test1 0.99 nrows=100000 test2 0.1 test2 0.2 test2 0.3 test2 0.4 test2 0.5 test2 0.6 test2 0.7 test2 0.8 test2 0.9 test2 0.95 test2 0.99 psql ${dbname} -c 'select * from result order by "#", "del%", "##"' ==== BEFORE patch. # = 1 is fixed sized rows, # = 2 is variable sized rows # | del% | pages | n_live_tup | tups est | tups real | est/real | bufs ---+------+-------+------------+----------+-----------+----------+------1 | 0.1 | 7523 | 900001 | 941326 | 900001| 1.046 | 4441 | 0.2 | 7523 | 800001 | 882465 | 800001 | 1.103 | 8861 | 0.3 | 7523 | 700001| 823697 | 700001 | 1.177 | 13291 | 0.4 | 7523 | 600001 | 764836 | 600001 | 1.275 | 17711 | 0.5 | 7523 | 500001 | 706068 | 500001 | 1.412 | 22141 | 0.6 | 7523 | 400001 | 647206 | 400001 | 1.618 | 26561 | 0.7 | 4425 | 300001 | 588239 | 300001 | 1.961 | 30991 | 0.8 | 4425 | 200001 | 529394| 200001 | 2.647 | 35421 | 0.9 | 4425 | 100001 | 470626 | 100001 | 4.706 | 39851 | 0.95 | 4425| 50001 | 441196 | 50001 | 8.824 | 42061 | 0.99 | 4425 | 417670 | 417670 | 10001 | 41.763| 43832 | 0.1 | 1263 | 100000 | 91902 | 90001 | 1.021 | 1042 | 0.2 | 1263 | 100000 | 83737| 80001 | 1.047 | 2072 | 0.3 | 1263 | 100000 | 75573 | 70001 | 1.080 | 3102 | 0.4 | 1263| 100000 | 67409 | 60001 | 1.123 | 4132 | 0.5 | 1263 | 100000 | 59245 | 50001 | 1.185| 5162 | 0.6 | 1263 | 100000 | 51099 | 40001 | 1.277 | 6202 | 0.7 | 541 | 100000 | 42855| 30001 | 1.428 | 7232 | 0.8 | 541 | 38607 | 38607 | 20001 | 1.930 | 7782 | 0.9 | 541| 100000 | 34321 | 10001 | 3.432 | 8322 | 0.95 | 541 | 100000 | 34930 | 5001 | 6.985 | 8852 | 0.99 | 541 | 30930 | 30930 | 1001 | 30.899 | 885 ==== AFTER the patch# | del% | pages | n_live_tup | tups est | tups real | est/real | bufs ---+------+-------+------------+----------+-----------+----------+------1 | 0.1 | 7523 | 900001 | 941326 | 900001| 1.046 | 4441 | 0.2 | 7523 | 800001 | 803449 | 800001 | 1.004 | 8861 | 0.3 | 7523 | 700001| 703835 | 700001 | 1.005 | 13291 | 0.4 | 7523 | 600001 | 604220 | 600001 | 1.007 | 17711 | 0.5 | 7523 | 500001 | 504606 | 500001 | 1.009 | 22141 | 0.6 | 7523 | 400001 | 404992 | 400001 | 1.012 | 26561 | 0.7 | 4425 | 300001 | 305340 | 300001 | 1.018 | 30991 | 0.8 | 4425 | 205758 | 205758| 200001 | 1.029 | 35421 | 0.9 | 4425 | 106169 | 106169 | 100001 | 1.062 | 39851 | 0.95 | 4425| 56373 | 56373 | 50001 | 1.127 | 42061 | 0.99 | 4425 | 10001 | 16535 | 10001 | 1.653| 43832 | 0.1 | 1263 | 100000 | 91902 | 90001 | 1.021 | 1042 | 0.2 | 1263 | 100000 | 67589| 80001 | 0.845 | 2072 | 0.3 | 1263 | 100000 | 57522 | 70001 | 0.822 | 3102 | 0.4 | 1263| 60001 | 47489 | 60001 | 0.791 | 4132 | 0.5 | 1263 | 100000 | 37469 | 50001 | 0.749| 5162 | 0.6 | 1263 | 40001 | 27455 | 40001 | 0.686 | 6202 | 0.7 | 541 | 100000 | 17432| 30001 | 0.581 | 7232 | 0.8 | 541 | 20001 | 12894 | 20001 | 0.645 | 7782 | 0.9 | 541| 7570 | 7570 | 10001 | 0.757 | 8322 | 0.95 | 541 | 100000 | 6605 | 5001 | 1.321 | 8852 | 0.99 | 541 | 100000 | 2598 | 1001 | 2.595 | 885 (44 rows)
<p dir="ltr">Sorry, I made an mistake.<p dir="ltr">"Kyotaro HORIGUCHI" <<a href="mailto:horiguchi.kyotaro@lab.ntt.co.jp">horiguchi.kyotaro@lab.ntt.co.jp</a>>:<pdir="ltr">> Overall algorithmcould be illistrated as below,<br /> ><br /> > - summing up used bytes, max offnum(PageGetMaxOffsetNumber),<pdir="ltr">Not "max" offnum, the number of linp's used after page vacuum.<p dir="ltr">> maximum free bytes for tuple data , and free bytes after page<br /> > vacuum through all scanned pages.<br/> ><br /> > - summing up free bytes informed by FSM through all skipped<br /> > pages.<br /> ><br/> > - Calculate mean tuple length from the overall used bytes and<br /> > sum of max offnums, and scannedpages.<p dir="ltr">Here also is the same. not sum of max offnum but total of used entrre(linp)s.<p dir="ltr">> - Guess tuple density in skipped pages using overall free bytes<br /> > from FSM and the mean tuplelength calculated above.<br /> ><br /> > - Finally, feed estimated number of the live tuples BEFORE<br /> > vacuum into vac_estimate_reltuples.<p dir="ltr">regards,<p dir="ltr">-- <br /> Kyotaro Horiguchi
On Friday, June 14, 2013 2:05 PM Kyotaro HORIGUCHI wrote: > Hello, > > Postgresql estimates the number of live tuples after the vacuum has > left some buffers unscanned. This estimation does well for most cases, > but makes completely different result with a strong imbalance of tuple > density. > > For example, > > create table t (a int, b int); > insert into t (select a, (random() * 100000)::int from > generate_series((select count(*) from t) + 1, 1000000) a); update t set > b = b + 1 where a < (select count(*) from t) * 0.7; vacuum t; delete > from t where a < (select count(*) from t) * 0.99; > > After this, pg_stat_user_tables.n_live_tup shows 417670 which is > 41 times larger than the real number of rows 100001. Number should be 10001 not 100001. > And what makes it > worse, autovacuum nor autoanalyze won't run until n_dead_tup goes above > 8 times larger than the real number of tuples in the table for the > default settings.. > > > | postgres=# select n_live_tup, n_dead_tup > | from pg_stat_user_tables where relname='t'; n_live_tup | > | n_dead_tup > | ------------+------------ > | 417670 | 0 > | > | postgres=# select reltuples from pg_class where relname='t'; > | reltuples > | ----------- > | 417670 > | > | postgres=# select count(*) from t; > | count > | ------- > | 10001 I have tried to reproduce the problem in different m/c's, but couldn't reproduce it. I have ran tests with default configuration. Output on Windows: ------------------- postgres=# create table t (a int, b int); CREATE TABLE postgres=# insert into t (select a, (random() * 100000)::int from generate_serie s((select count(*) from t) + 1, 1000000) a); INSERT 0 1000000 postgres=# update t set b = b + 1 where a < (select count(*) from t) * 0.7; UPDATE 699999 postgres=# vacuum t; VACUUM postgres=# delete from t where a < (select count(*) from t) * 0.99; DELETE 989999 postgres=# postgres=# select n_live_tup, n_dead_tup from pg_stat_user_tables where relname= 't'; n_live_tup | n_dead_tup ------------+------------ 10001 | 989999 (1 row) Output on Suse ---------------- postgres=# drop table if exists t; create table t (a int, b int); insert into t (select a, (random() * 100000)::int from generate_series((select count(*) from t) + 1, 1000000) a); update t set b = b + 1 where a < (select count(*) from t) * 0.7; vacuum t; delete from t where a < (select count(*) from t) * 0.99; vacuum t; select c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t) as tuples, reltuples::float / (select count(*) from t) as ratio from pg_stat_user_tables s, pg_class c where s.relname = 't' and c.relname = 't';DROP TABLE postgres=# CREATE TABLE postgres=# INSERT 0 1000000 postgres=# UPDATE 699999 postgres=# VACUUM postgres=# DELETE 989999 postgres=# VACUUM postgres=# relpages | n_live_tup | reltuples | tuples | ratio ----------+------------+-----------+--------+------- 4425 | 10001 | 10001 | 10001 | 1 (1 row) When I tried to run vactest.sh, it gives below error: linux:~/akapila/vacuum_nlivetup> ./vactest.sh ./vactest.sh: line 11: syntax error near unexpected token `&' ./vactest.sh: line 11: ` psql ${dbname} -c "vacuum verbose t" |& egrep "INFO: *\"t\": found " | sed -e 's/^.* versions in \([0-9]*\) .*$/\1/'' Can you help me in reproducing the problem by letting me know if I am doing something wrong or results of test are not predictable? With Regards, Amit Kapila.
Hello, > I have tried to reproduce the problem in different m/c's, but couldn't > reproduce it. > I have ran tests with default configuration. I think you had reproduced it. > Output on Windows: > ------------------- > postgres=# create table t (a int, b int); (snip) > postgres=# select n_live_tup, n_dead_tup from pg_stat_user_tables where > relname= > 't'; > n_live_tup | n_dead_tup > ------------+------------ > 10001 | 989999 > (1 row) Yes, this is the same for me. You should've done this instead, postgres=# select reltuples from pg_class where relname = 't';reltuples ----------- 1e+06 (1 row) This is 100 times larger than n_live_tup, and it is this value which used for judge the necessity of autovacuum. autovacuum.c: 2695 | reltuples = classForm->reltuples; | vactuples = tabentry->n_dead_tuples; .... | vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples; | anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples; Although.. > Output on Suse > ---------------- > postgres=# drop table if exists t; > create table t (a int, b int); > insert into t (select a, (random() * 100000)::int from > generate_series((select count(*) from t) + 1, 1000000) a); > update t set b = b + 1 where a < (select count(*) from t) * 0.7; > vacuum t; > delete from t where a < (select count(*) from t) * 0.99; > vacuum t; > select c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t) as > tuples, reltuples::float / (select count(*) from t) as ratio from > pg_stat_user_tables s, pg_class c where s.relname = 't' and c.relname = > 't';DROP TABLE > postgres=# CREATE TABLE > postgres=# INSERT 0 1000000 > postgres=# UPDATE 699999 > postgres=# VACUUM > postgres=# DELETE 989999 > postgres=# VACUUM > postgres=# > relpages | n_live_tup | reltuples | tuples | ratio > ----------+------------+-----------+--------+------- > 4425 | 10001 | 10001 | 10001 | 1 > (1 row) ... Mmm.. I have following figures for the same operation. relpages | n_live_tup | reltuples | tuples | ratio ----------+------------+-----------+--------+------------------ 4425 | 417670 | 417670 | 10001 | 41.7628237176282 I condisider on this for a time.. > When I tried to run vactest.sh, it gives below error: > linux:~/akapila/vacuum_nlivetup> ./vactest.sh > ./vactest.sh: line 11: syntax error near unexpected token `&' > ./vactest.sh: line 11: ` psql ${dbname} -c "vacuum verbose t" |& > egrep "INFO: *\"t\": found " | sed -e 's/^.* versions in \([0-9]*\) > .*$/\1/'' > > > Can you help me in reproducing the problem by letting me know if I am doing > something wrong or results of test are not predictable? Could you let me know the pg's version you're running? And it is appreciated if you're kindly show me the vacuum logs while testing. # I found a silly bug in the patch, but I put it off. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Sorry for lots of mistakes I've made, I noticed that I was dancing on the 9.3dev at sometime, so I reran on 9.4devel up to date, having the same result from the view of inaccuracy of pg_class.reltuples after vacuuming. Although, what differs from my old testset, vacuum reported that it sanned the whole table pages. It cannot be the case that it gives such a reltuples value for the case since before... Please let me have a bit of time to diagnose this. ===== > When I tried to run vactest.sh, it gives below error: > linux:~/akapila/vacuum_nlivetup> ./vactest.sh > ./vactest.sh: line 11: syntax error near unexpected token `&' > ./vactest.sh: line 11: ` psql ${dbname} -c "vacuum verbose t" |& > egrep "INFO: *\"t\": found " | sed -e 's/^.* versions in \([0-9]*\) > .*$/\1/'' '|&' I carelessly used should not be understood by real /bin/sh. Since I might use other notations out of sh syntex, it seems to more helpful to replace the shbang with '/bin/bash' instead of '/bin/sh'.. In addition, the test script I brought up here discards all outputs of sql commands. The attached script shows the verbose response from vacuum. - replaced shbang with /bin/bash- replaced '|&' notation with '2>&1 |'- removed '> /dev/null' from psql commandline > Can you help me in reproducing the problem by letting me know > if I am doing something wrong or results of test are not > predictable? Thank you for getting involved and sorry for the insufficient preparation. But please wait for a while to go on. I ran the attached revised script for the distcleaned current head on the master branch (9.4devel) on CentOS 6.4 and got the same result as previous, shown below. But I found the result ununderstandable. I'd like to have a bit time to diagnose this. regards, -- Kyotaro Horiguchi NTT Open Source Software Center =========== $ ./vactest.sh test1 ratio = 0.4 DROP TABLE CREATE TABLE INSERT 0 1000000 UPDATE 699999 psql:<stdin>:5: INFO: vacuuming "public.t" psql:<stdin>:5: INFO: "t": removed 699999 row versions in 4459 pages psql:<stdin>:5: INFO: "t": found 699999 removable, 1000000 nonremovable row versions in 10829 out of 10829 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.13u sec elapsed 0.13 sec. VACUUM DELETE 399999 test1 ratio = 0.99 DROP TABLE CREATE TABLE INSERT 0 1000000 UPDATE 699999 psql:<stdin>:5: INFO: vacuuming "public.t" psql:<stdin>:5: INFO: "t": removed 699999 row versions in 4459 pages psql:<stdin>:5: INFO: "t": found 699999 removable, 1000000 nonremovable row versions in 10829 out of 10829 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.15u sec elapsed 0.25 sec. VACUUM DELETE 989999 test1 ratio = 1.00 DROP TABLE CREATE TABLE INSERT 0 1000000 UPDATE 699999 psql:<stdin>:5: INFO: vacuuming "public.t" psql:<stdin>:5: INFO: "t": removed 699999 row versions in 4459 pages psql:<stdin>:5: INFO: "t": found 699999 removable, 1000000 nonremovable row versions in 10829 out of 10829 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.14u sec elapsed 0.19 sec. VACUUM DELETE 999999# | del% | ## | pages | n_live_tup | tups est | tups real | est/real | bufs ---+------+----+-------+------------+----------+-----------+------------+------1 | 0.4 | 1 | 10829 | 600001 | 764808| 600001 | 1.275 | 25491 | 0.99 | 1 | 6370 | 10001 | 417600 | 10001 | 41.756 | 63081 | 1 | 1 | 6370 | 1 | 411673 | 1 | 411673.000 | 6371 (3 rows) ======== #! /bin/bash dbname="postgres" function insert_result() {psql ${dbname} -f - > /dev/null <<EOF insert into result select $1, $2, $3, c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t), reltuples::float /(select count(*) from t) from pg_stat_user_tables s, pg_class c where s.relname = 't' and c.relname = 't'; EOF } function vac_with_bufs() {psql ${dbname} -c "vacuum verbose t" 2>&1 | egrep "INFO: *\"t\": found " | sed -e 's/^.* versionsin \([0-9]*\) .*$/\1/' } function update_result_bufs() {local test_no=$1local delratio=$2local vac_no=$3local bufs=$4 psql ${dbname} -c "update result set bufs=${bufs} where \"#\"=$test_no and \"del%\"=$delratio and \"##\"=$vac_no" >/dev/null } function store_result() {local test_no=$1local delratio=$2scanned_bufs=`vac_with_bufs`insert_result $test_no $delratio 1update_result_bufs$test_no $delratio 1 $scanned_bufs # scanned_bufs=`vac_with_bufs` # insert_result $test_no $delratio 2 # update_result_bufs $test_no $delratio 2 $scanned_bufs } function test1() {local delratio=$1 echo "test1 ratio = $delratio"psql ${dbname} -f - <<EOF drop table if exists t; create table t (a int, b int, c int, d int default 0, e int default 0, f int default 0); insert into t (select a, (random() * 100000)::int from generate_series((select count(*) from t) + 1, $nrows) a); update t set b = b + 1 where a < (select count(*) from t) * 0.7; vacuum verbose t; delete from t where a < (select count(*) from t) * $delratio EOFstore_result 1 $delratio } function test2() {local delratio=$1 echo "test2 ratio = $delratio"psql ${dbname} -f - <<EOF drop table if exists t; create table t (a int, b text); insert into t (select a, 'abcdefg' from generate_series((select count(*) from t) + 1, $nrows) a); update t set b = repeat('abcdefghij', 250) where a < (select count(*) from t) * 0.7; vacuum verbose t; delete from t where a < (select count(*) from t) * $delratio; EOFstore_result 2 $delratio } psql ${dbname} -f - > /dev/null <<EOF drop table if exists result; create table result ("#" int, "del%" float, "##" int, pages int, n_live_tup int, "tups est" int, "tups real" int, "est/real"numeric(10, 3), bufs int default 0); EOF nrows=1000000 # test1 0.1 # test1 0.2 # test1 0.3 test1 0.4 # test1 0.5 # test1 0.6 # test1 0.7 # test1 0.8 # test1 0.9 # test1 0.95 test1 0.99 test1 1.00 # nrows=100000 # test2 0.1 # test2 0.2 # test2 0.3 # test2 0.4 # test2 0.5 # test2 0.6 # test2 0.7 # test2 0.8 # test2 0.9 # test2 0.95 # test2 0.99 # test2 1.00 psql ${dbname} -c 'select * from result order by "#", "del%", "##"'
I've recovered from messing up. <snip> > Please let me have a bit of time to diagnose this. I was completely messed up and walking on the wrong way. I looked into the vacuum for UPDATEs, not DELETE's so it's quite resonable to have such results. The renewed test script attached shows the verbose output of vacuum after the deletes. I had following output from it. # I belive this runs for you.. | INFO: "t": found 989999 removable, 110 nonremovable row | versions in 6308 out of 10829 pages On such a case of partially-scanned, lazy_scan_heap() tries to estimate resulting num_tuples in vac_estimate_reltuples() assuming the uniformity of tuple density, which failes for such a a strong imbalance made by bulk updates. Do you find any differences between what you will have and the following I had? | $ ./vactest.sh | ### test1 ratio = 0.4 | INFO: vacuuming "public.t" | INFO: "t": removed 399999 row versions in 2549 pages | INFO: "t": found 399999 removable, 194 nonremovable row versions in 2549 out of 10829 pages | DETAIL: 0 dead row versions cannot be removed yet. | There were 0 unused item pointers. | 0 pages are entirely empty. | CPU 0.00s/0.04u sec elapsed 0.04 sec. | ### test1 ratio = 0.99 | INFO: vacuuming "public.t" | INFO: "t": removed 989999 row versions in 6308 pages | INFO: "t": found 989999 removable, 110 nonremovable row versions in 6308 out of 10829 pages | DETAIL: 0 dead row versions cannot be removed yet. | There were 93 unused item pointers. | 0 pages are entirely empty. | CPU 0.00s/0.11u sec elapsed 0.24 sec. | INFO: "t": truncated 10829 to 6370 pages | DETAIL: CPU 0.00s/0.00u sec elapsed 0.01 sec. | ### test1 ratio = 1.00 | INFO: vacuuming "public.t" | INFO: "t": removed 999999 row versions in 6371 pages | INFO: "t": found 999999 removable, 1 nonremovable row versions in 6371 out of 10829 pages | DETAIL: 0 dead row versions cannot be removed yet. | There were 93 unused item pointers. | 0 pages are entirely empty. | CPU 0.00s/0.11u sec elapsed 0.20 sec. | INFO: "t": truncated 10829 to 6370 pages | DETAIL: CPU 0.01s/0.00u sec elapsed 0.27 sec. | # | del% | ## | pages | n_live_tup | tups est | tups real | est/real | bufs | ---+------+----+-------+------------+----------+-----------+------------+------ | 1 | 0.4 | 1 | 10829 | 600001 | 764808 | 600001 | 1.275 | 2549 | 1 | 0.99 | 1 | 6370 | 10001 | 417600 | 10001 | 41.756 | 6308 | 1 | 1 | 1 | 6370 | 411673 | 411673 | 1 | 411673.000 | 6371 | (3 rows) regards, -- Kyotaro Horiguchi NTT Open Source Software Center #! /bin/bash dbname="postgres" function insert_result() {psql ${dbname} -f - > /dev/null <<EOF insert into result select $1, $2, $3, c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t), reltuples::float /(select count(*) from t) from pg_stat_user_tables s, pg_class c where s.relname = 't' and c.relname = 't'; EOF } function vac_with_bufs() {local f=`mktemp vactest_XXXX`psql ${dbname} -c "vacuum verbose t" 2>>$f 1>/dev/nullcat $f 1>&2cat$f | egrep "INFO: *\"t\": found " | sed -e 's/^.* versions in \([0-9]*\) .*$/\1/'rm $f } function update_result_bufs() {local test_no=$1local delratio=$2local vac_no=$3local bufs=$4 psql ${dbname} -c "update result set bufs=${bufs} where \"#\"=$test_no and \"del%\"=$delratio and \"##\"=$vac_no" >/dev/null } function store_result() {local test_no=$1local delratio=$2scanned_bufs=`vac_with_bufs`insert_result $test_no $delratio 1update_result_bufs$test_no $delratio 1 $scanned_bufs # scanned_bufs=`vac_with_bufs` # insert_result $test_no $delratio 2 # update_result_bufs $test_no $delratio 2 $scanned_bufs } function test1() {local delratio=$1 echo "### test1 ratio = $delratio"psql ${dbname} -f - > /dev/null <<EOF drop table if exists t; create table t (a int, b int, c int, d int default 0, e int default 0, f int default 0); insert into t (select a, (random() * 100000)::int from generate_series((select count(*) from t) + 1, $nrows) a); update t set b = b + 1 where a < (select count(*) from t) * 0.7; vacuum t; delete from t where a < (select count(*) from t) * $delratio EOFstore_result 1 $delratio } function test2() {local delratio=$1 echo "### test2 ratio = $delratio"psql ${dbname} -f - > /dev/null <<EOF drop table if exists t; create table t (a int, b text); insert into t (select a, 'abcdefg' from generate_series((select count(*) from t) + 1, $nrows) a); update t set b = repeat('abcdefghij', 250) where a < (select count(*) from t) * 0.7; vacuum t; delete from t where a < (select count(*) from t) * $delratio; EOFstore_result 2 $delratio } psql ${dbname} -f - > /dev/null <<EOF drop table if exists result; create table result ("#" int, "del%" float, "##" int, pages int, n_live_tup int, "tups est" int, "tups real" int, "est/real"numeric(10, 3), bufs int default 0); EOF nrows=1000000 # test1 0.1 # test1 0.2 # test1 0.3 test1 0.4 # test1 0.5 # test1 0.6 # test1 0.7 # test1 0.8 # test1 0.9 # test1 0.95 test1 0.99 test1 1.00 # nrows=100000 # test2 0.1 # test2 0.2 # test2 0.3 # test2 0.4 # test2 0.5 # test2 0.6 # test2 0.7 # test2 0.8 # test2 0.9 # test2 0.95 # test2 0.99 # test2 1.00 psql ${dbname} -c 'select * from result order by "#", "del%", "##"'
On Tuesday, June 25, 2013 2:06 PM Kyotaro HORIGUCHI wrote: > Hello, > > > I have tried to reproduce the problem in different m/c's, but > couldn't > > reproduce it. > > I have ran tests with default configuration. > > I think you had reproduced it. > > > Output on Windows: > > ------------------- > > postgres=# create table t (a int, b int); > (snip) > > postgres=# select n_live_tup, n_dead_tup from pg_stat_user_tables > where > > relname= > > 't'; > > n_live_tup | n_dead_tup > > ------------+------------ > > 10001 | 989999 > > (1 row) > > Yes, this is the same for me. You should've done this instead, > > postgres=# select reltuples from pg_class where relname = 't'; > reltuples > ----------- > 1e+06 > (1 row) > > This is 100 times larger than n_live_tup, and it is this value > which used for judge the necessity of autovacuum. > > autovacuum.c: 2695 > | reltuples = classForm->reltuples; > | vactuples = tabentry->n_dead_tuples; > .... > | vacthresh = (float4) vac_base_thresh + vac_scale_factor * > reltuples; > | anlthresh = (float4) anl_base_thresh + anl_scale_factor * > reltuples; > > Although.. > > > Output on Suse > > ---------------- > > postgres=# drop table if exists t; > > create table t (a int, b int); > > insert into t (select a, (random() * 100000)::int from > > generate_series((select count(*) from t) + 1, 1000000) a); > > update t set b = b + 1 where a < (select count(*) from t) * 0.7; > > vacuum t; > > delete from t where a < (select count(*) from t) * 0.99; > > vacuum t; > > select c.relpages, s.n_live_tup, c.reltuples, (select count(*) from > t) as > > tuples, reltuples::float / (select count(*) from t) as ratio from > > pg_stat_user_tables s, pg_class c where s.relname = 't' and c.relname > = > > 't';DROP TABLE > > postgres=# CREATE TABLE > > postgres=# INSERT 0 1000000 > > postgres=# UPDATE 699999 > > postgres=# VACUUM > > postgres=# DELETE 989999 > > postgres=# VACUUM > > postgres=# > > relpages | n_live_tup | reltuples | tuples | ratio > > ----------+------------+-----------+--------+------- > > 4425 | 10001 | 10001 | 10001 | 1 > > (1 row) > > ... Mmm.. I have following figures for the same operation. > > > relpages | n_live_tup | reltuples | tuples | ratio > ----------+------------+-----------+--------+------------------ > 4425 | 417670 | 417670 | 10001 | 41.7628237176282 > > I condisider on this for a time.. > > > When I tried to run vactest.sh, it gives below error: > > linux:~/akapila/vacuum_nlivetup> ./vactest.sh > > ./vactest.sh: line 11: syntax error near unexpected token `&' > > ./vactest.sh: line 11: ` psql ${dbname} -c "vacuum verbose t" > |& > > egrep "INFO: *\"t\": found " | sed -e 's/^.* versions in \([0-9]*\) > > .*$/\1/'' > > > > > > Can you help me in reproducing the problem by letting me know if I am > doing > > something wrong or results of test are not predictable? > > Could you let me know the pg's version you're running? I had used 9.4 Head to run above tests. Sorry, yesterday I was busy with some other work so could not got time to check and work on this issue further. I shall try to work on it today. > And it is > appreciated if you're kindly show me the vacuum logs while > testing. > > # I found a silly bug in the patch, but I put it off. > > regards, > > -- > Kyotaro Horiguchi > NTT Open Source Software Center
On Wednesday, June 26, 2013 7:40 AM Kyotaro HORIGUCHI wrote: > I've recovered from messing up. > > <snip> > > Please let me have a bit of time to diagnose this. > > I was completely messed up and walking on the wrong way. I looked into > the vacuum for UPDATEs, not DELETE's so it's quite resonable to have > such results. > > The renewed test script attached shows the verbose output of vacuum > after the deletes. I had following output from it. > > # I belive this runs for you.. > > | INFO: "t": found 989999 removable, 110 nonremovable row > | versions in 6308 out of 10829 pages > > On such a case of partially-scanned, lazy_scan_heap() tries to estimate > resulting num_tuples in vac_estimate_reltuples() assuming the > uniformity of tuple density, which failes for such a a strong imbalance > made by bulk updates. > > Do you find any differences between what you will have and the > following I had? I could see the same output with your latest script, also I could reproduce the test if I run the test with individual sql statements. One of the main point for reproducing individual test was to keep autovacuum = off. Now I can look into it further, I have still not gone through in detail about your new approach to calculate the reltuples, but I am wondering whether there can be anyway with which estimates can be improved with different calculation in vac_estimate_reltuples(). One thing I have observed that 2nd parameter is_analyze of vac_estimate_reltuples() is currently not used. I cannot work on it till early next week, so others are welcome to join review. With Regards, Amit Kapila.
On Thu, Jun 27, 2013 at 7:27 AM, Amit Kapila <amit.kapila@huawei.com> wrote: > Now I can look into it further, I have still not gone through in detail > about your new approach to calculate the reltuples, but I am wondering > whether there can be anyway with which estimates can be improved with > different calculation in vac_estimate_reltuples(). I think this is getting at the threshold question for this patch, which is whether it's really making things better or just moving the problems around. I mean, I have no problem accepting that the new algorithm is (1) reasonably cheap and (2) better in some cases. But if it's worse in other cases, which AFAICS hasn't been discussed, then it's no good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thursday, June 27, 2013 4:58 PM Amit Kapila wrote: > On Wednesday, June 26, 2013 7:40 AM Kyotaro HORIGUCHI wrote: > > I've recovered from messing up. > > > > <snip> > > > Please let me have a bit of time to diagnose this. > > > > I was completely messed up and walking on the wrong way. I looked > into > > the vacuum for UPDATEs, not DELETE's so it's quite resonable to have > > such results. > > > > The renewed test script attached shows the verbose output of vacuum > > after the deletes. I had following output from it. > > > > # I belive this runs for you.. > > > > | INFO: "t": found 989999 removable, 110 nonremovable row > > | versions in 6308 out of 10829 pages > > > > On such a case of partially-scanned, lazy_scan_heap() tries to > estimate > > resulting num_tuples in vac_estimate_reltuples() assuming the > > uniformity of tuple density, which failes for such a a strong > imbalance > > made by bulk updates. > > > > Do you find any differences between what you will have and the > > following I had? > > I could see the same output with your latest script, also I could > reproduce > the test if I run the test with individual sql statements. > One of the main point for reproducing individual test was to keep > autovacuum > = off. I checked further that why I could not reproduce the issue with autovacuum=on. The reason is that it starts analyzer which changes the value for reltuples in pg_class and after that the estimated and real values become same. Kindly refer below code: relation_needs_vacanalyze() { .. anltuples = tabentry->changes_since_analyze; .. anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples; .. *doanalyze = (anltuples > anlthresh); } Test Results -------------- postgres=# drop table if exists t; DROP TABLE postgres=# create table t (a int, b int, c int, d int default 0, e int default 0 , f int default 0); CREATE TABLE postgres=# insert into t (select a, (random() * 100000)::int from generate_serie s((select count(*) from t) + 1, 1000000) a); INSERT 0 1000000 postgres=# update t set b = b + 1 where a < (select count(*) from t) * 0.7; UPDATE 699999 postgres=# vacuum t; VACUUM postgres=# delete from t where a < (select count(*) from t) * 0.99; DELETE 989999 postgres=# vacuum t; VACUUM postgres=# select c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t ) as tuples, reltuples::float / (select count(*) from t) as ratio from pg_stat_ user_tables s, pg_class c where s.relname = 't' and c.relname = 't'; relpages | n_live_tup | reltuples | tuples | ratio ----------+------------+-----------+--------+------------------ 6370 | 417600 | 417600 | 10001 | 41.7558244175582 (1 row) Here I waited for 1 minute (sufficient time so that analyzer should get trigger if required). Infact if you run Analyze t, that also would have served the purpose. postgres=# select c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t ) as tuples, reltuples::float / (select count(*) from t) as ratio from pg_stat_ user_tables s, pg_class c where s.relname = 't' and c.relname = 't'; relpages | n_live_tup | reltuples | tuples | ratio ----------+------------+-----------+--------+------- 6370 | 10001 | 10001 | 10001 | 1 (1 row) Now if subsequent analyzer run corrects the estimate, don't you think that it is sufficient for the problem reported? With Regards, Amit Kapila.
Hello, > I could see the same output with your latest script, also I could reproduce > the test if I run the test with individual sql statements. > One of the main point for reproducing individual test was to keep autovacuum > = off. I see. Autovacuum's nap time is 60 sconds for the default settings. Your operation might help it to snipe the window between the last massive delete and the next explict vacuum in store_result().. Anyway setting autovacuum to off should aid to make clean environment fot this issue. > Now I can look into it further, I have still not gone through in detail > about your new approach to calculate the reltuples, but I am wondering > whether there can be anyway with which estimates can be improved with > different calculation in vac_estimate_reltuples(). I'll explain this in other words alghough It might be repetitious. It is tough to decide how to modify there. Currently I decided to preserve vac_estimate_reltuples as possible as it is. For that objective, I picked up old_rel_tuples as intermediate variable for the aid to 'deceive' the function. This can be different form deciding to separate this estimation function from that for analyze. As I described before, vac_estimates_reltuples has a presumption that the tuple density in skipped pages is not so different from that in whole table before vacuuming. Since the density is calculated without using any hint about the skipped pages, and it cannot tell how much tuples aganst pg_class.reltuples is already dead, the value can be far different from the true one and cannot be verified. Given that we canot use pg_stat_user_tables.n_dead_tup, reading all pages can fix it but the penalty should be intolerable. Using FSM to know the used bytes in skipped pages (which is all visible by the definition) seems to give good estimations of the tuples in the skipped pages to some extent assuming the uniformity of tuple length. Of course strong deviation in length can deceive the algorithm. Does it make sense for you? I might could show the numerical explanation but I'm afraind I can't do it for now. I'll be able to take time sooner... (also for reviewing..) > One thing I have observed that 2nd parameter is_analyze of > vac_estimate_reltuples() is currently not used. Mmm, it seems to have been useless from the beginning of the function... > I cannot work on it till early next week, so others are welcome to join regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On Wednesday, July 03, 2013 1:21 PM Kyotaro HORIGUCHI wrote: > Hello, > > > I could see the same output with your latest script, also I could > reproduce > > the test if I run the test with individual sql statements. > > One of the main point for reproducing individual test was to keep > autovacuum > > = off. > > I see. Autovacuum's nap time is 60 sconds for the default > settings. Your operation might help it to snipe the window > between the last massive delete and the next explict vacuum in > store_result().. Anyway setting autovacuum to off should aid to > make clean environment fot this issue. > > > Now I can look into it further, I have still not gone through in > detail > > about your new approach to calculate the reltuples, but I am > wondering > > whether there can be anyway with which estimates can be improved with > > different calculation in vac_estimate_reltuples(). > > I'll explain this in other words alghough It might be > repetitious. > > It is tough to decide how to modify there. Currently I decided to > preserve vac_estimate_reltuples as possible as it is. For that > objective, I picked up old_rel_tuples as intermediate variable > for the aid to 'deceive' the function. This can be different form > deciding to separate this estimation function from that for > analyze. > > As I described before, vac_estimates_reltuples has a presumption > that the tuple density in skipped pages is not so different from > that in whole table before vacuuming. Since the density is > calculated without using any hint about the skipped pages, and it > cannot tell how much tuples aganst pg_class.reltuples is already > dead, the value can be far different from the true one and cannot > be verified. Given that we canot use > pg_stat_user_tables.n_dead_tup, reading all pages can fix it but > the penalty should be intolerable. > > Using FSM to know the used bytes in skipped pages (which is all > visible by the definition) seems to give good estimations of the > tuples in the skipped pages to some extent assuming the > uniformity of tuple length. Of course strong deviation in length > can deceive the algorithm. > > Does it make sense for you? I understood your patch's algorithm, but still I have doubt in my mind that if the next analyze can correct the estimates, Why would that be not sufficient. Please refer my last mail for analysis of same http://www.postgresql.org/message-id/000601ce77ad$7d3388e0$779a9aa0$@kapila@ huawei.com Performance Data ------------------ I have checked few cases where FSM is not updated accurately, this patch seems to give much worse results than current code. Test with Patch -------------------- 1. Test given by you where tuple density is non-uniform postgres=# drop table if exists t; DROP TABLE postgres=# create table t (a int, b int, c int, d int default 0, e int default 0 , f int default 0); CREATE TABLE postgres=# insert into t (select a, (random() * 100000)::int from generate_serie s((select count(*) from t) + 1, 1000000) a); INSERT 0 1000000 postgres=# update t set b = b + 1 where a < (select count(*) from t) * 0.7; UPDATE 699999 postgres=# vacuum t; VACUUM postgres=# delete from t where a < (select count(*) from t) * 0.99; DELETE 989999 postgres=# vacuum t; VACUUM postgres=# select c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t ) as tuples, reltuples::float / (select count(*) from t) as ratio from pg_stat_ user_tables s, pg_class c where s.relname = 't' and c.relname = 't'; relpages | n_live_tup | reltuples | tuples | ratio ----------+------------+-----------+--------+------------------ 6370 | 13596 | 13596 | 10001 | 1.35946405359464 (1 row) 2. Test where tuple density is non-uniform and FSM updates before calculation in Vacuum are not accurate. I have created index on table to simulate this test postgres=# drop table if exists t; DROP TABLE postgres=# create table t (a int, b int, c int, d int default 0, e int default 0 , f int default 0); CREATE TABLE postgres=# create index on t(a); CREATE INDEX postgres=# insert into t (select a, (random() * 100000)::int from generate_serie s((select count(*) from t) + 1, 1000000) a); INSERT 0 1000000 postgres=# update t set b = b + 1 where a < (select count(*) from t) * 0.7; UPDATE 699999 postgres=# vacuum t; VACUUM postgres=# delete from t where a < (select count(*) from t) * 0.99; DELETE 989999 postgres=# vacuum t; VACUUM postgres=# select c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t ) as tuples, reltuples::float / (select count(*) from t) as ratio from pg_stat_ user_tables s, pg_class c where s.relname = 't' and c.relname = 't'; relpages | n_live_tup | reltuples | tuples | ratio ----------+------------+--------------+--------+------------------ 6370 | 1001327 | 1.00133e+006 | 10001 | 100.122687731227 (1 row) Now this result in tuple estimation worse than current code. I think we need to have more tests to show that new calculation is better in all cases than current calculation. With Regards, Amit Kapila.
On 07/03/2013 12:51 AM, Kyotaro HORIGUCHI wrote: > It is tough to decide how to modify there. Currently I decided to > preserve vac_estimate_reltuples as possible as it is. For that > objective, I picked up old_rel_tuples as intermediate variable > for the aid to 'deceive' the function. This can be different form > deciding to separate this estimation function from that for > analyze. Kyotaro, do you think you'll be revising this patch in the next 2 days, or should I bounce it, and you can resubmit it for the next commitfest? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Hello, I'm very sory to take your time on this mess. ul 2013 16:06:11 +0530, Amit Kapila <amit.kapila@huawei.com> wrote in <014201ce7bc6$f71eb950$e55c2bf0$@kapila@huawei.com> > I understood your patch's algorithm, but still I have doubt in my mind that > if the next analyze can correct the estimates, > Why would that be not sufficient. Please refer my last mail for analysis of > same > http://www.postgresql.org/message-id/000601ce77ad$7d3388e0$779a9aa0$@kapila@ > huawei.com Hmm. I've reconfirmed what was happened on my test set. As the result, the misestimation with dead_tup = 0 which I thought to observe has turned out to be an illusion.. Tuple number estimation is working as it is expected. I withdraw this patch. regards, -- Kyotaro Horiguchi NTT Open Source Software Center