Обсуждение: [9.2devel] why it doesn't do index scan only?
hi did: create table test as select i as id, i || ' ' || repeat('depesz', 100) as z from generate_series(1,30000000) i; create index q on test (id); vacuum verbose analyze test; vacuum verbose analyze test; then I checked that index only scans work: $ explain analyze select id from test order by id desc limit 5; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..0.50 rows=5 width=4) (actual time=0.036..0.040 rows=5 loops=1) -> Index Only Scan Backward using q on test (cost=0.00..3029050.54 rows=30000000 width=4) (actual time=0.033..0.034rows=5 loops=1) Total runtime: 0.061 ms (3 rows) but when I tried it with a bit more complex query, it failed? $ explain analyze select id from test where id = any('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::int4[]); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on test (cost=88.47..169.00 rows=20 width=4) (actual time=0.126..0.149 rows=20 loops=1) Recheck Cond: (id = ANY ('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::integer[])) -> Bitmap Index Scan on q (cost=0.00..88.46 rows=20 width=0) (actual time=0.117..0.117 rows=20 loops=1) Index Cond: (id = ANY ('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::integer[])) Total runtime: 0.177 ms (5 rows) it is selecting 20 rows out of 30 million. why is it: 1. not using index only scan 2. not using even normal index scan? Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
hubert depesz lubaczewski <depesz@depesz.com> writes: > it is selecting 20 rows out of 30 million. why is it: > 1. not using index only scan > 2. not using even normal index scan? It thinks the bitmap scan is cheaper. Whether that's true or not is not very clear, but nobody is claiming that the costing of index-only scans is accurate yet. regards, tom lane
I wrote: > hubert depesz lubaczewski <depesz@depesz.com> writes: >> it is selecting 20 rows out of 30 million. why is it: >> 1. not using index only scan >> 2. not using even normal index scan? > It thinks the bitmap scan is cheaper. No, wait, I take that back --- it can't do a plain indexscan because ScalarArrayOp (=ANY(ARRAY)) isn't supported as a plain indexscan qual, only as a bitmap qual. This is because we rely on the bitmap to eliminate duplicates. It was never worth improving on that before; but now that plain indexscans have a potential performance advantage, we ought to think about ways to use ScalarArrayOp in plain indexscans. regards, tom lane
This beaviour is normal. Bitmap index scan is faster than index scan. 2011/10/8, hubert depesz lubaczewski <depesz@depesz.com>: > hi > did: > create table test as select i as id, i || ' ' || repeat('depesz', 100) as z > from generate_series(1,30000000) i; > create index q on test (id); > vacuum verbose analyze test; > vacuum verbose analyze test; > > then I checked that index only scans work: > > $ explain analyze select id from test order by id desc limit 5; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..0.50 rows=5 width=4) (actual time=0.036..0.040 rows=5 > loops=1) > -> Index Only Scan Backward using q on test (cost=0.00..3029050.54 > rows=30000000 width=4) (actual time=0.033..0.034 rows=5 loops=1) > Total runtime: 0.061 ms > (3 rows) > > but when I tried it with a bit more complex query, it failed? > > $ explain analyze select id from test where id = > any('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::int4[]); > > QUERY PLAN > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on test (cost=88.47..169.00 rows=20 width=4) (actual > time=0.126..0.149 rows=20 loops=1) > Recheck Cond: (id = ANY > ('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::integer[])) > -> Bitmap Index Scan on q (cost=0.00..88.46 rows=20 width=0) (actual > time=0.117..0.117 rows=20 loops=1) > Index Cond: (id = ANY > ('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::integer[])) > Total runtime: 0.177 ms > (5 rows) > > it is selecting 20 rows out of 30 million. why is it: > 1. not using index only scan > 2. not using even normal index scan? > > Best regards, > > depesz > > -- > The best thing about modern society is how easy it is to avoid contact with > it. > > http://depesz.com/ > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- ------------ pasman
Hello 2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>: > hubert depesz lubaczewski <depesz@depesz.com> writes: >> it is selecting 20 rows out of 30 million. why is it: >> 1. not using index only scan >> 2. not using even normal index scan? > > It thinks the bitmap scan is cheaper. Whether that's true or not is not > very clear, but nobody is claiming that the costing of index-only scans > is accurate yet. > I did a few tests and bitmap scan is faster. Maybe there is a some issue. In very simple test (and very syntetic test) create table omega(a int); insert into omega select (random()*10000)::int from generate_series(1,400000); select count(*) from omega where a = 100; and index scan is faster than index only scan. There is lot of duplicates. When I used a bigger range, a speed of bitmap index, index only scan and index scan is similar - but index scan was faster everywhere. Regards Pavel Stehule > regards, tom lane > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On 8 October 2011 18:53, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Hello > > 2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>: >> hubert depesz lubaczewski <depesz@depesz.com> writes: >>> it is selecting 20 rows out of 30 million. why is it: >>> 1. not using index only scan >>> 2. not using even normal index scan? >> >> It thinks the bitmap scan is cheaper. Whether that's true or not is not >> very clear, but nobody is claiming that the costing of index-only scans >> is accurate yet. >> > > I did a few tests and bitmap scan is faster. Maybe there is a some > issue. In very simple test (and very syntetic test) > > create table omega(a int); > insert into omega select (random()*10000)::int from generate_series(1,400000); > > select count(*) from omega where a = 100; > > and index scan is faster than index only scan. There is lot of > duplicates. When I used a bigger range, a speed of bitmap index, index > only scan and index scan is similar - but index scan was faster > everywhere. Here, index-only scan is massively faster than any other scan: test=# explain analyse select count(thing) from stuff where thing = 14; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=99814.38..99814.39 rows=1 width=4) (actual time=337.506..337.506 rows=1 loops=1) -> Index Only Scan using idx_stuff_thing on stuff (cost=0.00..99336.88 rows=191000 width=4) (actual time=155.955..315.106 rows=196828 loops=1) Index Cond: (thing = 14) Total runtime: 337.639 ms (4 rows) test=# set enable_indexonlyscan to false; SET test=# explain analyse select count(thing) from stuff where thing = 14; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=99814.38..99814.39 rows=1 width=4) (actual time=164882.528..164882.528 rows=1 loops=1) -> Index Scan using idx_stuff_thing on stuff (cost=0.00..99336.88 rows=191000 width=4) (actual time=0.184..164494.806 rows=196828 loops=1) Index Cond: (thing = 14) Total runtime: 164882.666 ms (4 rows) test=# set enable_indexonlyscan to false; SET test=# set enable_indexscan to false; SET test=# explain analyse select count(thing) from stuff where thing = 14; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=170553.91..170553.92 rows=1 width=4) (actual time=154102.221..154102.222 rows=1 loops=1) -> Bitmap Heap Scan on stuff (cost=2004.91..170076.41 rows=191000 width=4) (actual time=482.974..153730.892 rows=196828 loops=1) Recheck Cond: (thing = 14) -> Bitmap Index Scan on idx_stuff_thing (cost=0.00..1957.16 rows=191000 width=0) (actual time=421.854..421.854 rows=196828 loops=1) Index Cond: (thing = 14) Total runtime: 154107.415 ms (6 rows) test=# set enable_indexonlyscan to false; SET test=# set enable_indexscan to false; SET test=# set enable_bitmapscan to false; SET test=# explain analyse select count(thing) from stuff where thing = 14; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=812977.50..812977.51 rows=1 width=4) (actual time=121296.897..121296.897 rows=1 loops=1) -> Seq Scan on stuff (cost=0.00..812500.00 rows=191000 width=4) (actual time=67.105..121215.296 rows=196828 loops=1) Filter: (thing = 14) Rows Removed by Filter: 14803172 Total runtime: 121296.999 ms (5 rows) Note: buffer cache cleared between queries. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2011/10/8 Thom Brown <thom@linux.com>: > On 8 October 2011 18:53, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> Hello >> >> 2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>: >>> hubert depesz lubaczewski <depesz@depesz.com> writes: >>>> it is selecting 20 rows out of 30 million. why is it: >>>> 1. not using index only scan >>>> 2. not using even normal index scan? >>> >>> It thinks the bitmap scan is cheaper. Whether that's true or not is not >>> very clear, but nobody is claiming that the costing of index-only scans >>> is accurate yet. >>> >> >> I did a few tests and bitmap scan is faster. Maybe there is a some >> issue. In very simple test (and very syntetic test) >> >> create table omega(a int); >> insert into omega select (random()*10000)::int from generate_series(1,400000); >> >> select count(*) from omega where a = 100; >> >> and index scan is faster than index only scan. There is lot of >> duplicates. When I used a bigger range, a speed of bitmap index, index >> only scan and index scan is similar - but index scan was faster >> everywhere. > > Here, index-only scan is massively faster than any other scan: > > test=# explain analyse select count(thing) from stuff where thing = 14; > > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=99814.38..99814.39 rows=1 width=4) (actual > time=337.506..337.506 rows=1 loops=1) > -> Index Only Scan using idx_stuff_thing on stuff > (cost=0.00..99336.88 rows=191000 width=4) (actual > time=155.955..315.106 rows=196828 loops=1) > Index Cond: (thing = 14) > Total runtime: 337.639 ms > (4 rows) > > test=# set enable_indexonlyscan to false; > SET > test=# explain analyse select count(thing) from stuff where thing = 14; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------ > Aggregate (cost=99814.38..99814.39 rows=1 width=4) (actual > time=164882.528..164882.528 rows=1 loops=1) > -> Index Scan using idx_stuff_thing on stuff (cost=0.00..99336.88 > rows=191000 width=4) (actual time=0.184..164494.806 rows=196828 > loops=1) > Index Cond: (thing = 14) > Total runtime: 164882.666 ms > (4 rows) > > test=# set enable_indexonlyscan to false; > SET > test=# set enable_indexscan to false; > SET > test=# explain analyse select count(thing) from stuff where thing = 14; > > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=170553.91..170553.92 rows=1 width=4) (actual > time=154102.221..154102.222 rows=1 loops=1) > -> Bitmap Heap Scan on stuff (cost=2004.91..170076.41 rows=191000 > width=4) (actual time=482.974..153730.892 rows=196828 loops=1) > Recheck Cond: (thing = 14) > -> Bitmap Index Scan on idx_stuff_thing (cost=0.00..1957.16 > rows=191000 width=0) (actual time=421.854..421.854 rows=196828 > loops=1) > Index Cond: (thing = 14) > Total runtime: 154107.415 ms > (6 rows) > > test=# set enable_indexonlyscan to false; > SET > test=# set enable_indexscan to false; > SET > test=# set enable_bitmapscan to false; > SET > test=# explain analyse select count(thing) from stuff where thing = 14; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=812977.50..812977.51 rows=1 width=4) (actual > time=121296.897..121296.897 rows=1 loops=1) > -> Seq Scan on stuff (cost=0.00..812500.00 rows=191000 width=4) > (actual time=67.105..121215.296 rows=196828 loops=1) > Filter: (thing = 14) > Rows Removed by Filter: 14803172 > Total runtime: 121296.999 ms > (5 rows) > > Note: buffer cache cleared between queries. I did it. It is strange, so your times are significantly slower than I have. Have you enabled asserts? Pavel > > -- > Thom Brown > Twitter: @darkixion > IRC (freenode): dark_ixion > Registered Linux user: #516935 > > EnterpriseDB UK: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
On 8 October 2011 19:30, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2011/10/8 Thom Brown <thom@linux.com>: >> On 8 October 2011 18:53, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> Hello >>> >>> 2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>: >>>> hubert depesz lubaczewski <depesz@depesz.com> writes: >>>>> it is selecting 20 rows out of 30 million. why is it: >>>>> 1. not using index only scan >>>>> 2. not using even normal index scan? >>>> >>>> It thinks the bitmap scan is cheaper. Whether that's true or not is not >>>> very clear, but nobody is claiming that the costing of index-only scans >>>> is accurate yet. >>>> >>> >>> I did a few tests and bitmap scan is faster. Maybe there is a some >>> issue. In very simple test (and very syntetic test) >>> >>> create table omega(a int); >>> insert into omega select (random()*10000)::int from generate_series(1,400000); >>> >>> select count(*) from omega where a = 100; >>> >>> and index scan is faster than index only scan. There is lot of >>> duplicates. When I used a bigger range, a speed of bitmap index, index >>> only scan and index scan is similar - but index scan was faster >>> everywhere. >> >> Here, index-only scan is massively faster than any other scan: >> >> test=# explain analyse select count(thing) from stuff where thing = 14; >> >> QUERY PLAN >> ---------------------------------------------------------------------------------------------------------------------------------------------------- >> Aggregate (cost=99814.38..99814.39 rows=1 width=4) (actual >> time=337.506..337.506 rows=1 loops=1) >> -> Index Only Scan using idx_stuff_thing on stuff >> (cost=0.00..99336.88 rows=191000 width=4) (actual >> time=155.955..315.106 rows=196828 loops=1) >> Index Cond: (thing = 14) >> Total runtime: 337.639 ms >> (4 rows) >> >> test=# set enable_indexonlyscan to false; >> SET >> test=# explain analyse select count(thing) from stuff where thing = 14; >> >> QUERY PLAN >> ------------------------------------------------------------------------------------------------------------------------------------------------ >> Aggregate (cost=99814.38..99814.39 rows=1 width=4) (actual >> time=164882.528..164882.528 rows=1 loops=1) >> -> Index Scan using idx_stuff_thing on stuff (cost=0.00..99336.88 >> rows=191000 width=4) (actual time=0.184..164494.806 rows=196828 >> loops=1) >> Index Cond: (thing = 14) >> Total runtime: 164882.666 ms >> (4 rows) >> >> test=# set enable_indexonlyscan to false; >> SET >> test=# set enable_indexscan to false; >> SET >> test=# explain analyse select count(thing) from stuff where thing = 14; >> >> QUERY PLAN >> ----------------------------------------------------------------------------------------------------------------------------------------------- >> Aggregate (cost=170553.91..170553.92 rows=1 width=4) (actual >> time=154102.221..154102.222 rows=1 loops=1) >> -> Bitmap Heap Scan on stuff (cost=2004.91..170076.41 rows=191000 >> width=4) (actual time=482.974..153730.892 rows=196828 loops=1) >> Recheck Cond: (thing = 14) >> -> Bitmap Index Scan on idx_stuff_thing (cost=0.00..1957.16 >> rows=191000 width=0) (actual time=421.854..421.854 rows=196828 >> loops=1) >> Index Cond: (thing = 14) >> Total runtime: 154107.415 ms >> (6 rows) >> >> test=# set enable_indexonlyscan to false; >> SET >> test=# set enable_indexscan to false; >> SET >> test=# set enable_bitmapscan to false; >> SET >> test=# explain analyse select count(thing) from stuff where thing = 14; >> QUERY PLAN >> -------------------------------------------------------------------------------------------------------------------------- >> Aggregate (cost=812977.50..812977.51 rows=1 width=4) (actual >> time=121296.897..121296.897 rows=1 loops=1) >> -> Seq Scan on stuff (cost=0.00..812500.00 rows=191000 width=4) >> (actual time=67.105..121215.296 rows=196828 loops=1) >> Filter: (thing = 14) >> Rows Removed by Filter: 14803172 >> Total runtime: 121296.999 ms >> (5 rows) >> >> Note: buffer cache cleared between queries. > > I did it. It is strange, so your times are significantly slower than I > have. Have you enabled asserts? The table contains 15 million rows with column values randomly selected from the 1-350 range, with 60% within the 1-50 range, and asserts are enabled. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
>> I did it. It is strange, so your times are significantly slower than I >> have. Have you enabled asserts? > > The table contains 15 million rows with column values randomly > selected from the 1-350 range, with 60% within the 1-50 range, and > asserts are enabled. > Now I repeated tests on litlle bit wide table with 9 milion rows, but without success. Try to disable asserts. I am not sure, but maybe there significantlly change a speed. Pavel
On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> I did it. It is strange, so your times are significantly slower than I >>> have. Have you enabled asserts? >> >> The table contains 15 million rows with column values randomly >> selected from the 1-350 range, with 60% within the 1-50 range, and >> asserts are enabled. >> > > Now I repeated tests on litlle bit wide table with 9 milion rows, but > without success. > > Try to disable asserts. I am not sure, but maybe there significantlly > change a speed. Okay, here you go. Results with debug_assertions = false: Index-only scan: 173.389 ms (78.442 ms) Index scan: 184239.399 ms (previously 164882.666 ms) Bitmap scan: 159354.261 ms (previously 154107.415 ms) Sequential scan: 134552.263 ms (previously 121296.999 ms) So no particularly significant difference, except with the index-only scan (which I repeated 3 times and it's about the same each time). -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2011/10/8 Thom Brown <thom@linux.com>: >> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>> I did it. It is strange, so your times are significantly slower than I >>>>> have. Have you enabled asserts? >>>> >>>> The table contains 15 million rows with column values randomly >>>> selected from the 1-350 range, with 60% within the 1-50 range, and >>>> asserts are enabled. >>>> >>> >>> Now I repeated tests on litlle bit wide table with 9 milion rows, but >>> without success. >>> >>> Try to disable asserts. I am not sure, but maybe there significantlly >>> change a speed. >> >> Okay, here you go. Results with debug_assertions = false: >> >> Index-only scan: 173.389 ms (78.442 ms) >> Index scan: 184239.399 ms (previously 164882.666 ms) >> Bitmap scan: 159354.261 ms (previously 154107.415 ms) >> Sequential scan: 134552.263 ms (previously 121296.999 ms) >> >> So no particularly significant difference, except with the index-only >> scan (which I repeated 3 times and it's about the same each time). > > what is size of table? 4884MB Here's how I set it up: create table stuff (id serial, a text, b int, c int, d text, e text, f int, g int, h text, thing int); insert into stuff (a,b,c,d,e,f,g,h,thing) select 'nfewiufn weoifn weiou fnwe fnoi weuoifn woeuifniunfeiwodnjnfdoewnfjienfjioe wneinrienrf',53424524,26575757,'fhdsouhfuisdhfiudshif sdhiuf iu hfius dhfiu sdiuf dsihfiusdhiufhdsiufhdsiu fhiuds fhiusd fiuds fidsf iuds fiu dsiuf sdhiu','dfoisdhjfojdsoifjdiosjfoidsjiofjds',52944234,5246463,'fdfidsjoifjdsoifjoidsjfodsjfd',ceil(random()*350) from generate_series(1,6000000); insert into stuff (a,b,c,d,e,f,g,h,thing) select 'nfewiufn weoifn weiou fnwe fnoi weuoifn woeuifniunfeiwodnjnfdoewnfjienfjioe wneinrienrf',53424524,26575757,'fhdsouhfuisdhfiudshif sdhiuf iu hfius dhfiu sdiuf dsihfiusdhiufhdsiufhdsiu fhiuds fhiusd fiuds fidsf iuds fiu dsiuf sdhiu','dfoisdhjfojdsoifjdiosjfoidsjiofjds',52944234,5246463,'fdfidsjoifjdsoifjoidsjfodsjfd',ceil(random()*50) from generate_series(1,9000000); create index idx_stuff_thing on stuff (thing); vacuum analyse stuff; Testing without all the extra columns in the table yields a similar result pattern. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2011/10/8 Thom Brown <thom@linux.com>: > On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> I did it. It is strange, so your times are significantly slower than I >>>> have. Have you enabled asserts? >>> >>> The table contains 15 million rows with column values randomly >>> selected from the 1-350 range, with 60% within the 1-50 range, and >>> asserts are enabled. >>> >> >> Now I repeated tests on litlle bit wide table with 9 milion rows, but >> without success. >> >> Try to disable asserts. I am not sure, but maybe there significantlly >> change a speed. > > Okay, here you go. Results with debug_assertions = false: > > Index-only scan: 173.389 ms (78.442 ms) > Index scan: 184239.399 ms (previously 164882.666 ms) > Bitmap scan: 159354.261 ms (previously 154107.415 ms) > Sequential scan: 134552.263 ms (previously 121296.999 ms) > > So no particularly significant difference, except with the index-only > scan (which I repeated 3 times and it's about the same each time). what is size of table? It is mystic - I created 1.5GB long table, Everywhere I restarted postgres and I did a reset of system cache and still I have index only scan little bit slower than index scan. I use [root@nemesis pavel]# uname -a Linux nemesis 2.6.35.14-97.fc14.i686 #1 SMP Sat Sep 17 00:34:02 UTC 2011 i686 i686 i386 GNU/Linux Notebook Dell D830 - 2GB RAM, PostgreSQL used with default configuration - 24MB Shared buffers [pavel@nemesis ~]$ bonnie++ Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP nemesis 4G 170 99 38853 10 17292 6 1283 94 41249 7 140.2 6 Latency 56039us 1374ms 1647ms 93214us 365ms 429ms Version 1.96 ------Sequential Create------ --------Random Create-------- nemesis -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 9777 47 +++++ +++ 20363 39 11331 57 +++++ +++ 23478 45 Latency 91217us 1079us 1147us 200us 1148us 199us 1.96,1.96,nemesis,1,1318123502,4G,,170,99,38853,10,17292,6,1283,94,41249,7,140.2,6,16,,,,,9777,47,+++++,+++,20363,39,11331,57,+++++,+++,23478,45,56039us,1374ms,1647ms,93214us,365ms,429ms,91217us,1079us,1147us,200us,1148us,199us > > -- > Thom Brown > Twitter: @darkixion > IRC (freenode): dark_ixion > Registered Linux user: #516935 > > EnterpriseDB UK: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
2011/10/8 Thom Brown <thom@linux.com>: > On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2011/10/8 Thom Brown <thom@linux.com>: >>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>>> I did it. It is strange, so your times are significantly slower than I >>>>>> have. Have you enabled asserts? >>>>> >>>>> The table contains 15 million rows with column values randomly >>>>> selected from the 1-350 range, with 60% within the 1-50 range, and >>>>> asserts are enabled. >>>>> >>>> >>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but >>>> without success. >>>> >>>> Try to disable asserts. I am not sure, but maybe there significantlly >>>> change a speed. >>> >>> Okay, here you go. Results with debug_assertions = false: >>> >>> Index-only scan: 173.389 ms (78.442 ms) >>> Index scan: 184239.399 ms (previously 164882.666 ms) >>> Bitmap scan: 159354.261 ms (previously 154107.415 ms) >>> Sequential scan: 134552.263 ms (previously 121296.999 ms) >>> >>> So no particularly significant difference, except with the index-only >>> scan (which I repeated 3 times and it's about the same each time). >> >> what is size of table? > > 4884MB It has a sense - index only scan it is faster (and significantly faster) on wider tables - or tables with strings where TOAST is not active. Maybe there is a some issue because on thin tables is slower (and I expect a should be faster everywhere). Regards Pavel Stehule > > Here's how I set it up: > > create table stuff (id serial, a text, b int, c int, d text, e text, f > int, g int, h text, thing int); > > insert into stuff (a,b,c,d,e,f,g,h,thing) select 'nfewiufn weoifn > weiou fnwe fnoi weuoifn woeuifniunfeiwodnjnfdoewnfjienfjioe > wneinrienrf',53424524,26575757,'fhdsouhfuisdhfiudshif sdhiuf iu hfius > dhfiu sdiuf dsihfiusdhiufhdsiufhdsiu fhiuds fhiusd fiuds fidsf iuds > fiu dsiuf sdhiu','dfoisdhjfojdsoifjdiosjfoidsjiofjds',52944234,5246463,'fdfidsjoifjdsoifjoidsjfodsjfd',ceil(random()*350) > from generate_series(1,6000000); > > insert into stuff (a,b,c,d,e,f,g,h,thing) select 'nfewiufn weoifn > weiou fnwe fnoi weuoifn woeuifniunfeiwodnjnfdoewnfjienfjioe > wneinrienrf',53424524,26575757,'fhdsouhfuisdhfiudshif sdhiuf iu hfius > dhfiu sdiuf dsihfiusdhiufhdsiufhdsiu fhiuds fhiusd fiuds fidsf iuds > fiu dsiuf sdhiu','dfoisdhjfojdsoifjdiosjfoidsjiofjds',52944234,5246463,'fdfidsjoifjdsoifjoidsjfodsjfd',ceil(random()*50) > from generate_series(1,9000000); > > create index idx_stuff_thing on stuff (thing); > > vacuum analyse stuff; > > Testing without all the extra columns in the table yields a similar > result pattern. > > -- > Thom Brown > Twitter: @darkixion > IRC (freenode): dark_ixion > Registered Linux user: #516935 > > EnterpriseDB UK: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2011/10/8 Thom Brown <thom@linux.com>: >> On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> 2011/10/8 Thom Brown <thom@linux.com>: >>>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>>>> I did it. It is strange, so your times are significantly slower than I >>>>>>> have. Have you enabled asserts? >>>>>> >>>>>> The table contains 15 million rows with column values randomly >>>>>> selected from the 1-350 range, with 60% within the 1-50 range, and >>>>>> asserts are enabled. >>>>>> >>>>> >>>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but >>>>> without success. >>>>> >>>>> Try to disable asserts. I am not sure, but maybe there significantlly >>>>> change a speed. >>>> >>>> Okay, here you go. Results with debug_assertions = false: >>>> >>>> Index-only scan: 173.389 ms (78.442 ms) >>>> Index scan: 184239.399 ms (previously 164882.666 ms) >>>> Bitmap scan: 159354.261 ms (previously 154107.415 ms) >>>> Sequential scan: 134552.263 ms (previously 121296.999 ms) >>>> >>>> So no particularly significant difference, except with the index-only >>>> scan (which I repeated 3 times and it's about the same each time). >>> >>> what is size of table? >> >> 4884MB > > It has a sense - index only scan it is faster (and significantly > faster) on wider tables - or tables with strings where TOAST is not > active. Maybe there is a some issue because on thin tables is slower > (and I expect a should be faster everywhere). No, that's my point, I re-tested it on a table with just 2 int columns, and the results are roughly the same. I added all the columns to make it expensive to fetch the column being queried. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2011/10/9 Thom Brown <thom@linux.com>: > On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2011/10/8 Thom Brown <thom@linux.com>: >>> On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> 2011/10/8 Thom Brown <thom@linux.com>: >>>>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>>>>> I did it. It is strange, so your times are significantly slower than I >>>>>>>> have. Have you enabled asserts? >>>>>>> >>>>>>> The table contains 15 million rows with column values randomly >>>>>>> selected from the 1-350 range, with 60% within the 1-50 range, and >>>>>>> asserts are enabled. >>>>>>> >>>>>> >>>>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but >>>>>> without success. >>>>>> >>>>>> Try to disable asserts. I am not sure, but maybe there significantlly >>>>>> change a speed. >>>>> >>>>> Okay, here you go. Results with debug_assertions = false: >>>>> >>>>> Index-only scan: 173.389 ms (78.442 ms) >>>>> Index scan: 184239.399 ms (previously 164882.666 ms) >>>>> Bitmap scan: 159354.261 ms (previously 154107.415 ms) >>>>> Sequential scan: 134552.263 ms (previously 121296.999 ms) >>>>> >>>>> So no particularly significant difference, except with the index-only >>>>> scan (which I repeated 3 times and it's about the same each time). >>>> >>>> what is size of table? >>> >>> 4884MB >> >> It has a sense - index only scan it is faster (and significantly >> faster) on wider tables - or tables with strings where TOAST is not >> active. Maybe there is a some issue because on thin tables is slower >> (and I expect a should be faster everywhere). > > No, that's my point, I re-tested it on a table with just 2 int > columns, and the results are roughly the same. I added all the > columns to make it expensive to fetch the column being queried. then I don't understand Regards Pavel > > -- > Thom Brown > Twitter: @darkixion > IRC (freenode): dark_ixion > Registered Linux user: #516935 > > EnterpriseDB UK: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
On 9 October 2011 11:51, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2011/10/9 Thom Brown <thom@linux.com>: >> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> 2011/10/8 Thom Brown <thom@linux.com>: >>>> On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>> 2011/10/8 Thom Brown <thom@linux.com>: >>>>>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>>>>>> I did it. It is strange, so your times are significantly slower than I >>>>>>>>> have. Have you enabled asserts? >>>>>>>> >>>>>>>> The table contains 15 million rows with column values randomly >>>>>>>> selected from the 1-350 range, with 60% within the 1-50 range, and >>>>>>>> asserts are enabled. >>>>>>>> >>>>>>> >>>>>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but >>>>>>> without success. >>>>>>> >>>>>>> Try to disable asserts. I am not sure, but maybe there significantlly >>>>>>> change a speed. >>>>>> >>>>>> Okay, here you go. Results with debug_assertions = false: >>>>>> >>>>>> Index-only scan: 173.389 ms (78.442 ms) >>>>>> Index scan: 184239.399 ms (previously 164882.666 ms) >>>>>> Bitmap scan: 159354.261 ms (previously 154107.415 ms) >>>>>> Sequential scan: 134552.263 ms (previously 121296.999 ms) >>>>>> >>>>>> So no particularly significant difference, except with the index-only >>>>>> scan (which I repeated 3 times and it's about the same each time). >>>>> >>>>> what is size of table? >>>> >>>> 4884MB >>> >>> It has a sense - index only scan it is faster (and significantly >>> faster) on wider tables - or tables with strings where TOAST is not >>> active. Maybe there is a some issue because on thin tables is slower >>> (and I expect a should be faster everywhere). >> >> No, that's my point, I re-tested it on a table with just 2 int >> columns, and the results are roughly the same. I added all the >> columns to make it expensive to fetch the column being queried. > > then I don't understand Well here's some more reliable results since each has been run 3 times. The row size in this sample is a 10th of previous ones (i.e. 1.5 million rows): -- Narrow table (table with 2 int columns) -- Assertions = on Index-only scan = 6.088, 5.885, 6.361 Index scan = 65.661, 63.441, 64.105 Bitmap scan = 68.448, 66.476, 64.626 Sequential scan = 244.129, 239.584, 242.680 Assertions = off Index-only scan = 6.710, 6.709, 6.192 Index scan = 66.838, 67.534, 64.348 Bitmap scan = 75.662, 64.500, 69.080 Sequential scan = 232.065, 231.366, 231.547 -- Wide table (table as described in earlier post) -- Assertions = on Index-only scan = 7.313, 7.299, 6.401 Index scan = 193.555, 186.564, 198.150 Bitmap scan = 199.082, 204.664, 207.902 Sequential scan = 643.765, 645.426, 621.150 Assertions = off Index-only scan = 7.569, 6.477, 6.113 Index scan = 197.332, 197.286, 204.257 Bitmap scan = 207.838, 202.235, 203.322 Sequential scan = 653.739, 633.309, 619.081 Bitmap scans tend not to be faster than index scans in this scenario. These numbers are produced by HEAD as of this morning. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Pavel Stehule <pavel.stehule@gmail.com> writes: > 2011/10/9 Thom Brown <thom@linux.com>: >> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> It has a sense - index only scan it is faster (and significantly >>> faster) on wider tables - or tables with strings where TOAST is not >>> active. Maybe there is a some issue because on thin tables is slower >>> (and I expect a should be faster everywhere). >> No, that's my point, I re-tested it on a table with just 2 int >> columns, and the results are roughly the same. I added all the >> columns to make it expensive to fetch the column being queried. > then I don't understand Are you sure you've remembered to vacuum the test table? I get results like yours (ie, no speed benefit for index-only scan) if the table doesn't have its visibility-map bits set. regards, tom lane
2011/10/9 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> 2011/10/9 Thom Brown <thom@linux.com>: >>> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> It has a sense - index only scan it is faster (and significantly >>>> faster) on wider tables - or tables with strings where TOAST is not >>>> active. Maybe there is a some issue because on thin tables is slower >>>> (and I expect a should be faster everywhere). > >>> No, that's my point, I re-tested it on a table with just 2 int >>> columns, and the results are roughly the same. I added all the >>> columns to make it expensive to fetch the column being queried. > >> then I don't understand > > Are you sure you've remembered to vacuum the test table? I get results > like yours (ie, no speed benefit for index-only scan) if the table > doesn't have its visibility-map bits set. it should be - I didn't do VACUUM Regards Pavel > > regards, tom lane >
2011/10/9 Pavel Stehule <pavel.stehule@gmail.com>: > 2011/10/9 Tom Lane <tgl@sss.pgh.pa.us>: >> Pavel Stehule <pavel.stehule@gmail.com> writes: >>> 2011/10/9 Thom Brown <thom@linux.com>: >>>> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>> It has a sense - index only scan it is faster (and significantly >>>>> faster) on wider tables - or tables with strings where TOAST is not >>>>> active. Maybe there is a some issue because on thin tables is slower >>>>> (and I expect a should be faster everywhere). >> >>>> No, that's my point, I re-tested it on a table with just 2 int >>>> columns, and the results are roughly the same. I added all the >>>> columns to make it expensive to fetch the column being queried. >> >>> then I don't understand >> >> Are you sure you've remembered to vacuum the test table? I get results >> like yours (ie, no speed benefit for index-only scan) if the table >> doesn't have its visibility-map bits set. > > it should be - I didn't do VACUUM > yes, After VACUUM I got a significantly better times - index only scan is about 5-6x better Regards Pavel Stehule > > Regards > > Pavel >> >> regards, tom lane >> >
On 9 October 2011 18:38, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2011/10/9 Pavel Stehule <pavel.stehule@gmail.com>: >> 2011/10/9 Tom Lane <tgl@sss.pgh.pa.us>: >>> Pavel Stehule <pavel.stehule@gmail.com> writes: >>>> 2011/10/9 Thom Brown <thom@linux.com>: >>>>> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>>> It has a sense - index only scan it is faster (and significantly >>>>>> faster) on wider tables - or tables with strings where TOAST is not >>>>>> active. Maybe there is a some issue because on thin tables is slower >>>>>> (and I expect a should be faster everywhere). >>> >>>>> No, that's my point, I re-tested it on a table with just 2 int >>>>> columns, and the results are roughly the same. I added all the >>>>> columns to make it expensive to fetch the column being queried. >>> >>>> then I don't understand >>> >>> Are you sure you've remembered to vacuum the test table? I get results >>> like yours (ie, no speed benefit for index-only scan) if the table >>> doesn't have its visibility-map bits set. >> >> it should be - I didn't do VACUUM >> > > yes, After VACUUM I got a significantly better times - index only scan > is about 5-6x better Something that I was expecting the planner to do with this patch, which it doesn't, is pull in the index for queries like: SELECT count(*) from my_table; or SELECT sum(indexed_column) from my_table; I don't see why a non-partial index can't fulfill these queries. I can only get index-only scans with WHERE conditions. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company