Обсуждение: 10.1: hash index size exploding on vacuum full analyze
I've some tables that'll never grow so I decided to replace a big index with one with a fillfactor of 100. That went well. The index shrunk to 280GB. I then did a vacuum full analyze on the table to get rid of any cruft (as the table will be static for a long time and then only deletes will happen) and the index exploded to 701GB. When it was created with fillfactor 90 (organically by filling the table) the index was 309GB. I would've expected the index size to, at worst, remain constant rather than explode. Am I wrong or is this a bug? PostgreSQL is v10.1. Original index created on v10.0.
On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: > I've some tables that'll never grow so I decided to replace a big index > with one with a fillfactor of 100. That went well. The index shrunk to > 280GB. I then did a vacuum full analyze on the table to get rid of any > cruft (as the table will be static for a long time and then only deletes > will happen) and the index exploded to 701GB. When it was created with > fillfactor 90 (organically by filling the table) the index was 309GB. > Sounds quite strange. I think during vacuum it leads to more number of splits than when the original data was loaded. By any chance do you have a copy of both the indexes (before vacuum full and after vacuum full)? Can you once check and share the output of pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()?I wanted to confirm if the bloat is due to additionalsplits. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: > On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: > > I've some tables that'll never grow so I decided to replace a big index > > with one with a fillfactor of 100. That went well. The index shrunk to > > 280GB. I then did a vacuum full analyze on the table to get rid of any > > cruft (as the table will be static for a long time and then only deletes > > will happen) and the index exploded to 701GB. When it was created with > > fillfactor 90 (organically by filling the table) the index was 309GB. > > Sounds quite strange. I think during vacuum it leads to more number > of splits than when the original data was loaded. By any chance do > you have a copy of both the indexes (before vacuum full and after > vacuum full)? Can you once check and share the output of > pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? > I wanted to confirm if the bloat is due to additional splits. I'll see what I can do. Currently vacuuming the table without the index so that I can then do a create index concurrently and get back my 280GB index (it's how I got it in the first place). Namely: create index concurrently on ... using hash (datum) with ( fillfactor = 100 ); I've got more similar tables, though. AP
On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: > Sounds quite strange. I think during vacuum it leads to more number > of splits than when the original data was loaded. By any chance do > you have a copy of both the indexes (before vacuum full and after > vacuum full)? Can you once check and share the output of > pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? > I wanted to confirm if the bloat is due to additional splits. For the latter is this correct? select * from hash_metapage_info(get_raw_page('exploding_index', 0))\gx AP
On Thu, Nov 16, 2017 at 10:32 AM, AP <pgsql@inml.weebeastie.net> wrote: > On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: >> Sounds quite strange. I think during vacuum it leads to more number >> of splits than when the original data was loaded. By any chance do >> you have a copy of both the indexes (before vacuum full and after >> vacuum full)? Can you once check and share the output of >> pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? >> I wanted to confirm if the bloat is due to additional splits. > > For the latter is this correct? > > select * from hash_metapage_info(get_raw_page('exploding_index', 0))\gx > I think it should work, but please refer documentation [1] for exact usage. [1] - https://www.postgresql.org/docs/devel/static/pageinspect.html#idm191242 -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Nov 16, 2017 at 10:36:39AM +0530, Amit Kapila wrote: > On Thu, Nov 16, 2017 at 10:32 AM, AP <pgsql@inml.weebeastie.net> wrote: > > On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: > >> Sounds quite strange. I think during vacuum it leads to more number > >> of splits than when the original data was loaded. By any chance do > >> you have a copy of both the indexes (before vacuum full and after > >> vacuum full)? Can you once check and share the output of > >> pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? > >> I wanted to confirm if the bloat is due to additional splits. > > > > For the latter is this correct? > > > > select * from hash_metapage_info(get_raw_page('exploding_index', 0))\gx > > I think it should work, but please refer documentation [1] for exact usage. > > [1] - https://www.postgresql.org/docs/devel/static/pageinspect.html#idm191242 Cool. That's where I got the usage from. The "0" argument of get_raw_page seemed somewhat arbitrary so I wasn't sure if that was correct. If all's well, though, then I'll have some values Tuesday/Wednesday. The VACUUM FULL alone takes ~1.5 days at least and pgstathashindex() is not the fastest duck in the west and I can't start VACCUMing until it's done working out the "before" stats. AP
On Thu, Nov 16, 2017 at 10:00 AM, AP <pgsql@inml.weebeastie.net> wrote: > On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: >> On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: >> > I've some tables that'll never grow so I decided to replace a big index >> > with one with a fillfactor of 100. That went well. The index shrunk to >> > 280GB. I then did a vacuum full analyze on the table to get rid of any >> > cruft (as the table will be static for a long time and then only deletes >> > will happen) and the index exploded to 701GB. When it was created with >> > fillfactor 90 (organically by filling the table) the index was 309GB. >> >> Sounds quite strange. I think during vacuum it leads to more number >> of splits than when the original data was loaded. By any chance do >> you have a copy of both the indexes (before vacuum full and after >> vacuum full)? Can you once check and share the output of >> pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? >> I wanted to confirm if the bloat is due to additional splits. > > I'll see what I can do. Currently vacuuming the table without the index > so that I can then do a create index concurrently and get back my 280GB > index (it's how I got it in the first place). Namely: > One possible theory could be that the calculation for initial buckets required for the index has overestimated the number of buckets. I think this is possible because we choose the initial number of buckets based on the number of tuples, but actually while inserting the values we might have created more of overflow buckets rather than using the newly created primary buckets. The chances of such a misestimation are more when there are duplicate values. Now, if that is true, then actually one should see the same size of the index (as you have seen after vacuum full ..) when you create an index on the table with the same values in index columns. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Nov 16, 2017 at 9:48 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: >> I've some tables that'll never grow so I decided to replace a big index >> with one with a fillfactor of 100. That went well. The index shrunk to >> 280GB. I then did a vacuum full analyze on the table to get rid of any >> cruft (as the table will be static for a long time and then only deletes >> will happen) and the index exploded to 701GB. When it was created with >> fillfactor 90 (organically by filling the table) the index was 309GB. >> > > Sounds quite strange. I think during vacuum it leads to more number > of splits than when the original data was loaded. By any chance do > you have a copy of both the indexes (before vacuum full and after > vacuum full)? Can you once check and share the output of > pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? > I wanted to confirm if the bloat is due to additional splits. > This looks surprising to me too... AP, Is there anything else happening in parallel with VACUUM that could lead to increase in the index table size. Anyways, before i put my thoughts, i would like to summarize on what you have done here, 1) Created hash index table on your base table with ff=90. 2) You then realised that your base table is static and therefore thought of changing the index table fillfactor to 100. For that you altered the index table to set FF=100 3) REINDEX your hash index table. 4) Checked for the index table size. It got reduced from 309GB to 280GB. 5) Ran VACUUM FULL ANALYZE and checked for the index table size. Now you saw the index table size as 709GB which was not expected. I think, in hash index the table size should remain the same i.e 280GB in your case. I think, as Amit suggested, the first thing you can do is, share the index table statistics before and after VACUUM. Also, as i mentioned above, it would be worth checking if there is something that could be running in parallel with VACUUM. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com
On Fri, Nov 17, 2017 at 7:58 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Nov 16, 2017 at 10:00 AM, AP <pgsql@inml.weebeastie.net> wrote: >> On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: >>> On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: >>> > I've some tables that'll never grow so I decided to replace a big index >>> > with one with a fillfactor of 100. That went well. The index shrunk to >>> > 280GB. I then did a vacuum full analyze on the table to get rid of any >>> > cruft (as the table will be static for a long time and then only deletes >>> > will happen) and the index exploded to 701GB. When it was created with >>> > fillfactor 90 (organically by filling the table) the index was 309GB. >>> >>> Sounds quite strange. I think during vacuum it leads to more number >>> of splits than when the original data was loaded. By any chance do >>> you have a copy of both the indexes (before vacuum full and after >>> vacuum full)? Can you once check and share the output of >>> pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? >>> I wanted to confirm if the bloat is due to additional splits. >> >> I'll see what I can do. Currently vacuuming the table without the index >> so that I can then do a create index concurrently and get back my 280GB >> index (it's how I got it in the first place). Namely: >> > > One possible theory could be that the calculation for initial buckets > required for the index has overestimated the number of buckets. I > think this is possible because we choose the initial number of buckets > based on the number of tuples, but actually while inserting the values > we might have created more of overflow buckets rather than using the > newly created primary buckets. The chances of such a misestimation > are more when there are duplicate values. Now, if that is true, then > actually one should see the same size of the index (as you have seen > after vacuum full ..) when you create an index on the table with the > same values in index columns. > Amit, I think what you are trying to put here is that the estimation on number of hash buckets required is calculated based on the number of tuples in the base table but during this calculation we are not aware of the fact that the table contains more of the duplicate values or not. If it contains more of a duplicate values then during index insertion it would start adding overflow page and many of the hash index buckets added at start i.e. during hash index size estimation would remain unused. If this is true then i think hash index would not be the right choice. However, this is might not be exactly related to what AP has reported here. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com
On Fri, Nov 17, 2017 at 11:58 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > On Fri, Nov 17, 2017 at 7:58 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Thu, Nov 16, 2017 at 10:00 AM, AP <pgsql@inml.weebeastie.net> wrote: >>> On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: >>>> On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: >>>> > I've some tables that'll never grow so I decided to replace a big index >>>> > with one with a fillfactor of 100. That went well. The index shrunk to >>>> > 280GB. I then did a vacuum full analyze on the table to get rid of any >>>> > cruft (as the table will be static for a long time and then only deletes >>>> > will happen) and the index exploded to 701GB. When it was created with >>>> > fillfactor 90 (organically by filling the table) the index was 309GB. >>>> >>>> Sounds quite strange. I think during vacuum it leads to more number >>>> of splits than when the original data was loaded. By any chance do >>>> you have a copy of both the indexes (before vacuum full and after >>>> vacuum full)? Can you once check and share the output of >>>> pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? >>>> I wanted to confirm if the bloat is due to additional splits. >>> >>> I'll see what I can do. Currently vacuuming the table without the index >>> so that I can then do a create index concurrently and get back my 280GB >>> index (it's how I got it in the first place). Namely: >>> >> >> One possible theory could be that the calculation for initial buckets >> required for the index has overestimated the number of buckets. I >> think this is possible because we choose the initial number of buckets >> based on the number of tuples, but actually while inserting the values >> we might have created more of overflow buckets rather than using the >> newly created primary buckets. The chances of such a misestimation >> are more when there are duplicate values. Now, if that is true, then >> actually one should see the same size of the index (as you have seen >> after vacuum full ..) when you create an index on the table with the >> same values in index columns. >> > > Amit, I think what you are trying to put here is that the estimation > on number of hash buckets required is calculated based on the number > of tuples in the base table but during this calculation we are not > aware of the fact that the table contains more of the duplicate values > or not. If it contains more of a duplicate values then during index > insertion it would start adding overflow page and many of the hash > index buckets added at start i.e. during hash index size estimation > would remain unused. If this is true then i think hash index would not > be the right choice. > Hmm, I am not sure that is the conclusion we can draw from this behavior as we can change it if required. However, before drawing any conclusions based on this theory, we should first try to find what is the actual problem. > However, this is might not be exactly related to > what AP has reported here. > Yeah, quite possible. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Fri, Nov 17, 2017 at 11:50:57AM +0530, Ashutosh Sharma wrote: > AP, Is there anything else happening in parallel with VACUUM that > could lead to increase in the index table size. Nope. System was quiet. It was, in fact, the only thing happening. > Anyways, before i put my thoughts, i would like to summarize on what > you have done here, > > 1) Created hash index table on your base table with ff=90. Yup. > 2) You then realised that your base table is static and therefore > thought of changing the index table fillfactor to 100. For that you > altered the index table to set FF=100 Almost. :) > 3) REINDEX your hash index table. Nope. REINDEX does not do CONCURRENTLY so I created a minty fresh index. Index was created like so: create index concurrently on schema.table using hash (datum_id) with ( fillfactor = 100 ); > 4) Checked for the index table size. It got reduced from 309GB to 280GB. Yup. > 5) Ran VACUUM FULL ANALYZE and checked for the index table size. Now > you saw the index table size as 709GB which was not expected. I think, Yes. > in hash index the table size should remain the same i.e 280GB in your > case. This was my thought also. > I think, as Amit suggested, the first thing you can do is, share the > index table statistics before and after VACUUM. Also, as i mentioned > above, it would be worth checking if there is something that could be > running in parallel with VACUUM. Hopefully I have that now. AP
On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: > On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: > > I've some tables that'll never grow so I decided to replace a big index > > with one with a fillfactor of 100. That went well. The index shrunk to > > 280GB. I then did a vacuum full analyze on the table to get rid of any > > cruft (as the table will be static for a long time and then only deletes > > will happen) and the index exploded to 701GB. When it was created with > > fillfactor 90 (organically by filling the table) the index was 309GB. FYI: Nuking the above and doing a create index run gave me a 280GB index again. > Sounds quite strange. I think during vacuum it leads to more number > of splits than when the original data was loaded. By any chance do > you have a copy of both the indexes (before vacuum full and after > vacuum full)? Can you once check and share the output of > pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? > I wanted to confirm if the bloat is due to additional splits. Before VACUUM FULL: Schema | Name | Type | Owner | Table | Size | Description ---------------+----------------------------------------------------+-------+-----------+---------------------------------------+--------+-------------bundle_link | be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx | index | mdkingpin | be5be0ac8_3ba5_407d_8183_2d05b9387e81 | 273 GB| mdstash=# select * from hash_metapage_info(get_raw_page('bundle_link.be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx',0))\gx -[ RECORD 1 ]---...--- magic | 105121344 version | 4 ntuples | 9123458028 ffactor | 409 bsize | 8152 bmsize | 4096 bmshift | 15 maxbucket | 25165823 highmask | 33554431 lowmask | 16777215 ovflpoint | 71 firstfree | 10623106 nmaps | 325 procid | 456 spares | {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10623106,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0} mapp | {25165825,25198593,25231361,25264129,25296897,25329665,25362433,25395201,25427969,25460737,25493505,25526273,25559041,25591809,25624577,25657345,25690113,25722881,25755649,25788417,25821185,25853953,25886721,25919489,25952257,25985025,26017793,26050561,26083329,26116097,26148865,26181633,26214401,26247169,26279937,26312705,26345473,26378241,26411009,26443777,26476545,26509313,26542081,26574849,26607617,26640385,26673153,26705921,26738689,26771457,26804225,26836993,26869761,26902529,26935297,26968065,27000833,27033601,27066369,27099137,27131905,27164673,27197441,27230209,27262977,27295745,27328513,27361281,27394049,27426817,27459585,27492353,27525121,27557889,27590657,27623425,27656193,27688961,27721729,27754497,27787265,27820033,27852801,27885569,27918337,27951105,27983873,28016641,28049409,28082177,28114945,28147713,28180481,28213249,28246017,28278785,28311553,28344321,28377089,28409857,28442625,28475393,28508161,28540929,28573697,28606465,28639233,28672001,28704769,28737537,28770305,28803073,28835841,28868609,28901377,28934145,28966913,28999681,29032449,29065217,29097985,29130753,29163521,29196289,29229057,29261825,29294593,29327361,29360129,29392897,29425665,29458433,29491201,29523969,29556737,29589505,29622273,29655041,29687809,29720577,29753345,29786113,29818881,29851649,29884417,29917185,29949953,29982721,30015489,30048257,30081025,30113793,30146561,30179329,30212097,30244865,30277633,30310401,30343169,30375937,30408705,30441473,30474241,30507009,30539777,30572545,30605313,30638081,30670849,30703617,30736385,30769153,30801921,30834689,30867457,30900225,30932993,30965761,30998529,31031297,31064065,31096833,31129601,31162369,31195137,31227905,31260673,31293441,31326209,31358977,31391745,31424513,31457281,31490049,31522817,31555585,31588353,31621121,31653889,31686657,31719425,31752193,31784961,31817729,31850497,31883265,31916033,31948801,31981569,32014337,32047105,32079873,32112641,32145409,32178177,32210945,32243713,32276481,32309249,32342017,32374785,32407553,32440321,32473089,32505857,32538625,32571393,32604161,32636929,32669697,32702465,32735233,32768001,32800769,32833537,32866305,32899073,32931841,32964609,32997377,33030145,33062913,33095681,33128449,33161217,33193985,33226753,33259521,33292289,33325057,33357825,33390593,33423361,33456129,33488897,33521665,33554433,33587201,33619969,33652737,33685505,33718273,33751041,33783809,33816577,33849345,33882113,33914881,33947649,33980417,34013185,34045953,34078721,34111489,34144257,34177025,34209793,34242561,34275329,34308097,34340865,34373633,34406401,34439169,34471937,34504705,34537473,34570241,34603009,34635777,34668545,34701313,34734081,34766849,34799617,34832385,34865153,34897921,34930689,34963457,34996225,35028993,35061761,35094529,35127297,35160065,35192833,35225601,35258369,35291137,35323905,35356673,35389441,35422209,35454977,35487745,35520513,35553281,35586049,35618817,35651585,35684353,35717121,35749889,35782657,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0} Time: 0.613 ms mdstash=# select * from pgstathashindex('bundle_link.be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx');version | bucket_pages| overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent ---------+--------------+----------------+--------------+--------------+------------+------------+------------------ 4 | 25165824 | 10622781 | 325 | 0 | 9123458028 | 0 | 37.4567373970968 (1 row) Time: 2002419.406 ms (33:22.419) After VACUUM FULL: Schema | Name | Type | Owner | Table | Size | Description ---------------+----------------------------------------------------+-------+-----------+---------------------------------------+--------+-------------bundle_link | be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx | index | mdkingpin | be5be0ac8_3ba5_407d_8183_2d05b9387e81 | 701 GB| -[ RECORD 1 ]---...--- magic | 105121344 version | 4 ntuples | 9123458028 ffactor | 409 bsize | 8152 bmsize | 4096 bmshift | 15 maxbucket | 83886079 highmask | 134217727 lowmask | 67108863 ovflpoint | 78 firstfree | 7996259 nmaps | 245 procid | 456 spares | {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7996259,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0} mapp | {83886081,83918849,83951617,83984385,84017153,84049921,84082689,84115457,84148225,84180993,84213761,84246529,84279297,84312065,84344833,84377601,84410369,84443137,84475905,84508673,84541441,84574209,84606977,84639745,84672513,84705281,84738049,84770817,84803585,84836353,84869121,84901889,84934657,84967425,85000193,85032961,85065729,85098497,85131265,85164033,85196801,85229569,85262337,85295105,85327873,85360641,85393409,85426177,85458945,85491713,85524481,85557249,85590017,85622785,85655553,85688321,85721089,85753857,85786625,85819393,85852161,85884929,85917697,85950465,85983233,86016001,86048769,86081537,86114305,86147073,86179841,86212609,86245377,86278145,86310913,86343681,86376449,86409217,86441985,86474753,86507521,86540289,86573057,86605825,86638593,86671361,86704129,86736897,86769665,86802433,86835201,86867969,86900737,86933505,86966273,86999041,87031809,87064577,87097345,87130113,87162881,87195649,87228417,87261185,87293953,87326721,87359489,87392257,87425025,87457793,87490561,87523329,87556097,87588865,87621633,87654401,87687169,87719937,87752705,87785473,87818241,87851009,87883777,87916545,87949313,87982081,88014849,88047617,88080385,88113153,88145921,88178689,88211457,88244225,88276993,88309761,88342529,88375297,88408065,88440833,88473601,88506369,88539137,88571905,88604673,88637441,88670209,88702977,88735745,88768513,88801281,88834049,88866817,88899585,88932353,88965121,88997889,89030657,89063425,89096193,89128961,89161729,89194497,89227265,89260033,89292801,89325569,89358337,89391105,89423873,89456641,89489409,89522177,89554945,89587713,89620481,89653249,89686017,89718785,89751553,89784321,89817089,89849857,89882625,89915393,89948161,89980929,90013697,90046465,90079233,90112001,90144769,90177537,90210305,90243073,90275841,90308609,90341377,90374145,90406913,90439681,90472449,90505217,90537985,90570753,90603521,90636289,90669057,90701825,90734593,90767361,90800129,90832897,90865665,90898433,90931201,90963969,90996737,91029505,91062273,91095041,91127809,91160577,91193345,91226113,91258881,91291649,91324417,91357185,91389953,91422721,91455489,91488257,91521025,91553793,91586561,91619329,91652097,91684865,91717633,91750401,91783169,91815937,91848705,91881473,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0} Time: 69.237 ms mdstash=# select * from pgstathashindex('bundle_link.be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx');version | bucket_pages| overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent ---------+--------------+----------------+--------------+--------------+------------+------------+------------------ 4 | 83886080 | 7996014 | 245 | 0 | 9123458028 | 0 | 75.6390388675015 (1 row) Time: 2474290.172 ms (41:14.290) Tell me if you need me to keep the index around. AP
On Mon, Nov 20, 2017 at 5:01 AM, AP <pgsql@inml.weebeastie.net> wrote: > On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: >> On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: >> > I've some tables that'll never grow so I decided to replace a big index >> > with one with a fillfactor of 100. That went well. The index shrunk to >> > 280GB. I then did a vacuum full analyze on the table to get rid of any >> > cruft (as the table will be static for a long time and then only deletes >> > will happen) and the index exploded to 701GB. When it was created with >> > fillfactor 90 (organically by filling the table) the index was 309GB. > > FYI: Nuking the above and doing a create index run gave me a 280GB index again. > >> Sounds quite strange. I think during vacuum it leads to more number >> of splits than when the original data was loaded. By any chance do >> you have a copy of both the indexes (before vacuum full and after >> vacuum full)? Can you once check and share the output of >> pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? >> I wanted to confirm if the bloat is due to additional splits. > > Before VACUUM FULL: > > Schema | Name | Type | Owner | Table | Size | Description > ---------------+----------------------------------------------------+-------+-----------+---------------------------------------+--------+------------- > bundle_link | be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx | index | mdkingpin | be5be0ac8_3ba5_407d_8183_2d05b9387e81| 273 GB | > > mdstash=# select * from hash_metapage_info(get_raw_page('bundle_link.be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx',0))\gx > -[ RECORD 1 ]---...--- > magic | 105121344 > version | 4 > ntuples | 9123458028 > ffactor | 409 > bsize | 8152 > bmsize | 4096 > bmshift | 15 > maxbucket | 25165823 > highmask | 33554431 > lowmask | 16777215 > ovflpoint | 71 > firstfree | 10623106 > nmaps | 325 > procid | 456 > ... > Time: 0.613 ms > > mdstash=# select * from pgstathashindex('bundle_link.be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx'); > version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent > ---------+--------------+----------------+--------------+--------------+------------+------------+------------------ > 4 | 25165824 | 10622781 | 325 | 0 | 9123458028 | 0 | 37.4567373970968 > (1 row) > > Time: 2002419.406 ms (33:22.419) > > After VACUUM FULL: > > Schema | Name | Type | Owner | Table | Size | Description > ---------------+----------------------------------------------------+-------+-----------+---------------------------------------+--------+------------- > bundle_link | be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx | index | mdkingpin | be5be0ac8_3ba5_407d_8183_2d05b9387e81| 701 GB | > > -[ RECORD 1 ]---...--- > magic | 105121344 > version | 4 > ntuples | 9123458028 > ffactor | 409 > bsize | 8152 > bmsize | 4096 > bmshift | 15 > maxbucket | 83886079 > highmask | 134217727 > lowmask | 67108863 > ovflpoint | 78 > firstfree | 7996259 > nmaps | 245 > procid | 456 > > Time: 69.237 ms > > mdstash=# select * from pgstathashindex('bundle_link.be5be0ac8_3ba5_407d_8183_2d05b9387e81_datum_id_idx'); > version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent > ---------+--------------+----------------+--------------+--------------+------------+------------+------------------ > 4 | 83886080 | 7996014 | 245 | 0 | 9123458028 | 0 | 75.6390388675015 > (1 row) > > Time: 2474290.172 ms (41:14.290) > Based on above data, we can easily see that after vacuum full, there is a huge increase in free_percent which is mostly due to the additional bucket_pages after vacuum full. See the below calculation: Index size difference = 701 - 273 = 428GB Bucket page size difference = 83886080 - 25165824 = 58720256 = 448GB Overflow page size difference = 7996014 - 10622781 = -2626767 = -20GB So, if we just add the difference of bucket pages and overflow pages size, it will give us the difference of size you are seeing after vacuum full. So, this clearly indicates the theory I was speculating above that somehow the estimated number of tuples (based on which number of buckets are computed) is different when we do a vacuum full. On further looking into it, I found that the relcacheentry for a relation doesn't have the correct value for relpages and reltuples during vacuum full due to which estimate_rel_size can give some size which might be quite different and then hashbuild can create buckets which it might not even need to populate the tuples. I am not sure if it is expected to have uninitialized (0) values for these attributes during vacuum full, but I see that in swap_relation_files when we swap the statistics, we are assuming that new rel has freshly-updated stats which I think is not true. This needs some further investigation. Another angle to look at it is that even if the values of relpages and reltuples is not updated why we get such a wrong estimation by estimate_rel_size. I think to some extent it depends on the schema of the table, so is it possible for you to share schema of the table. > Tell me if you need me to keep the index around. > I don't think so, but till we solve the problem there is no harm in keeping it if possible because one might want some information at a later stage to debug this problem. OTOH, if you have space crunch then feel free to delete it. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Mon, Nov 20, 2017 at 01:26:50PM +0530, Amit Kapila wrote: > Another angle to look at it is that even if the values of relpages and > reltuples is not updated why we get such a wrong estimation by > estimate_rel_size. I think to some extent it depends on the schema of > the table, so is it possible for you to share schema of the table. Hi, Schema's simple: CREATE TABLE link ( datum_id BYTEA NOT NULL, ids BYTEA NOT NULL ); ALTER TABLE link ALTER COLUMN datum_id SET STATISTICS 10000; ALTER TABLE link ALTER COLUMN ids SET STATISTICS 0; ALTER TABLE link SET ( AUTOVACUUM_ANALYZE_SCALE_FACTOR = 0.001, AUTOVACUUM_VACUUM_SCALE_FACTOR = 0.001 ); CREATE INDEX ON link USING hash (datum_id) WITH ( FILLFACTOR = 90 ); That's for the live table. Then I move it aside and recreate the index with FILLFACTOR = 100. > > Tell me if you need me to keep the index around. > > I don't think so, but till we solve the problem there is no harm in > keeping it if possible because one might want some information at a > later stage to debug this problem. OTOH, if you have space crunch > then feel free to delete it. No worries. I'll keep it around for as long as I can. AP
On Mon, Nov 20, 2017 at 1:26 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Mon, Nov 20, 2017 at 5:01 AM, AP <pgsql@inml.weebeastie.net> wrote: >> On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote: >>> On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote: >>> > I've some tables that'll never grow so I decided to replace a big index >>> > with one with a fillfactor of 100. That went well. The index shrunk to >>> > 280GB. I then did a vacuum full analyze on the table to get rid of any >>> > cruft (as the table will be static for a long time and then only deletes >>> > will happen) and the index exploded to 701GB. When it was created with >>> > fillfactor 90 (organically by filling the table) the index was 309GB. >> >> FYI: Nuking the above and doing a create index run gave me a 280GB index again. >> >>> Sounds quite strange. I think during vacuum it leads to more number >>> of splits than when the original data was loaded. By any chance do >>> you have a copy of both the indexes (before vacuum full and after >>> vacuum full)? Can you once check and share the output of >>> pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()? >>> I wanted to confirm if the bloat is due to additional splits. >> >> > > Based on above data, we can easily see that after vacuum full, there > is a huge increase in free_percent which is mostly due to the > additional bucket_pages after vacuum full. See the below calculation: > > Index size difference = 701 - 273 = 428GB > > Bucket page size difference = 83886080 - 25165824 = 58720256 = 448GB > > Overflow page size difference = 7996014 - 10622781 = -2626767 = -20GB > > So, if we just add the difference of bucket pages and overflow pages > size, it will give us the difference of size you are seeing after > vacuum full. So, this clearly indicates the theory I was speculating > above that somehow the estimated number of tuples (based on which > number of buckets are computed) is different when we do a vacuum full. > On further looking into it, I found that the relcache entry for a > relation doesn't have the correct value for relpages and reltuples > during vacuum full due to which estimate_rel_size can give some size > which might be quite different and then hashbuild can create buckets > which it might not even need to populate the tuples. I am not sure if > it is expected to have uninitialized (0) values for these attributes > during vacuum full, but I see that in swap_relation_files when we swap > the statistics, we are assuming that new rel has freshly-updated stats > which I think is not true. This needs some further investigation. > I think if we update the stats in copy_heap_data after copying the data, then we don't see such problem. Attached patch should fix the issue. You can try this patch to see if it fixes the issue for you. You might want to wait for a day or so to see if anyone else has any opinion on the patch or my analysis. >> Another angle to look at it is that even if the values of relpages and >> reltuples is not updated why we get such a wrong estimation by >> estimate_rel_size. I think to some extent it depends on the schema of >> the table, so is it possible for you to share schema of the table. >> > Schema's simple: > > CREATE TABLE link ( > datum_id BYTEA NOT NULL, > ids BYTEA NOT NULL > ); > ALTER TABLE link ALTER COLUMN datum_id SET STATISTICS 10000; > ALTER TABLE link ALTER COLUMN ids SET STATISTICS 0; > I think the reason for getting totally off stats during estimate_rel_size is that for the second column you have set statistics to 0. I think if you keep it to default or some reasonable number, then you won't get such a behavior. Anyhow, I think irrespective of the value of stats, the relcache entry should also be updated as explained above. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Вложения
On Tue, Nov 21, 2017 at 05:22:18PM +0530, Amit Kapila wrote: > I think if we update the stats in copy_heap_data after copying the > data, then we don't see such problem. Attached patch should fix the > issue. You can try this patch to see if it fixes the issue for you. > You might want to wait for a day or so to see if anyone else has any > opinion on the patch or my analysis. I'd love to but I wont be able to now for a week or two. The DB in question is moving towards liveness but, once it's live I can work on a copy to see if things become good. If I can get that happening sooner I'll grab that chance. > > Schema's simple: > > > > CREATE TABLE link ( > > datum_id BYTEA NOT NULL, > > ids BYTEA NOT NULL > > ); > > ALTER TABLE link ALTER COLUMN datum_id SET STATISTICS 10000; > > ALTER TABLE link ALTER COLUMN ids SET STATISTICS 0; > > I think the reason for getting totally off stats during > estimate_rel_size is that for the second column you have set > statistics to 0. I think if you keep it to default or some reasonable > number, then you won't get such a behavior. Anyhow, I think Hmm. I wanted Postgres to ignore that column as it'll never be searched on or sorted by or anything else. It's just there to provide a result. Unless I missed the boat on this I'd like to keep that. > irrespective of the value of stats, the relcache entry should also be > updated as explained above. Should the STATISTICS setting change index layout so drastically at any rate? AP
On Thu, Nov 23, 2017 at 11:01 AM, AP <pgsql@inml.weebeastie.net> wrote: > On Tue, Nov 21, 2017 at 05:22:18PM +0530, Amit Kapila wrote: >> I think if we update the stats in copy_heap_data after copying the >> data, then we don't see such problem. Attached patch should fix the >> issue. You can try this patch to see if it fixes the issue for you. >> You might want to wait for a day or so to see if anyone else has any >> opinion on the patch or my analysis. > > I'd love to but I wont be able to now for a week or two. The DB in question > is moving towards liveness but, once it's live I can work on a copy to see > if things become good. If I can get that happening sooner I'll grab that > chance. > Okay. >> > Schema's simple: >> > >> > CREATE TABLE link ( >> > datum_id BYTEA NOT NULL, >> > ids BYTEA NOT NULL >> > ); >> > ALTER TABLE link ALTER COLUMN datum_id SET STATISTICS 10000; >> > ALTER TABLE link ALTER COLUMN ids SET STATISTICS 0; >> >> I think the reason for getting totally off stats during >> estimate_rel_size is that for the second column you have set >> statistics to 0. I think if you keep it to default or some reasonable >> number, then you won't get such a behavior. Anyhow, I think > > Hmm. I wanted Postgres to ignore that column as it'll never be searched > on or sorted by or anything else. It's just there to provide a result. > > Unless I missed the boat on this I'd like to keep that. > >> irrespective of the value of stats, the relcache entry should also be >> updated as explained above. > > Should the STATISTICS setting change index layout so drastically at > any rate? > Ideally not, that's why I proposed a patch to fix the actual cause of the problem. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Hi! > I think if we update the stats in copy_heap_data after copying the > data, then we don't see such problem. Attached patch should fix the > issue. You can try this patch to see if it fixes the issue for you. I'm afraid I'm not able to reproduce the problem which patch should fix. What I did (today's master, without patch): autovacuum off pgbench -i -s 100 select relname, relpages, reltuples from pg_class where relname = 'pgbench_accounts'; relname | relpages | reltuples ------------------+----------+----------- pgbench_accounts | 163935 | 1e+07 vacuum full pgbench_accounts; # select relname, relpages, reltuples from pg_class where relname = 'pgbench_accounts'; relname | relpages | reltuples ------------------+----------+----------- pgbench_accounts | 163935 | 1e+07 I've tried to add hash index to that table and print notice about number of pages and tuples immediately after estimate_rel_size() in hashbuild(). hash index got right estimation even I deleted all rows before vacuum full. What am I doing wrong? Patch looks good except, seems, updating stats is better to move to swap_relation_files(), then it will work even for toast tables. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Hi, On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: > Hi! > >> I think if we update the stats in copy_heap_data after copying the >> data, then we don't see such problem. Attached patch should fix the >> issue. You can try this patch to see if it fixes the issue for you. > > I'm afraid I'm not able to reproduce the problem which patch should fix. > > What I did (today's master, without patch): > autovacuum off > pgbench -i -s 100 > > select relname, relpages, reltuples from pg_class where relname = > 'pgbench_accounts'; > relname | relpages | reltuples > ------------------+----------+----------- > pgbench_accounts | 163935 | 1e+07 > > vacuum full pgbench_accounts; > > # select relname, relpages, reltuples from pg_class where relname = > 'pgbench_accounts'; > relname | relpages | reltuples > ------------------+----------+----------- > pgbench_accounts | 163935 | 1e+07 > > > I've tried to add hash index to that table and print notice about number of > pages and tuples immediately after estimate_rel_size() in hashbuild(). hash > index got right estimation even I deleted all rows before vacuum full. What > am I doing wrong? > > Patch looks good except, seems, updating stats is better to move to > swap_relation_files(), then it will work even for toast tables. > > I haven't looked into the patch properly, but, i could reproduce the issue. Here are the steps that i am following, CREATE TABLE hash_index_table (keycol INT); INSERT INTO hash_index_table (keycol) SELECT (a - 1) % 1000 + 1 FROM GENERATE_SERIES(1, 1000000) a; CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH (keycol) with (fillfactor = 80); CREATE EXTENSION pgstattuple; select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; select relname, relpages, reltuples from pg_class where relname = 'hash_index_table'; select * from pgstathashindex('hash_index'); DROP INDEX hash_index; CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH (keycol) with (fillfactor = 100); select * from pgstathashindex('hash_index'); select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; select relname, relpages, reltuples from pg_class where relname = 'hash_index_table'; VACUUM FULL; select * from pgstathashindex('hash_index'); select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; select relname, relpages, reltuples from pg_class where relname = 'hash_index_table'; I think the issue is only visible when VACUUM FULL is executed after altering the index table fill-factor. Could you please try with above steps and let us know your observations. Thanks. With patch, I could see that the index table stats before and after VACUUM FULL are same. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com
On Sat, Dec 16, 2017 at 8:03 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > Hi, > > On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: >> Hi! >> >>> I think if we update the stats in copy_heap_data after copying the >>> data, then we don't see such problem. Attached patch should fix the >>> issue. You can try this patch to see if it fixes the issue for you. >> >> I'm afraid I'm not able to reproduce the problem which patch should fix. >> >> What I did (today's master, without patch): >> autovacuum off >> pgbench -i -s 100 >> >> select relname, relpages, reltuples from pg_class where relname = >> 'pgbench_accounts'; >> relname | relpages | reltuples >> ------------------+----------+----------- >> pgbench_accounts | 163935 | 1e+07 >> >> vacuum full pgbench_accounts; >> >> # select relname, relpages, reltuples from pg_class where relname = >> 'pgbench_accounts'; >> relname | relpages | reltuples >> ------------------+----------+----------- >> pgbench_accounts | 163935 | 1e+07 >> >> >> I've tried to add hash index to that table and print notice about number of >> pages and tuples immediately after estimate_rel_size() in hashbuild(). hash >> index got right estimation even I deleted all rows before vacuum full. What >> am I doing wrong? >> >> Patch looks good except, seems, updating stats is better to move to >> swap_relation_files(), then it will work even for toast tables. >> >> > > I haven't looked into the patch properly, but, i could reproduce the > issue. Here are the steps that i am following, > > CREATE TABLE hash_index_table (keycol INT); > INSERT INTO hash_index_table (keycol) SELECT (a - 1) % 1000 + 1 FROM > GENERATE_SERIES(1, 1000000) a; > > CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH > (keycol) with (fillfactor = 80); > > CREATE EXTENSION pgstattuple; > > select oid, relname, relpages, reltuples from pg_class where relname = > 'hash_index'; > > select relname, relpages, reltuples from pg_class where relname = > 'hash_index_table'; > > select * from pgstathashindex('hash_index'); > > DROP INDEX hash_index; > > CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH > (keycol) with (fillfactor = 100); > > select * from pgstathashindex('hash_index'); > > select oid, relname, relpages, reltuples from pg_class where relname = > 'hash_index'; > > select relname, relpages, reltuples from pg_class where relname = > 'hash_index_table'; > > VACUUM FULL; > > select * from pgstathashindex('hash_index'); > > select oid, relname, relpages, reltuples from pg_class where relname = > 'hash_index'; > > select relname, relpages, reltuples from pg_class where relname = > 'hash_index_table'; > > I think the issue is only visible when VACUUM FULL is executed after > altering the index table fill-factor. Could you please try with above > steps and let us know your observations. Thanks. > > With patch, I could see that the index table stats before and after > VACUUM FULL are same. > I think you should have shared the value of stats before and after patch so that we can see if the above is a right way to validate. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Nov 23, 2017 at 11:01 AM, AP <pgsql@inml.weebeastie.net> wrote: > On Tue, Nov 21, 2017 at 05:22:18PM +0530, Amit Kapila wrote: >> I think if we update the stats in copy_heap_data after copying the >> data, then we don't see such problem. Attached patch should fix the >> issue. You can try this patch to see if it fixes the issue for you. >> You might want to wait for a day or so to see if anyone else has any >> opinion on the patch or my analysis. > > I'd love to but I wont be able to now for a week or two. > Can you try to verify the patch, if you have some bandwidth now? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: > Hi! > >> I think if we update the stats in copy_heap_data after copying the >> data, then we don't see such problem. Attached patch should fix the >> issue. You can try this patch to see if it fixes the issue for you. > > I'm afraid I'm not able to reproduce the problem which patch should fix. > > What I did (today's master, without patch): > autovacuum off > pgbench -i -s 100 > > select relname, relpages, reltuples from pg_class where relname = > 'pgbench_accounts'; > relname | relpages | reltuples > ------------------+----------+----------- > pgbench_accounts | 163935 | 1e+07 > > vacuum full pgbench_accounts; > > # select relname, relpages, reltuples from pg_class where relname = > 'pgbench_accounts'; > relname | relpages | reltuples > ------------------+----------+----------- > pgbench_accounts | 163935 | 1e+07 > > > I've tried to add hash index to that table and print notice about number of > pages and tuples immediately after estimate_rel_size() in hashbuild(). hash > index got right estimation even I deleted all rows before vacuum full. What > am I doing wrong? > The estimation depends on the type of columns and stats. I think we need to use schema and stats in the way AP is using to see the effect AP is seeing. I was under impression that AP will help us in verifying the problem as he can reproduce it, but it seems he is busy. It seems Ashutosh is trying to reproduce the problem in a slightly different way, let us see if with his test we can see the similar effect. > Patch looks good except, seems, updating stats is better to move to > swap_relation_files(), then it will work even for toast tables. > Initially, I have also thought of doing it in swap_relation_files, but we don't have stats values there. We might be able to pass it, but not sure if there is any need for same. As far as Toast table's case is concerned, I don't see the problem because we are copying the data row-by-row only for heap where the value of num_tuples and num_pages could be different. See copy_heap_data. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Hi,
On Sat, Dec 16, 2017 at 8:34 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Dec 16, 2017 at 8:03 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> Hi,
>>
>> On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
>>> Hi!
>>>
>>>> I think if we update the stats in copy_heap_data after copying the
>>>> data, then we don't see such problem. Attached patch should fix the
>>>> issue. You can try this patch to see if it fixes the issue for you.
>>>
Here are the stats i saw before and after VACUUM FULL - with and without the patch. Please note i have followed the steps shared in - [1].
A) Without patch - Stats before and after VACUUM FULL:
postgres[43768]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0 | 1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)
Time: 6.531 ms
postgres[43768]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)
Time: 0.369 ms
postgres[43768]=# VACUUM FULL;
VACUUM
Time: 4145.813 ms (00:04.146)
postgres[43768]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 3072 | 2019 | 1 | 0 | 1000000 | 0 | 51.8093562713087 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)
Time: 9.194 ms
postgres[43768]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 5093 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)
Time: 1.289 ms
B) With Patch- Stats before and after VACUUM FULL:
postgres[31111]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0 | 1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)
Time: 6.539 ms
postgres[31111]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)
Time: 0.379 ms
postgres[31111]=# VACUUM FULL;
VACUUM
Time: 4265.662 ms (00:04.266)
postgres[31111]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0 | 1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)
Time: 6.699 ms
postgres[31111]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)
Time: 0.893 ms
[1] - https://www.postgresql.org/message-id/CAE9k0P%3DihFyPAKfrMX9GaDo5RaeGSJ4i4nb28fGev15wKOPYog%40mail.gmail.com
>>> I'm afraid I'm not able to reproduce the problem which patch should fix.
>>>
>>> What I did (today's master, without patch):
>>> autovacuum off
>>> pgbench -i -s 100
>>>
>>> select relname, relpages, reltuples from pg_class where relname =
>>> 'pgbench_accounts';
>>> relname | relpages | reltuples
>>> ------------------+----------+-----------
>>> pgbench_accounts | 163935 | 1e+07
>>>
>>> vacuum full pgbench_accounts;
>>>
>>> # select relname, relpages, reltuples from pg_class where relname =
>>> 'pgbench_accounts';
>>> relname | relpages | reltuples
>>> ------------------+----------+-----------
>>> pgbench_accounts | 163935 | 1e+07
>>>
>>>
>>> I've tried to add hash index to that table and print notice about number of
>>> pages and tuples immediately after estimate_rel_size() in hashbuild(). hash
>>> index got right estimation even I deleted all rows before vacuum full. What
>>> am I doing wrong?
>>>
>>> Patch looks good except, seems, updating stats is better to move to
>>> swap_relation_files(), then it will work even for toast tables.
>>>
>>>
>>
>> I haven't looked into the patch properly, but, i could reproduce the
>> issue. Here are the steps that i am following,
>>
>> CREATE TABLE hash_index_table (keycol INT);
>> INSERT INTO hash_index_table (keycol) SELECT (a - 1) % 1000 + 1 FROM
>> GENERATE_SERIES(1, 1000000) a;
>>
>> CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH
>> (keycol) with (fillfactor = 80);
>>
>> CREATE EXTENSION pgstattuple;
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> select * from pgstathashindex('hash_index');
>>
>> DROP INDEX hash_index;
>>
>> CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH
>> (keycol) with (fillfactor = 100);
>>
>> select * from pgstathashindex('hash_index');
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> VACUUM FULL;
>>
>> select * from pgstathashindex('hash_index');
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> I think the issue is only visible when VACUUM FULL is executed after
>> altering the index table fill-factor. Could you please try with above
>> steps and let us know your observations. Thanks.
>>
>> With patch, I could see that the index table stats before and after
>> VACUUM FULL are same.
>>
>
> I think you should have shared the value of stats before and after
> patch so that we can see if the above is a right way to validate.
>
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
On Sat, Dec 16, 2017 at 8:34 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Dec 16, 2017 at 8:03 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> Hi,
>>
>> On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
>>> Hi!
>>>
>>>> I think if we update the stats in copy_heap_data after copying the
>>>> data, then we don't see such problem. Attached patch should fix the
>>>> issue. You can try this patch to see if it fixes the issue for you.
>>>
Here are the stats i saw before and after VACUUM FULL - with and without the patch. Please note i have followed the steps shared in - [1].
A) Without patch - Stats before and after VACUUM FULL:
postgres[43768]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0 | 1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)
Time: 6.531 ms
postgres[43768]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)
Time: 0.369 ms
postgres[43768]=# VACUUM FULL;
VACUUM
Time: 4145.813 ms (00:04.146)
postgres[43768]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 3072 | 2019 | 1 | 0 | 1000000 | 0 | 51.8093562713087 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)
Time: 9.194 ms
postgres[43768]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 5093 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)
Time: 1.289 ms
B) With Patch- Stats before and after VACUUM FULL:
postgres[31111]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0 | 1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)
Time: 6.539 ms
postgres[31111]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)
Time: 0.379 ms
postgres[31111]=# VACUUM FULL;
VACUUM
Time: 4265.662 ms (00:04.266)
postgres[31111]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0 | 1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)
Time: 6.699 ms
postgres[31111]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)
Time: 0.893 ms
[1] - https://www.postgresql.org/message-id/CAE9k0P%3DihFyPAKfrMX9GaDo5RaeGSJ4i4nb28fGev15wKOPYog%40mail.gmail.com
>>> I'm afraid I'm not able to reproduce the problem which patch should fix.
>>>
>>> What I did (today's master, without patch):
>>> autovacuum off
>>> pgbench -i -s 100
>>>
>>> select relname, relpages, reltuples from pg_class where relname =
>>> 'pgbench_accounts';
>>> relname | relpages | reltuples
>>> ------------------+----------+-----------
>>> pgbench_accounts | 163935 | 1e+07
>>>
>>> vacuum full pgbench_accounts;
>>>
>>> # select relname, relpages, reltuples from pg_class where relname =
>>> 'pgbench_accounts';
>>> relname | relpages | reltuples
>>> ------------------+----------+-----------
>>> pgbench_accounts | 163935 | 1e+07
>>>
>>>
>>> I've tried to add hash index to that table and print notice about number of
>>> pages and tuples immediately after estimate_rel_size() in hashbuild(). hash
>>> index got right estimation even I deleted all rows before vacuum full. What
>>> am I doing wrong?
>>>
>>> Patch looks good except, seems, updating stats is better to move to
>>> swap_relation_files(), then it will work even for toast tables.
>>>
>>>
>>
>> I haven't looked into the patch properly, but, i could reproduce the
>> issue. Here are the steps that i am following,
>>
>> CREATE TABLE hash_index_table (keycol INT);
>> INSERT INTO hash_index_table (keycol) SELECT (a - 1) % 1000 + 1 FROM
>> GENERATE_SERIES(1, 1000000) a;
>>
>> CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH
>> (keycol) with (fillfactor = 80);
>>
>> CREATE EXTENSION pgstattuple;
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> select * from pgstathashindex('hash_index');
>>
>> DROP INDEX hash_index;
>>
>> CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH
>> (keycol) with (fillfactor = 100);
>>
>> select * from pgstathashindex('hash_index');
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> VACUUM FULL;
>>
>> select * from pgstathashindex('hash_index');
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> I think the issue is only visible when VACUUM FULL is executed after
>> altering the index table fill-factor. Could you please try with above
>> steps and let us know your observations. Thanks.
>>
>> With patch, I could see that the index table stats before and after
>> VACUUM FULL are same.
>>
>
> I think you should have shared the value of stats before and after
> patch so that we can see if the above is a right way to validate.
>
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
On Sat, Dec 16, 2017 at 09:08:23AM +0530, Amit Kapila wrote: > On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: > The estimation depends on the type of columns and stats. I think we > need to use schema and stats in the way AP is using to see the effect > AP is seeing. I was under impression that AP will help us in > verifying the problem as he can reproduce it, but it seems he is busy. Different fires keep springing up and they are causing delay. This is still on my mind and I'll get back to it as soon as I can. My apologies. :( AP
On Sat, Dec 16, 2017 at 12:27 PM, AP <pgsql@inml.weebeastie.net> wrote: > On Sat, Dec 16, 2017 at 09:08:23AM +0530, Amit Kapila wrote: >> On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: >> The estimation depends on the type of columns and stats. I think we >> need to use schema and stats in the way AP is using to see the effect >> AP is seeing. I was under impression that AP will help us in >> verifying the problem as he can reproduce it, but it seems he is busy. > > Different fires keep springing up and they are causing delay. This > is still on my mind and I'll get back to it as soon as I can. > Okay. I think Ashutosh has reproduced it with a standalone test, let us see if that suffice the need. In any case, feel free to verify in the meantime. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Hi, On Sat, Dec 16, 2017 at 12:56 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, Dec 16, 2017 at 12:27 PM, AP <pgsql@inml.weebeastie.net> wrote: > > On Sat, Dec 16, 2017 at 09:08:23AM +0530, Amit Kapila wrote: > >> On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: > >> The estimation depends on the type of columns and stats. I think we > >> need to use schema and stats in the way AP is using to see the effect > >> AP is seeing. I was under impression that AP will help us in > >> verifying the problem as he can reproduce it, but it seems he is busy. > > > > Different fires keep springing up and they are causing delay. This > > is still on my mind and I'll get back to it as soon as I can. > > > > Okay. I think Ashutosh has reproduced it with a standalone test, let > us see if that suffice the need. In any case, feel free to verify in > the meantime. I am able to reproduce the problem reported by AP with the following test-case. My earlier test-case-[1] was also helpful in diagnosing the problem and understanding Amit's patch -[2] but this test-case exposes the problem more than my earlier test-case. test-case ======= 1) CREATE TABLE hash_index_table(datum_id BYTEA NOT NULL); 2) INSERT INTO hash_index_table SELECT repeat(stringu1, 30)::bytea FROM tenk1; 3) ALTER TABLE hash_index_table ALTER COLUMN datum_id SET STATISTICS 0; 4) ANALYZE hash_index_table; 5) CREATE INDEX hash_index ON hash_index_table USING hash (datum_id); 6) select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; 7) DROP INDEX hash_index; 8) CREATE INDEX hash_index ON hash_index_table USING hash (datum_id); 9) select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; 10) VACUUM FULL hash_index_table; 11) select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; The output of above two test-cases without patch is as follows, Output (without patch): ================ postgres[72965]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; +--------+------------+----------+-----------+ | oid | relname | relpages | reltuples | +--------+------------+----------+-----------+ | 287169 | hash_index | 69 | 10000 | +--------+------------+----------+-----------+ (1 row) Time: 0.381 ms postgres[72965]=# postgres[72965]=# VACUUM FULL hash_index_table; VACUUM Time: 55.703 ms postgres[72965]=# postgres[72965]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; +--------+------------+----------+-----------+ | oid | relname | relpages | reltuples | +--------+------------+----------+-----------+ | 287169 | hash_index | 130 | 10000 | +--------+------------+----------+-----------+ (1 row) Time: 0.904 ms Output (with patch): ============== postgres[85460]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; +-------+------------+----------+-----------+ | oid | relname | relpages | reltuples | +-------+------------+----------+-----------+ | 26394 | hash_index | 69 | 10000 | +-------+------------+----------+-----------+ (1 row) Time: 0.370 ms postgres[85460]=# postgres[85460]=# VACUUM FULL hash_index_table; VACUUM Time: 68.351 ms postgres[85460]=# postgres[85460]=# select oid, relname, relpages, reltuples from pg_class where relname = 'hash_index'; +-------+------------+----------+-----------+ | oid | relname | relpages | reltuples | +-------+------------+----------+-----------+ | 26394 | hash_index | 69 | 10000 | +-------+------------+----------+-----------+ (1 row) Time: 0.838 ms Please note that i have tried running above test-case both with and without ANALYZE hash_index_table (step #4) and the problem is observed in both the cases. [1]- https://www.postgresql.org/message-id/CAE9k0P%3DihFyPAKfrMX9GaDo5RaeGSJ4i4nb28fGev15wKOPYog%40mail.gmail.com [2]- https://www.postgresql.org/message-id/CAA4eK1%2B6BSGrm%2BNtUDhx59CNR51Ehbnmch9LpswMEyrLBBjKLg%40mail.gmail.com -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com > > > -- > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com >
On Tue, Dec 19, 2017 at 5:30 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > Hi, > > On Sat, Dec 16, 2017 at 12:56 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Sat, Dec 16, 2017 at 12:27 PM, AP <pgsql@inml.weebeastie.net> wrote: >> > On Sat, Dec 16, 2017 at 09:08:23AM +0530, Amit Kapila wrote: >> >> On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: >> >> The estimation depends on the type of columns and stats. I think we >> >> need to use schema and stats in the way AP is using to see the effect >> >> AP is seeing. I was under impression that AP will help us in >> >> verifying the problem as he can reproduce it, but it seems he is busy. >> > >> > Different fires keep springing up and they are causing delay. This >> > is still on my mind and I'll get back to it as soon as I can. >> > >> >> Okay. I think Ashutosh has reproduced it with a standalone test, let >> us see if that suffice the need. In any case, feel free to verify in >> the meantime. > > I am able to reproduce the problem reported by AP with the following > test-case. My earlier test-case-[1] was also helpful in diagnosing the > problem and understanding Amit's patch -[2] but this test-case exposes > the problem more than my earlier test-case. > > test-case > ======= > 1) CREATE TABLE hash_index_table(datum_id BYTEA NOT NULL); > 2) INSERT INTO hash_index_table SELECT repeat(stringu1, 30)::bytea FROM tenk1; > > 3) ALTER TABLE hash_index_table ALTER COLUMN datum_id SET STATISTICS 0; > > 4) ANALYZE hash_index_table; > > 5) CREATE INDEX hash_index ON hash_index_table USING hash (datum_id); > > 6) select oid, relname, relpages, reltuples from pg_class where > relname = 'hash_index'; > > 7) DROP INDEX hash_index; > > 8) CREATE INDEX hash_index ON hash_index_table USING hash (datum_id); > > 9) select oid, relname, relpages, reltuples from pg_class where > relname = 'hash_index'; > > 10) VACUUM FULL hash_index_table; > > 11) select oid, relname, relpages, reltuples from pg_class where > relname = 'hash_index'; > > The output of above two test-cases without patch is as follows, > > Output (without patch): > ================ > postgres[72965]=# select oid, relname, relpages, reltuples from > pg_class where relname = 'hash_index'; > +--------+------------+----------+-----------+ > | oid | relname | relpages | reltuples | > +--------+------------+----------+-----------+ > | 287169 | hash_index | 69 | 10000 | > +--------+------------+----------+-----------+ > (1 row) > > Time: 0.381 ms > postgres[72965]=# > postgres[72965]=# VACUUM FULL hash_index_table; > VACUUM > Time: 55.703 ms > postgres[72965]=# > postgres[72965]=# select oid, relname, relpages, reltuples from > pg_class where relname = 'hash_index'; > +--------+------------+----------+-----------+ > | oid | relname | relpages | reltuples | > +--------+------------+----------+-----------+ > | 287169 | hash_index | 130 | 10000 | > +--------+------------+----------+-----------+ > (1 row) > > Time: 0.904 ms > > Output (with patch): > ============== > postgres[85460]=# select oid, relname, relpages, reltuples from > pg_class where relname = 'hash_index'; > +-------+------------+----------+-----------+ > | oid | relname | relpages | reltuples | > +-------+------------+----------+-----------+ > | 26394 | hash_index | 69 | 10000 | > +-------+------------+----------+-----------+ > (1 row) > > Time: 0.370 ms > postgres[85460]=# > postgres[85460]=# VACUUM FULL hash_index_table; > VACUUM > Time: 68.351 ms > postgres[85460]=# > postgres[85460]=# select oid, relname, relpages, reltuples from > pg_class where relname = 'hash_index'; > +-------+------------+----------+-----------+ > | oid | relname | relpages | reltuples | > +-------+------------+----------+-----------+ > | 26394 | hash_index | 69 | 10000 | > +-------+------------+----------+-----------+ > (1 row) > > Time: 0.838 ms > > Please note that i have tried running above test-case both with and > without ANALYZE hash_index_table (step #4) and the problem is observed > in both the cases. > > [1]- https://www.postgresql.org/message-id/CAE9k0P%3DihFyPAKfrMX9GaDo5RaeGSJ4i4nb28fGev15wKOPYog%40mail.gmail.com > > [2]- https://www.postgresql.org/message-id/CAA4eK1%2B6BSGrm%2BNtUDhx59CNR51Ehbnmch9LpswMEyrLBBjKLg%40mail.gmail.com > I'm not sure when AP is planning to share his test-results. But, I've shared mine test-results -[1] and also reviewed the patch. As mentioned in my earlier update -[1], the patch looks good to me and it fixes the issue. I am therefore moving the patch to 'Ready for Committer'. Thanks. [1]-https://www.postgresql.org/message-id/CAE9k0PmTZKLA2hKAPT6OCinH%2BXX%2BXVSej3jx17j9SMBJr%3DFvkA%40mail.gmail.com -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com > >> >> >> -- >> With Regards, >> Amit Kapila. >> EnterpriseDB: http://www.enterprisedb.com >>
On Wed, Dec 27, 2017 at 8:59 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: > thank you, pushed > Thanks a lot. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com