Обсуждение: GiST splitting on empty pages

Поиск
Список
Период
Сортировка

GiST splitting on empty pages

От
Andrew Gierth
Дата:
This is from Bug #11555, which is still in moderation as I type this
(analysis was done via IRC).

The GiST insertion code appears to have no length checks at all on the
inserted entry. index_form_tuple checks for length <= 8191, with the
default blocksize, but obviously a tuple less than 8191 bytes may
still not fit on the page due to page header info etc.

However, the gist insertion code assumes that if the tuple doesn't fit,
then it must have to split the page, with no check whether the page is
already empty.

So this crashes with infinite recursion in gistSplit (which also lacks
a check_stack_depth() call):

create extension pg_trgm;

create table t1 (a text);
create index on t1 using gist (a gist_trgm_ops);

insert into t1 values
('vvsehbxxlezgwtyvbgdyburmhxmuzbwvwoaepxbbbaiyuguwnxvprmbzkotkqqfnegruds''wftedolykzvfonsqndehouxuibazwdrtjlynzjlkihqxvjnimrpbmnvupvtlzlejxdwwmh''hvpxtkggstyivlvqgkmawmlbvjerfrzmnokgyrnrllagwwxdgjddwofrxjidbiqowbvusi''mdumkrihuprxsnmyekhnojvexsftmybzcwlmntuijlfcyracciqqrmuoeairzkqbgcouvi''cfthhszhvbplshxmwcnetnokovmdpimrnfzuxzcsaseszcfvetaxgoivjuzzclqprqkopn''hqgmjgoocsicqpqylatkzvvqlmhwbwjjmpvvwkkyctatirstsldsgzismqonmxxzntvkdf''ifzizharbsdfkjetcrjqfwocvcvqmywuevddddvevyjgiozkfialfpnarjqdinymibqlem''qakzgtofeuoeftutulclpkynxgoostaitkizewfirunxnhqhttsiervbxkpqdqyxbhxfdc''nvwbskiirbckkgbfizqypuoorpvovzqiunjnxswpuyaefbkobbmrvbgmrbbmbsvwffjcxf''ssesxjtiyvjkmemsrdusqvklspqbsohkhlcevwmtrveeaqwrurjknuwfkngcbnnjzpnvma''odvsiwjfnewxpjslocyveajsjjhxeuxsxtlgqvldzhbortagvybazlsjuagyueqsycyoxj''swrtljnlpikrjjccswczuxelpdnorlyjhpdszqdozngjxilqfoqalumaxapplnzscclctp''rtdxdagorlchmocypayepjrpcusowldnfgkihrxzcagoojndjmizwzoyugmqsqeyxpgege''ejelytulxeyfdufsszzfqrvupskwxrbbafnyzhkwlicpchivhhaxywsopdlnumpusctrje''ovmqlpytlfamdziwnxzyltlaodciummihzzsoxmadmmgluczscxdwiekmgsgsfpaeostme''tprfwcazbtbzwyibiuhbbahqalftfryyhpeeseduxftudcvmwdoxdvodgtxllvktkoxdta''xrgqmjtiwqlknpfctmwqyhliawxyzrywienvogdkwovkeanxmjnkrztsvrqviprquflimp''tjeouiphfcrtnisgaoxrjfgbwahijuxddbsxkhfqjwjwfcdgrbxagdcdekmoekshmkfwsl''mbivynyctpdrqkutnzdaohkgpwqvsihfkpajczlwoonfziynibnwjxczttumcbrnrswtri''qgxelwmjjvlwruuutnoozqpregjbaajrhhvsdicndnkvhepbseprvfjzmsamtkearzsuiu''ilhsgpwwqoafgvkpuwhujbenbwnuqvoygwrnlnjccjhiesyyogtyhymiuzclvrkbobpapy''crhjalcykreepmdbvyaxkvpuxdwmdllfcspdesbpjguysyaowbmhwbcufyhiksonleqpws''ffyzerxefufrcctexydegnxvajzrywjebiegzckfxqxzsqdpohuvusrvbrmanwepeivelg''jiwhhoxlemszimraisrvterytwhpasvkarrhgptlclklyblhuccnhumbqtqrllcldutkkn''vmyfyxhkecmhqubcvsvmkgxmsbgllqyhdxmbuulzwygmtipoakakqywjadvltusxrfymzk''mwjsjcayqbirlzpiipmebfyucqabcampwvigxieoknfwnvfvlranxyiaoibringfjolgxq''uhdaeqwjmhamvxldxzlzqunxawmmdjcyrgzvxvfjcfwydhbbhmbxhovhlhtoqwnicmeahj''jkpgitojuvwvtekomvwfkncxvfkzfrjpcyvlskvmfrizwsoiokoyxqwsvhsazbpbalmsvh''fbznavgoeuystwjpoexhfwjvxkgkdcridrwdncsrxrkqntgbydjdzszwcfgghyolqlodnh''ukyfblyhnwkwajpzgsfnynlnybynfmuzxseyddfrapnaycafugsstdfsfefkqaknsplwsq''ntgbufdukybcrugxnmbsxrsielxqiqhwjnxdtbydzzgqunnhzoawgsflecbmtjjcxggqhe''tgeaxynkgmzgjgzordrtqkdznaftqnyktdkrcxcikbouiniarathkxgyxmsnzrytuikwfm''eqotkxgtxxtrfeomclyvzymxrggcdmpicebmbifyfzpldexgqvbptnnlutnxfdfihhuipa''hvaxgdbdkszliszvetpsrvvxddeymuytpyrvctzmlyytrxovreojzjhcnlazgzsvykrbdq''nopmhgjwcbaqlaasdneemkdfgcpxdtoqhddoknlmomdzmdrprvtegxkmzajctytacxpmka''zzncyzgqpxmjcsgmfgmojfndgpawckwbjjeijlzzjmilfpxkwkzfqmjxbjteuqfeaknjvm''iezrqegnodynjpasmbbffwvlavwnfraowzfmdmaspygyograisgopcaqxwednerkexwijw''azvhyjnpkwiqkxsloqhsuvwlfbjbjtykturrefhpcpfnnyybpftjaqvfsfhbygmraejekq''umfzztyxuocoydftixzqzxwlpxpyczowmuwlnuiiilxgocaxaaozxklnialkaagmucyixh''qgsnnhqnfqntpleaymbkxckdpfgnnduejnrwuikayytokyoilqtisdmhisvwwpafcscxan''xylrnvpcebsxjlbvtkoogkegqhzsfzgdyrulnknslgqusrqmbebhpfofnnysnewlvqxjal''cmrshkjxwkcxsrdhwquujhzftvwqexbgjtyqdioatqxliatfrnabvzhoueeybgflzecdmq''dghbsqclvuyvvtudiohm'
);

Suggested fixes (probably all of these are appropriate):

1. gistSplit should have check_stack_depth()

2. gistSplit should probably refuse to split anything if called with
only one item (which is the value being inserted).

3. somewhere before reaching gistSplit it might make sense to check
explicitly (e.g. in gistFormTuple) whether the tuple will actually
fit on a page.

4. pg_trgm probably should do something more sensible with large leaf
items, but this is a peripheral issue since ultimately the gist core
must enforce these limits rather than rely on the opclass.

-- 
Andrew (irc:RhodiumToad)



Re: GiST splitting on empty pages

От
Heikki Linnakangas
Дата:
On 10/03/2014 05:03 AM, Andrew Gierth wrote:
> This is from Bug #11555, which is still in moderation as I type this
> (analysis was done via IRC).
>
> The GiST insertion code appears to have no length checks at all on the
> inserted entry. index_form_tuple checks for length <= 8191, with the
> default blocksize, but obviously a tuple less than 8191 bytes may
> still not fit on the page due to page header info etc.
>
> However, the gist insertion code assumes that if the tuple doesn't fit,
> then it must have to split the page, with no check whether the page is
> already empty.
>
> [script to reproduce]

Thanks for the analysis!

> Suggested fixes (probably all of these are appropriate):
>
> 1. gistSplit should have check_stack_depth()
>
> 2. gistSplit should probably refuse to split anything if called with
> only one item (which is the value being inserted).
>
> 3. somewhere before reaching gistSplit it might make sense to check
> explicitly (e.g. in gistFormTuple) whether the tuple will actually
> fit on a page.
>
> 4. pg_trgm probably should do something more sensible with large leaf
> items, but this is a peripheral issue since ultimately the gist core
> must enforce these limits rather than rely on the opclass.

Fixed, I did 1. and 2. Number 3. would make a lot of sense, but I 
couldn't totally convince myself that we can safely put the check in 
gistFormTuple() without causing some cases to fail that currently work. 
I think the code sometimes forms tuples that are never added to the 
insert as such, used only to "union" them to existing tuples on internal 
pages. Or maybe not, but in any case, the check in gistSplit() is enough 
to stop the crash.

- Heikki