Re: pgsql: Optimize btree insertions for common case of increasing values
От | Peter Geoghegan |
---|---|
Тема | Re: pgsql: Optimize btree insertions for common case of increasing values |
Дата | |
Msg-id | CAH2-WznpGDHev3QTeFA2XO-51czgLt=61C+fWvHAe+V7K97NGA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: pgsql: Optimize btree insertions for common case of increasing values (Simon Riggs <simon@2ndquadrant.com>) |
Список | pgsql-committers |
On Thu, Mar 29, 2018 at 7:59 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Another issue that I have with the main test of the >> RelationSetTargetBlock() page is that nobody explains *why* we check >> that we have enough space on the page to proceed. Why would it not be >> okay if there was a page split? > > ...because we need the Stack to bubble up the parent insert. As I said, strictly speaking we do not. >> Although it's subtle, I'm pretty confident that it actually would be >> okay from a correctness point of view to allow an insertion to go >> ahead that would result in a page split -- it's possible to pass a >> NULL stack to _bt_findinsertloc() and _bt_insertonpg() while still >> getting correct behavior. > It's a linear scan, so quite clearly not going to perform well on big indexes. > > Why would that be worth pursuing? I must have been unclear. I agree that it isn't worth pursuing. My point is that the fact that that actually works (e.g. it won't segfault) could mask bugs where the "we must avoid a pagesplit" logic breaks in the future. Relying on that working from a distance is something that we can and should verify works out, with an assertion. >> Suggested next steps to deal with this: >> >> * A minor point: I don't think you should call >> RelationSetTargetBlock() when the page P_ISROOT(), which, as I >> mentioned, is a condition that can coexist with P_ISLEAF() with very >> small B-Trees. There can be no point in doing so. No need to add >> P_ISROOT() to the main "Is cached page stale?" test within >> _bt_doinsert(), though. > > True. A better test would be to not use the optimization at all on > smaller btrees by checking the level is >= 3. That might be a better idea, though I would probably go with >= 2. You can have a reasonably big B-Tree with only one layer of internal pages between the root and the leaf level. Though that's not worth quibbling about. >> * An assertion would make me feel a lot better about depending on not >> having a page split from a significant distance. >> >> Your optimization should continue to not be used when it would result >> in a page split, but only because that would be slow. The comments >> should say so, IMV. Also, _bt_insertonpg() should have an assertion >> against a page split actually occurring when the optimization was >> used, just in case. When !BufferIsValid(cbuf), we know that we're >> being called from _bt_doinsert() (see existing assertion at the top of >> _bt_insertonpg() as an example of this), so at the point where it's >> clear a page split is needed, we should assert that there is no target >> block that we must have been passed as the target page. > > This is the same issues as the NULL stack. The page split would need > to insert into parent. But, to repeat myself, it actually can do that. Try removing the space check in the main _bt_doinsert() test for yourself -- the regression tests continue to pass. I would like there to be an assertion failure instead. > I don't see any need to change any of these things. This is a tuning > patch and none of the above affects correctness of what has been > committed. I think that there has been a misunderstanding. I'm only asking for a new assertion when I talk about the stack. That's all. -- Peter Geoghegan
В списке pgsql-committers по дате отправления: