Re: pgsql: Optimize btree insertions for common case of increasing values
От | Simon Riggs |
---|---|
Тема | Re: pgsql: Optimize btree insertions for common case of increasing values |
Дата | |
Msg-id | CANP8+jLv-yNmnPeMQ-zuvQzbMcR3sLVyDQWvbzjjZM4nZ3YH+w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: pgsql: Optimize btree insertions for common case of increasing values (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: pgsql: Optimize btree insertions for common case of increasing values
(Peter Geoghegan <pg@bowt.ie>)
|
Список | pgsql-committers |
On 29 March 2018 at 00:09, Peter Geoghegan <pg@bowt.ie> wrote: > On Wed, Mar 28, 2018 at 11:56 AM, Peter Geoghegan <pg@bowt.ie> wrote: >>> Previously, we agreed that P_IGNORE() is required. So I assume no issues >>> there. The other tests seem required too for us to conclusively decide to >>> use the cached block. >> >> Actually, you're right that P_INCOMPLETE_SPLIT() is the only redundant >> one. > > Another issue that I have with the main test of the > RelationSetTargetBlock() page is that nobody explains *why* we check > that we have enough space on the page to proceed. Why would it not be > okay if there was a page split? ...because we need the Stack to bubble up the parent insert. > Although it's subtle, I'm pretty confident that it actually would be > okay from a correctness point of view to allow an insertion to go > ahead that would result in a page split -- it's possible to pass a > NULL stack to _bt_findinsertloc() and _bt_insertonpg() while still > getting correct behavior. > > * In the case of _bt_findinsertloc(), it's okay to have a NULL stack > because you'll never take the P_INCOMPLETE_SPLIT() path for a > rightmost page, as already explained, so the stack cannot possibly be > used (besides, even _bt_finish_split() can deal with a NULL stack). It > is not just an accident that it works, because _bt_search() will > sometimes return a NULL stack when writing -- it does so when there is > only one non-meta page, which is both a root and a leaf page. Your > optimization is a bit similar to that, in the sense that the > cached/target block contains a leaf page that is rightmost (root pages > are always rightmost, and are leaf pages in the case described). > > * In the case of _bt_insertonpg(), it's okay to have a NULL stack > because there is a codepath that allows _bt_insert_parent() (which is > the only thing in _bt_insertonpg() that touches the stack) to work > without a stack during crash recovery. That path is quite a bit > slower, though. It's a linear scan, so quite clearly not going to perform well on big indexes. Why would that be worth pursuing? > Suggested next steps to deal with this: > > * A minor point: I don't think you should call > RelationSetTargetBlock() when the page P_ISROOT(), which, as I > mentioned, is a condition that can coexist with P_ISLEAF() with very > small B-Trees. There can be no point in doing so. No need to add > P_ISROOT() to the main "Is cached page stale?" test within > _bt_doinsert(), though. True. A better test would be to not use the optimization at all on smaller btrees by checking the level is >= 3. > * An assertion would make me feel a lot better about depending on not > having a page split from a significant distance. > > Your optimization should continue to not be used when it would result > in a page split, but only because that would be slow. The comments > should say so, IMV. Also, _bt_insertonpg() should have an assertion > against a page split actually occurring when the optimization was > used, just in case. When !BufferIsValid(cbuf), we know that we're > being called from _bt_doinsert() (see existing assertion at the top of > _bt_insertonpg() as an example of this), so at the point where it's > clear a page split is needed, we should assert that there is no target > block that we must have been passed as the target page. This is the same issues as the NULL stack. The page split would need to insert into parent. > * The indentation of the main _bt_doinsert() test does not follow > project guidelines. Please tweak that, too. > > That's all I have for now. I might think of something else tomorrow, > since I haven't spent that much time on this. It would be nice to get > a second opinion on some of this, if at all possible. I don't see any need to change any of these things. This is a tuning patch and none of the above affects correctness of what has been committed. If you can find non-performant cases that require further tuning, we can submit another patch at a later time. And then we can cross the bridge of checking whether what is said above is correct or not. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-committers по дате отправления:
Следующее
От: Tom LaneДата:
Сообщение: pgsql: While vacuuming a large table,update upper-level FSM data every