Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
| От | Peter Geoghegan | 
|---|---|
| Тема | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. | 
| Дата | |
| Msg-id | CAH2-Wzm=9TnAFGCDfvsBVC5zYonQqeLMmYpnx=xZ3nyXOeHjNA@mail.gmail.com обсуждение исходный текст | 
| Ответ на | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>) | 
| Ответы | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. | 
| Список | pgsql-hackers | 
On Mon, Sep 2, 2019 at 6:53 PM Peter Geoghegan <pg@bowt.ie> wrote: > Attach is v10, which fixes the Valgrind issue. Attached is v11, which makes the kill_prior_tuple optimization work with posting list tuples. The only catch is that it can only work when all "logical tuples" within a posting list are known-dead, since of course there is only one LP_DEAD bit available for each posting list. The hardest part of this kill_prior_tuple work was writing the new _bt_killitems() code, which I'm still not 100% happy with. Still, it seems to work well -- new pageinspect LP_DEAD status info was added to the second patch to verify that we're setting LP_DEAD bits as needed for posting list tuples. I also had to add a new nbtree-specific, posting-list-aware version of index_compute_xid_horizon_for_tuples() -- _bt_compute_xid_horizon_for_tuples(). Finally, it was necessary to avoid splitting a posting list with the LP_DEAD bit set. I took a naive approach to avoiding that problem, adding code to _bt_findinsertloc() to prevent it. Posting list splits are generally assumed to be rare, so the fact that this is slightly inefficient should be fine IMV. I also refactored deduplication itself in anticipation of making the WAL logging more efficient, and incremental. So, the structure of the code within _bt_dedup_one_page() was simplified, without really changing it very much (I think). I also fixed a bug in _bt_dedup_one_page(). The check for dead items was broken in previous versions, because the loop examined the high key tuple in every iteration. Making _bt_dedup_one_page() more efficient and incremental is still the most important open item for the patch. -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: