On Wed, Aug 28, 2019 at 5:30 AM Alexandra Wang <lewang@pivotal.io> wrote:
You are correct that we currently go through each item in the leaf page that contains the given tid, specifically, the logic to retrieve all the attribute items inside a ZSAttStream is now moved to decode_attstream() in the latest code, and then in zsbt_attr_fetch() we again loop through each item we previously retrieved from decode_attstream() and look for the given tid.
Okay. Any idea why this new way of storing attribute data as streams (lowerstream and upperstream) has been chosen just for the attributes but not for tids. Are only attribute blocks compressed but not the tids blocks?
One optimization we can to is to tell decode_attstream() to stop decoding at the tid we are interested in. We can also apply other tricks to speed up the lookups in the page, for fixed length attribute, it is easy to do binary search instead of linear search, and for variable length attribute, we can probably try something that we didn't think of yet.
I think we can probably ask decode_attstream() to stop once it has found the tid that we are searching for but then we only need to do that for Index Scans.
Zedstore currently implement update as delete+insert, hence the old tid is not reused. We don't store the tuple in our UNDO log, and we only store the transaction information in the UNDO log. Reusing the tid of the old tuple means putting the old tuple in the UNDO log, which we have not implemented yet.
OKay, so that means performing update on a non-key attribute would also require changes in the index table. In short, HOT update is currently not possible with zedstore table. Am I right?