Обсуждение: [HACKERS] GSoC 2017 : Proposal for predicate locking in gist index
Hello everyone,
I have been accepted as GSoC student for the project "Explicitly support predicate locks in index access methods besides b-tree". I want to share my approach for implementation of page level predicate locking in gist index. Any suggestions will be appreciated.
Proposal
The main difference between b-tree and gist index while searching for a target tuple is that in gist index, we can determine if there is a match or not at any level of the index. In gist, index entry of internal nodes contains a predicate which is used as a search key to search all reachable tuples from that node. To insert a tuple in the index, we first check the key representing a target subtree. If it doesn't already cover the key we are inserting, we have to replace it with the union of old key and the key we are inserting. After considering all these points, it seems logical to acquire a predicate lock at each level of the index.
The simplest way to do that will be by inserting a call for prdicatelockpage() in gistscanpage().
Insertion algorithm also needs to check for conflicting predicate locks at each level of the index.
We can insert a call for CheckForSerializableConflictIn() at two places in gistdoinsert().
1. after acquiring an exclusive lock on internal page (in case we are trying to replace an old search key)
2. after acquiring an exclusive lock on leaf page
If there is not enough space for insertion, we have to copy predicate lock from an old page to all new pages generated after a successful split operation. For that, we can insert a call for PredicateLockPageSplit() in gistplacetopage().
Regards,
Shubham
On Wed, May 31, 2017 at 3:02 PM, Shubham Barai <shubhambaraiss@gmail.com> wrote: > I have been accepted as GSoC student for the project "Explicitly support > predicate locks in index access methods besides b-tree". I want to share my > approach for implementation of page level predicate locking in gist index. For the benefit of all following the discussion, implementing support in an index AM conceptually consists of two things: (1) Any scan with user-visible results must create SIRead predicate locks on "gaps" scanned. (For example, a scan just to find an insertion spot for an index entry does not generally count, whereas a scan to satisfy a user "EXISTS" test does.) (2) Any insert into the index must CheckForSerializableConflictIn() at enough points that at least one of them will detect an overlap with a predicate lock from a preceding scan if the inserted index entry might have changed the results of that preceding scan. Detecting such a conflict does not automatically result in a serialization failure, but is only part of tracking the information necessary to make such a determination. All that is handled by the existing machinery if the AM does the above. With a btree index, locking gaps at the leaf level is normally sufficient, because both scan and insertion of an index entry must descend that far. > The main difference between b-tree and gist index while searching for a > target tuple is that in gist index, we can determine if there is a match or > not at any level of the index. Agreed. A gist scan can fail at any level, but for that scan to have produced a different result due to insertion of an index entry, *some* page that the scan looked at must be modified. > The simplest way to do that will be by inserting a call for > prdicatelockpage() in gistscanpage(). Yup. > Insertion algorithm also needs to check for conflicting predicate locks at > each level of the index. Yup. > We can insert a call for CheckForSerializableConflictIn() at two places in > gistdoinsert(). > > 1. after acquiring an exclusive lock on internal page (in case we are trying > to replace an old search key) > > 2. after acquiring an exclusive lock on leaf page > > If there is not enough space for insertion, we have to copy predicate lock > from an old page to all new pages generated after a successful split > operation. For that, we can insert a call for PredicateLockPageSplit() in > gistplacetopage(). That all sounds good. When you code a patch, don't forget to add tests to src/test/isolation. Do you need any help getting a development/build/test environment set up? -- Kevin Grittner VMware vCenter Server https://www.vmware.com/
Hi, hackers! 2017-06-01 1:50 GMT+05:00 Kevin Grittner <kgrittn@gmail.com>: > > The main difference between b-tree and gist index while searching for a > > target tuple is that in gist index, we can determine if there is a match or > > not at any level of the index. > > Agreed. A gist scan can fail at any level, but for that scan to > have produced a different result due to insertion of an index entry, > *some* page that the scan looked at must be modified. Two things seem non-obvious to me. First, I just do not know, can VACUUM erase page with predicate lock? Right now, GiST never deletes pages, as far as I know, so this question is only matter of future compatibility. Second, when we are doing GiST inserts, we can encounter unfinished split. That's not a frequent situation, but still. Should we skip finishing split or should we add it to collision check too? Best regards, Andrey Borodin, Octonica.
On Wed, May 31, 2017 at 3:02 PM, Shubham Barai <shubhambaraiss@gmail.com> wrote:
> I have been accepted as GSoC student for the project "Explicitly support
> predicate locks in index access methods besides b-tree". I want to share my
> approach for implementation of page level predicate locking in gist index.
For the benefit of all following the discussion, implementing
support in an index AM conceptually consists of two things:
(1) Any scan with user-visible results must create SIRead predicate
locks on "gaps" scanned. (For example, a scan just to find an
insertion spot for an index entry does not generally count, whereas
a scan to satisfy a user "EXISTS" test does.)
(2) Any insert into the index must CheckForSerializableConflictIn()
at enough points that at least one of them will detect an overlap
with a predicate lock from a preceding scan if the inserted index
entry might have changed the results of that preceding scan.
Detecting such a conflict does not automatically result in a
serialization failure, but is only part of tracking the information
necessary to make such a determination. All that is handled by the
existing machinery if the AM does the above.
With a btree index, locking gaps at the leaf level is normally
sufficient, because both scan and insertion of an index entry must
descend that far.
> The main difference between b-tree and gist index while searching for a
> target tuple is that in gist index, we can determine if there is a match or
> not at any level of the index.
Agreed. A gist scan can fail at any level, but for that scan to
have produced a different result due to insertion of an index entry,
*some* page that the scan looked at must be modified.
> The simplest way to do that will be by inserting a call for
> prdicatelockpage() in gistscanpage().
Yup.
> Insertion algorithm also needs to check for conflicting predicate locks at
> each level of the index.
Yup.
> We can insert a call for CheckForSerializableConflictIn() at two places in
> gistdoinsert().
>
> 1. after acquiring an exclusive lock on internal page (in case we are trying
> to replace an old search key)
>
> 2. after acquiring an exclusive lock on leaf page
>
> If there is not enough space for insertion, we have to copy predicate lock
> from an old page to all new pages generated after a successful split
> operation. For that, we can insert a call for PredicateLockPageSplit() in
> gistplacetopage().
That all sounds good. When you code a patch, don't forget to add
tests to src/test/isolation.
Do you need any help getting a development/build/test environment
set up?
--
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/
On Thu, Jun 1, 2017 at 12:49 AM, Andrew Borodin <borodin@octonica.com> wrote: > First, I just do not know, can VACUUM erase page with predicate lock? For handling in btree, see: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/nbtree/nbtpage.c;h=f815fd40b20e98a88cb2fb5c71005ea125a459c9;hb=refs/heads/master#l1406 Note also this discussion: https://www.postgresql.org/message-id/4D669122.80904@enterprisedb.com It doesn't look like we ever got to the optimization Heikki suggested in that post, so on rare occasions we could see a false positive from this. Perhaps we should give this another look while we're in the AMs. > Right now, GiST never deletes pages, as far as I know, so this > question is only matter of future compatibility. ok > Second, when we are doing GiST inserts, we can encounter unfinished > split. That's not a frequent situation, but still. Should we skip > finishing split or should we add it to collision check too? When a page is split, I think you need to call PredicateLockPageSplit() before it gets to the point that an insert to get to it. -- Kevin Grittner VMware vCenter Server https://www.vmware.com/