Обсуждение: An out-of-date comment in nodeIndexonlyscan.c

Поиск
Список
Период
Сортировка

An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
Hello,

Since commit cdf91edb (2012), nodeIndexonlyscan.c says:

                /*
                 * Predicate locks for index-only scans must be
acquired at the page
                 * level when the heap is not accessed, since
tuple-level predicate
                 * locks need the tuple's xmin value.  If we had to
visit the tuple
                 * anyway, then we already have the tuple-level lock
and can skip the
                 * page lock.
                 */
                if (tuple == NULL)
                        PredicateLockPage(scandesc->heapRelation,

ItemPointerGetBlockNumber(tid),
                                                          estate->es_snapshot);

The first sentence of that comment is no longer true as of commit
c01262a8 (2013).  As for whether it's necessary to predicate-lock the
whole eheap page (rather than the heap tuple) anyway because of HOT
update chains, I don't know, so I'm not sure what wording to propose
instead.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: An out-of-date comment in nodeIndexonlyscan.c

От
Heikki Linnakangas
Дата:
On 14/05/18 02:15, Thomas Munro wrote:
> Hello,
> 
> Since commit cdf91edb (2012), nodeIndexonlyscan.c says:
> 
>                  /*
>                   * Predicate locks for index-only scans must be
> acquired at the page
>                   * level when the heap is not accessed, since
> tuple-level predicate
>                   * locks need the tuple's xmin value.  If we had to
> visit the tuple
>                   * anyway, then we already have the tuple-level lock
> and can skip the
>                   * page lock.
>                   */
>                  if (tuple == NULL)
>                          PredicateLockPage(scandesc->heapRelation,
> 
> ItemPointerGetBlockNumber(tid),
>                                                            estate->es_snapshot);
> 
> The first sentence of that comment is no longer true as of commit
> c01262a8 (2013).  As for whether it's necessary to predicate-lock the
> whole eheap page (rather than the heap tuple) anyway because of HOT
> update chains, I don't know, so I'm not sure what wording to propose
> instead.

Hmm. If there are any updated tuples, HOT or not, the visibility map bit 
will not be set, and we won't reach this code. So I think we could 
acquire the tuple lock here.

- Heikki


Re: An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
On Thu, May 17, 2018 at 3:49 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 14/05/18 02:15, Thomas Munro wrote:
> > Since commit cdf91edb (2012), nodeIndexonlyscan.c says:
> >
> >                  /*
> >                   * Predicate locks for index-only scans must be
> > acquired at the page
> >                   * level when the heap is not accessed, since
> > tuple-level predicate
> >                   * locks need the tuple's xmin value.  If we had to
> > visit the tuple
> >                   * anyway, then we already have the tuple-level lock
> > and can skip the
> >                   * page lock.
> >                   */
> >                  if (tuple == NULL)
> >                          PredicateLockPage(scandesc->heapRelation,
> >
> > ItemPointerGetBlockNumber(tid),
> >                                                            estate->es_snapshot);
> >
> > The first sentence of that comment is no longer true as of commit
> > c01262a8 (2013).  As for whether it's necessary to predicate-lock the
> > whole eheap page (rather than the heap tuple) anyway because of HOT
> > update chains, I don't know, so I'm not sure what wording to propose
> > instead.
>
> Hmm. If there are any updated tuples, HOT or not, the visibility map bit
> will not be set, and we won't reach this code. So I think we could
> acquire the tuple lock here.

Right.  CC Kevin.  I think we should at least fix the comment.  As for
changing the lock level, PredicateLockTuple() wants a heap tuple that
we don't have, so we'd probably need to add a PredicateLockTid()
function.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
On Fri, Feb 8, 2019 at 4:58 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, May 17, 2018 at 3:49 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> > On 14/05/18 02:15, Thomas Munro wrote:
> > > The first sentence of that comment is no longer true as of commit
> > > c01262a8 (2013).  As for whether it's necessary to predicate-lock the
> > > whole eheap page (rather than the heap tuple) anyway because of HOT
> > > update chains, I don't know, so I'm not sure what wording to propose
> > > instead.
> >
> > Hmm. If there are any updated tuples, HOT or not, the visibility map bit
> > will not be set, and we won't reach this code. So I think we could
> > acquire the tuple lock here.
>
> Right.  CC Kevin.  I think we should at least fix the comment.  As for
> changing the lock level, PredicateLockTuple() wants a heap tuple that
> we don't have, so we'd probably need to add a PredicateLockTid()
> function.

Here's a patch I'd like to commit to fix the comment.  We could look
at improving the actual code after
https://commitfest.postgresql.org/23/2169/ is done.

I wonder if it might be possible to avoid page locks on both the heap
*and* index in some limited cases, as I mentioned over here (just an
idea, could be way off base):

https://www.postgresql.org/message-id/CA%2BhUKGJGDVfhHmoUGzi-_J%2BN8FmRjmWTY0MCd1ZV5Fj9qdyb1w%40mail.gmail.com

-- 
Thomas Munro
https://enterprisedb.com

Вложения

Re: An out-of-date comment in nodeIndexonlyscan.c

От
Ashwin Agrawal
Дата:
On Thu, Jun 27, 2019 at 4:33 PM Thomas Munro <thomas.munro@gmail.com> wrote:
Here's a patch I'd like to commit to fix the comment.  We could look
at improving the actual code after
https://commitfest.postgresql.org/23/2169/ is done.

Change looks good to me.
 
I wonder if it might be possible to avoid page locks on both the heap
*and* index in some limited cases, as I mentioned over here (just an
idea, could be way off base):

https://www.postgresql.org/message-id/CA%2BhUKGJGDVfhHmoUGzi-_J%2BN8FmRjmWTY0MCd1ZV5Fj9qdyb1w%40mail.gmail.com

I am in same boat as you. One can get serializable conflict error today based on tuple gets HOT updated or not. HOT is logically internal code optimization and not so much user functionality, so ideally feels shouldn't affect serializable behavior. But it does currently, again, due to index locking. Disable HOT update and 4 isolation tests fail due to "could not serialize access due to read/write dependencies among transactions" otherwise not. If the tuple happens to fit on same page user will not get the error, if the tuple gets inserted to different page the error can happen, due to index page locking. I had discussed this with Heikki and thinking is we shouldn't need to take the lock on the index page, if the index key was not changed, even if it was a non-HOT update. Not sure of complications and implications, but just a thought.

Re: An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
On Fri, Jun 28, 2019 at 12:55 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
> Change looks good to me.

Pushed, thanks.

-- 
Thomas Munro
https://enterprisedb.com



Re: An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
On Fri, Jun 28, 2019 at 12:55 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
> On Thu, Jun 27, 2019 at 4:33 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>> I wonder if it might be possible to avoid page locks on both the heap
>> *and* index in some limited cases, as I mentioned over here (just an
>> idea, could be way off base):
>>
>> https://www.postgresql.org/message-id/CA%2BhUKGJGDVfhHmoUGzi-_J%2BN8FmRjmWTY0MCd1ZV5Fj9qdyb1w%40mail.gmail.com
>
> I am in same boat as you. One can get serializable conflict error today based on tuple gets HOT updated or not. HOT
islogically internal code optimization and not so much user functionality, so ideally feels shouldn't affect
serializablebehavior. But it does currently, again, due to index locking. Disable HOT update and 4 isolation tests fail
dueto "could not serialize access due to read/write dependencies among transactions" otherwise not. If the tuple
happensto fit on same page user will not get the error, if the tuple gets inserted to different page the error can
happen,due to index page locking. I had discussed this with Heikki and thinking is we shouldn't need to take the lock
onthe index page, if the index key was not changed, even if it was a non-HOT update. Not sure of complications and
implications,but just a thought. 

Oh, I think the idea I was suggesting might be the same as this item
from README-SSI (unrelated to HOT, but related to index uniqueness,
particularly in index-only-scan where you might be able to skip the
btree page lock):

"Several optimizations are possible, though not all are implemented yet:
...
    * An index scan which is comparing for equality on the entire key
for a unique index need not acquire a predicate lock as long as a key
is found corresponding to a visible tuple which has not been modified
by another transaction -- there are no "between or around" gaps to
cover."

--
Thomas Munro
https://enterprisedb.com



Re: An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
On Fri, Jun 28, 2019 at 12:55 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
> On Thu, Jun 27, 2019 at 4:33 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>>
>> Here's a patch I'd like to commit to fix the comment.  We could look
>> at improving the actual code after
>> https://commitfest.postgresql.org/23/2169/ is done.
>
> Change looks good to me.

... and here is the corresponding code change, with a test to
demonstrate the change.

I'm working on a proof-of-concept to skip the btree page lock
sometimes too, which seems promising, but it requires some planner
work which has temporarily pretzeled my brain.

Вложения

Re: An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
On Sat, Jun 12, 2021 at 2:35 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> ... and here is the corresponding code change, with a test to
> demonstrate the change.
>
> I'm working on a proof-of-concept to skip the btree page lock
> sometimes too, which seems promising, but it requires some planner
> work which has temporarily pretzeled my brain.

Here's a highly experimental patch I came up with that seems to
produce the right results in simple cases, without (yet) involving the
planner.  The regression tests show single table queries, but it works
also for nest loop joins, which is where this optimisation should be
most interesting, I think.  There are a few weird things about this
patch though, and there could well be much better ways to do it, as
noted in the commit message and comments.  It's a start on the
problem...

Вложения

Re: An out-of-date comment in nodeIndexonlyscan.c

От
David Rowley
Дата:
On Mon, 14 Jun 2021 at 00:02, Thomas Munro <thomas.munro@gmail.com> wrote:
> Here's a highly experimental patch I came up with that seems to
> produce the right results in simple cases, without (yet) involving the
> planner.

+ /* Find all equality quals. */
+ for (int i = 0; i < n_scan_keys; ++i)
+ {
+ if (scan_keys[i].sk_strategy == BTEqualStrategyNumber)
+ attnos[nattnos++] = scan_keys[i].sk_attno;
+ }
+
+ /* Are all attributes covered? */
+ /* XXX is this check enough or do we need to work harder? */
+ qsort(attnos, nattnos, sizeof(AttrNumber), compare_int16);
+ nattnos = qunique(attnos, nattnos, sizeof(AttrNumber), compare_int16);
+ if (nattnos == index->rd_index->indnkeyatts)

I think a more optimal and nicer way of doing that would be setting
bits in a Bitmapset then checking bms_num_members is equal to
n_scan_keys.

David



Re: An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
On Mon, Jun 14, 2021 at 12:54 AM David Rowley <dgrowleyml@gmail.com> wrote:
> I think a more optimal and nicer way of doing that would be setting
> bits in a Bitmapset then checking bms_num_members is equal to
> n_scan_keys.

Shouldn't it be compared with indnkeyatts?  Yes, much nicer, thanks!

Вложения

Re: An out-of-date comment in nodeIndexonlyscan.c

От
David Rowley
Дата:
On Mon, 14 Jun 2021 at 10:03, Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Mon, Jun 14, 2021 at 12:54 AM David Rowley <dgrowleyml@gmail.com> wrote:
> > I think a more optimal and nicer way of doing that would be setting
> > bits in a Bitmapset then checking bms_num_members is equal to
> > n_scan_keys.
>
> Shouldn't it be compared with indnkeyatts?  Yes, much nicer, thanks!

Oh yeah, I did mean that. Thanks for the correction.

Have you also thought about deferrable unique / primary key constraints?

It's possible to the uniqueness temporarily violated during a
transaction when the unique constraint is deferred,

For example:
create table t (id int primary key deferrable initially deferred);
begin;
insert into t values(1),(1);
select * from t;
 id
----
  1
  1
(2 rows)

I think you'd just need to add a check to ensure that indimmediate is
true around where you're checking the indisunique flag.

David



Re: An out-of-date comment in nodeIndexonlyscan.c

От
Thomas Munro
Дата:
On Mon, Jun 14, 2021 at 2:29 PM David Rowley <dgrowleyml@gmail.com> wrote:
> Have you also thought about deferrable unique / primary key constraints?
>
> It's possible to the uniqueness temporarily violated during a
> transaction when the unique constraint is deferred,

Oh, yeah, very good question.  I think many scenarios involving
duplicates work correctly, but I think there is a danger like this:

create table t (i int primary key deferrable initially deferred, j int);
create unique index on t(j);
insert into t values (999, 999); -- avoid empty index
set enable_seqscan = off;

begin transaction isolation level serializable;
insert into t values (1, 2); -- create phantom row
select * from t where i = 1;
delete from t where j = 2; -- remove phantom row
SELECT locktype, relation::regclass, page, tuple FROM pg_locks WHERE
mode = 'SIReadLock';
commit;

master:

 locktype | relation | page | tuple
----------+----------+------+-------
 page     | t_pkey   |    1 |
 page     | t_j_idx  |    1 |
(2 rows)

v3 patch:

 locktype | relation | page | tuple
----------+----------+------+-------
(0 rows)

In fact the lock on t_pkey's page 1 was important: it represents the
search for a tuple with i = 1, other than our temporary phantom (only
allowed because constraint deferred).  If someone else inserts a row
matching i = 1, the SSI system will not know that we tried to look for
it, because our own temporary phantom confused us.

> I think you'd just need to add a check to ensure that indimmediate is
> true around where you're checking the indisunique flag.

Yeah, that fixes the problem in this case at least.  With v4:

 locktype | relation | page | tuple
----------+----------+------+-------
 page     | t_pkey   |    1 |
(1 row)

(It's expected that t_j_idx is not locked: the delete query benefits
from the optimisation when accessing the index on t(j)).

That test case is a little confusing, because at no point does it ever
actually create a duplicate, but I suspect the problem is avoided
without the deferred constraint because you'd either get a UCV on
insertion, or block anyone else from inserting until after you commit
(even after you delete the phantom), and I think that may avoid the
hazard.  I could be confused about that.  If I am wrong, then a
possible general solution may be to apply the optimisation only if we
find a match that wasn't written by this transaction, though I'm not
sure how given the current layering.

Вложения

Re: An out-of-date comment in nodeIndexonlyscan.c

От
Daniel Gustafsson
Дата:
This patch now fails to apply, probably due to a mostly trivial collision with
fdd88571454e2b00dbe446e8609c6e4294ca89ae in the test files.  Can you submit a
rebased version?

--
Daniel Gustafsson        https://vmware.com/




Re: An out-of-date comment in nodeIndexonlyscan.c

От
Daniel Gustafsson
Дата:
> On 9 Nov 2021, at 15:28, Daniel Gustafsson <daniel@yesql.se> wrote:
>
> This patch now fails to apply, probably due to a mostly trivial collision with
> fdd88571454e2b00dbe446e8609c6e4294ca89ae in the test files.  Can you submit a
> rebased version?

I've marked this Returned with Feedback, please feel free to resubmit when a
new version of the patch is available.

--
Daniel Gustafsson        https://vmware.com/