Обсуждение: sample scans and predicate locking

Поиск
Список
Период
Сортировка

sample scans and predicate locking

От
Andres Freund
Дата:
Hi,

While looking at fixing [1] on master, I noticed the following
codeblock:

static HeapScanDesc
heap_beginscan_internal(Relation relation, Snapshot snapshot,
                        int nkeys, ScanKey key,
                        ParallelHeapScanDesc parallel_scan,
                        bool allow_strat,
                        bool allow_sync,
                        bool allow_pagemode,
                        bool is_bitmapscan,
                        bool is_samplescan,
                        bool temp_snap)
...
    /*
     * For a seqscan in a serializable transaction, acquire a predicate lock
     * on the entire relation. This is required not only to lock all the
     * matching tuples, but also to conflict with new insertions into the
     * table. In an indexscan, we take page locks on the index pages covering
     * the range specified in the scan qual, but in a heap scan there is
     * nothing more fine-grained to lock. A bitmap scan is a different story,
     * there we have already scanned the index and locked the index pages
     * covering the predicate. But in that case we still have to lock any
     * matching heap tuples.
     */
    if (!is_bitmapscan)
        PredicateLockRelation(relation, snapshot);

As you can see this only tests for is_bitmapscan, *not* for
is_samplescan. Which means we afaict currently every sample scan
predicate locks the entire relation.

I think there's two possibilities here:

1) It's just the comment that's inaccurate, and it should really talk
   about both seqscans and sample scans. It should not be necessary to
   lock the whole relation, but I'm not sure the code otherwise takes
   enough care.

2) We should really not predicate lock the entire relation. In which
   case I think there might be missing PredicateLockTuple/Page calls.

Greetings,

Andres Freund

[1] https://postgr.es/m/4EA80A20-E9BF-49F1-9F01-5B66CAB21453%40elusive.cx



Re: sample scans and predicate locking

От
Thomas Munro
Дата:
On Sun, May 19, 2019 at 8:31 AM Andres Freund <andres@anarazel.de> wrote:
> While looking at fixing [1] on master, I noticed the following
> codeblock:
>
> static HeapScanDesc
> heap_beginscan_internal(Relation relation, Snapshot snapshot,
>                                                 int nkeys, ScanKey key,
>                                                 ParallelHeapScanDesc parallel_scan,
>                                                 bool allow_strat,
>                                                 bool allow_sync,
>                                                 bool allow_pagemode,
>                                                 bool is_bitmapscan,
>                                                 bool is_samplescan,
>                                                 bool temp_snap)
> ...
>         /*
>          * For a seqscan in a serializable transaction, acquire a predicate lock
>          * on the entire relation. This is required not only to lock all the
>          * matching tuples, but also to conflict with new insertions into the
>          * table. In an indexscan, we take page locks on the index pages covering
>          * the range specified in the scan qual, but in a heap scan there is
>          * nothing more fine-grained to lock. A bitmap scan is a different story,
>          * there we have already scanned the index and locked the index pages
>          * covering the predicate. But in that case we still have to lock any
>          * matching heap tuples.
>          */
>         if (!is_bitmapscan)
>                 PredicateLockRelation(relation, snapshot);
>
> As you can see this only tests for is_bitmapscan, *not* for
> is_samplescan. Which means we afaict currently every sample scan
> predicate locks the entire relation.

Right, I just tested that.  That's not wrong though, is it?  It's just
overly pessimistic.

> I think there's two possibilities here:
>
> 1) It's just the comment that's inaccurate, and it should really talk
>    about both seqscans and sample scans. It should not be necessary to
>    lock the whole relation, but I'm not sure the code otherwise takes
>    enough care.
>
> 2) We should really not predicate lock the entire relation. In which
>    case I think there might be missing PredicateLockTuple/Page calls.

Yeah, we could probably predicate-lock pages in
heapam_scan_sample_next_block() and tuples in
heapam_scan_sample_next_tuple(), instead of doing this.  Seems like a
reasonable improvement for 13.  But... hmm....  There *might* be a
theoretical argument about TABLESAMPLE(100) behaving differently if
done per page rather than if done at relation level, wrt new pages
added to the end later and therefore missed.  And then by logical
extension, TABLESAMPLE of any percentage.  I'm not sure.

I made a little list of small SERIALIZABLE projects for v13 and added
this, over here:

https://wiki.postgresql.org/wiki/SerializableToDo

-- 
Thomas Munro
https://enterprisedb.com



Re: sample scans and predicate locking

От
Andres Freund
Дата:
Hi,

On 2019-05-19 13:57:42 +1200, Thomas Munro wrote:
> On Sun, May 19, 2019 at 8:31 AM Andres Freund <andres@anarazel.de> wrote:
> > While looking at fixing [1] on master, I noticed the following
> > codeblock:
> >
> > static HeapScanDesc
> > heap_beginscan_internal(Relation relation, Snapshot snapshot,
> >                                                 int nkeys, ScanKey key,
> >                                                 ParallelHeapScanDesc parallel_scan,
> >                                                 bool allow_strat,
> >                                                 bool allow_sync,
> >                                                 bool allow_pagemode,
> >                                                 bool is_bitmapscan,
> >                                                 bool is_samplescan,
> >                                                 bool temp_snap)
> > ...
> >         /*
> >          * For a seqscan in a serializable transaction, acquire a predicate lock
> >          * on the entire relation. This is required not only to lock all the
> >          * matching tuples, but also to conflict with new insertions into the
> >          * table. In an indexscan, we take page locks on the index pages covering
> >          * the range specified in the scan qual, but in a heap scan there is
> >          * nothing more fine-grained to lock. A bitmap scan is a different story,
> >          * there we have already scanned the index and locked the index pages
> >          * covering the predicate. But in that case we still have to lock any
> >          * matching heap tuples.
> >          */
> >         if (!is_bitmapscan)
> >                 PredicateLockRelation(relation, snapshot);
> >
> > As you can see this only tests for is_bitmapscan, *not* for
> > is_samplescan. Which means we afaict currently every sample scan
> > predicate locks the entire relation.
> 
> Right, I just tested that.  That's not wrong though, is it?  It's just
> overly pessimistic.

Yea, I was mostly commenting on the fact that the comment doesn't
mention sample scans, so it looks a bit accidental.

I added a comment to master (as part of a fix, where this codepath was
entered inadvertently)


> > I think there's two possibilities here:
> >
> > 1) It's just the comment that's inaccurate, and it should really talk
> >    about both seqscans and sample scans. It should not be necessary to
> >    lock the whole relation, but I'm not sure the code otherwise takes
> >    enough care.
> >
> > 2) We should really not predicate lock the entire relation. In which
> >    case I think there might be missing PredicateLockTuple/Page calls.
> 
> Yeah, we could probably predicate-lock pages in
> heapam_scan_sample_next_block() and tuples in
> heapam_scan_sample_next_tuple(), instead of doing this.  Seems like a
> reasonable improvement for 13.  But... hmm....  There *might* be a
> theoretical argument about TABLESAMPLE(100) behaving differently if
> done per page rather than if done at relation level, wrt new pages
> added to the end later and therefore missed.  And then by logical
> extension, TABLESAMPLE of any percentage.  I'm not sure.

I don't think that's actually a problem, tablesample doesn't return
invisible rows. And the equivalent issue is true just as well for index
and bitmap heap scans?

Greetings,

Andres Freund



Re: sample scans and predicate locking

От
Thomas Munro
Дата:
On Mon, May 20, 2019 at 10:23 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-05-19 13:57:42 +1200, Thomas Munro wrote:
> > Yeah, we could probably predicate-lock pages in
> > heapam_scan_sample_next_block() and tuples in
> > heapam_scan_sample_next_tuple(), instead of doing this.  Seems like a
> > reasonable improvement for 13.  But... hmm....  There *might* be a
> > theoretical argument about TABLESAMPLE(100) behaving differently if
> > done per page rather than if done at relation level, wrt new pages
> > added to the end later and therefore missed.  And then by logical
> > extension, TABLESAMPLE of any percentage.  I'm not sure.
>
> I don't think that's actually a problem, tablesample doesn't return
> invisible rows. And the equivalent issue is true just as well for index
> and bitmap heap scans?

It affects whether this transaction could be considered to have run
after the other transaction though.  The equivalent issue is handled
for index scans, because we arrange to predicate lock pages that
anyone else will have to touch if they insert index tuples that would
match your WHERE clause (ie your predicate), as the comment says.  (I
wondered if there'd be a finer grained way to do it by
predicate-locking the page-after-last to detect extension, but I
suspect you might really need to lock all-the-pages-after-last... I
don't know.)

-- 
Thomas Munro
https://enterprisedb.com