Обсуждение: [DOC] Document concurrent index builds waiting on each other

Поиск
Список
Период
Сортировка

[DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
In my experience it's not immediately obvious (even after reading the
documentation) the implications of how concurrent index builds manage
transactions with respect to multiple concurrent index builds in
flight at the same time.

Specifically, as I understand multiple concurrent index builds running
at the same time will all return at the same time as the longest
running one.

I've attached a small patch to call this caveat out specifically in
the documentation. I think the description in the patch is accurate,
but please let me know if there's some intricacies around how the
various stages might change the results.

James Coleman

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
Bruce Momjian
Дата:
On Wed, Sep 18, 2019 at 01:51:00PM -0400, James Coleman wrote:
> In my experience it's not immediately obvious (even after reading the
> documentation) the implications of how concurrent index builds manage
> transactions with respect to multiple concurrent index builds in
> flight at the same time.
> 
> Specifically, as I understand multiple concurrent index builds running
> at the same time will all return at the same time as the longest
> running one.
> 
> I've attached a small patch to call this caveat out specifically in
> the documentation. I think the description in the patch is accurate,
> but please let me know if there's some intricacies around how the
> various stages might change the results.

The CREATE INDEX docs already say:

    In a concurrent index build, the index is actually entered into
    the system catalogs in one transaction, then two table scans occur in
    two more transactions.  Before each table scan, the index build must
    wait for existing transactions that have modified the table to terminate.
    After the second scan, the index build must wait for any transactions
--> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
--> scan to terminate.  Then finally the index can be marked ready for use,

So, having multiple concurrent index scans is just a special case of
having to "wait for any transactions that have a snapshot", no?  I am
not sure adding a doc mention of other index builds really is helpful.

---------------------------------------------------------------------------

> commit 9e28e704820eebb81ff94c1c3cbfb7db087b2c45
> Author: James Coleman <jtc331@gmail.com>
> Date:   Wed Sep 18 13:36:22 2019 -0400
> 
>     Document concurrent indexes waiting on each other
>     
>     It's not immediately obvious that because concurrent index building
>     waits on previously running transactions to complete, running multiple
>     concurrent index builds at the same time will result in each of them
>     taking as long to return as the longest takes, so, document this caveat.
> 
> diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
> index 629a31ef79..35f15abb0e 100644
> --- a/doc/src/sgml/ref/create_index.sgml
> +++ b/doc/src/sgml/ref/create_index.sgml
> @@ -616,6 +616,13 @@ Indexes:
>      cannot.
>     </para>
>  
> +   <para>
> +    Because the second table scan must wait for any transactions having a
> +    snapshot preceding the start of that scan to finish before completing the
> +    scan, concurrent index builds on multiple tables at the same time will
> +    not return on any one table until all have completed.
> +   </para>
> +
>     <para>
>      Concurrent builds for indexes on partitioned tables are currently not
>      supported.  However, you may concurrently build the index on each


-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2019-Sep-28, Bruce Momjian wrote:

> The CREATE INDEX docs already say:
> 
>     In a concurrent index build, the index is actually entered into
>     the system catalogs in one transaction, then two table scans occur in
>     two more transactions.  Before each table scan, the index build must
>     wait for existing transactions that have modified the table to terminate.
>     After the second scan, the index build must wait for any transactions
> --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> --> scan to terminate.  Then finally the index can be marked ready for use,
> 
> So, having multiple concurrent index scans is just a special case of
> having to "wait for any transactions that have a snapshot", no?  I am
> not sure adding a doc mention of other index builds really is helpful.

I always thought that create index concurrently was prevented from
running concurrently in a table by the ShareUpdateExclusive lock that's
held during the operation.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Sat, Sep 28, 2019 at 9:22 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2019-Sep-28, Bruce Momjian wrote:
>
> > The CREATE INDEX docs already say:
> >
> >     In a concurrent index build, the index is actually entered into
> >     the system catalogs in one transaction, then two table scans occur in
> >     two more transactions.  Before each table scan, the index build must
> >     wait for existing transactions that have modified the table to terminate.
> >     After the second scan, the index build must wait for any transactions
> > --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> > --> scan to terminate.  Then finally the index can be marked ready for use,
> >
> > So, having multiple concurrent index scans is just a special case of
> > having to "wait for any transactions that have a snapshot", no?  I am
> > not sure adding a doc mention of other index builds really is helpful.
>
> I always thought that create index concurrently was prevented from
> running concurrently in a table by the ShareUpdateExclusive lock that's
> held during the operation.

You mean multiple CICs on a single table at the same time? Yes, that
(unfortunately) isn't possible, but I'm concerned in the patch with
the fact that CIC on table X blocks CIC on table Y.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Bruce Momjian
Дата:
On Sat, Sep 28, 2019 at 09:54:48PM -0400, James Coleman wrote:
> On Sat, Sep 28, 2019 at 9:22 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> >
> > On 2019-Sep-28, Bruce Momjian wrote:
> >
> > > The CREATE INDEX docs already say:
> > >
> > >     In a concurrent index build, the index is actually entered into
> > >     the system catalogs in one transaction, then two table scans occur in
> > >     two more transactions.  Before each table scan, the index build must
> > >     wait for existing transactions that have modified the table to terminate.
> > >     After the second scan, the index build must wait for any transactions
> > > --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> > > --> scan to terminate.  Then finally the index can be marked ready for use,
> > >
> > > So, having multiple concurrent index scans is just a special case of
> > > having to "wait for any transactions that have a snapshot", no?  I am
> > > not sure adding a doc mention of other index builds really is helpful.
> >
> > I always thought that create index concurrently was prevented from
> > running concurrently in a table by the ShareUpdateExclusive lock that's
> > held during the operation.
> 
> You mean multiple CICs on a single table at the same time? Yes, that
> (unfortunately) isn't possible, but I'm concerned in the patch with
> the fact that CIC on table X blocks CIC on table Y.

I think any open transaction will block CIC, which is my point.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Sat, Sep 28, 2019 at 9:56 PM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Sat, Sep 28, 2019 at 09:54:48PM -0400, James Coleman wrote:
> > On Sat, Sep 28, 2019 at 9:22 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > >
> > > On 2019-Sep-28, Bruce Momjian wrote:
> > >
> > > > The CREATE INDEX docs already say:
> > > >
> > > >     In a concurrent index build, the index is actually entered into
> > > >     the system catalogs in one transaction, then two table scans occur in
> > > >     two more transactions.  Before each table scan, the index build must
> > > >     wait for existing transactions that have modified the table to terminate.
> > > >     After the second scan, the index build must wait for any transactions
> > > > --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> > > > --> scan to terminate.  Then finally the index can be marked ready for use,
> > > >
> > > > So, having multiple concurrent index scans is just a special case of
> > > > having to "wait for any transactions that have a snapshot", no?  I am
> > > > not sure adding a doc mention of other index builds really is helpful.

While that may be technically true, as a co-worker of mine likes to
point out, being "technically correct" is the worst kind of correct.

Here's what I mean:

First, I believe the docs should aim to be as useful as possible to
even those with more entry-level understanding of PostgreSQL. The fact
the paragraph you cite actually links to the entire chapter on
concurrency control in Postgres demonstrates that there's some
not-so-immediate stuff here to consider. For one: is it obvious to all
users that the transaction held by CIC (or even that all transactions)
has an open snapshot?

Second, this is a difference from a regular CREATE INDEX, and we
already call out as caveats differences between CREATE INDEX
CONCURRENTLY and regular CREATE INDEX as I point out below re:
Alvaro's comment.

Third, related to the above point, many DDL commands only block DDL
against the table being operated on. The fact that CIC here is
different is, in my opinion, a fairly surprising break from that
pattern, and as such likely to catch users off guard. I can attest
that this surprised at least one entire database team a while back :)
including many people who've been operating Postgres at a large scale
for a long time.

I believe caveats like this are worth calling out rather than
expecting users to have to understand the implementation details an
work out the implications on their own.

> > > I always thought that create index concurrently was prevented from
> > > running concurrently in a table by the ShareUpdateExclusive lock that's
> > > held during the operation.
> >
> > You mean multiple CICs on a single table at the same time? Yes, that
> > (unfortunately) isn't possible, but I'm concerned in the patch with
> > the fact that CIC on table X blocks CIC on table Y.
>
> I think any open transaction will block CIC, which is my point.

I read Alvaro as referring to the fact that the docs already call out
the following:

> Regular index builds permit other regular index builds on the same table to occur simultaneously, but only one
concurrentindex build can occur on a table at a time.
 

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2019-Sep-28, James Coleman wrote:

> I believe caveats like this are worth calling out rather than
> expecting users to have to understand the implementation details an
> work out the implications on their own.

I agree.

> I read Alvaro as referring to the fact that the docs already call out
> the following:
> 
> > Regular index builds permit other regular index builds on the same
> > table to occur simultaneously, but only one concurrent index build
> > can occur on a table at a time.

Yeah, that's what I was understanding.

BTW I think there's an approach that could alleviate part of this
problem, at least some of the time: whenever CIC runs for an index
that's not on expression and not partial, we could set the
PROC_IN_VACUUM flag.  That would cause it to get ignored by other
processes for snapshot purposes (including CIC itself), as well as by
vacuum.  I need to take some time to research the safety of this, but
intuitively it seems safe.

Even further, I think we could also do it for regular CREATE INDEX
(under the same conditions) provided that it's not run in a transaction
block.  But that requires even more research/proof.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [DOC] Document concurrent index builds waiting on each other

От
Michael Paquier
Дата:
On Sat, Sep 28, 2019 at 10:22:28PM -0300, Alvaro Herrera wrote:
> I always thought that create index concurrently was prevented from
> running concurrently in a table by the ShareUpdateExclusive lock that's
> held during the operation.

REINDEX CONCURRENTLY and CIC can deadlock while waiting for each other
to finish after their validation phase, see:
https://www.postgresql.org/message-id/20190507030756.GD1499@paquier.xyz
https://www.postgresql.org/message-id/20190507032543.GH1499@paquier.xyz
--
Michael

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
I went ahead and registered this in the current only CF as
https://commitfest.postgresql.org/27/2454/

Alvaro: Would you like to be added as a reviewer?

James



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Sun, Sep 29, 2019 at 9:24 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Sep 28, 2019 at 10:22:28PM -0300, Alvaro Herrera wrote:
> > I always thought that create index concurrently was prevented from
> > running concurrently in a table by the ShareUpdateExclusive lock that's
> > held during the operation.
>
> REINDEX CONCURRENTLY and CIC can deadlock while waiting for each other
> to finish after their validation phase, see:
> https://www.postgresql.org/message-id/20190507030756.GD1499@paquier.xyz
> https://www.postgresql.org/message-id/20190507032543.GH1499@paquier.xyz

Michael,

Thanks for the cross-link. Do you think this would be valuable to
document at the same time? Or did you just want to ensure we were also
aware of this particular downfall? If the latter, I appreciate it,
it's helpful info. If the latter, let me know, and I'll try to update
the patch.

Thanks,
James



Re: [DOC] Document concurrent index builds waiting on each other

От
Andres Freund
Дата:
Hi,

On 2019-09-18 13:51:00 -0400, James Coleman wrote:
> In my experience it's not immediately obvious (even after reading the
> documentation) the implications of how concurrent index builds manage
> transactions with respect to multiple concurrent index builds in
> flight at the same time.
> 
> Specifically, as I understand multiple concurrent index builds running
> at the same time will all return at the same time as the longest
> running one.
> 
> I've attached a small patch to call this caveat out specifically in
> the documentation. I think the description in the patch is accurate,
> but please let me know if there's some intricacies around how the
> various stages might change the results.
> 
> James Coleman

I'd much rather see effort spent fixing this issue as far as it relates
to concurrent CICs. For the snapshot waits we can add a procarray flag
(alongside PROCARRAY_VACUUM_FLAG) indicating that the backend is
doing. Which WaitForOlderSnapshots() can then use to ignore those CICs,
which is safe, because those transactions definitely don't insert into
relations targeted by CIC. The change to WaitForOlderSnapshots() would
just be to pass the new flag to GetCurrentVirtualXIDs, I think.

Greetings,

Andres Freund



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Wed, Mar 25, 2020 at 3:19 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2019-09-18 13:51:00 -0400, James Coleman wrote:
> > In my experience it's not immediately obvious (even after reading the
> > documentation) the implications of how concurrent index builds manage
> > transactions with respect to multiple concurrent index builds in
> > flight at the same time.
> >
> > Specifically, as I understand multiple concurrent index builds running
> > at the same time will all return at the same time as the longest
> > running one.
> >
> > I've attached a small patch to call this caveat out specifically in
> > the documentation. I think the description in the patch is accurate,
> > but please let me know if there's some intricacies around how the
> > various stages might change the results.
> >
> > James Coleman
>
> I'd much rather see effort spent fixing this issue as far as it relates
> to concurrent CICs. For the snapshot waits we can add a procarray flag
> (alongside PROCARRAY_VACUUM_FLAG) indicating that the backend is
> doing. Which WaitForOlderSnapshots() can then use to ignore those CICs,
> which is safe, because those transactions definitely don't insert into
> relations targeted by CIC. The change to WaitForOlderSnapshots() would
> just be to pass the new flag to GetCurrentVirtualXIDs, I think.

Alvaro: I think you had some ideas on this too; any chance you've know
of a patch that anyone's got cooking?

Andres: If we got this fixed in current PG would you be opposed to
documenting the caveat in previous versions?

Thanks,
James



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2020-Mar-25, James Coleman wrote:

> Alvaro: I think you had some ideas on this too; any chance you've know
> of a patch that anyone's got cooking?

I posted this in November
https://postgr.es/m/20191101203310.GA12239@alvherre.pgsql but I didn't
put time to go through the issues there.  I don't know if my approach is
exactly what Andres has in mind, but I was discouraged by the number of
gotchas for which the optimization I propose has to be turned off.

Maybe that preliminary patch can serve as a discussion starter, if
nothing else.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [DOC] Document concurrent index builds waiting on each other

От
Andres Freund
Дата:
Hi,

On 2020-03-25 15:24:44 -0400, James Coleman wrote:
> Andres: If we got this fixed in current PG would you be opposed to
> documenting the caveat in previous versions?

Not really. I'm just not confident it's going to be useful, given the
amount of details needed to be provided to really make sense of the
issue (the earlier CIC phases don't wait for snapshots, but just
relation locks etc).

Greetings,

Andres Freund



Re: [DOC] Document concurrent index builds waiting on each other

От
Andres Freund
Дата:
Hi,

On 2020-03-25 16:30:10 -0300, Alvaro Herrera wrote:
> I posted this in November
> https://postgr.es/m/20191101203310.GA12239@alvherre.pgsql but I didn't
> put time to go through the issues there.

Oh, missed that.


> I don't know if my approach is exactly what Andres has in mind

Not quite. I don't think it's generally correct for CIC to set
PROC_IN_VACUUM. I'm doubtful it's the case even just for plain indexes -
we don't want rows to be pruned away from under us. I also think we'd
want to set such a flag during all of the CIC phases?

What I was thinking of was a new flag, with a distinct value from
PROC_IN_VACUUM. It'd currently just be specified in the
GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid
needing to wait for other CICs on different relations. Since CIC is not
permitted on system tables, and CIC doesn't do DML on normal tables, it
seems fairly obviously correct to exclude other CICs.

Greetings,

Andres Freund



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2020-Mar-25, Andres Freund wrote:

> > I don't know if my approach is exactly what Andres has in mind
> 
> Not quite. I don't think it's generally correct for CIC to set
> PROC_IN_VACUUM. I'm doubtful it's the case even just for plain indexes -
> we don't want rows to be pruned away from under us. I also think we'd
> want to set such a flag during all of the CIC phases?
> 
> What I was thinking of was a new flag, with a distinct value from
> PROC_IN_VACUUM. It'd currently just be specified in the
> GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid
> needing to wait for other CICs on different relations. Since CIC is not
> permitted on system tables, and CIC doesn't do DML on normal tables, it
> seems fairly obviously correct to exclude other CICs.

Hmm, that sounds more promising.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [DOC] Document concurrent index builds waiting on each other

От
Michael Paquier
Дата:
On Wed, Mar 25, 2020 at 05:12:48PM -0300, Alvaro Herrera wrote:
> Hmm, that sounds more promising.

Haven't looked at that myself in details.  But as I doubt that this
would be backpatched, wouldn't it be better to document the issue for
now?
--
Michael

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Wed, Mar 25, 2020 at 3:58 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-03-25 16:30:10 -0300, Alvaro Herrera wrote:
> > I posted this in November
> > https://postgr.es/m/20191101203310.GA12239@alvherre.pgsql but I didn't
> > put time to go through the issues there.
>
> Oh, missed that.
>
>
> > I don't know if my approach is exactly what Andres has in mind
>
> Not quite. I don't think it's generally correct for CIC to set
> PROC_IN_VACUUM. I'm doubtful it's the case even just for plain indexes -
> we don't want rows to be pruned away from under us. I also think we'd
> want to set such a flag during all of the CIC phases?
>
> What I was thinking of was a new flag, with a distinct value from
> PROC_IN_VACUUM. It'd currently just be specified in the
> GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid
> needing to wait for other CICs on different relations. Since CIC is not
> permitted on system tables, and CIC doesn't do DML on normal tables, it
> seems fairly obviously correct to exclude other CICs.

That would keep CIC from blocking other CICs, but it wouldn't solve
the problem of CIC blocking vacuum on unrelated tables, right? Perhaps
that's orthogonal though.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Andres Freund
Дата:
Hi,

On 2020-04-15 09:31:58 -0400, James Coleman wrote:
> On Wed, Mar 25, 2020 at 3:58 PM Andres Freund <andres@anarazel.de> wrote:
> > On 2020-03-25 16:30:10 -0300, Alvaro Herrera wrote:
> > > I posted this in November
> > > https://postgr.es/m/20191101203310.GA12239@alvherre.pgsql but I didn't
> > > put time to go through the issues there.
> >
> > Oh, missed that.
> >
> >
> > > I don't know if my approach is exactly what Andres has in mind
> >
> > Not quite. I don't think it's generally correct for CIC to set
> > PROC_IN_VACUUM. I'm doubtful it's the case even just for plain indexes -
> > we don't want rows to be pruned away from under us. I also think we'd
> > want to set such a flag during all of the CIC phases?
> >
> > What I was thinking of was a new flag, with a distinct value from
> > PROC_IN_VACUUM. It'd currently just be specified in the
> > GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid
> > needing to wait for other CICs on different relations. Since CIC is not
> > permitted on system tables, and CIC doesn't do DML on normal tables, it
> > seems fairly obviously correct to exclude other CICs.
> 
> That would keep CIC from blocking other CICs, but it wouldn't solve
> the problem of CIC blocking vacuum on unrelated tables, right? Perhaps
> that's orthogonal though.

I am not sure what blocking you are referring to here? CIC shouldn't
block vacuum on other tables from running? Or do you just mean that
vacuum will not be able to remove some rows due to the snapshot from the
CIC? That'd be an orthogonal problem, yes.

If it's about the xmin horizon for vacuum: I think we could probably
avoid that using the same flag. As vacuum cannot be run against a table
that has a CIC running (although it'd theoretically be possible to allow
that), it should be safe to ignore PROC_IN_CIC backends in vacuum's
GetOldestXmin() call. That might not be true for system relations, but
we don't allow CIC on those.

Greetings,

Andres Freund



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Wed, Apr 15, 2020 at 6:31 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-04-15 09:31:58 -0400, James Coleman wrote:
> > On Wed, Mar 25, 2020 at 3:58 PM Andres Freund <andres@anarazel.de> wrote:
> > > On 2020-03-25 16:30:10 -0300, Alvaro Herrera wrote:
> > > > I posted this in November
> > > > https://postgr.es/m/20191101203310.GA12239@alvherre.pgsql but I didn't
> > > > put time to go through the issues there.
> > >
> > > Oh, missed that.
> > >
> > >
> > > > I don't know if my approach is exactly what Andres has in mind
> > >
> > > Not quite. I don't think it's generally correct for CIC to set
> > > PROC_IN_VACUUM. I'm doubtful it's the case even just for plain indexes -
> > > we don't want rows to be pruned away from under us. I also think we'd
> > > want to set such a flag during all of the CIC phases?
> > >
> > > What I was thinking of was a new flag, with a distinct value from
> > > PROC_IN_VACUUM. It'd currently just be specified in the
> > > GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid
> > > needing to wait for other CICs on different relations. Since CIC is not
> > > permitted on system tables, and CIC doesn't do DML on normal tables, it
> > > seems fairly obviously correct to exclude other CICs.
> >
> > That would keep CIC from blocking other CICs, but it wouldn't solve
> > the problem of CIC blocking vacuum on unrelated tables, right? Perhaps
> > that's orthogonal though.
>
> I am not sure what blocking you are referring to here? CIC shouldn't
> block vacuum on other tables from running? Or do you just mean that
> vacuum will not be able to remove some rows due to the snapshot from the
> CIC? That'd be an orthogonal problem, yes.
>
> If it's about the xmin horizon for vacuum: I think we could probably
> avoid that using the same flag. As vacuum cannot be run against a table
> that has a CIC running (although it'd theoretically be possible to allow
> that), it should be safe to ignore PROC_IN_CIC backends in vacuum's
> GetOldestXmin() call. That might not be true for system relations, but
> we don't allow CIC on those.

Yeah, I mean that if I have a CIC running on table X then vacuum can't
remove dead tuples (from after the CIC's snapshot) on table Y.

That's a pretty significant danger, given the combination of:
1. Index builds on very large tables can take many days, and
2. The well understood problems of high update tables with dead tuples
and poor plans.

I've previously discussed this with other hackers and the reasoning
they'd understood way that we couldn't always safely ignore
PROC_IN_CIC backends in the vacuum's oldest xmin call because of
function indexes, and the fact that (despite clear recommendations to
the contrary), there's nothing actually preventing someone from adding
a function index on table X that queries table Y.

I'm not sure I buy that we should care about people doing something
clearly so dangerous, but...I grant that it'd be nice not to cause new
crashes.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Andres Freund
Дата:
Hi,

On 2020-04-15 21:44:48 -0400, James Coleman wrote:
> On Wed, Apr 15, 2020 at 6:31 PM Andres Freund <andres@anarazel.de> wrote:
> > If it's about the xmin horizon for vacuum: I think we could probably
> > avoid that using the same flag. As vacuum cannot be run against a table
> > that has a CIC running (although it'd theoretically be possible to allow
> > that), it should be safe to ignore PROC_IN_CIC backends in vacuum's
> > GetOldestXmin() call. That might not be true for system relations, but
> > we don't allow CIC on those.
>
> Yeah, I mean that if I have a CIC running on table X then vacuum can't
> remove dead tuples (from after the CIC's snapshot) on table Y.

For me "blocking" evokes waiting for a lock, which is why I thought
you'd not mean that issue.


> That's a pretty significant danger, given the combination of:
> 1. Index builds on very large tables can take many days, and

We at least don't hold a single snapshot over the multiple phases...


> 2. The well understood problems of high update tables with dead tuples
> and poor plans.

Which specific problem are you referring to? The planner probing the end
of the index for values outside of the histogram? I'd hope
3ca930fc39ccf987c1c22fd04a1e7463b5dd0dfd improved the situation there a
bit?


> > [description why we could ignore CIC for vacuum horizon on other tables ]

> I've previously discussed this with other hackers and the reasoning
> they'd understood way that we couldn't always safely ignore
> PROC_IN_CIC backends in the vacuum's oldest xmin call because of
> function indexes, and the fact that (despite clear recommendations to
> the contrary), there's nothing actually preventing someone from adding
> a function index on table X that queries table Y.

Well, even if we consider this an actual problem, we could still use
PROC_IN_CIC for non-expression non-partial indexes (index operator
themselves better ensure this isn't a problem, or they're ridiculously
broken already - they can get called during vacuum).

Even when expressions are involved, I don't think that necessarily would
have to mean that we need to use the same snapshot to run expressions in
for the hole scan. So we could occasionally take a new snapshot for the
purpose of computing expressions.

The hard part presumably would be that we'd need to advertise one xmin
for the expression snapshot to protect tuples potentially accessed from
being removed, but at the same time we also need to advertise the xmin
of the snapshot used by CIC, to avoid HOT pruning in other session from
removing tuple versions from the table the index is being created
on.

There's not really infrastructure for doing so. I think we'd basically
have to start publicizing multiple xmin values (as long as PGXACT->xmin
is <= new xmin for expressions, only GetOldestXmin() would need to care,
and it's not that performance critical). Not pretty.


> I'm not sure I buy that we should care about people doing something
> clearly so dangerous, but...I grant that it'd be nice not to cause new
> crashes.

I don't think it's just dangerous expressions that would be
affected. Normal expression indexes need to be able to do syscache
lookups etc, and they can't safely do so if tuple versions can be
removed in the middle of a scan. We could avoid that by not ignoring
PROC_IN_CIC backend in GetOldestXmin() calls for catalog tables (yuck).

Regards,

Andres



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Thu, Apr 16, 2020 at 6:12 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-04-15 21:44:48 -0400, James Coleman wrote:
> > On Wed, Apr 15, 2020 at 6:31 PM Andres Freund <andres@anarazel.de> wrote:
> > > If it's about the xmin horizon for vacuum: I think we could probably
> > > avoid that using the same flag. As vacuum cannot be run against a table
> > > that has a CIC running (although it'd theoretically be possible to allow
> > > that), it should be safe to ignore PROC_IN_CIC backends in vacuum's
> > > GetOldestXmin() call. That might not be true for system relations, but
> > > we don't allow CIC on those.
> >
> > Yeah, I mean that if I have a CIC running on table X then vacuum can't
> > remove dead tuples (from after the CIC's snapshot) on table Y.
>
> For me "blocking" evokes waiting for a lock, which is why I thought
> you'd not mean that issue.

It was sloppy choice of language on my part; for better or worse at
work we've taken to talking about "blocking vacuum" when that's really
shorthand for "blocking [or you'd prefer preventing] vacuuming dead
tuples".

> > That's a pretty significant danger, given the combination of:
> > 1. Index builds on very large tables can take many days, and
>
> We at least don't hold a single snapshot over the multiple phases...

For sure. And text sorting improvements have made this better also,
still, as you often point out re: xid size, databases are only getting
larger (and more TPS).

> > 2. The well understood problems of high update tables with dead tuples
> > and poor plans.
>
> Which specific problem are you referring to? The planner probing the end
> of the index for values outside of the histogram? I'd hope
> 3ca930fc39ccf987c1c22fd04a1e7463b5dd0dfd improved the situation there a
> bit?

Yes, and other commits too, IIRC from the time we spent debugging
exactly the scenario mentioned in that commit.

But by "poor plans" I don't mean specifically "poor planning time" but
that we can still end up choosing the "wrong" plan, right? And dead
tuples can make an index scan be significantly worse than it would
otherwise be. Same for a seq scan: you can end up looking at millions
of dead tuples in a nominally 500 row table.

> > > [description why we could ignore CIC for vacuum horizon on other tables ]
>
> > I've previously discussed this with other hackers and the reasoning
> > they'd understood way that we couldn't always safely ignore
> > PROC_IN_CIC backends in the vacuum's oldest xmin call because of
> > function indexes, and the fact that (despite clear recommendations to
> > the contrary), there's nothing actually preventing someone from adding
> > a function index on table X that queries table Y.
>
> Well, even if we consider this an actual problem, we could still use
> PROC_IN_CIC for non-expression non-partial indexes (index operator
> themselves better ensure this isn't a problem, or they're ridiculously
> broken already - they can get called during vacuum).

Agreed. It'd be unfortunate to have to limit it though.

> Even when expressions are involved, I don't think that necessarily would
> have to mean that we need to use the same snapshot to run expressions in
> for the hole scan. So we could occasionally take a new snapshot for the
> purpose of computing expressions.
>
> The hard part presumably would be that we'd need to advertise one xmin
> for the expression snapshot to protect tuples potentially accessed from
> being removed, but at the same time we also need to advertise the xmin
> of the snapshot used by CIC, to avoid HOT pruning in other session from
> removing tuple versions from the table the index is being created
> on.
>
> There's not really infrastructure for doing so. I think we'd basically
> have to start publicizing multiple xmin values (as long as PGXACT->xmin
> is <= new xmin for expressions, only GetOldestXmin() would need to care,
> and it's not that performance critical). Not pretty.

In other words, pretty invasive.

> > I'm not sure I buy that we should care about people doing something
> > clearly so dangerous, but...I grant that it'd be nice not to cause new
> > crashes.
>
> I don't think it's just dangerous expressions that would be
> affected. Normal expression indexes need to be able to do syscache
> lookups etc, and they can't safely do so if tuple versions can be
> removed in the middle of a scan. We could avoid that by not ignoring
> PROC_IN_CIC backend in GetOldestXmin() calls for catalog tables (yuck).

At first glance this sounds a lot less invasive, but I also agree it's gross.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
David Johnston
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            tested, passed

James,

I'm on board with the point of pointing out explicitly the "concurrent index builds on multiple tables at the same time
willnot return on any one table until all have completed", with back-patching.  I do not believe the new paragraph is
necessarythough.  I'd suggest trying to weave it into the existing paragraph ending "Even then, however, the index may
notbe immediately usable for queries: in the worst case, it cannot be used as long as transactions exist that predate
thestart of the index build."  Adding "Notably, " in front of the existing sentence fragment above and tacking it onto
theend probably suffices.
 

I don't actually don't whether this is true behavior though.  Is it something our tests do, or could, demonstrate?

It is sorta weird to say "one will not return until all have completed, though, since usually people think return means
completed". That whole paragraph is a bit unclear for the inexperienced DBA, in particular marked ready to use but
isn'tusable.
 

That isn't really on this patch to fix though, and the clarity around concurrent CIC seems worthwhile to add, even if
imprecise- IMO it doesn't make that whole section any less clear and points out what seems to be a unique dynamic.  IOW
Iwould send the simple fix (inline, not a new paragraph) to a committer.  The bigger doc reworking or actual behavioral
improvementsshouldn't hold up such a simple improvement.
 

David J.

The new status of this patch is: Waiting on Author

Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Thu, Jul 16, 2020 at 7:34 PM David Johnston
<david.g.johnston@gmail.com> wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world:  not tested
> Implements feature:       not tested
> Spec compliant:           not tested
> Documentation:            tested, passed
>
> James,
>
> I'm on board with the point of pointing out explicitly the "concurrent index builds on multiple tables at the same
timewill not return on any one table until all have completed", with back-patching.  I do not believe the new paragraph
isnecessary though.  I'd suggest trying to weave it into the existing paragraph ending "Even then, however, the index
maynot be immediately usable for queries: in the worst case, it cannot be used as long as transactions exist that
predatethe start of the index build."  Adding "Notably, " in front of the existing sentence fragment above and tacking
itonto the end probably suffices. 

I'm not sure "the index may not be immediately usable for queries" is
really accurate/sufficient: it seems to imply the CREATE INDEX has
returned but for some reason the index isn't yet valid. The issue I'm
trying to describe here is that the CREATE INDEX query itself will not
return until all preceding queries have completed *including*
concurrent index creations on unrelated tables.

> I don't actually don't whether this is true behavior though.  Is it something our tests do, or could, demonstrate?

It'd take tests that exercise parallelism, but it's pretty simple to
demonstrate (but you do have to catch the first index build in a scan
phase, so you either need lots of data or a hack). Here's an example
that uses a bit of a hack to simulate a slow scan phase:

Setup:
create table items(i int);
create table others(i int);
create function slow_expr() returns text as $$ select pg_sleep(15);
select '5'; $$ language sql immutable;
insert into items(i) values (1), (2);
insert into others(i) values (1), (2);

Then the following in order:
1. In session A: create index concurrently on items((i::text || slow_expr()));
2. In session B (at the same time): create index concurrently on others(i);

You'll notice that the 2nd command, which should be practically
instantaneous, waits on the first ~30s scan phase of (1) before it
returns. The same is true if after (2) completes you immediately run
it again -- it waits on the second ~30s scan phase of (1).

That does reveal a bit of complexity though that that the current
patch doesn't address, which is that this can be phase dependent (and
that complexity gets a lot more non-obvious when there's real live
activity (particularly long-running transactions) in the system as
well.

I've attached a new patch series with two items:
1. A simpler (and I believe more correct) doc changes for "cic blocks
cic on other tables".
2. A patch to document that all index builds can prevent tuples from
being vacuumed away on other tables.

If it's preferable we could commit the first and discuss the second
separately, but since that limitation was also discussed up-thread, I
decided to include them both here for now.

James

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2020-Mar-25, Andres Freund wrote:

> What I was thinking of was a new flag, with a distinct value from
> PROC_IN_VACUUM. It'd currently just be specified in the
> GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid
> needing to wait for other CICs on different relations. Since CIC is not
> permitted on system tables, and CIC doesn't do DML on normal tables, it
> seems fairly obviously correct to exclude other CICs.

Hmm, that does work, and seems a pretty small patch -- attached.  Of
course, some more commentary is necessary, but the theory of operation
is as Andres says.  (It does not solve the vacuuming problem I was
describing in the other thread, only the spurious waiting that James is
complaining about in this thread.)

I'm going to try and poke holes on this now ... (Expression indexes with
falsely immutable functions?)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2020-Aug-04, Alvaro Herrera wrote:

> diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
> index b20e2ad4f6..43c8ea3e31 100644
> --- a/src/include/storage/proc.h
> +++ b/src/include/storage/proc.h
> @@ -53,6 +53,8 @@ struct XidCache
>  #define        PROC_IS_AUTOVACUUM    0x01    /* is it an autovac worker? */
>  #define        PROC_IN_VACUUM        0x02    /* currently running lazy vacuum */
>  #define        PROC_IN_ANALYZE        0x04    /* currently running analyze */
> +#define        PROC_IN_CIC            0x40    /* currently running CREATE INDEX
> +                                           CONCURRENTLY */
>  #define        PROC_VACUUM_FOR_WRAPAROUND    0x08    /* set by autovac only */
>  #define        PROC_IN_LOGICAL_DECODING    0x10    /* currently doing logical
>                                                   * decoding outside xact */

Hah, missed to add new bit to PROC_VACUUM_STATE_MASK here.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



PROC_IN_ANALYZE stillborn 13 years ago

От
Alvaro Herrera
Дата:
Back in the 8.3 cycle (2007) when the autovacuum launcher/worker split
was done, we annoyed people because it blocked DDL.  That led to an
effort to cancel autovac automatically when that was detected, by Simon
Riggs.
https://postgr.es/m/1191526327.4223.204.camel@ebony.site
https://postgr.es/m/1192129897.4233.433.camel@ebony.site

I was fixated on only cancelling when it was ANALYZE, to avoid losing
any VACUUM work.
https://postgr.es/m/20071025164150.GF23566@alvh.no-ip.org
That turned into some flags for PGPROC to detect whether a process was
ANALYZE, and cancel only those.
https://postgr.es/m/20071024151328.GG6559@alvh.no-ip.org
Commit:
https://postgr.es/m/20071024205536.CB425754229@cvs.postgresql.org
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=745c1b2c2ab

However, I was outvoted, so we do not limit cancellation to analyze.
Patch and discussion: https://postgr.es/m/20071025164150.GF23566@alvh.no-ip.org
Commit:
https://postgr.es/m/20071026204510.AA02E754229@cvs.postgresql.org
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=acac68b2bca

... which means the flag I had added two days earlier has never been
used for anything.  We've carried the flag forward to this day for
almost 13 years, dutifully turning it on and off ... but never checking
it anywhere.

I propose to remove it, as in the attached patch.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Andres Freund
Дата:
Hi,

On 2020-08-05 19:55:49 -0400, Alvaro Herrera wrote:
> Back in the 8.3 cycle (2007) when the autovacuum launcher/worker split
> was done, we annoyed people because it blocked DDL.  That led to an
> effort to cancel autovac automatically when that was detected, by Simon
> Riggs.
> https://postgr.es/m/1191526327.4223.204.camel@ebony.site
> https://postgr.es/m/1192129897.4233.433.camel@ebony.site
> 
> I was fixated on only cancelling when it was ANALYZE, to avoid losing
> any VACUUM work.
> https://postgr.es/m/20071025164150.GF23566@alvh.no-ip.org
> That turned into some flags for PGPROC to detect whether a process was
> ANALYZE, and cancel only those.
> https://postgr.es/m/20071024151328.GG6559@alvh.no-ip.org
> Commit:
> https://postgr.es/m/20071024205536.CB425754229@cvs.postgresql.org
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=745c1b2c2ab
> 
> However, I was outvoted, so we do not limit cancellation to analyze.
> Patch and discussion: https://postgr.es/m/20071025164150.GF23566@alvh.no-ip.org
> Commit:
> https://postgr.es/m/20071026204510.AA02E754229@cvs.postgresql.org
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=acac68b2bca
> 
> ... which means the flag I had added two days earlier has never been
> used for anything.  We've carried the flag forward to this day for
> almost 13 years, dutifully turning it on and off ... but never checking
> it anywhere.
> 
> I propose to remove it, as in the attached patch.

I'm mildly against that, because I'd really like to start making use of
the flag. Not so much for cancellations, but to avoid the drastic impact
analyze has on bloat.  In OLTP workloads with big tables, and without
disabled cost limiting for analyze (or slow IO), the snapshot that
analyze holds is often by far the transaction with the oldest xmin.

It's not entirely trivial to fix (just ignoring it could lead to
detoasting issues), but also not that.

Only mildly against because it'd not be hard to reintroduce once we need
it.

Greetings,

Andres Freund



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Simon Riggs
Дата:
On Thu, 6 Aug 2020 at 02:07, Andres Freund <andres@anarazel.de> wrote:

On 2020-08-05 19:55:49 -0400, Alvaro Herrera wrote: 
> ... which means the flag I had added two days earlier has never been
> used for anything.  We've carried the flag forward to this day for
> almost 13 years, dutifully turning it on and off ... but never checking
> it anywhere.
>
> I propose to remove it, as in the attached patch.

I'm mildly against that, because I'd really like to start making use of
the flag. Not so much for cancellations, but to avoid the drastic impact
analyze has on bloat.  In OLTP workloads with big tables, and without
disabled cost limiting for analyze (or slow IO), the snapshot that
analyze holds is often by far the transaction with the oldest xmin.

It's not entirely trivial to fix (just ignoring it could lead to
detoasting issues), but also not that.

Only mildly against because it'd not be hard to reintroduce once we need
it.

Good points, both.

The most obvious way to avoid long analyze snapshots is to make the analysis take multiple snapshots as it runs, rather than try to invent some clever way of ignoring the analyze snapshots (which as Alvaro points out, we never did). All we need to do is to have an analyze snapshot last for at most N rows, but keep scanning until we have the desired sample size. Doing that would mean the analyze sample wouldn't come from a single snapshot, but then who cares? There is no requirement for consistency - the sample would be arguably *more* stable because it comes from multiple points in time, not just one.

--
Simon Riggs                http://www.2ndQuadrant.com/
Mission Critical Databases

Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Robert Haas
Дата:
On Wed, Aug 5, 2020 at 9:07 PM Andres Freund <andres@anarazel.de> wrote:
> I'm mildly against that, because I'd really like to start making use of
> the flag. Not so much for cancellations, but to avoid the drastic impact
> analyze has on bloat.  In OLTP workloads with big tables, and without
> disabled cost limiting for analyze (or slow IO), the snapshot that
> analyze holds is often by far the transaction with the oldest xmin.
>
> It's not entirely trivial to fix (just ignoring it could lead to
> detoasting issues), but also not that.
>
> Only mildly against because it'd not be hard to reintroduce once we need
> it.

I think we should nuke it. It's trivial to reintroduce the flag if we
need it later, if and when somebody's willing to do the associated
work. In the meantime, it adds confusion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Aug 5, 2020 at 9:07 PM Andres Freund <andres@anarazel.de> wrote:
>> Only mildly against because it'd not be hard to reintroduce once we need
>> it.

> I think we should nuke it. It's trivial to reintroduce the flag if we
> need it later, if and when somebody's willing to do the associated
> work. In the meantime, it adds confusion.

+1 for removal.  It's not clear to me that we'd ever put it back.
Long-running ANALYZE snapshots are indeed a problem, but Simon's proposal
upthread to just take a new one every so often seems like a much cleaner
and simpler answer than having onlookers assume that it's safe to ignore
ANALYZE processes.  (Given that ANALYZE can invoke user-defined functions,
and can be invoked from inside user transactions, any such assumption
seems horribly dangerous.)

            regards, tom lane



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Robert Haas
Дата:
On Thu, Aug 6, 2020 at 2:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> +1 for removal.  It's not clear to me that we'd ever put it back.
> Long-running ANALYZE snapshots are indeed a problem, but Simon's proposal
> upthread to just take a new one every so often seems like a much cleaner
> and simpler answer than having onlookers assume that it's safe to ignore
> ANALYZE processes.  (Given that ANALYZE can invoke user-defined functions,
> and can be invoked from inside user transactions, any such assumption
> seems horribly dangerous.

Not to get too far from the proposal on the table of just removing
something that's been unused for a really long time, which stands on
its own merits, but if a particular ANALYZE doesn't invoke any
user-defined functions and isn't run inside a transaction, could we
skip acquiring a snapshot altogether? That's an extremely common case,
though by no means universal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Not to get too far from the proposal on the table of just removing
> something that's been unused for a really long time, which stands on
> its own merits, but if a particular ANALYZE doesn't invoke any
> user-defined functions and isn't run inside a transaction, could we
> skip acquiring a snapshot altogether? That's an extremely common case,
> though by no means universal.

I'm inclined to think not.

(1) Without a snapshot it's hard to make any non-bogus decisions about
which tuples are live and which are dead.  Admittedly, with Simon's
proposal the final totals would be spongy anyhow, but at least the
individual decisions produce meaningful answers.

(2) I'm pretty sure there are places in the system that assume that any
reader of a table is using an MVCC snapshot.  For instance, didn't you
introduce some such assumptions along with or just after getting rid of
SnapshotNow for catalog scans?

            regards, tom lane



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Robert Haas
Дата:
On Thu, Aug 6, 2020 at 3:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> (1) Without a snapshot it's hard to make any non-bogus decisions about
> which tuples are live and which are dead.  Admittedly, with Simon's
> proposal the final totals would be spongy anyhow, but at least the
> individual decisions produce meaningful answers.

I don't think I believe this. It's impossible to make *consistent*
decisions, but it's not difficult to make *non-bogus* decisions.
HeapTupleSatisfiesVacuum() and HeapTupleSatifiesUpdate() both make
such decisions, and neither takes a snapshot argument.

> (2) I'm pretty sure there are places in the system that assume that any
> reader of a table is using an MVCC snapshot.  For instance, didn't you
> introduce some such assumptions along with or just after getting rid of
> SnapshotNow for catalog scans?

SnapshotSelf still exists and is still used, and IIRC, it has very
similar semantics to the old SnapshotNow, so I don't think that we
introduced any really general assumptions of this sort. I think the
important part of those changes was that all the code that had
previously used SnapshotNow to examine system catalog tuples for DDL
purposes and catcache lookups and so forth started using an MVCC scan,
which removed one (of many) impediments to concurrent DDL. I think the
fact that we removed SnapshotNow outright rather than just ceasing to
use it for that purpose was mostly so that nobody would accidentally
reintroduce code that used it for the sorts of purposes for which it
had been used previously, and secondarily for code cleanliness.
There's nothing wrong with it fundamentally AFAIK.

It's worth mentioning, I think, that the main problem with SnapshotNow
was that it provided no particular stability. If you did an index scan
under SnapshotNow you might find two copies or no copies of a row
being concurrently updated, rather than exactly one. And that in turn
could cause problems like failure to build a relcache entry. Now, how
important is stability to ANALYZE? If you *either* retake your MVCC
snapshots periodically as you re-scan the table *or* use a non-MVCC
snapshot for the scan, you can get those same kinds of artifacts: you
might see two copies of a just-updated row, or none. Maybe this would
actually *break* something - e.g. could there be code that would get
confused if we sample multiple rows for the same value in a column
that has a UNIQUE index? But I think mostly the consequences would be
that you might get somewhat different results from the statistics.

It's not clear to me that it would even be correct to categorize those
somewhat-different results as "less accurate." Tuples that are
invisible to a query often have performance consequences very similar
to visible tuples, in terms of the query run time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Andres Freund
Дата:
Hi,

On 2020-08-06 14:48:52 -0400, Robert Haas wrote:
> On Thu, Aug 6, 2020 at 2:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > +1 for removal.  It's not clear to me that we'd ever put it back.
> > Long-running ANALYZE snapshots are indeed a problem, but Simon's proposal
> > upthread to just take a new one every so often seems like a much cleaner
> > and simpler answer than having onlookers assume that it's safe to ignore
> > ANALYZE processes.  (Given that ANALYZE can invoke user-defined functions,
> > and can be invoked from inside user transactions, any such assumption
> > seems horribly dangerous.
> 
> Not to get too far from the proposal on the table of just removing
> something that's been unused for a really long time, which stands on
> its own merits, but if a particular ANALYZE doesn't invoke any
> user-defined functions and isn't run inside a transaction, could we
> skip acquiring a snapshot altogether? That's an extremely common case,
> though by no means universal.

I don't think so, at least not in very common situations. E.g. as long
as there's a toast table we need to hold a snapshot to ensure that we
don't get failures looking up toasted datums. IIRC there were some other
similar issues that I can't quite recall right now.

Greetings,

Andres Freund



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> ... how
> important is stability to ANALYZE? If you *either* retake your MVCC
> snapshots periodically as you re-scan the table *or* use a non-MVCC
> snapshot for the scan, you can get those same kinds of artifacts: you
> might see two copies of a just-updated row, or none. Maybe this would
> actually *break* something - e.g. could there be code that would get
> confused if we sample multiple rows for the same value in a column
> that has a UNIQUE index? But I think mostly the consequences would be
> that you might get somewhat different results from the statistics.

Yeah, that's an excellent point.  I can imagine somebody complaining
"this query clearly matches a unique index, why is the planner estimating
multiple rows out?".  But most of the time it wouldn't matter much.
(And I think you can get cases like that anyway today.)

> It's not clear to me that it would even be correct to categorize those
> somewhat-different results as "less accurate."

Estimating two rows where the correct answer is one row is clearly
"less accurate".  But I suspect you'd have to be quite unlucky to
get such a result in practice from Simon's proposal, as long as we
weren't super-aggressive about changing ANALYZE's snapshot a lot.

            regards, tom lane



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Andres Freund
Дата:
Hi,

On 2020-08-06 16:22:23 -0400, Robert Haas wrote:
> On Thu, Aug 6, 2020 at 3:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > (1) Without a snapshot it's hard to make any non-bogus decisions about
> > which tuples are live and which are dead.  Admittedly, with Simon's
> > proposal the final totals would be spongy anyhow, but at least the
> > individual decisions produce meaningful answers.
> 
> I don't think I believe this. It's impossible to make *consistent*
> decisions, but it's not difficult to make *non-bogus* decisions.
> HeapTupleSatisfiesVacuum() and HeapTupleSatifiesUpdate() both make
> such decisions, and neither takes a snapshot argument.

Yea, I don't think that's a big problem for the main table. As I just
mentioned in an email a few minutes ago, toast is a bit of a different
topic.

In fact using conceptually like a new snapshot for each sample tuple
actually seems like it'd be somewhat of an improvement over using a
single snapshot. Given that it's a sample it's not like have very
precise expectations of the precise sample, and publishing one that
solely consists of pretty old rows by the time we're done doesn't seem
like it's a meaningful improvement.  I guess there's some danger of
distinctness estimates getting worse, by seeing multiple versions of the
same tuple multiple times - but they're notoriously inaccurate already,
don't think this changes much.


> > (2) I'm pretty sure there are places in the system that assume that any
> > reader of a table is using an MVCC snapshot.  For instance, didn't you
> > introduce some such assumptions along with or just after getting rid of
> > SnapshotNow for catalog scans?
> 
> SnapshotSelf still exists and is still used, and IIRC, it has very
> similar semantics to the old SnapshotNow, so I don't think that we
> introduced any really general assumptions of this sort. I think the
> important part of those changes was that all the code that had
> previously used SnapshotNow to examine system catalog tuples for DDL
> purposes and catcache lookups and so forth started using an MVCC scan,
> which removed one (of many) impediments to concurrent DDL. I think the
> fact that we removed SnapshotNow outright rather than just ceasing to
> use it for that purpose was mostly so that nobody would accidentally
> reintroduce code that used it for the sorts of purposes for which it
> had been used previously, and secondarily for code cleanliness.
> There's nothing wrong with it fundamentally AFAIK.

Some preaching to the choir:

IDK, there's not really much it (along with Self, Any, ...) can safely
be used for, unless you have pretty heavyweight additional locking, or
look explicitly at exactly one tuple version.  Except that it's probably
unnecessary, and that there's some disaster recovery benefits, I'd be in
favor of prohibiting most snapshot types for [sys]table scans.


I'm doubtful that using the term "snapshot" for any of these is a good
choice, and I don't think there's benefit in actually going through the
snapshot APIs.  Especially not when, like *Dirty, they abuse fields
inside SnapshotData to return data that can't be returned through the
normal API.  It'd probably be better to have more explicit APIs for
these, rather than going through snapshot.


> It's not clear to me that it would even be correct to categorize those
> somewhat-different results as "less accurate." Tuples that are
> invisible to a query often have performance consequences very similar
> to visible tuples, in terms of the query run time.

+1

Greetings,

Andres Freund



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> In fact using conceptually like a new snapshot for each sample tuple
> actually seems like it'd be somewhat of an improvement over using a
> single snapshot.

Dunno, that feels like a fairly bad idea to me.  It seems like it would
overemphasize the behavior of whatever queries happened to be running
concurrently with the ANALYZE.  I do follow the argument that using a
single snapshot for the whole ANALYZE overemphasizes a single instant
in time, but I don't think that leads to the conclusion that we shouldn't
use a snapshot at all.

Another angle that would be worth considering, aside from the issue
of whether the sample used for pg_statistic becomes more or less
representative, is what impact all this would have on the tuple count
estimates that go to the stats collector and pg_class.reltuples.
Right now, we don't have a great story at all on how the stats collector's
count is affected by combining VACUUM/ANALYZE table-wide counts with
the incremental deltas reported by transactions happening concurrently
with VACUUM/ANALYZE.  Would changing this behavior make that better,
or worse, or about the same?

            regards, tom lane



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Simon Riggs
Дата:
On Thu, 6 Aug 2020 at 22:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> ... how
> important is stability to ANALYZE? If you *either* retake your MVCC
> snapshots periodically as you re-scan the table *or* use a non-MVCC
> snapshot for the scan, you can get those same kinds of artifacts: you
> might see two copies of a just-updated row, or none. Maybe this would
> actually *break* something - e.g. could there be code that would get
> confused if we sample multiple rows for the same value in a column
> that has a UNIQUE index? But I think mostly the consequences would be
> that you might get somewhat different results from the statistics.

Yeah, that's an excellent point.  I can imagine somebody complaining
"this query clearly matches a unique index, why is the planner estimating
multiple rows out?".  But most of the time it wouldn't matter much.
(And I think you can get cases like that anyway today.)

> It's not clear to me that it would even be correct to categorize those
> somewhat-different results as "less accurate."

Estimating two rows where the correct answer is one row is clearly
"less accurate".  But I suspect you'd have to be quite unlucky to
get such a result in practice from Simon's proposal, as long as we
weren't super-aggressive about changing ANALYZE's snapshot a lot.

Seems like we're agreed we can use more than one snapshot, the only discussion is "how many?"

The more you take the more weirdness you will see, so adopting an approach of one-snapshot-per-row seems like the worst case for accuracy, even if it does make analyze faster.

(If we do want to speed up ANALYZE, we should use the system block sampling approach, but the argument against that is less independence of rows.)

Keeping the discussion on reducing the impact of bernoulli sampled analyze, I was imagining we would do one snapshot for each block of rows with default statistics_target, so that default behavior would be unaffected. Larger settings would be chunked according to the default, so stats_target=10k(max) would take a 10000/100 = 100 snapshots. That approach allows people to vary that using an existing parameter if needed.

--
Simon Riggs                http://www.2ndQuadrant.com/
Mission Critical Databases

Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Robert Haas
Дата:
On Thu, Aug 6, 2020 at 5:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > It's not clear to me that it would even be correct to categorize those
> > somewhat-different results as "less accurate."
>
> Estimating two rows where the correct answer is one row is clearly
> "less accurate" [ than estimating one row ].

That's a tautology, so I can't argue with it as far as it goes.
Thinking about it more, there are really two ways to think about an
estimated row count.

On the one hand, if you think of the row count estimate as the number
of rows that are going to pop out of a node, then it's always right to
think of a unique index as limiting the number of occurrences of a
given value to 1. But, if you think of the row count estimate as a way
of estimating the amount of work that the node has to do to produce
that output, then it isn't.

If a table has a lot of inserts and deletes, or a lot of updates,
index scans might have to do a lot of extra work chasing down index
pointers to tuples that end up being invisible to our scan. The scan
may not have any filter quals at all, and even if it does, they are
likely cheap to evaluate compared to the cost of finding a locking
buffers and checking visibility, so the dominant cost of the scan is
really based on the total number of rows that are present, not the
number that are visible. Ideally, the presence of those rows would
affect the cost estimate for the node in a way very similar to
expecting to find more rows. At the same time, it doesn't work to just
bump up the row count estimate for the node, because then you'll think
more rows will be output, which might cause poor planning decisions at
higher levels.

It doesn't seem easy to get this 100% right. Tuple visibility can
change very quickly, much faster than the inter-ANALYZE interval. And
sometimes tuples can be pruned away very quickly, too, and the index
pointers may be opportunistically removed very quickly, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Thinking about it more, there are really two ways to think about an
> estimated row count.

> On the one hand, if you think of the row count estimate as the number
> of rows that are going to pop out of a node, then it's always right to
> think of a unique index as limiting the number of occurrences of a
> given value to 1. But, if you think of the row count estimate as a way
> of estimating the amount of work that the node has to do to produce
> that output, then it isn't.

The planner intends its row counts to be interpreted in the first way.
We do have a rather indirect way of accounting for the cost of scanning
dead tuples and such, which is that we scale scanning costs according
to the measured physical size of the relation.  That works better for
I/O costs than it does for CPU costs, but it's not completely useless
for the latter.  In any case, we'd certainly not want to increase the
scan's row count estimate for that, because that would falsely inflate
our estimate of how much work upper plan levels have to do.  Whatever
happens at the scan level, the upper levels aren't going to see those
dead tuples.

            regards, tom lane



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Alvaro Herrera
Дата:
On 2020-Aug-05, Andres Freund wrote:

> I'm mildly against that, because I'd really like to start making use of
> the flag. Not so much for cancellations, but to avoid the drastic impact
> analyze has on bloat.  In OLTP workloads with big tables, and without
> disabled cost limiting for analyze (or slow IO), the snapshot that
> analyze holds is often by far the transaction with the oldest xmin.

I pushed despite the objection because it seemed that downstream
discussion was largely favorable to the change, and there's a different
proposal to solve the bloat problem for analyze; and also:

> Only mildly against because it'd not be hard to reintroduce once we need
> it.

Thanks for the discussion!

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Andres Freund
Дата:
Hi,

On 2020-08-06 18:02:26 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > In fact using conceptually like a new snapshot for each sample tuple
> > actually seems like it'd be somewhat of an improvement over using a
> > single snapshot.
> 
> Dunno, that feels like a fairly bad idea to me.  It seems like it would
> overemphasize the behavior of whatever queries happened to be running
> concurrently with the ANALYZE.  I do follow the argument that using a
> single snapshot for the whole ANALYZE overemphasizes a single instant
> in time, but I don't think that leads to the conclusion that we shouldn't
> use a snapshot at all.

I didn't actually want to suggest that we should take a separate
snapshot for every sampled row - that'd be excessively costly. What I
wanted to say was that I don't think that I don't see a clear accuraccy
benefit. E.g. not seeing any of the values inserted more recently will
under-emphasize those in the histogram.

What precisely do you mean with "overemphasize" above? I mean those will
e the rows most likely to live after the analyze is done, so including
them doesn't seem like a bad thing to me?


> Another angle that would be worth considering, aside from the issue
> of whether the sample used for pg_statistic becomes more or less
> representative, is what impact all this would have on the tuple count
> estimates that go to the stats collector and pg_class.reltuples.
> Right now, we don't have a great story at all on how the stats collector's
> count is affected by combining VACUUM/ANALYZE table-wide counts with
> the incremental deltas reported by transactions happening concurrently
> with VACUUM/ANALYZE.  Would changing this behavior make that better,
> or worse, or about the same?

Hm. Vacuum already counts rows that are inserted concurrently with the
vacuum scan, if it encounters them. Analyze doesn't. Seems like we'd at
least be wrong in a more consistent manner than before...

IIUC both analyze and vacuum will overwrite concurrent changes to
n_live_tuples. So taking concurrently committed changes into account
seems like it'd be the right thing?

We probably could make this more accurate by accounting separately for
"recently inserted and committed" rows, and taking the difference of
n_live_tuples before/after into account.  But I'm a bit doubtful that
it's worth it?

Greetings,

Andres Freund



Re: PROC_IN_ANALYZE stillborn 13 years ago

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I pushed despite the objection because it seemed that downstream
> discussion was largely favorable to the change, and there's a different
> proposal to solve the bloat problem for analyze; and also:

Note that this quasi-related patch has pretty thoroughly hijacked
the CF entry for James' original docs patch proposal.  The cfbot
thinks that that's the latest patch in the original thread, and
unsurprisingly is failing to apply it.

Since the discussion was all over the place, I'm not sure whether
there's still a live docs patch proposal or not; but if so, somebody
should repost that patch (and go back to the original thread title).

            regards, tom lane



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Fri, Jul 31, 2020 at 2:51 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Thu, Jul 16, 2020 at 7:34 PM David Johnston
> <david.g.johnston@gmail.com> wrote:
> >
> > The following review has been posted through the commitfest application:
> > make installcheck-world:  not tested
> > Implements feature:       not tested
> > Spec compliant:           not tested
> > Documentation:            tested, passed
> >
> > James,
> >
> > I'm on board with the point of pointing out explicitly the "concurrent index builds on multiple tables at the same
timewill not return on any one table until all have completed", with back-patching.  I do not believe the new paragraph
isnecessary though.  I'd suggest trying to weave it into the existing paragraph ending "Even then, however, the index
maynot be immediately usable for queries: in the worst case, it cannot be used as long as transactions exist that
predatethe start of the index build."  Adding "Notably, " in front of the existing sentence fragment above and tacking
itonto the end probably suffices. 
>
> I'm not sure "the index may not be immediately usable for queries" is
> really accurate/sufficient: it seems to imply the CREATE INDEX has
> returned but for some reason the index isn't yet valid. The issue I'm
> trying to describe here is that the CREATE INDEX query itself will not
> return until all preceding queries have completed *including*
> concurrent index creations on unrelated tables.
>
> > I don't actually don't whether this is true behavior though.  Is it something our tests do, or could, demonstrate?
>
> It'd take tests that exercise parallelism, but it's pretty simple to
> demonstrate (but you do have to catch the first index build in a scan
> phase, so you either need lots of data or a hack). Here's an example
> that uses a bit of a hack to simulate a slow scan phase:
>
> Setup:
> create table items(i int);
> create table others(i int);
> create function slow_expr() returns text as $$ select pg_sleep(15);
> select '5'; $$ language sql immutable;
> insert into items(i) values (1), (2);
> insert into others(i) values (1), (2);
>
> Then the following in order:
> 1. In session A: create index concurrently on items((i::text || slow_expr()));
> 2. In session B (at the same time): create index concurrently on others(i);
>
> You'll notice that the 2nd command, which should be practically
> instantaneous, waits on the first ~30s scan phase of (1) before it
> returns. The same is true if after (2) completes you immediately run
> it again -- it waits on the second ~30s scan phase of (1).
>
> That does reveal a bit of complexity though that that the current
> patch doesn't address, which is that this can be phase dependent (and
> that complexity gets a lot more non-obvious when there's real live
> activity (particularly long-running transactions) in the system as
> well.
>
> I've attached a new patch series with two items:
> 1. A simpler (and I believe more correct) doc changes for "cic blocks
> cic on other tables".
> 2. A patch to document that all index builds can prevent tuples from
> being vacuumed away on other tables.
>
> If it's preferable we could commit the first and discuss the second
> separately, but since that limitation was also discussed up-thread, I
> decided to include them both here for now.

Álvaro's patch confused the current state of this thread, so I'm
reattaching (rebased) v2 as v3.

James

Вложения

Re: PROC_IN_ANALYZE stillborn 13 years ago

От
James Coleman
Дата:
On Sat, Aug 29, 2020 at 8:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > I pushed despite the objection because it seemed that downstream
> > discussion was largely favorable to the change, and there's a different
> > proposal to solve the bloat problem for analyze; and also:
>
> Note that this quasi-related patch has pretty thoroughly hijacked
> the CF entry for James' original docs patch proposal.  The cfbot
> thinks that that's the latest patch in the original thread, and
> unsurprisingly is failing to apply it.
>
> Since the discussion was all over the place, I'm not sure whether
> there's still a live docs patch proposal or not; but if so, somebody
> should repost that patch (and go back to the original thread title).

I replied to the original email thread with reposted patches.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Michael Paquier
Дата:
On Tue, Sep 08, 2020 at 01:25:21PM -0400, James Coleman wrote:
> Álvaro's patch confused the current state of this thread, so I'm
> reattaching (rebased) v2 as v3.

+  <para>
+   <command>CREATE INDEX</command> (including the <literal>CONCURRENTLY</literal>
+   option) commands are included when <command>VACUUM</command> calculates what
+   dead tuples are safe to remove even on tables other than the one being indexed.
+  </para>
FWIW, this is true as well for REINDEX CONCURRENTLY because both use
the same code paths for index builds and validation, with basically
the same waiting phases.  But is CREATE INDEX the correct place for
that?  Wouldn't it be better to tell about such things on the VACUUM
doc?

0001 sounds fine to me.
--
Michael

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
"David G. Johnston"
Дата:
On Wed, Sep 30, 2020 at 2:10 AM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Sep 08, 2020 at 01:25:21PM -0400, James Coleman wrote:
> Álvaro's patch confused the current state of this thread, so I'm
> reattaching (rebased) v2 as v3.

+  <para>
+   <command>CREATE INDEX</command> (including the <literal>CONCURRENTLY</literal>
+   option) commands are included when <command>VACUUM</command> calculates what
+   dead tuples are safe to remove even on tables other than the one being indexed.
+  </para>
FWIW, this is true as well for REINDEX CONCURRENTLY because both use
the same code paths for index builds and validation, with basically
the same waiting phases.  But is CREATE INDEX the correct place for
that?  Wouldn't it be better to tell about such things on the VACUUM
doc?

0001 sounds fine to me.


v3-0002 needs a rebase over the create_index.sgml page due to the change of the nearby xref to link.  Attached as v4-0002 along with the original v3-0001.

I resisted the temptation to commit my word-smithing thoughts to the affected paragraph.  The word "phase" appearing out of nowhere struck me a bit oddly.  "Then finally the" feels like it is missing a couple of commas - or just drop the finally.  "then two table scans occur in separate transactions" reads better than "two more transactions" IMO.

For 0002 maybe focus on the fact that CREATE INDEX is a global concern even though it only names a single table in any one invocation.  As a consequence, while it is running, vacuum cannot bring the system's oldest xid more current than the oldest xid on any index-in-progress table (I don't know exactly how this works).  And, rehasing 0001, all concurrent indexing will finish at the same time.

In short maybe focus less on procedure and specific waiting states and more on the user-visible consequences.  0001 didn't really clear things up much in that regard.  It reads like we are introducing a deadlock situation even though that evidently is not the case.

I concur that vacuum's perspective on the create index global reach needs to be addressed there if it is not already.

<starts looking at vacuum>

I'm a bit confused as to why/whether create index transactions are somehow special in this regard, compared to other transactions.  I infer from the existence of 0002 that they somehow are...

My conclusion thus far is that with respect to the original complaint:

On 2019-09-18 13:51:00 -0400, James Coleman wrote:
> In my experience it's not immediately obvious (even after reading the
> documentation) the implications of how concurrent index builds manage
> transactions with respect to multiple concurrent index builds in
> flight at the same time.

These two limited scope patches have not materially moved the needle in understanding.  They are too technical when the underlying issue is comprehension by non-technical people in terms of how they see their system behave.

David J.

Re: [DOC] Document concurrent index builds waiting on each other

От
"David G. Johnston"
Дата:
On Wed, Oct 21, 2020 at 3:25 PM David G. Johnston <david.g.johnston@gmail.com> wrote:

v3-0002 needs a rebase over the create_index.sgml page due to the change of the nearby xref to link.  Attached as v4-0002 along with the original v3-0001.


attached...

Reading the commit message on 0002 - vacuum isn't a transaction-taking command so it wouldn't interfere with itself, create index does use transactions and thus it's not surprising that it interferes with vacuum - which looks at transactions, not commands (as most of the internals would I'd presume).

David J.

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
Anastasia Lubennikova
Дата:
Status update for a commitfest entry.

The commitfest is nearing the end and I wonder what is this discussion waiting for.
It looks like the proposed patch received its fair share of review, so I mark it as ReadyForCommitter and lay
responsibilityfor the final decision on them. 

The new status of this patch is: Ready for Committer

Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2020-Nov-30, Anastasia Lubennikova wrote:

> The commitfest is nearing the end and I wonder what is this discussion waiting for.
> It looks like the proposed patch received its fair share of review, so
> I mark it as ReadyForCommitter and lay responsibility for the final
> decision on them.

I'll get these pushed now, thanks for the reminder.



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2020-Sep-30, Michael Paquier wrote:

> +  <para>
> +   <command>CREATE INDEX</command> (including the <literal>CONCURRENTLY</literal>
> +   option) commands are included when <command>VACUUM</command> calculates what
> +   dead tuples are safe to remove even on tables other than the one being indexed.
> +  </para>
> FWIW, this is true as well for REINDEX CONCURRENTLY because both use
> the same code paths for index builds and validation, with basically
> the same waiting phases.  But is CREATE INDEX the correct place for
> that?  Wouldn't it be better to tell about such things on the VACUUM
> doc?

Yeah, I think it might be more sensible to document this in
maintenance.sgml, as part of the paragraph that discusses removing
tuples "to save space".  But making it inline with the rest of the flow,
it seems to distract from higher-level considerations, so I suggest to
make it a footnote instead.

I'm not sure on the wording to use; what about this?


Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Mon, Nov 30, 2020 at 4:53 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2020-Sep-30, Michael Paquier wrote:
>
> > +  <para>
> > +   <command>CREATE INDEX</command> (including the <literal>CONCURRENTLY</literal>
> > +   option) commands are included when <command>VACUUM</command> calculates what
> > +   dead tuples are safe to remove even on tables other than the one being indexed.
> > +  </para>
> > FWIW, this is true as well for REINDEX CONCURRENTLY because both use
> > the same code paths for index builds and validation, with basically
> > the same waiting phases.  But is CREATE INDEX the correct place for
> > that?  Wouldn't it be better to tell about such things on the VACUUM
> > doc?
>
> Yeah, I think it might be more sensible to document this in
> maintenance.sgml, as part of the paragraph that discusses removing
> tuples "to save space".  But making it inline with the rest of the flow,
> it seems to distract from higher-level considerations, so I suggest to
> make it a footnote instead.

I have mixed feelings about wholesale moving it; users aren't likely
to read the vacuum doc when considering how running CIC might impact
their system, though I do understand why it otherwise fits there. Even
if the primary details are in the vacuum, I tend to think a reference
note (or link to the vacuum docs) in the create index docs would be
useful. The principle here is that 1.) vacuum is automatic/part of the
background of the system, not just something people trigger manually,
and 2.) we ought to document things where the user action triggering
the behavior is documented.

> I'm not sure on the wording to use; what about this?

The wording seems fine to me.

This is a replacement for what was 0002 earlier? And 0001 from earlier
still seems to be a useful standalone patch?

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2020-Nov-30, James Coleman wrote:

> On Mon, Nov 30, 2020 at 4:53 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2020-Sep-30, Michael Paquier wrote:

> > Yeah, I think it might be more sensible to document this in
> > maintenance.sgml, as part of the paragraph that discusses removing
> > tuples "to save space".  But making it inline with the rest of the flow,
> > it seems to distract from higher-level considerations, so I suggest to
> > make it a footnote instead.
> 
> I have mixed feelings about wholesale moving it; users aren't likely
> to read the vacuum doc when considering how running CIC might impact
> their system, though I do understand why it otherwise fits there.

Makes sense.  ISTM that if we want to have a cautionary blurb CIC docs,
it should go in REINDEX CONCURRENTLY as well.

> > I'm not sure on the wording to use; what about this?
> 
> The wording seems fine to me.

Great, thanks.

> This is a replacement for what was 0002 earlier? And 0001 from earlier
> still seems to be a useful standalone patch?

0001 is the one that I got pushed yesterday, I think -- correct?
src/tools/git_changelog says:

Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Branch: master [58ebe967f] 2020-11-30 18:24:55 -0300
Branch: REL_13_STABLE [3fe0e7c3f] 2020-11-30 18:24:55 -0300
Branch: REL_12_STABLE [b2603f16a] 2020-11-30 18:24:55 -0300
Branch: REL_11_STABLE [ed9c9b033] 2020-11-30 18:24:55 -0300
Branch: REL_10_STABLE [d3bd36a63] 2020-11-30 18:24:55 -0300
Branch: REL9_6_STABLE [b3d33bf59] 2020-11-30 18:24:55 -0300
Branch: REL9_5_STABLE [968a537b4] 2020-11-30 18:24:55 -0300

    Document concurrent indexes waiting on each other
    
    Because regular CREATE INDEX commands are independent, and there's no
    logical data dependency, it's not immediately obvious that transactions
    held by concurrent index builds on one table will block the second phase
    of concurrent index creation on an unrelated table, so document this
    caveat.
    
    Backpatch this all the way back.  In branch master, mention that only
    some indexes are involved.
    
    Author: James Coleman <jtc331@gmail.com>
    Reviewed-by: David Johnston <david.g.johnston@gmail.com>
    Discussion: https://postgr.es/m/CAAaqYe994=PUrn8CJZ4UEo_S-FfRr_3ogERyhtdgHAb2WG_Ufg@mail.gmail.com




Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Tue, Dec 1, 2020 at 6:51 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2020-Nov-30, James Coleman wrote:
>
> > On Mon, Nov 30, 2020 at 4:53 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > >
> > > On 2020-Sep-30, Michael Paquier wrote:
>
> > > Yeah, I think it might be more sensible to document this in
> > > maintenance.sgml, as part of the paragraph that discusses removing
> > > tuples "to save space".  But making it inline with the rest of the flow,
> > > it seems to distract from higher-level considerations, so I suggest to
> > > make it a footnote instead.
> >
> > I have mixed feelings about wholesale moving it; users aren't likely
> > to read the vacuum doc when considering how running CIC might impact
> > their system, though I do understand why it otherwise fits there.
>
> Makes sense.  ISTM that if we want to have a cautionary blurb CIC docs,
> it should go in REINDEX CONCURRENTLY as well.

Agreed. Or, alternatively, a blurb something like "Please note how CIC
interacts with VACUUM <link>...", and then the primary language in
maintenance.sgml. That would have the benefit of maintaining the core
language in only one place.

> > > I'm not sure on the wording to use; what about this?
> >
> > The wording seems fine to me.
>
> Great, thanks.
>
> > This is a replacement for what was 0002 earlier? And 0001 from earlier
> > still seems to be a useful standalone patch?
>
> 0001 is the one that I got pushed yesterday, I think -- correct?
> src/tools/git_changelog says:
>
> Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Branch: master [58ebe967f] 2020-11-30 18:24:55 -0300
> Branch: REL_13_STABLE [3fe0e7c3f] 2020-11-30 18:24:55 -0300
> Branch: REL_12_STABLE [b2603f16a] 2020-11-30 18:24:55 -0300
> Branch: REL_11_STABLE [ed9c9b033] 2020-11-30 18:24:55 -0300
> Branch: REL_10_STABLE [d3bd36a63] 2020-11-30 18:24:55 -0300
> Branch: REL9_6_STABLE [b3d33bf59] 2020-11-30 18:24:55 -0300
> Branch: REL9_5_STABLE [968a537b4] 2020-11-30 18:24:55 -0300
>
>     Document concurrent indexes waiting on each other
>
>     Because regular CREATE INDEX commands are independent, and there's no
>     logical data dependency, it's not immediately obvious that transactions
>     held by concurrent index builds on one table will block the second phase
>     of concurrent index creation on an unrelated table, so document this
>     caveat.
>
>     Backpatch this all the way back.  In branch master, mention that only
>     some indexes are involved.
>
>     Author: James Coleman <jtc331@gmail.com>
>     Reviewed-by: David Johnston <david.g.johnston@gmail.com>
>     Discussion: https://postgr.es/m/CAAaqYe994=PUrn8CJZ4UEo_S-FfRr_3ogERyhtdgHAb2WG_Ufg@mail.gmail.com

Ah, yes, somehow I'd missed that that had been pushed.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Tue, Dec 1, 2020 at 8:05 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Tue, Dec 1, 2020 at 6:51 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2020-Nov-30, James Coleman wrote:
> >
> > > On Mon, Nov 30, 2020 at 4:53 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > > >
> > > > On 2020-Sep-30, Michael Paquier wrote:
> >
> > > > Yeah, I think it might be more sensible to document this in
> > > > maintenance.sgml, as part of the paragraph that discusses removing
> > > > tuples "to save space".  But making it inline with the rest of the flow,
> > > > it seems to distract from higher-level considerations, so I suggest to
> > > > make it a footnote instead.
> > >
> > > I have mixed feelings about wholesale moving it; users aren't likely
> > > to read the vacuum doc when considering how running CIC might impact
> > > their system, though I do understand why it otherwise fits there.
> >
> > Makes sense.  ISTM that if we want to have a cautionary blurb CIC docs,
> > it should go in REINDEX CONCURRENTLY as well.
>
> Agreed. Or, alternatively, a blurb something like "Please note how CIC
> interacts with VACUUM <link>...", and then the primary language in
> maintenance.sgml. That would have the benefit of maintaining the core
> language in only one place.

Any thoughts on this?

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2020-Dec-01, James Coleman wrote:

> On Tue, Dec 1, 2020 at 6:51 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> > Makes sense.  ISTM that if we want to have a cautionary blurb CIC docs,
> > it should go in REINDEX CONCURRENTLY as well.
> 
> Agreed. Or, alternatively, a blurb something like "Please note how CIC
> interacts with VACUUM <link>...", and then the primary language in
> maintenance.sgml. That would have the benefit of maintaining the core
> language in only one place.

I looked into this again, and I didn't like what I had added to
maintenance.sgml at all.  It seems out of place where I put it; and I
couldn't find any great spots.  Going back to your original proposal,
what about something like this?  It's just one more para in the "notes"
section in CREATE INDEX and REINDEX pages, without any additions to the
VACUUM pages.

-- 
Álvaro Herrera                            39°49'30"S 73°17'W

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
Michael Paquier
Дата:
On Tue, Jan 12, 2021 at 04:51:39PM -0300, Alvaro Herrera wrote:
> I looked into this again, and I didn't like what I had added to
> maintenance.sgml at all.  It seems out of place where I put it; and I
> couldn't find any great spots.  Going back to your original proposal,
> what about something like this?  It's just one more para in the "notes"
> section in CREATE INDEX and REINDEX pages, without any additions to the
> VACUUM pages.

+1.
--
Michael

Вложения

Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Wed, Jan 13, 2021 at 12:58 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jan 12, 2021 at 04:51:39PM -0300, Alvaro Herrera wrote:
> > I looked into this again, and I didn't like what I had added to
> > maintenance.sgml at all.  It seems out of place where I put it; and I
> > couldn't find any great spots.  Going back to your original proposal,
> > what about something like this?  It's just one more para in the "notes"
> > section in CREATE INDEX and REINDEX pages, without any additions to the
> > VACUUM pages.
>
> +1.

I think one more para in the notes is good. But shouldn't we still
clarify the issue is specific to CONCURRENTLY?

Also that it's not just the table being indexed seems fairly significant.

How about something like:

---
Like any long-running transaction, <command>REINDEX CONCURRENTLY</command> can
affect which tuples can be removed by concurrent
<command>VACUUM</command> on any table.
---

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2021-Jan-13, James Coleman wrote:

> On Wed, Jan 13, 2021 at 12:58 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Tue, Jan 12, 2021 at 04:51:39PM -0300, Alvaro Herrera wrote:
> > > I looked into this again, and I didn't like what I had added to
> > > maintenance.sgml at all.  It seems out of place where I put it; and I
> > > couldn't find any great spots.  Going back to your original proposal,
> > > what about something like this?  It's just one more para in the "notes"
> > > section in CREATE INDEX and REINDEX pages, without any additions to the
> > > VACUUM pages.
> >
> > +1.
> 
> I think one more para in the notes is good. But shouldn't we still
> clarify the issue is specific to CONCURRENTLY?

How is it specific to concurrent builds?  What we're documenting here is
the behavior of vacuum, and that one is identical in both regular builds
and concurrent builds (since vacuum has to avoid removing rows from
under either of them).  The only reason concurrent builds are
interesting is because they take longer.

What was specific to concurrent builds was the fact that you can't have
more than one at a time, and that one is what was added in 58ebe967f.

> Also that it's not just the table being indexed seems fairly significant.

This is true.  So I propose

    Like any long-running transaction, <command>REINDEX</command> can
    affect which tuples can be removed by concurrent <command>VACUUM</command>
    on any table.

-- 
Álvaro Herrera                            39°49'30"S 73°17'W



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Wed, Jan 13, 2021 at 12:33 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Jan-13, James Coleman wrote:
>
> > On Wed, Jan 13, 2021 at 12:58 AM Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Tue, Jan 12, 2021 at 04:51:39PM -0300, Alvaro Herrera wrote:
> > > > I looked into this again, and I didn't like what I had added to
> > > > maintenance.sgml at all.  It seems out of place where I put it; and I
> > > > couldn't find any great spots.  Going back to your original proposal,
> > > > what about something like this?  It's just one more para in the "notes"
> > > > section in CREATE INDEX and REINDEX pages, without any additions to the
> > > > VACUUM pages.
> > >
> > > +1.
> >
> > I think one more para in the notes is good. But shouldn't we still
> > clarify the issue is specific to CONCURRENTLY?
>
> How is it specific to concurrent builds?  What we're documenting here is
> the behavior of vacuum, and that one is identical in both regular builds
> and concurrent builds (since vacuum has to avoid removing rows from
> under either of them).  The only reason concurrent builds are
> interesting is because they take longer.
>
> What was specific to concurrent builds was the fact that you can't have
> more than one at a time, and that one is what was added in 58ebe967f.

Ah, right. I've mixed those up at least once on this thread already.

> > Also that it's not just the table being indexed seems fairly significant.
>
> This is true.  So I propose
>
>     Like any long-running transaction, <command>REINDEX</command> can
>     affect which tuples can be removed by concurrent <command>VACUUM</command>
>     on any table.

That sounds good to me.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2021-Jan-13, James Coleman wrote:

> On Wed, Jan 13, 2021 at 12:33 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> > This is true.  So I propose
> >
> >     Like any long-running transaction, <command>REINDEX</command> can
> >     affect which tuples can be removed by concurrent <command>VACUUM</command>
> >     on any table.
> 
> That sounds good to me.

Great, pushed with one more wording tweak: "REINDEX on any table can
affect ... on any other table".  To pg12 and up.

I wondered about noting whether only processes in the current database
are affected, but then I noticed that the current code since commit
dc7420c2c927 uses a completely different algorithm than what we had with
GetOldestXmin() and does not consider database boundaries at all.
This doesn't sound great to me, since a misbehaved database can now
affect others ...  Maybe I misunderstand that code.

-- 
Álvaro Herrera                            39°49'30"S 73°17'W
"This is what I like so much about PostgreSQL.  Most of the surprises
are of the "oh wow!  That's cool" Not the "oh shit!" kind.  :)"
Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Wed, Jan 13, 2021 at 4:05 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Jan-13, James Coleman wrote:
>
> > On Wed, Jan 13, 2021 at 12:33 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> > > This is true.  So I propose
> > >
> > >     Like any long-running transaction, <command>REINDEX</command> can
> > >     affect which tuples can be removed by concurrent <command>VACUUM</command>
> > >     on any table.
> >
> > That sounds good to me.
>
> Great, pushed with one more wording tweak: "REINDEX on any table can
> affect ... on any other table".  To pg12 and up.

Looks like what got committed is "REINDEX on a table" not "on any",
but I'm not sure that matters too much.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2021-Jan-13, Alvaro Herrera wrote:

> I wondered about noting whether only processes in the current database
> are affected, but then I noticed that the current code since commit
> dc7420c2c927 uses a completely different algorithm than what we had with
> GetOldestXmin() and does not consider database boundaries at all.
> This doesn't sound great to me, since a misbehaved database can now
> affect others ...  Maybe I misunderstand that code.

This appears to be false, per ComputeXidHorizons.

-- 
Álvaro Herrera                            39°49'30"S 73°17'W
"Ni aún el genio muy grande llegaría muy lejos
si tuviera que sacarlo todo de su propio interior" (Goethe)



Re: [DOC] Document concurrent index builds waiting on each other

От
Alvaro Herrera
Дата:
On 2021-Jan-13, James Coleman wrote:

> On Wed, Jan 13, 2021 at 4:05 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2021-Jan-13, James Coleman wrote:
> >
> > > On Wed, Jan 13, 2021 at 12:33 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > > > This is true.  So I propose
> > > >
> > > >     Like any long-running transaction, <command>REINDEX</command> can
> > > >     affect which tuples can be removed by concurrent <command>VACUUM</command>
> > > >     on any table.
> > >
> > > That sounds good to me.
> >
> > Great, pushed with one more wording tweak: "REINDEX on any table can
> > affect ... on any other table".  To pg12 and up.
> 
> Looks like what got committed is "REINDEX on a table" not "on any",
> but I'm not sure that matters too much.

Ouch.  The difference seems slight enough that it doesn't matter; is it
ungrammatical?

Either way I'm gonna close this CF entry now, finally.  Thank you for
your patience!

-- 
Álvaro Herrera       Valdivia, Chile
"I can't go to a restaurant and order food because I keep looking at the
fonts on the menu.  Five minutes later I realize that it's also talking
about food" (Donald Knuth)



Re: [DOC] Document concurrent index builds waiting on each other

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2021-Jan-13, James Coleman wrote:
>>>> This is true.  So I propose
>>>> Like any long-running transaction, <command>REINDEX</command> can
>>>> affect which tuples can be removed by concurrent <command>VACUUM</command>
>>>> on any table.

>> Looks like what got committed is "REINDEX on a table" not "on any",
>> but I'm not sure that matters too much.

> Ouch.  The difference seems slight enough that it doesn't matter; is it
> ungrammatical?

I'd personally have written "on other tables" or "on another table",
or left out that clause altogether and just said "concurrent
<command>VACUUM</command>".  I'm not sure it's ungrammatical exactly,
but the antecedent of "a table" is a bit unclear; people might
wonder if it means the table being reindexed.

            regards, tom lane



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Wed, Jan 13, 2021 at 4:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > On 2021-Jan-13, James Coleman wrote:
> >>>> This is true.  So I propose
> >>>> Like any long-running transaction, <command>REINDEX</command> can
> >>>> affect which tuples can be removed by concurrent <command>VACUUM</command>
> >>>> on any table.
>
> >> Looks like what got committed is "REINDEX on a table" not "on any",
> >> but I'm not sure that matters too much.
>
> > Ouch.  The difference seems slight enough that it doesn't matter; is it
> > ungrammatical?
>
> I'd personally have written "on other tables" or "on another table",
> or left out that clause altogether and just said "concurrent
> <command>VACUUM</command>".  I'm not sure it's ungrammatical exactly,
> but the antecedent of "a table" is a bit unclear; people might
> wonder if it means the table being reindexed.

It does mean the table being reindexed; the last phrase says "any
table" meaning "any other table".

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Tom Lane
Дата:
James Coleman <jtc331@gmail.com> writes:
> On Wed, Jan 13, 2021 at 4:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> but the antecedent of "a table" is a bit unclear; people might
>> wonder if it means the table being reindexed.

> It does mean the table being reindexed; the last phrase says "any
> table" meaning "any other table".

[ raised eyebrow ]  Surely REINDEX and VACUUM can't run on the same
table at the same time.

            regards, tom lane



Re: [DOC] Document concurrent index builds waiting on each other

От
James Coleman
Дата:
On Wed, Jan 13, 2021 at 5:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> James Coleman <jtc331@gmail.com> writes:
> > On Wed, Jan 13, 2021 at 4:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> but the antecedent of "a table" is a bit unclear; people might
> >> wonder if it means the table being reindexed.
>
> > It does mean the table being reindexed; the last phrase says "any
> > table" meaning "any other table".
>
> [ raised eyebrow ]  Surely REINDEX and VACUUM can't run on the same
> table at the same time.

+ Like any long-running transaction, <command>CREATE INDEX</command> on a
+ table can affect which tuples can be removed by concurrent
+ <command>VACUUM</command> on any other table.

The "on a table" is the table on which the REINDEX/CREATE INDEX is
occurring. The "any other table" is where VACUUM might run.

James



Re: [DOC] Document concurrent index builds waiting on each other

От
Tom Lane
Дата:
James Coleman <jtc331@gmail.com> writes:
> On Wed, Jan 13, 2021 at 5:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> [ raised eyebrow ]  Surely REINDEX and VACUUM can't run on the same
>> table at the same time.

> + Like any long-running transaction, <command>CREATE INDEX</command> on a
> + table can affect which tuples can be removed by concurrent
> + <command>VACUUM</command> on any other table.

> The "on a table" is the table on which the REINDEX/CREATE INDEX is
> occurring. The "any other table" is where VACUUM might run.

I still think it'd be just as clear without the auxiliary clauses,
say

+ Like any long-running transaction, <command>CREATE INDEX</command>
+ can affect which tuples can be removed by concurrent
+ <command>VACUUM</command> operations.

            regards, tom lane