Обсуждение: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

Поиск
Список
Период
Сортировка

pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Noah Misch
Дата:
Dissociate btequalimage() from interval_ops, ending its deduplication.

Under interval_ops, some equal values are distinguishable.  One such
pair is '24:00:00' and '1 day'.  With that being so, btequalimage()
breaches the documented contract for the "equalimage" btree support
function.  This can cause incorrect results from index-only scans.
Users should REINDEX any btree indexes having interval-type columns.
After updating, pg_amcheck will report an error for almost all such
indexes.  This fix makes interval_ops simply omit the support function,
like numeric_ops does.  Back-pack to v13, where btequalimage() first
appeared.  In back branches, for the benefit of old catalog content,
btequalimage() code will return false for type "interval".  Going
forward, back-branch initdb will include the catalog change.

Reviewed by Peter Geoghegan.

Discussion: https://postgr.es/m/20231011013317.22.nmisch@google.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/5f27b5f848a433ba54c521ccb889788b8f4d6ba7

Modified Files
--------------
contrib/amcheck/verify_nbtree.c          | 13 ++++++++++++-
src/include/catalog/catversion.h         |  2 +-
src/include/catalog/pg_amproc.dat        |  2 --
src/include/catalog/pg_opfamily.dat      |  2 +-
src/test/regress/expected/opr_sanity.out |  3 ++-
5 files changed, 16 insertions(+), 6 deletions(-)


Re: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> ... This fix makes interval_ops simply omit the support function,
> like numeric_ops does.  Back-pack to v13, where btequalimage() first
> appeared.  In back branches, for the benefit of old catalog content,
> btequalimage() code will return false for type "interval".  Going
> forward, back-branch initdb will include the catalog change.

Hmm, I'm not sure that that last is a good idea.  The upshot of this
(because of the opr_sanity.out change) is that "make installcheck"
will fail against existing back-branch installations.  That seems
more likely to cause problems/confusion than the alternative of just
depending on the code change.

            regards, tom lane



Re: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Noah Misch
Дата:
On Sat, Oct 14, 2023 at 07:45:21PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > ... This fix makes interval_ops simply omit the support function,
> > like numeric_ops does.  Back-pack to v13, where btequalimage() first
> > appeared.  In back branches, for the benefit of old catalog content,
> > btequalimage() code will return false for type "interval".  Going
> > forward, back-branch initdb will include the catalog change.
> 
> Hmm, I'm not sure that that last is a good idea.  The upshot of this
> (because of the opr_sanity.out change) is that "make installcheck"
> will fail against existing back-branch installations.  That seems
> more likely to cause problems/confusion than the alternative of just
> depending on the code change.

I'm way more worried about amcheck failing on all those indexes than I am
about someone who needs to tweak their installcheck rig.  I accepted the
former as the least-bad option, but if any area needs more thought, I feel
that's the area.



Re: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> On Sat, Oct 14, 2023 at 07:45:21PM -0400, Tom Lane wrote:
>> Hmm, I'm not sure that that last is a good idea.  The upshot of this
>> (because of the opr_sanity.out change) is that "make installcheck"
>> will fail against existing back-branch installations.  That seems
>> more likely to cause problems/confusion than the alternative of just
>> depending on the code change.

> I'm way more worried about amcheck failing on all those indexes than I am
> about someone who needs to tweak their installcheck rig.

Well, that's indeed going to be a pain for affected people, but
it doesn't seem like a reason to also break installcheck.

            regards, tom lane



Re: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Noah Misch
Дата:
On Sat, Oct 14, 2023 at 08:27:58PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Sat, Oct 14, 2023 at 07:45:21PM -0400, Tom Lane wrote:
> >> Hmm, I'm not sure that that last is a good idea.  The upshot of this
> >> (because of the opr_sanity.out change) is that "make installcheck"
> >> will fail against existing back-branch installations.  That seems
> >> more likely to cause problems/confusion than the alternative of just
> >> depending on the code change.
> 
> > I'm way more worried about amcheck failing on all those indexes than I am
> > about someone who needs to tweak their installcheck rig.
> 
> Well, that's indeed going to be a pain for affected people, but
> it doesn't seem like a reason to also break installcheck.

That's right.  We don't have a standard that installcheck of v13.N will have
zero diffs on an initdb from v13.0.  Zero diffs would be help a tiny user
population, and corrected catalog entries will help a different population.  I
don't think either choice does enough harm to reopen the decision.  We've got
bigger fish to fry.



Re: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> On Sat, Oct 14, 2023 at 08:27:58PM -0400, Tom Lane wrote:
>> Well, that's indeed going to be a pain for affected people, but
>> it doesn't seem like a reason to also break installcheck.

> That's right.  We don't have a standard that installcheck of v13.N will have
> zero diffs on an initdb from v13.0.

Um ... don't we?  I do not recall very many cases where we changed
initial catalog contents at all in a point release, and I don't think
any of those cases intentionally created regression diffs.

> We've got bigger fish to fry.

True, this isn't going to affect many people either way.  But
I don't think you've made a good precedent here.

            regards, tom lane



Re: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Peter Geoghegan
Дата:
On Sat, Oct 14, 2023 at 7:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Noah Misch <noah@leadboat.com> writes:
> > That's right.  We don't have a standard that installcheck of v13.N will have
> > zero diffs on an initdb from v13.0.
>
> Um ... don't we?  I do not recall very many cases where we changed
> initial catalog contents at all in a point release, and I don't think
> any of those cases intentionally created regression diffs.

This did come up in review. I deferred to Noah on the question. FWIW
if I had authored this bugfix, it wouldn't have touched catalog
contents on the backbranches.

> > We've got bigger fish to fry.
>
> True, this isn't going to affect many people either way.  But
> I don't think you've made a good precedent here.

I tend to prefer whatever approach seems least reliant on my having an
accurate understanding of what guarantees exist, and/or what people
actually rely on. It seems to me that there is an asymmetry here: it
is virtually certain that superfluous catalog entries won't bother
anybody, but it is less certain that allowing initdb to produce
different catalog contents in a stable branch is completely harmless.
Why even take a tiny risk if it can be avoided? For that matter why
even bother trying to quantify a risk like that if it can just be
avoided?

--
Peter Geoghegan



Re: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Tom Lane
Дата:
Peter Geoghegan <pg@bowt.ie> writes:
> On Sat, Oct 14, 2023 at 7:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Noah Misch <noah@leadboat.com> writes:
>>> That's right.  We don't have a standard that installcheck of v13.N will have
>>> zero diffs on an initdb from v13.0.

>> Um ... don't we?  I do not recall very many cases where we changed
>> initial catalog contents at all in a point release, and I don't think
>> any of those cases intentionally created regression diffs.

> This did come up in review. I deferred to Noah on the question. FWIW
> if I had authored this bugfix, it wouldn't have touched catalog
> contents on the backbranches.

To research this, I looked at every post-dot-zero commit that touched
src/include/catalog, going back to the 7.4 branch (~ 2003).  While
there are a lot that adjust internal C function declarations or the
like, there are darn few that touch the catalog data.  Ignoring a few
that just adjusted pg_description entries, I found

283262cd9 et al
    Fix bogus provolatile/proparallel markings on a few built-in functions

aa7e04cb5 et al
    Clean up some lack-of-STRICT issues in the core code, too

6e0a053a9
    Correct volatility markings of a few json functions

cb651b624 et al
    Fix incorrect pg_proc.proallargtypes entries for two built-in functions

0b11a1525 et al
    Mark to_number() and the numeric-type variants of to_char() as stable, not

aed597102
    anyarray really needs to be declared with typalign = 'd', so that entries
    in pg_statistic are correctly aligned if they contain values that require
    double alignment.  Too bad we cannot force initdb for this in 7.4 branch.

None of these commits had any side-effects on regression test cases,
and the log entry for cb651b624 explicitly notes that we didn't add
the same regression test case as in HEAD for fear that it'd fail with
older catalog contents.

So I beg to differ with Noah: I think we *do* have a policy against
doing this.  At least, we've not actually done it in the last twenty
years.

            regards, tom lane



Re: pgsql: Dissociate btequalimage() from interval_ops, ending its deduplic

От
Noah Misch
Дата:
On Sun, Oct 15, 2023 at 05:12:58PM -0400, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
> > On Sat, Oct 14, 2023 at 7:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Noah Misch <noah@leadboat.com> writes:
> >>> That's right.  We don't have a standard that installcheck of v13.N will have
> >>> zero diffs on an initdb from v13.0.
> 
> >> Um ... don't we?  I do not recall very many cases where we changed
> >> initial catalog contents at all in a point release, and I don't think
> >> any of those cases intentionally created regression diffs.
> 
> > This did come up in review. I deferred to Noah on the question. FWIW
> > if I had authored this bugfix, it wouldn't have touched catalog
> > contents on the backbranches.
> 
> To research this, I looked at every post-dot-zero commit that touched
> src/include/catalog, going back to the 7.4 branch (~ 2003).  While
> there are a lot that adjust internal C function declarations or the
> like, there are darn few that touch the catalog data.  Ignoring a few
> that just adjusted pg_description entries, I found
> 
> 283262cd9 et al
>     Fix bogus provolatile/proparallel markings on a few built-in functions
> 
> aa7e04cb5 et al
>     Clean up some lack-of-STRICT issues in the core code, too
> 
> 6e0a053a9
>     Correct volatility markings of a few json functions
> 
> cb651b624 et al
>     Fix incorrect pg_proc.proallargtypes entries for two built-in functions
> 
> 0b11a1525 et al
>     Mark to_number() and the numeric-type variants of to_char() as stable, not
> 
> aed597102
>     anyarray really needs to be declared with typalign = 'd', so that entries
>     in pg_statistic are correctly aligned if they contain values that require
>     double alignment.  Too bad we cannot force initdb for this in 7.4 branch.
> 
> None of these commits had any side-effects on regression test cases,
> and the log entry for cb651b624 explicitly notes that we didn't add
> the same regression test case as in HEAD for fear that it'd fail with
> older catalog contents.

Commit e568e1e changed pg_rewrite and associated rules.out lines.  It could
have avoided the testing effect by modifying rules.sql to pass with or without
the catalog mutation, e.g. by eliminating that output row with an additional
qual on the query.  Nobody brought that up in the six intervening years.