Обсуждение: [RFC] Removing "magic" oids

Поиск
Список
Период
Сортировка

[RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

In my opinion the current WITH OIDs system has numerous weaknesses:

1) The fact that oids are so magic means that if we get pluggable
   storage, the design of the potential pluggable systems is constrained
   and similar magic has to be present everywhere.

2) The fact that the oids in each table have the same counter to be
   based on means that oid wraparounds have much worse consequences
   performance wise than necessary. E.g. once the global counter has
   wrapped, all toast tables start to be significantly slower.

   It would be much better if most database objects had their own
   counters.

3) For some oid using objects (toast, large objects at the very least)
   it'd be quite worthwhile to switch to 8 byte ids.  Currently that's
   hard to do, because it'd break on-disk compatibility.

4) There's a lot of special case code around for dealing with oids.

5a) The fact that system table oids don't show up in selects by default
   makes it more work than necessary to look at catalogs.

5b) Similarly, it's fairly annoying when debugging not to trivially see
   oids for catalog structs.


I think we should drop WITH OIDs support.  pg_dump should convert WITH
OIDs tables into tables that have an explicit oid column (with an
appropriate default function), pg_upgrade should refuse to upgrade them.
We've defaulted WITH OIDs to off for quite a while now, so that's
hopefully not going to be too painful.

For catalog tables, I think we should just add explicit oid columns.
That obviously requires a fair amount of change, but it's not too bad.
One issue here is that we we want to support "manual" inserts both for
initdb, and for emergency work.


The attached *PROTOTYPE* *WIP* patch removes WITH OIDs support, converts
catalog table oids to explicit oid columns, and makes enough
infrastructure changes to make plain pg_regress pass (after the
necessary changes of course).  Only superficial psql and no pg_dump
changes.

There's plenty of issues with the prototype, but overall I'm pretty
pleased.


There's three major areas I'm not so sure about:

1) There's a few places dealing with system tables that don't deal with
   a hardcoded system table. Since there's no notion of "table has oid"
   and "which column is the oid column) anymore, we need some way to
   deal with that.  So far I've just extended existing objectaddress.c
   code to deal with that, but it's not that pretty.

2) We need to be able to manually insert into catalog tables. Both
   initdb and emergency surgery.  My current hack is doing so directly
   in nodeModifyTable.c but that's beyond ugly.  I think we should add
   an explicit DEFAULT clause to those columns with something like
   nextoid('tablename', 'name_of_index') or such.

3) The quickest way to deal with the bootstrap code was to just assign
   all oids for oid carrying tables that don't yet have any assigned.
   That doesn't generally seem terrible, although it's certainly badly
   implemented right now.  That'd mean we'd have three ranges of oids
   probably, unless we somehow have the bootstrap code advance the
   in-database oid counter to a right state before we start with
   post-bootstrap work.  I like the idea of all bootstrap time oids
   being determined by genbki.pl (I think that'll allow to remove some
   code too).


I do wonder about ripping the oid counter out entirely, and replacing it
with sequences. But that seems like a separate project.


I'll polish this up further in the not too far away future.  But I'd
really like to get some feedback before I sink more time into this.


Greetings,

Andres Freund

Вложения

Re: [RFC] Removing "magic" oids

От
David Fetter
Дата:
On Sat, Sep 29, 2018 at 08:48:10PM -0700, Andres Freund wrote:
> Hi,
> 
> In my opinion the current WITH OIDs system has numerous weaknesses:
> 
> 1) The fact that oids are so magic means that if we get pluggable
>    storage, the design of the potential pluggable systems is constrained
>    and similar magic has to be present everywhere.
> 
> 2) The fact that the oids in each table have the same counter to be
>    based on means that oid wraparounds have much worse consequences
>    performance wise than necessary. E.g. once the global counter has
>    wrapped, all toast tables start to be significantly slower.
> 
>    It would be much better if most database objects had their own
>    counters.
> 
> 3) For some oid using objects (toast, large objects at the very least)
>    it'd be quite worthwhile to switch to 8 byte ids.  Currently that's
>    hard to do, because it'd break on-disk compatibility.
> 
> 4) There's a lot of special case code around for dealing with oids.
> 
> 5a) The fact that system table oids don't show up in selects by default
>    makes it more work than necessary to look at catalogs.
> 
> 5b) Similarly, it's fairly annoying when debugging not to trivially see
>    oids for catalog structs.
> 
> 
> I think we should drop WITH OIDs support.

+1

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: [RFC] Removing "magic" oids

От
Andreas Karlsson
Дата:
On 09/30/2018 05:48 AM, Andres Freund wrote:> I think we should drop 
WITH OIDs support.

+1

> 2) We need to be able to manually insert into catalog tables. Both
>     initdb and emergency surgery.  My current hack is doing so directly
>     in nodeModifyTable.c but that's beyond ugly.  I think we should add
>     an explicit DEFAULT clause to those columns with something like
>     nextoid('tablename', 'name_of_index') or such.

Adding a function to get the next OID sounds like a good solution for 
both our catalog and legacy applications. The only potential 
disadvantage that I see is that this function becomes something we need 
to maintain if we ever decide to move from OIDs to sequences.

> 3) The quickest way to deal with the bootstrap code was to just assign
>     all oids for oid carrying tables that don't yet have any assigned.
>     That doesn't generally seem terrible, although it's certainly badly
>     implemented right now.  That'd mean we'd have three ranges of oids
>     probably, unless we somehow have the bootstrap code advance the
>     in-database oid counter to a right state before we start with
>     post-bootstrap work.  I like the idea of all bootstrap time oids
>     being determined by genbki.pl (I think that'll allow to remove some
>     code too).

Agreed, having genbki.pl determine all oids in the bootstrap data sounds 
ideal.

Andreas



Re: [RFC] Removing "magic" oids

От
Vik Fearing
Дата:
On 30/09/18 05:48, Andres Freund wrote:
> 5a) The fact that system table oids don't show up in selects by default
>    makes it more work than necessary to look at catalogs.

As a user, this is a big pain point for me.  +1 for getting rid of it.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-09-29 20:48:10 -0700, Andres Freund wrote:
> In my opinion the current WITH OIDs system has numerous weaknesses:
> 
> 1) The fact that oids are so magic means that if we get pluggable
>    storage, the design of the potential pluggable systems is constrained
>    and similar magic has to be present everywhere.
> 
> 2) The fact that the oids in each table have the same counter to be
>    based on means that oid wraparounds have much worse consequences
>    performance wise than necessary. E.g. once the global counter has
>    wrapped, all toast tables start to be significantly slower.
> 
>    It would be much better if most database objects had their own
>    counters.
> 
> 3) For some oid using objects (toast, large objects at the very least)
>    it'd be quite worthwhile to switch to 8 byte ids.  Currently that's
>    hard to do, because it'd break on-disk compatibility.
> 
> 4) There's a lot of special case code around for dealing with oids.
> 
> 5a) The fact that system table oids don't show up in selects by default
>    makes it more work than necessary to look at catalogs.
> 
> 5b) Similarly, it's fairly annoying when debugging not to trivially see
>    oids for catalog structs.
> 
> 
> I think we should drop WITH OIDs support.  pg_dump should convert WITH
> OIDs tables into tables that have an explicit oid column (with an
> appropriate default function), pg_upgrade should refuse to upgrade them.
> We've defaulted WITH OIDs to off for quite a while now, so that's
> hopefully not going to be too painful.
> 
> For catalog tables, I think we should just add explicit oid columns.
> That obviously requires a fair amount of change, but it's not too bad.
> One issue here is that we we want to support "manual" inserts both for
> initdb, and for emergency work.
> 
> 
> The attached *PROTOTYPE* *WIP* patch removes WITH OIDs support, converts
> catalog table oids to explicit oid columns, and makes enough
> infrastructure changes to make plain pg_regress pass (after the
> necessary changes of course).  Only superficial psql and no pg_dump
> changes.
> 
> There's plenty of issues with the prototype, but overall I'm pretty
> pleased.
> 
> 
> There's three major areas I'm not so sure about:
> 
> 1) There's a few places dealing with system tables that don't deal with
>    a hardcoded system table. Since there's no notion of "table has oid"
>    and "which column is the oid column) anymore, we need some way to
>    deal with that.  So far I've just extended existing objectaddress.c
>    code to deal with that, but it's not that pretty.
> 
> 2) We need to be able to manually insert into catalog tables. Both
>    initdb and emergency surgery.  My current hack is doing so directly
>    in nodeModifyTable.c but that's beyond ugly.  I think we should add
>    an explicit DEFAULT clause to those columns with something like
>    nextoid('tablename', 'name_of_index') or such.
> 
> 3) The quickest way to deal with the bootstrap code was to just assign
>    all oids for oid carrying tables that don't yet have any assigned.
>    That doesn't generally seem terrible, although it's certainly badly
>    implemented right now.  That'd mean we'd have three ranges of oids
>    probably, unless we somehow have the bootstrap code advance the
>    in-database oid counter to a right state before we start with
>    post-bootstrap work.  I like the idea of all bootstrap time oids
>    being determined by genbki.pl (I think that'll allow to remove some
>    code too).
> 
> 
> I do wonder about ripping the oid counter out entirely, and replacing it
> with sequences. But that seems like a separate project.
> 
> 
> I'll polish this up further in the not too far away future.  But I'd
> really like to get some feedback before I sink more time into this.

Does anybody have engineering / architecture level comments about this
proposal?

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> Does anybody have engineering / architecture level comments about this
> proposal?

FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes.  Yeah, it's
a wart we wouldn't have if we designed the system today, but the wart is
thirty years old.  I think changing that will break so many catalog
queries that we'll have the villagers on the doorstep.  Most of the other
things you're suggesting here could be done easily without making that
change.

Possibly we could make them not-magic from the storage standpoint (ie
they're regular columns) but have a pg_attribute flag that says not
to include them in "SELECT *" expansion.

BTW, the fact that we have magic OIDs in the system catalogs doesn't
mean that any other storage system has to support that.  (When I was
at Salesforce, we endured *substantial* pain from insisting that we
move the catalogs into their custom storage system.  There were good
reasons to do so, but it's not a decision I'd make again if there were
any plausible way not to.)

Personally, I'd drop WITH OIDs support for user tables altogether, rather
than having pg_dump create a "compatible" translation that won't be very
compatible if it loses the magicness aspect.

            regards, tom lane


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-10-14 18:34:50 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Does anybody have engineering / architecture level comments about this
> > proposal?
> 
> FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes.  Yeah, it's
> a wart we wouldn't have if we designed the system today, but the wart is
> thirty years old.  I think changing that will break so many catalog
> queries that we'll have the villagers on the doorstep.  Most of the other
> things you're suggesting here could be done easily without making that
> change.

Yea, I agree that it'll cause some pain. And we could easily 'de-magic'
oids everywhere except the SELECT * processing (that'd probably leave
some behavioural difference with composite types, but there'd it'd
probably be purely be better).


I'm not sure we want that however - yes, the short time pain will be
lower, but do we really want to inflict the confusion about invisible
oids on our users for the next 20 years? I think there's a fair argument
to be made that we should cause pain once, rather continuing to inflict
lower doeses of pain.


> Possibly we could make them not-magic from the storage standpoint (ie
> they're regular columns) but have a pg_attribute flag that says not
> to include them in "SELECT *" expansion.

Right. And we could just use that for system columns too, removing a
number of special case logic. Seems not unlikely that we'll have further
magic columns that want to be hidden by default, given Kevin is
pondering re-starting his work on incremental matviews.


> BTW, the fact that we have magic OIDs in the system catalogs doesn't
> mean that any other storage system has to support that.  (When I was
> at Salesforce, we endured *substantial* pain from insisting that we
> move the catalogs into their custom storage system.  There were good
> reasons to do so, but it's not a decision I'd make again if there were
> any plausible way not to.)

Right. I suspect that we at some point want to support having catalog
tables in different storage formats, but certainly not initially. Part
of the reason why I want to remove WITH OIDs support is precisely that I
do not want to add this kind of magic to other storage formats.


> Personally, I'd drop WITH OIDs support for user tables altogether, rather
> than having pg_dump create a "compatible" translation that won't be very
> compatible if it loses the magicness aspect.

Yea, I'm on-board with that.

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
Thomas Munro
Дата:
On Mon, Oct 15, 2018 at 11:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Does anybody have engineering / architecture level comments about this
> > proposal?
>
> FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes.  Yeah, it's
> a wart we wouldn't have if we designed the system today, but the wart is
> thirty years old.  I think changing that will break so many catalog
> queries that we'll have the villagers on the doorstep.  Most of the other
> things you're suggesting here could be done easily without making that
> change.
>
> Possibly we could make them not-magic from the storage standpoint (ie
> they're regular columns) but have a pg_attribute flag that says not
> to include them in "SELECT *" expansion.

FWIW there is interest in a general facility for hiding arbitrary
attributes from SELECT * for other reasons too:

https://www.postgresql.org/message-id/flat/CAEepm%3D3ZHh%3Dp0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


Re: [RFC] Removing "magic" oids

От
Robert Haas
Дата:
On Sun, Oct 14, 2018 at 6:43 PM Andres Freund <andres@anarazel.de> wrote:
> I'm not sure we want that however - yes, the short time pain will be
> lower, but do we really want to inflict the confusion about invisible
> oids on our users for the next 20 years? I think there's a fair argument
> to be made that we should cause pain once, rather continuing to inflict
> lower doeses of pain.

Yeah, I think that argument has quite a bit of merit.  I suspect that
there are a lot of people who expect that 'SELECT * FROM pg_whatever'
is going to show them all of the data in pg_whatever, and take a while
to figure out that it really doesn't.  I know better and still mess
this up with some regularity.  Anyone who finds it unintuitive that *
doesn't really mean everything is going to be happier with this change
in the long run.

In the short run, it is indeed possible that some catalog queries will
break.  But, really, how many?  For the most part, automated queries
written by tools probably select specific columns rather than
everything, and IIUC those won't break.  And even if they do use
'SELECT *', won't they just end up with two copies of the OID column?
That might work fine.

Of course it might not, and then you'd have to fix your code, but it's
not obvious to me that this would be a horror show.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [RFC] Removing "magic" oids

От
Stephen Frost
Дата:
Greetings,

* Thomas Munro (thomas.munro@enterprisedb.com) wrote:
> On Mon, Oct 15, 2018 at 11:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > Does anybody have engineering / architecture level comments about this
> > > proposal?
> >
> > FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes.  Yeah, it's
> > a wart we wouldn't have if we designed the system today, but the wart is
> > thirty years old.  I think changing that will break so many catalog
> > queries that we'll have the villagers on the doorstep.  Most of the other
> > things you're suggesting here could be done easily without making that
> > change.
> >
> > Possibly we could make them not-magic from the storage standpoint (ie
> > they're regular columns) but have a pg_attribute flag that says not
> > to include them in "SELECT *" expansion.
>
> FWIW there is interest in a general facility for hiding arbitrary
> attributes from SELECT * for other reasons too:
>
> https://www.postgresql.org/message-id/flat/CAEepm%3D3ZHh%3Dp0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA%40mail.gmail.com

Yeah, that's exactly what I was thinking to bring up also.

There's certainly also been explicit requests for the user to be able to
control what SELECT * means, beyond our own ideas of things we'd like to
be able to add and then hide.

Thanks!

Stephen

Вложения

Re: [RFC] Removing "magic" oids

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Sun, Oct 14, 2018 at 6:43 PM Andres Freund <andres@anarazel.de> wrote:
> > I'm not sure we want that however - yes, the short time pain will be
> > lower, but do we really want to inflict the confusion about invisible
> > oids on our users for the next 20 years? I think there's a fair argument
> > to be made that we should cause pain once, rather continuing to inflict
> > lower doeses of pain.
>
> Yeah, I think that argument has quite a bit of merit.  I suspect that
> there are a lot of people who expect that 'SELECT * FROM pg_whatever'
> is going to show them all of the data in pg_whatever, and take a while
> to figure out that it really doesn't.  I know better and still mess
> this up with some regularity.  Anyone who finds it unintuitive that *
> doesn't really mean everything is going to be happier with this change
> in the long run.
>
> In the short run, it is indeed possible that some catalog queries will
> break.  But, really, how many?  For the most part, automated queries
> written by tools probably select specific columns rather than
> everything, and IIUC those won't break.  And even if they do use
> 'SELECT *', won't they just end up with two copies of the OID column?
> That might work fine.
>
> Of course it might not, and then you'd have to fix your code, but it's
> not obvious to me that this would be a horror show.

I tend to agree that this won't be as much of a horror show as made out
to be up-thread.  Tools should certainly be using explicit column names
and if they aren't then we're breaking them regularly anyway whenever
we change what columns exist in a given catalog table.

All the column renaming we did for v10 strikes me as a much bigger deal
than this change and while we did hear some complaints about that, I
certainly feel like it was worth it and that people generally
understood.

Thanks!

Stephen

Вложения

Re: [RFC] Removing "magic" oids

От
Amit Kapila
Дата:
On Wed, Oct 17, 2018 at 9:22 PM Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
>
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > On Sun, Oct 14, 2018 at 6:43 PM Andres Freund <andres@anarazel.de> wrote:
> > > I'm not sure we want that however - yes, the short time pain will be
> > > lower, but do we really want to inflict the confusion about invisible
> > > oids on our users for the next 20 years? I think there's a fair argument
> > > to be made that we should cause pain once, rather continuing to inflict
> > > lower doeses of pain.
> >
> > Yeah, I think that argument has quite a bit of merit.  I suspect that
> > there are a lot of people who expect that 'SELECT * FROM pg_whatever'
> > is going to show them all of the data in pg_whatever, and take a while
> > to figure out that it really doesn't.  I know better and still mess
> > this up with some regularity.  Anyone who finds it unintuitive that *
> > doesn't really mean everything is going to be happier with this change
> > in the long run.
> >
> > In the short run, it is indeed possible that some catalog queries will
> > break.  But, really, how many?  For the most part, automated queries
> > written by tools probably select specific columns rather than
> > everything, and IIUC those won't break.  And even if they do use
> > 'SELECT *', won't they just end up with two copies of the OID column?
> > That might work fine.
> >
> > Of course it might not, and then you'd have to fix your code, but it's
> > not obvious to me that this would be a horror show.
>
> I tend to agree that this won't be as much of a horror show as made out
> to be up-thread.

+1.

>  Tools should certainly be using explicit column names
> and if they aren't then we're breaking them regularly anyway whenever
> we change what columns exist in a given catalog table.
>

I think we can't say with any certainty that how many application will
break, but if I have to guess then some will surely break, however, I
think it will be a good decision for the long term.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [RFC] Removing "magic" oids

От
Amit Kapila
Дата:
On Sun, Sep 30, 2018 at 9:18 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> In my opinion the current WITH OIDs system has numerous weaknesses:
>
> 1) The fact that oids are so magic means that if we get pluggable
>    storage, the design of the potential pluggable systems is constrained
>    and similar magic has to be present everywhere.
>
> 2) The fact that the oids in each table have the same counter to be
>    based on means that oid wraparounds have much worse consequences
>    performance wise than necessary. E.g. once the global counter has
>    wrapped, all toast tables start to be significantly slower.
>
>    It would be much better if most database objects had their own
>    counters.
>
> 3) For some oid using objects (toast, large objects at the very least)
>    it'd be quite worthwhile to switch to 8 byte ids.  Currently that's
>    hard to do, because it'd break on-disk compatibility.
>
> 4) There's a lot of special case code around for dealing with oids.
>
> 5a) The fact that system table oids don't show up in selects by default
>    makes it more work than necessary to look at catalogs.
>
> 5b) Similarly, it's fairly annoying when debugging not to trivially see
>    oids for catalog structs.
>
>
> I think we should drop WITH OIDs support.  pg_dump should convert WITH
> OIDs tables into tables that have an explicit oid column (with an
> appropriate default function), pg_upgrade should refuse to upgrade them.
>

Is there any technical reason why you think pg_upgrade should refuse
to upgrade them?  I think there is an argument to break backward
compatibility here and many people on the thread seem to be okay with
that, but refusing to upgrade sounds more restrictive.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-10-28 00:21:23 +0530, Amit Kapila wrote:
> On Sun, Sep 30, 2018 at 9:18 AM Andres Freund <andres@anarazel.de> wrote:
> > I think we should drop WITH OIDs support.  pg_dump should convert WITH
> > OIDs tables into tables that have an explicit oid column (with an
> > appropriate default function), pg_upgrade should refuse to upgrade them.
> >
> 
> Is there any technical reason why you think pg_upgrade should refuse
> to upgrade them?  I think there is an argument to break backward
> compatibility here and many people on the thread seem to be okay with
> that, but refusing to upgrade sounds more restrictive.

They'd not be on-disk compatible, because the column isn't stored as a
normal column but in the t_hoff space.

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
Amit Kapila
Дата:
On Sun, Oct 28, 2018 at 12:26 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2018-10-28 00:21:23 +0530, Amit Kapila wrote:
> > On Sun, Sep 30, 2018 at 9:18 AM Andres Freund <andres@anarazel.de> wrote:
> > > I think we should drop WITH OIDs support.  pg_dump should convert WITH
> > > OIDs tables into tables that have an explicit oid column (with an
> > > appropriate default function), pg_upgrade should refuse to upgrade them.
> > >
> >
> > Is there any technical reason why you think pg_upgrade should refuse
> > to upgrade them?  I think there is an argument to break backward
> > compatibility here and many people on the thread seem to be okay with
> > that, but refusing to upgrade sounds more restrictive.
>
> They'd not be on-disk compatible, because the column isn't stored as a
> normal column but in the t_hoff space.
>

Yeah, and which means we need to re-write all such tables which is not
an attractive option and probably quite some work.  So, users have to
dump and restore their databases which can be time-consuming for large
databases, but I think we don't have any better option to offer.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi All, Noah,

Noah, see one question to you below.

On 2018-09-29 20:48:10 -0700, Andres Freund wrote:
> The attached *PROTOTYPE* *WIP* patch removes WITH OIDs support, converts
> catalog table oids to explicit oid columns, and makes enough
> infrastructure changes to make plain pg_regress pass (after the
> necessary changes of course).  Only superficial psql and no pg_dump
> changes.
> 
> There's plenty of issues with the prototype, but overall I'm pretty
> pleased.

Here's an attached version of the prototype. Main changes besides lots
of conflict resolutions are that now all the regression tests pass. I've
also excised pg_dump support, and removed docs for WITH OIDs.

Questions:
- Currently the patch allows WITHOUT OIDS and WITH (oids = false), as
  those are "harmless".  Should we keep that?
- pg_dump will atm emit a warning if oids are contained in dumped table,
  but not fail. Does that seem reasonable?
- pg_restore does the same, just when reading the archive
- pg_dump still puts the hasoids field into the archive. Thus the format
  didn't change. That seems the most reasonable to me.
- one pgbench test tested concurrent insertions into a table with
  oids, as some sort of stress test for lwlocks and spinlocks. I *think*
  this doesn't really have to be a system oid column, and this was just
  because that's how we triggered a bug on some machine. Noah, do I get
  this right?

My reading from the discussion is that:
- we'll not have some magic compat thing in pg_dump that rewrites < 12 oids
  into oid columns
- pg_upgrade will error out when it encounters an oid
- oids will remain visible

Does that seem like a fair read?


> 1) There's a few places dealing with system tables that don't deal with
>    a hardcoded system table. Since there's no notion of "table has oid"
>    and "which column is the oid column) anymore, we need some way to
>    deal with that.  So far I've just extended existing objectaddress.c
>    code to deal with that, but it's not that pretty.

I think that can reverted, if we take care of 2), now that I cleaned up
some of the uglier hacks.  Right now the consequence of this is that
it's possible to insert into system catalogs without specifying the oid
only for catalogs with objectaddress.c support.


> 2) We need to be able to manually insert into catalog tables. Both
>    initdb and emergency surgery.  My current hack is doing so directly
>    in nodeModifyTable.c but that's beyond ugly.  I think we should add
>    an explicit DEFAULT clause to those columns with something like
>    nextoid('tablename', 'name_of_index') or such.

It'd be nextoid('tablename', 'oid', 'name_of_index'), I think.  There's
a bit of a sequencing issue about when to add those column defaults, so
initdb can properly insert, but it doesn't sound too hard.


> 3) The quickest way to deal with the bootstrap code was to just assign
>    all oids for oid carrying tables that don't yet have any assigned.
>    That doesn't generally seem terrible, although it's certainly badly
>    implemented right now.  That'd mean we'd have three ranges of oids
>    probably, unless we somehow have the bootstrap code advance the
>    in-database oid counter to a right state before we start with
>    post-bootstrap work.  I like the idea of all bootstrap time oids
>    being determined by genbki.pl (I think that'll allow to remove some
>    code too).

I've not done anything about this so far.

Greetings,

Andres Freund

Вложения

Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-14 00:01:52 -0800, Andres Freund wrote:
> > 1) There's a few places dealing with system tables that don't deal with
> >    a hardcoded system table. Since there's no notion of "table has oid"
> >    and "which column is the oid column) anymore, we need some way to
> >    deal with that.  So far I've just extended existing objectaddress.c
> >    code to deal with that, but it's not that pretty.
> 
> I think that can reverted, if we take care of 2), now that I cleaned up
> some of the uglier hacks.  Right now the consequence of this is that
> it's possible to insert into system catalogs without specifying the oid
> only for catalogs with objectaddress.c support.

> > 2) We need to be able to manually insert into catalog tables. Both
> >    initdb and emergency surgery.  My current hack is doing so directly
> >    in nodeModifyTable.c but that's beyond ugly.  I think we should add
> >    an explicit DEFAULT clause to those columns with something like
> >    nextoid('tablename', 'name_of_index') or such.
> 
> It'd be nextoid('tablename', 'oid', 'name_of_index'), I think.  There's
> a bit of a sequencing issue about when to add those column defaults, so
> initdb can properly insert, but it doesn't sound too hard.

I had contemplated adding the above function as the default on system
columns - but I think that'd add significant complications to
relcache.c. As it's not actually a common occurance to manually insert
into catalog tables, my revised plan is to provide a pg_nextoid()
function, but *not* set it as the default.  The only case in postgres
where that matters is the manual insertions into pg_depend during
initdb, but that query can trivially be adapted.

Currently I'm not planning to document the use pg_nextoid(), it's not
something users really should do.  What I wonder about however is where
it should best live - I've put into catalog.c for now, but I'm not
entirely happy with that.

I've thus taken out the nodeModifyTable.c hack that assigned oids at
INSERT. Thus insertions without oid specified now raise a NOT NULL
violation (which can be fixed by using the pg_nextoid() function).

Comments?


> > 3) The quickest way to deal with the bootstrap code was to just assign
> >    all oids for oid carrying tables that don't yet have any assigned.
> >    That doesn't generally seem terrible, although it's certainly badly
> >    implemented right now.  That'd mean we'd have three ranges of oids
> >    probably, unless we somehow have the bootstrap code advance the
> >    in-database oid counter to a right state before we start with
> >    post-bootstrap work.  I like the idea of all bootstrap time oids
> >    being determined by genbki.pl (I think that'll allow to remove some
> >    code too).
> 
> I've not done anything about this so far.

I've now revised this slightly. genbki.pl now computes the maximum oid
explicitly assigned in .dat files, and assignes oids to all 'oid'
columns without a value.  It does so starting directly at the maximum
value.  I personally don't see need to have implicit .bki oids be in a
different range, and having them assigned more densely is good for some
things (e.g. the fmgr builtins table, even though we currently assign
all proc oids manually).

Differing opinions about this?


Attached is revised version of the patch.  I've tried to find all
references to oids and revise them accordingly. The docs changes
probably need a polish.

I quite like the diffstat:
b0e170f2c8c57b683c74e6cccfb7e46356dbacea (HEAD -> oidnix) Remove WITH OIDs support.
 333 files changed, 1949 insertions(+), 4030 deletions(-)


While clearly not ready yet, I don't think it's that far off.

Missing:
- docs polish
- pg_upgrade early error
- discussion of the pg_dump/restore behaviour when encountering tables
  or archives with oids. It currently warns.  If we want to keep it that
  way - which I think is reasonable - a bit more code can be excised.

Greetings,

Andres Freund

Вложения

Re: [RFC] Removing "magic" oids

От
Noah Misch
Дата:
On Wed, Nov 14, 2018 at 12:01:52AM -0800, Andres Freund wrote:
> - one pgbench test tested concurrent insertions into a table with
>   oids, as some sort of stress test for lwlocks and spinlocks. I *think*
>   this doesn't really have to be a system oid column, and this was just
>   because that's how we triggered a bug on some machine. Noah, do I get
>   this right?

The point of the test is to exercise OidGenLock by issuing many parallel
GetNewOidWithIndex() and verifying absence of duplicates.  There's nothing
special about OidGenLock, but it is important to use an operation that takes a
particular LWLock many times, quickly.  If the test query spends too much time
on things other than taking locks, it will catch locking races too rarely.

> --- a/src/bin/pgbench/t/001_pgbench_with_server.pl
> +++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
> @@ -48,28 +48,26 @@ sub pgbench
>      return;
>  }
>  
> -# Test concurrent insertion into table with UNIQUE oid column.  DDL expects
> -# GetNewOidWithIndex() to successfully avoid violating uniqueness for indexes
> -# like pg_class_oid_index and pg_proc_oid_index.  This indirectly exercises
> -# LWLock and spinlock concurrency.  This test makes a 5-MiB table.
> +# Test concurrent insertion into table with serial column.  This
> +# indirectly exercises LWLock and spinlock concurrency.  This test
> +# makes a 5-MiB table.
>  
>  $node->safe_psql('postgres',
> -        'CREATE UNLOGGED TABLE oid_tbl () WITH OIDS; '
> -      . 'ALTER TABLE oid_tbl ADD UNIQUE (oid);');
> +        'CREATE UNLOGGED TABLE insert_tbl (id serial primary key); ');
>  
>  pgbench(
>      '--no-vacuum --client=5 --protocol=prepared --transactions=25',
>      0,
>      [qr{processed: 125/125}],
>      [qr{^$}],
> -    'concurrency OID generation',
> +    'concurrent insert generation',
>      {
> -        '001_pgbench_concurrent_oid_generation' =>
> -          'INSERT INTO oid_tbl SELECT FROM generate_series(1,1000);'
> +        '001_pgbench_concurrent_insert' =>
> +          'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'

The code for sequences is quite different, so this may or may not be an
effective replacement.  To study that, you could remove a few barriers from
lwlock.c, measure how many iterations today's test needs to catch the
mutation, and then measure the same for this proposal.


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-15 04:57:28 +0000, Noah Misch wrote:
> On Wed, Nov 14, 2018 at 12:01:52AM -0800, Andres Freund wrote:
> > - one pgbench test tested concurrent insertions into a table with
> >   oids, as some sort of stress test for lwlocks and spinlocks. I *think*
> >   this doesn't really have to be a system oid column, and this was just
> >   because that's how we triggered a bug on some machine. Noah, do I get
> >   this right?
> 
> The point of the test is to exercise OidGenLock by issuing many parallel
> GetNewOidWithIndex() and verifying absence of duplicates.  There's nothing
> special about OidGenLock, but it is important to use an operation that takes a
> particular LWLock many times, quickly.  If the test query spends too much time
> on things other than taking locks, it will catch locking races too rarely.

Sequences ought to do that, too. And if it's borked, we'd hopefully see
unique violations. But it's definitely not a 1:1 replacement.


> >  pgbench(
> >      '--no-vacuum --client=5 --protocol=prepared --transactions=25',
> >      0,
> >      [qr{processed: 125/125}],
> >      [qr{^$}],
> > -    'concurrency OID generation',
> > +    'concurrent insert generation',
> >      {
> > -        '001_pgbench_concurrent_oid_generation' =>
> > -          'INSERT INTO oid_tbl SELECT FROM generate_series(1,1000);'
> > +        '001_pgbench_concurrent_insert' =>
> > +          'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'
> 
> The code for sequences is quite different, so this may or may not be an
> effective replacement.  To study that, you could remove a few barriers from
> lwlock.c, measure how many iterations today's test needs to catch the
> mutation, and then measure the same for this proposal.

Unfortunately it's really hard to hit barrier issues on x86. I think
that's the only arch I currently have access to, but it's possible I
have access to some ppc too.  If you have a better idea for a
replacement test, I'd be all ears.

Thanks!

Andres


Re: [RFC] Removing "magic" oids

От
John Naylor
Дата:
On 11/15/18, Andres Freund <andres@anarazel.de> wrote:
> I've now revised this slightly. genbki.pl now computes the maximum oid
> explicitly assigned in .dat files, and assignes oids to all 'oid'
> columns without a value.  It does so starting directly at the maximum
> value.  I personally don't see need to have implicit .bki oids be in a
> different range, and having them assigned more densely is good for some
> things (e.g. the fmgr builtins table, even though we currently assign
> all proc oids manually).

I don't see an advantage to having a different range, but maybe it
should error out if $maxoid reaches FirstBootstrapObjectId.

This patch breaks reformat_dat_file.pl. I've attached a fix, which
also de-lists oid as a special key within the *.dat files. It might be
good to put off reformatting until feature freeze, so as not to break
others' patches.

-John Naylor

Вложения

Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-15 17:25:21 +0700, John Naylor wrote:
> On 11/15/18, Andres Freund <andres@anarazel.de> wrote:
> > I've now revised this slightly. genbki.pl now computes the maximum oid
> > explicitly assigned in .dat files, and assignes oids to all 'oid'
> > columns without a value.  It does so starting directly at the maximum
> > value.  I personally don't see need to have implicit .bki oids be in a
> > different range, and having them assigned more densely is good for some
> > things (e.g. the fmgr builtins table, even though we currently assign
> > all proc oids manually).
> 
> I don't see an advantage to having a different range, but maybe it
> should error out if $maxoid reaches FirstBootstrapObjectId.

Hm. Not sure I really see the point. Note we didn't have that check
before either, and it'd not catch manual assignment of too high oids.  I
wonder if we should have a check in sanity_check.sql or such, that'd
then catch both?


> This patch breaks reformat_dat_file.pl. I've attached a fix, which
> also de-lists oid as a special key within the *.dat files. It might be
> good to put off reformatting until feature freeze, so as not to break
> others' patches.

Thanks for catching that.  I wonder if we could fix that in a way that
doesn't move oid into the middle of the data - while it's less magic-y
from a storage level, it's still more important than the rest. Perhaps
we could just leave oid in metadata and skip all @metadata elements in
format_hash()?


> @@ -193,7 +192,7 @@ sub strip_default_values
>      {
>          my $attname = $column->{name};
>          die "strip_default_values: $catname.$attname undefined\n"
> -          if !defined $row->{$attname};
> +          if !defined $row->{$attname} and $attname ne 'oid';
>  
>          if (defined $column->{default}
>              and ($row->{$attname} eq $column->{default}))

Hm, why is there no column definition for oid?

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
John Naylor
Дата:
On 11/16/18, Andres Freund <andres@anarazel.de> wrote:
> On 2018-11-15 17:25:21 +0700, John Naylor wrote:
>> I don't see an advantage to having a different range, but maybe it
>> should error out if $maxoid reaches FirstBootstrapObjectId.
>
> Hm. Not sure I really see the point. Note we didn't have that check
> before either, and it'd not catch manual assignment of too high oids.  I
> wonder if we should have a check in sanity_check.sql or such, that'd
> then catch both?

My concern is that a developer might assign a high number oid to avoid
conflicts during development, such that the auto-assigned oids can
blow past 10000. If it can be done simply in SQL, I think that'd be
fine, and maybe more robust than checking within the script.

>> This patch breaks reformat_dat_file.pl. I've attached a fix, which
>> also de-lists oid as a special key within the *.dat files. It might be
>> good to put off reformatting until feature freeze, so as not to break
>> others' patches.
>
> Thanks for catching that.  I wonder if we could fix that in a way that
> doesn't move oid into the middle of the data - while it's less magic-y
> from a storage level, it's still more important than the rest. Perhaps
> we could just leave oid in metadata and skip all @metadata elements in
> format_hash()?

@metadata is passed to format_hash(), so that wouldn't work without
additional bookkeeping (if I understand you correctly). I think it'd
be simpler to filter out @metadata elements while building @attnames,
which is what we pass to format_hash() for ordinary columns. Let me
know what you think (attached).

>> @@ -193,7 +192,7 @@ sub strip_default_values
>>      {
>>          my $attname = $column->{name};
>>          die "strip_default_values: $catname.$attname undefined\n"
>> -          if !defined $row->{$attname};
>> +          if !defined $row->{$attname} and $attname ne 'oid';
>>
>>          if (defined $column->{default}
>>              and ($row->{$attname} eq $column->{default}))
>
> Hm, why is there no column definition for oid?

(This is a sanity check that all keys referring to columns have
values. I'd like to keep it since it has found bugs during
development.) If oid is now a normal column, then catalogs like
pg_amop will have no explicit oid and the check will fail without this
change. I've added a comment -- hopefully it's clearer.

-John Naylor

Вложения

Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
On 2018-11-14 21:02:41 -0800, Andres Freund wrote:
> Hi,
> 
> On 2018-11-15 04:57:28 +0000, Noah Misch wrote:
> > On Wed, Nov 14, 2018 at 12:01:52AM -0800, Andres Freund wrote:
> > > - one pgbench test tested concurrent insertions into a table with
> > >   oids, as some sort of stress test for lwlocks and spinlocks. I *think*
> > >   this doesn't really have to be a system oid column, and this was just
> > >   because that's how we triggered a bug on some machine. Noah, do I get
> > >   this right?
> > 
> > The point of the test is to exercise OidGenLock by issuing many parallel
> > GetNewOidWithIndex() and verifying absence of duplicates.  There's nothing
> > special about OidGenLock, but it is important to use an operation that takes a
> > particular LWLock many times, quickly.  If the test query spends too much time
> > on things other than taking locks, it will catch locking races too rarely.
> 
> Sequences ought to do that, too. And if it's borked, we'd hopefully see
> unique violations. But it's definitely not a 1:1 replacement.
> 
> 
> > >  pgbench(
> > >      '--no-vacuum --client=5 --protocol=prepared --transactions=25',
> > >      0,
> > >      [qr{processed: 125/125}],
> > >      [qr{^$}],
> > > -    'concurrency OID generation',
> > > +    'concurrent insert generation',
> > >      {
> > > -        '001_pgbench_concurrent_oid_generation' =>
> > > -          'INSERT INTO oid_tbl SELECT FROM generate_series(1,1000);'
> > > +        '001_pgbench_concurrent_insert' =>
> > > +          'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'
> > 
> > The code for sequences is quite different, so this may or may not be an
> > effective replacement.  To study that, you could remove a few barriers from
> > lwlock.c, measure how many iterations today's test needs to catch the
> > mutation, and then measure the same for this proposal.
> 
> Unfortunately it's really hard to hit barrier issues on x86. I think
> that's the only arch I currently have access to, but it's possible I
> have access to some ppc too.  If you have a better idea for a
> replacement test, I'd be all ears.

I've tested this on ppc.  Neither the old version nor the new version
stress test spinlocks sufficiently to error out with weakened spinlocks
(not that surprising, there are no spinlocks in any hot path of either
workload). Both versions very reliably trigger on weakened lwlocks. So I
think we're comparatively good on that front.

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-14 17:48:07 -0800, Andres Freund wrote:
> While clearly not ready yet, I don't think it's that far off.
>
> Missing:
> - docs polish
> - pg_upgrade early error
> - discussion of the pg_dump/restore behaviour when encountering tables
>   or archives with oids. It currently warns.  If we want to keep it that
>   way - which I think is reasonable - a bit more code can be excised.

Attached is an updated version. It fixes all the FIXMEs/XXXs that were
left over. Most of these weren't particularly interesting, except for
the fact that GetSysCacheOid[1-4]? now has a new AttrNumber parameter
indicating the oid attribute.

I also:

- re-added printing of oids via pageinspect. As we can encounter tuples
  with oids via pg_upgrade (even though the oid column must have been
  removed), that seems appropriate
- merged John Naylor's genbki/reformat fixes
- performed docs polishing
- made the objectaddress.c changes more consistent
- corrected lots of formatting issues (too long lines)
- added necessary casts (mostly using ObjectIdGetDatum for oids)
- re-added AT_DropOids, so we can have a proper WITHOUT OIDS ALTER TABLE
  option to ignore. I also considered supporting empty elements in the
  AlterTableStmt->cmds, but that seemed uglier.
- removed #ifdefed out code
- improved error messages
- other small stuff
- re-added a few tests
- lots of small comment fixes

I'm pretty happy with the new state. Unless somebody announces they want
to do a review soon-ish, I'm planning to commit this soon. It's a
painful set to keep up2date, and it's blocking a few other patches.  I'm
sure we'll find some things to adapt around the margins, but imo the
patch as a whole looks pretty reasonable.

Missing:
- nice and long commit message
- another detailed line-by-line read of the patch (last round took like
  3h :()

341 files changed, 2263 insertions(+), 4249 deletions(-)

Greetings,

Andres Freund

Вложения

Re: [RFC] Removing "magic" oids

От
John Naylor
Дата:
On 11/20/18, Andres Freund <andres@anarazel.de> wrote:
> I'm
> sure we'll find some things to adapt around the margins, but imo the
> patch as a whole looks pretty reasonable.

bki.sgml is now a bit out of date. I've attached a draft fix.

-John Naylor

Вложения

Re: [RFC] Removing "magic" oids

От
Robert Haas
Дата:
On Sun, Oct 14, 2018 at 6:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Does anybody have engineering / architecture level comments about this
> > proposal?
>
> FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes.  Yeah, it's
> a wart we wouldn't have if we designed the system today, but the wart is
> thirty years old.  I think changing that will break so many catalog
> queries that we'll have the villagers on the doorstep.  Most of the other
> things you're suggesting here could be done easily without making that
> change.
>
> Possibly we could make them not-magic from the storage standpoint (ie
> they're regular columns) but have a pg_attribute flag that says not
> to include them in "SELECT *" expansion.

I think such a flag would be a good idea; it seems to have other uses.
As Andres also noted, Kevin was quite interested in having a
hidden-by-default COUNT column to assist with materialized view
maintenance, and presumably this could also be used for that purpose.

I am less sure that it's a good idea that it's a good idea to set that
flag for the OID columns in system catalogs.  I do think that some
tool authors are likely to have to do some significant work to update
their tools, but a lot of tools are probably already querying for
specific columns, not using *, so they'll be fine.  And like Andres
says, it seems like we'll be happier in the long term if we get way
from having OID columns be invisible in system catalogs.  That's just
confusing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [RFC] Removing "magic" oids

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Sun, Oct 14, 2018 at 6:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > Does anybody have engineering / architecture level comments about this
> > > proposal?
> >
> > FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes.  Yeah, it's
> > a wart we wouldn't have if we designed the system today, but the wart is
> > thirty years old.  I think changing that will break so many catalog
> > queries that we'll have the villagers on the doorstep.  Most of the other
> > things you're suggesting here could be done easily without making that
> > change.
> >
> > Possibly we could make them not-magic from the storage standpoint (ie
> > they're regular columns) but have a pg_attribute flag that says not
> > to include them in "SELECT *" expansion.
>
> I think such a flag would be a good idea; it seems to have other uses.
> As Andres also noted, Kevin was quite interested in having a
> hidden-by-default COUNT column to assist with materialized view
> maintenance, and presumably this could also be used for that purpose.

Yeah, I like the idea of having this column too, it'd also be useful for
people who are trying to use column-level privileges.

> I am less sure that it's a good idea that it's a good idea to set that
> flag for the OID columns in system catalogs.  I do think that some
> tool authors are likely to have to do some significant work to update
> their tools, but a lot of tools are probably already querying for
> specific columns, not using *, so they'll be fine.  And like Andres
> says, it seems like we'll be happier in the long term if we get way
> from having OID columns be invisible in system catalogs.  That's just
> confusing.

Tools should *really* be using explicit column names and not just
'SELECT *' (and my experience is that most of them are using explicit
column names in their queries- sometimes to explicitly pull out the oid
column since it's so frequently needed...).  I suppose if we wanted to
get fancy we could allow users to adjust the setting on catalog tables
themselves...  I'd certainly be happier with the OID columns being
visible by SELECT * against the catalog tables though.

By my reckoning, we'll break a lot fewer tools/queries with this change
than the changes we made for xlog -> wal and location -> lsn in various
tables and functions with v10.

Thanks!

Stephen

Вложения

Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-20 11:25:22 -0500, Stephen Frost wrote:
> Greetings,
> 
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > On Sun, Oct 14, 2018 at 6:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Andres Freund <andres@anarazel.de> writes:
> > > > Does anybody have engineering / architecture level comments about this
> > > > proposal?
> > >
> > > FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes.  Yeah, it's
> > > a wart we wouldn't have if we designed the system today, but the wart is
> > > thirty years old.  I think changing that will break so many catalog
> > > queries that we'll have the villagers on the doorstep.  Most of the other
> > > things you're suggesting here could be done easily without making that
> > > change.
> > >
> > > Possibly we could make them not-magic from the storage standpoint (ie
> > > they're regular columns) but have a pg_attribute flag that says not
> > > to include them in "SELECT *" expansion.
> > 
> > I think such a flag would be a good idea; it seems to have other uses.
> > As Andres also noted, Kevin was quite interested in having a
> > hidden-by-default COUNT column to assist with materialized view
> > maintenance, and presumably this could also be used for that purpose.
> 
> Yeah, I like the idea of having this column too, it'd also be useful for
> people who are trying to use column-level privileges.

FWIW, leaving grammar bikeshedding aside, I don't think this is
particularly hard.  There certainly are a few corner cases I've not
touched here, but the attached implements the basic features for hidden
columns.


postgres[14805][1]=# CREATE TABLE blarg(id serial, data text, annotation text);
postgres[14805][1]=# INSERT INTO blarg (data, annotation) VALUES ('42', 'this is THE question');

postgres[14805][1]=# SELECT * FROM blarg;
┌────┬──────┬──────────────────────┐
│ id │ data │      annotation      │
├────┼──────┼──────────────────────┤
│  1 │ 42   │ this is THE question │
└────┴──────┴──────────────────────┘
(1 row)

postgres[14805][1]=# SELECT s.* FROM (SELECT * FROM blarg) s;
┌────┬──────┬──────────────────────┐
│ id │ data │      annotation      │
├────┼──────┼──────────────────────┤
│  1 │ 42   │ this is THE question │
└────┴──────┴──────────────────────┘
(1 row)

postgres[14805][1]=# SELECT s.* FROM (SELECT blarg FROM blarg) s;
┌───────────────────────────────┐
│             blarg             │
├───────────────────────────────┤
│ (1,42,"this is THE question") │
└───────────────────────────────┘
(1 row)


postgres[14805][1]=# SELECT s.annotation FROM (SELECT * FROM blarg) s;
┌──────────────────────┐
│      annotation      │
├──────────────────────┤
│ this is THE question │
└──────────────────────┘
(1 row)

postgres[14805][1]=# SELECT (s.blarg).annotation FROM (SELECT blarg FROM blarg) s;
┌──────────────────────┐
│      annotation      │
├──────────────────────┤
│ this is THE question │
└──────────────────────┘
(1 row)


-- update column to be hidden
postgres[14805][1]=# UPDATE pg_attribute SET attishidden = true WHERE attrelid = 'blarg'::regclass AND attname =
'annotation';

postgres[14805][1]=# SELECT * FROM blarg;
┌────┬──────┐
│ id │ data │
├────┼──────┤
│  1 │ 42   │
└────┴──────┘
(1 row)

postgres[14805][1]=# SELECT s.* FROM (SELECT * FROM blarg) s;
┌────┬──────┐
│ id │ data │
├────┼──────┤
│  1 │ 42   │
└────┴──────┘
(1 row)

postgres[14805][1]=# SELECT s.blarg FROM (SELECT blarg FROM blarg) s;
┌───────────────────────────────┐
│             blarg             │
├───────────────────────────────┤
│ (1,42,"this is THE question") │
└───────────────────────────────┘
(1 row)

postgres[14805][1]=# SELECT s.annotation FROM (SELECT * FROM blarg) s;
ERROR:  42703: column s.annotation does not exist
LINE 1: SELECT s.annotation FROM (SELECT * FROM blarg) s;
               ^
LOCATION:  errorMissingColumn, parse_relation.c:3319

postgres[14805][1]=# SELECT (s.blarg).annotation FROM (SELECT blarg FROM blarg) s;
┌──────────────────────┐
│      annotation      │
├──────────────────────┤
│ this is THE question │
└──────────────────────┘
(1 row)

postgres[14805][1]=# SELECT (s.blarg).* FROM (SELECT blarg FROM blarg) s;
┌────┬──────┐
│ id │ data │
├────┼──────┤
│  1 │ 42   │
└────┴──────┘
(1 row)


It's debatable if a wholerow-var select (without *, i.e the s.blarg case
above)) ought to include the hidden column, but to me that intuitively
makes sense. Otherwise you couldn't select it after a cast and such.

Greetings,

Andres Freund

Вложения

Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-20 21:49:27 +0700, John Naylor wrote:
> On 11/20/18, Andres Freund <andres@anarazel.de> wrote:
> > I'm
> > sure we'll find some things to adapt around the margins, but imo the
> > patch as a whole looks pretty reasonable.
> 
> bki.sgml is now a bit out of date. I've attached a draft fix.

Thanks!

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-20 01:52:27 -0800, Andres Freund wrote:
> I'm pretty happy with the new state. Unless somebody announces they want
> to do a review soon-ish, I'm planning to commit this soon. It's a
> painful set to keep up2date, and it's blocking a few other patches.  I'm
> sure we'll find some things to adapt around the margins, but imo the
> patch as a whole looks pretty reasonable.
> 
> Missing:
> - nice and long commit message
> - another detailed line-by-line read of the patch (last round took like
>   3h :()
> 
> 341 files changed, 2263 insertions(+), 4249 deletions(-)

I've now pushed this. There's one bugfix (forgot to unset replaces for
the oid column in heap_modify_tuple), and one mostly cosmetic change:
Previously the .bki insert lines had the oid duplicated between OID =
and the normal column, now it's just the latter.

Let's see what I broke :/

Regards,

Andres


Re: [RFC] Removing "magic" oids

От
Thomas Munro
Дата:
On Wed, Nov 21, 2018 at 1:08 PM Andres Freund <andres@anarazel.de> wrote:
> Let's see what I broke :/

Per rhinoceros, contrib/sepgsql.  This patch fixes the build for me
(on Debian with libselinux1-dev installed), though I don't know how to
test that it works and make check does nothing useful in there.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-21 14:32:23 +1300, Thomas Munro wrote:
> On Wed, Nov 21, 2018 at 1:08 PM Andres Freund <andres@anarazel.de> wrote:
> > Let's see what I broke :/
>
> Per rhinoceros, contrib/sepgsql.  This patch fixes the build for me
> (on Debian with libselinux1-dev installed)

Thanks. I pushed your fix.


> though I don't know how to test that it works and make check does
> nothing useful in there.

There's a testsuite (contrib/sepgsql/test_sepgsql), but it requires
setup. I'll just let rhinoceros sort that out...

- Andres


Re: [RFC] Removing "magic" oids

От
Andreas Karlsson
Дата:
On 11/21/18 1:07 AM, Andres Freund wrote:
> Let's see what I broke :/

There is a small typo in the old release notes.

Andreas

Вложения

Re: [RFC] Removing "magic" oids

От
Andrew Dunstan
Дата:
On 11/20/18 7:07 PM, Andres Freund wrote:
>
>
> Let's see what I broke :/
>


pg_upgrade against old versions, by the look of it. Even after I drop 
oids from user tables, I get errors like this when running pg_dumpall 
against a pg_upgraded REL_9_4_STABLE datadir:


2018-11-21 13:01:58.582 EST [11861] ERROR:  large object 10 does not exist
2018-11-21 13:01:58.637 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:01:58.637 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:01:59.638 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:01:59.638 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:01:59.638 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:01:59.638 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:00.493 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:00.639 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:00.639 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:00.639 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:00.639 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:01.495 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:01.640 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:01.640 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:01.641 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:01.641 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:02.496 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:02.642 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:02.642 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:02.642 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:02.642 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:03.497 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:03.643 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:03.643 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:03.643 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:03.643 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:04.498 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:04.644 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:04.644 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:04.644 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:04.644 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:05.499 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:05.645 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:05.645 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:05.645 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:05.645 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:06.501 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:06.646 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:06.646 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:06.646 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:06.646 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:07.502 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:07.648 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:07.648 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:07.648 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:07.648 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.
2018-11-21 13:02:08.503 EST [11688] ERROR:  could not open file 
"global/1262": No such file or directory
2018-11-21 13:02:08.649 EST [11686] ERROR:  could not open file 
"base/17486/2840_vm": No such file or directory
2018-11-21 13:02:08.649 EST [11686] CONTEXT:  writing block 0 of 
relation base/17486/2840_vm
2018-11-21 13:02:08.649 EST [11686] WARNING:  could not write block 0 of 
base/17486/2840_vm
2018-11-21 13:02:08.649 EST [11686] DETAIL:  Multiple failures --- write 
error might be permanent.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:

On November 21, 2018 10:17:57 AM PST, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:
>
>On 11/20/18 7:07 PM, Andres Freund wrote:
>>
>>
>> Let's see what I broke :/
>>
>
>
>pg_upgrade against old versions, by the look of it. Even after I drop
>oids from user tables, I get errors like this when running pg_dumpall
>against a pg_upgraded REL_9_4_STABLE datadir:
>
>
>2018-11-21 13:01:58.582 EST [11861] ERROR:  large object 10 does not
>exist
>2018-11-21 13:01:58.637 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:01:58.637 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:01:59.638 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:01:59.638 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:01:59.638 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:01:59.638 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:00.493 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:00.639 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:00.639 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:00.639 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:00.639 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:01.495 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:01.640 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:01.640 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:01.641 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:01.641 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:02.496 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:02.642 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:02.642 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:02.642 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:02.642 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:03.497 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:03.643 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:03.643 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:03.643 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:03.643 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:04.498 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:04.644 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:04.644 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:04.644 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:04.644 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:05.499 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:05.645 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:05.645 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:05.645 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:05.645 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:06.501 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:06.646 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:06.646 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:06.646 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:06.646 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:07.502 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:07.648 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:07.648 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:07.648 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:07.648 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.
>2018-11-21 13:02:08.503 EST [11688] ERROR:  could not open file
>"global/1262": No such file or directory
>2018-11-21 13:02:08.649 EST [11686] ERROR:  could not open file
>"base/17486/2840_vm": No such file or directory
>2018-11-21 13:02:08.649 EST [11686] CONTEXT:  writing block 0 of
>relation base/17486/2840_vm
>2018-11-21 13:02:08.649 EST [11686] WARNING:  could not write block 0
>of
>base/17486/2840_vm
>2018-11-21 13:02:08.649 EST [11686] DETAIL:  Multiple failures ---
>write
>error might be permanent.

I'll take a look, thanks.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-21 10:36:42 -0800, Andres Freund wrote:
> >pg_upgrade against old versions, by the look of it. Even after I drop 
> >oids from user tables, I get errors like this when running pg_dumpall 
> >against a pg_upgraded REL_9_4_STABLE datadir:

Not pg_upgrade in general, I did test that. "Just" large objects.
pg_largeobject and pg_largeobject_metadata are the only catalog tables
that we copy over from the old cluster. I'd checked that pg_largeobject
doesn't contain an oid column, but somehow missed that pg_upgrade also
copies pg_largeobject_metadata - which does contain an oid.

Part of that is me just missing that fact, but part of it is also that a
binary upgrade pg_dump actually makes it nearly entirely unnecessary to
migrate pg_largeobject_metadata. pg_dump in binary upgrade mode *does*
issue lo_create() calls, and also emits ALTER LARGE OBJECT .. OWNER TO
.. statements. Which make it largely unnecessary to copy
pg_largeobject_metadata over.  Only grants where skipped, but that's
trivial to change.

The attached fix fixes this issue "simply" by enabling the dumping of
blob ACLS in binary upgrade mode, and not copying the underlying files
over anymore. As we already emit multiple statements for each large
objects, doing so additionally for grants ought not to matter too
much. The right way to optimize that, if we wanted, would be to stop
dumping large objects metadata via pg_dump alltogether in pg_upgrade
(and make it a single COPY (SELECT) command that's restrored via COPY).

There's one comment atop of pg_upgrade that I'm not sure about anymore:
 *    We control all assignments of pg_authid.oid because these oids are stored
 *    in pg_largeobject_metadata. XXX still

That's not true anymore after this change. Turns out we currently don't
prevent regrole from being used as a column type (I'll start a thread
about insufficient reg* verification), but we could just prohibit that.
Any comments?


Note that this doesn't fix the < 9.0 path where we recreate
pg_largeobject_metadata. I basically think we should just remove support
for that version. But even if we cannot agree to that, I first want to
know if this fixes your full set of problems.


> >2018-11-21 13:01:58.582 EST [11861] ERROR:  large object 10 does not
> >exist
> >2018-11-21 13:01:58.637 EST [11686] ERROR:  could not open file 
> >"base/17486/2840_vm": No such file or directory
> >2018-11-21 13:01:58.637 EST [11686] CONTEXT:  writing block 0 of 
> >relation base/17486/2840_vm
> >2018-11-21 13:01:59.638 EST [11686] ERROR:  could not open file 
> >"base/17486/2840_vm": No such file or directory
> >2018-11-21 13:01:59.638 EST [11686] CONTEXT:  writing block 0 of 
> >relation base/17486/2840_vm
> >2018-11-21 13:01:59.638 EST [11686] WARNING:  could not write block 0
> >of 
> >base/17486/2840_vm

I was not able to reproduce this issue.  Could you check whether you
still encounter the issue after applying the attached fix? And if so,
provide me with more detailed instructions on how to reproduce?

Greetings,

Andres Freund

Вложения

Re: [RFC] Removing "magic" oids

От
Andrew Dunstan
Дата:
On 11/21/18 7:14 PM, Andres Freund wrote:
>   Could you check whether you
> still encounter the issue after applying the attached fix?
>


This has largely fixed the problem, so I think this should be applied. 
With some adjustments to the tests to remove problematic cases (e.g. 
postgres_fdw's ft_pg_type) the tests pass. The exception is HEAD->HEAD. 
The change is that the LOs are not dumped in the same order pre and post 
upgrade. I can change the tests to allow for a greater fuzz factor - 
generally when the source and target are the same we don't allow any 
fuzz. Or if we care we could do a better job of dumping LOs in a 
consistent order.


cheers


andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-21 23:32:07 -0500, Andrew Dunstan wrote:
> On 11/21/18 7:14 PM, Andres Freund wrote:
> >   Could you check whether you
> > still encounter the issue after applying the attached fix?
> > 
> 
> 
> This has largely fixed the problem, so I think this should be applied.

Cool, will do so tomorrow or such. Thanks for testing.


> With some adjustments to the tests to remove problematic cases (e.g.
> postgres_fdw's ft_pg_type) the tests pass. The exception is
> HEAD->HEAD. The change is that the LOs are not dumped in the same
> order pre and post upgrade. I can change the tests to allow for a
> greater fuzz factor - generally when the source and target are the
> same we don't allow any fuzz.  Or if we care we could do a better job
> of dumping LOs in a consistent order.

So you'd want to dump large objects in oid order or such? Probably
comparatively not a huge overhead, but also not nothing? We don't really
force ordering in other places in pg_dump afaik.

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
Andrew Dunstan
Дата:
On 11/22/18 4:14 PM, Andres Freund wrote:
> Hi,
>
> On 2018-11-21 23:32:07 -0500, Andrew Dunstan wrote:
>> On 11/21/18 7:14 PM, Andres Freund wrote:
>>>    Could you check whether you
>>> still encounter the issue after applying the attached fix?
>>>
>>
>> This has largely fixed the problem, so I think this should be applied.
> Cool, will do so tomorrow or such. Thanks for testing.
>
>
>> With some adjustments to the tests to remove problematic cases (e.g.
>> postgres_fdw's ft_pg_type) the tests pass. The exception is
>> HEAD->HEAD. The change is that the LOs are not dumped in the same
>> order pre and post upgrade. I can change the tests to allow for a
>> greater fuzz factor - generally when the source and target are the
>> same we don't allow any fuzz.  Or if we care we could do a better job
>> of dumping LOs in a consistent order.
> So you'd want to dump large objects in oid order or such? Probably
> comparatively not a huge overhead, but also not nothing? We don't really
> force ordering in other places in pg_dump afaik.
>

Well, all other data is dumped in a consistent order, and the tests rely 
on this. If we don't care about that for LOs I can accommodate it. I 
don't have a terribly strong opinion about the desirability of making 
LOs keep the same behaviour.


cheers


andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-22 17:12:31 -0500, Andrew Dunstan wrote:
> 
> On 11/22/18 4:14 PM, Andres Freund wrote:
> > Hi,
> > 
> > On 2018-11-21 23:32:07 -0500, Andrew Dunstan wrote:
> > > On 11/21/18 7:14 PM, Andres Freund wrote:
> > > >    Could you check whether you
> > > > still encounter the issue after applying the attached fix?
> > > > 
> > > 
> > > This has largely fixed the problem, so I think this should be applied.
> > Cool, will do so tomorrow or such. Thanks for testing.

Sorry, the long weekend delayed this. Pushed now.


> > > With some adjustments to the tests to remove problematic cases (e.g.
> > > postgres_fdw's ft_pg_type) the tests pass. The exception is
> > > HEAD->HEAD. The change is that the LOs are not dumped in the same
> > > order pre and post upgrade. I can change the tests to allow for a
> > > greater fuzz factor - generally when the source and target are the
> > > same we don't allow any fuzz.  Or if we care we could do a better job
> > > of dumping LOs in a consistent order.
> > So you'd want to dump large objects in oid order or such? Probably
> > comparatively not a huge overhead, but also not nothing? We don't really
> > force ordering in other places in pg_dump afaik.
> > 
> 
> Well, all other data is dumped in a consistent order, and the tests rely on
> this. If we don't care about that for LOs I can accommodate it. I don't have
> a terribly strong opinion about the desirability of making LOs keep the same
> behaviour.

I don't think it's true that other comparable metadata is dumped in a
really consistent order. What changes is the order of data in a system
catalog (pg_largeobject_metadata), and we don't enforce that the order
stays the same in e.g. pg_class either.  I guess we could add a
not-actually-needed sort to getBlobs(), with a comment saying that we
only need that for better comparability and note that it's less needed
for other types of objects due to the dependency ordering that pg_dump
does for most object types.

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2018-11-21 16:30:07 +0100, Andreas Karlsson wrote:
> On 11/21/18 1:07 AM, Andres Freund wrote:
> > Let's see what I broke :/
> 
> There is a small typo in the old release notes.

Thanks, pushed!

Greetings,

Andres Freund


Re: [RFC] Removing "magic" oids

От
Andrew Dunstan
Дата:
On 11/26/18 5:50 PM, Andres Freund wrote:
>
>>>> With some adjustments to the tests to remove problematic cases (e.g.
>>>> postgres_fdw's ft_pg_type) the tests pass. The exception is
>>>> HEAD->HEAD. The change is that the LOs are not dumped in the same
>>>> order pre and post upgrade. I can change the tests to allow for a
>>>> greater fuzz factor - generally when the source and target are the
>>>> same we don't allow any fuzz.  Or if we care we could do a better job
>>>> of dumping LOs in a consistent order.
>>> So you'd want to dump large objects in oid order or such? Probably
>>> comparatively not a huge overhead, but also not nothing? We don't really
>>> force ordering in other places in pg_dump afaik.
>>>
>> Well, all other data is dumped in a consistent order, and the tests rely on
>> this. If we don't care about that for LOs I can accommodate it. I don't have
>> a terribly strong opinion about the desirability of making LOs keep the same
>> behaviour.
> I don't think it's true that other comparable metadata is dumped in a
> really consistent order. What changes is the order of data in a system
> catalog (pg_largeobject_metadata), and we don't enforce that the order
> stays the same in e.g. pg_class either.  I guess we could add a
> not-actually-needed sort to getBlobs(), with a comment saying that we
> only need that for better comparability and note that it's less needed
> for other types of objects due to the dependency ordering that pg_dump
> does for most object types.
>


As of now, I have simnply got around the issue in the buildfarm module 
by allowing up to 50 lines of fuzz in the pre-upgrade and post-upgrade 
diffs when the source and destination branch are the same.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [RFC] Removing "magic" oids

От
Noah Misch
Дата:
On Tue, Nov 20, 2018 at 01:20:04AM -0800, Andres Freund wrote:
> On 2018-11-14 21:02:41 -0800, Andres Freund wrote:
> > On 2018-11-15 04:57:28 +0000, Noah Misch wrote:
> > > On Wed, Nov 14, 2018 at 12:01:52AM -0800, Andres Freund wrote:
> > > > - one pgbench test tested concurrent insertions into a table with
> > > >   oids, as some sort of stress test for lwlocks and spinlocks. I *think*
> > > >   this doesn't really have to be a system oid column, and this was just
> > > >   because that's how we triggered a bug on some machine. Noah, do I get
> > > >   this right?
> > > 
> > > The point of the test is to exercise OidGenLock by issuing many parallel
> > > GetNewOidWithIndex() and verifying absence of duplicates.  There's nothing
> > > special about OidGenLock, but it is important to use an operation that takes a
> > > particular LWLock many times, quickly.  If the test query spends too much time
> > > on things other than taking locks, it will catch locking races too rarely.
> > 
> > Sequences ought to do that, too. And if it's borked, we'd hopefully see
> > unique violations. But it's definitely not a 1:1 replacement.

> I've tested this on ppc.  Neither the old version nor the new version
> stress test spinlocks sufficiently to error out with weakened spinlocks
> (not that surprising, there are no spinlocks in any hot path of either
> workload). Both versions very reliably trigger on weakened lwlocks. So I
> think we're comparatively good on that front.

I tested this on xlc, the compiler that motivated the OID test, and the v12+
version of the test didn't catch the bug[1] with xlc 13.1.3.  CREATE TYPE
... AS ENUM generates an OID for each label, so the attached patch makes the
v12+ test have locking behavior similar to its v11 ancestor.

[1] https://postgr.es/m/flat/a72cfcb0-37d0-de2f-b3ec-f38ad8d6a8cc@postgrespro.ru

Вложения

Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2019-07-07 10:00:35 -0700, Noah Misch wrote:
> On Tue, Nov 20, 2018 at 01:20:04AM -0800, Andres Freund wrote:
> > On 2018-11-14 21:02:41 -0800, Andres Freund wrote:
> > > > The point of the test is to exercise OidGenLock by issuing many parallel
> > > > GetNewOidWithIndex() and verifying absence of duplicates.  There's nothing
> > > > special about OidGenLock, but it is important to use an operation that takes a
> > > > particular LWLock many times, quickly.  If the test query spends too much time
> > > > on things other than taking locks, it will catch locking races too rarely.
> > > 
> > > Sequences ought to do that, too. And if it's borked, we'd hopefully see
> > > unique violations. But it's definitely not a 1:1 replacement.
> 
> > I've tested this on ppc.  Neither the old version nor the new version
> > stress test spinlocks sufficiently to error out with weakened spinlocks
> > (not that surprising, there are no spinlocks in any hot path of either
> > workload). Both versions very reliably trigger on weakened lwlocks. So I
> > think we're comparatively good on that front.
> 
> I tested this on xlc, the compiler that motivated the OID test, and the v12+
> version of the test didn't catch the bug[1] with xlc 13.1.3.  CREATE TYPE
> ... AS ENUM generates an OID for each label, so the attached patch makes the
> v12+ test have locking behavior similar to its v11 ancestor.

Interesting.  It's not obvious to me why that works meaningfully better.


> [1] https://postgr.es/m/flat/a72cfcb0-37d0-de2f-b3ec-f38ad8d6a8cc@postgrespro.ru

> diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
> index dc2c72f..3b097a9 100644
> --- a/src/bin/pgbench/t/001_pgbench_with_server.pl
> +++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
> @@ -58,27 +58,20 @@ sub pgbench
>      return;
>  }
>  
> -# Test concurrent insertion into table with serial column.  This
> -# indirectly exercises LWLock and spinlock concurrency.  This test
> -# makes a 5-MiB table.
> -
> -$node->safe_psql('postgres',
> -    'CREATE UNLOGGED TABLE insert_tbl (id serial primary key); ');
> -
> +# Test concurrent OID generation via pg_enum_oid_index.  This indirectly
> +# exercises LWLock and spinlock concurrency.
> +my $labels = join ',', map { "'l$_'" } 1 .. 1000;
>  pgbench(
>      '--no-vacuum --client=5 --protocol=prepared --transactions=25',
>      0,
>      [qr{processed: 125/125}],
>      [qr{^$}],
> -    'concurrent insert workload',
> +    'concurrent OID generation',
>      {
>          '001_pgbench_concurrent_insert' =>
> -          'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'
> +          "CREATE TYPE pg_temp.e AS ENUM ($labels); DROP TYPE pg_temp.e;"
>      });

Hm, perhaps we should just do something stupid an insert into a catalog
table, determining the oid to insert with pg_nextoid? That ought to be a
lot faster and thus more "stress testing" than going through a full
blown DDL statement?  But perhaps that's just too ugly.

Greetings,

Andres Freund



Re: [RFC] Removing "magic" oids

От
Noah Misch
Дата:
On Fri, Jul 19, 2019 at 10:12:57AM -0700, Andres Freund wrote:
> On 2019-07-07 10:00:35 -0700, Noah Misch wrote:
> > +# Test concurrent OID generation via pg_enum_oid_index.  This indirectly
> > +# exercises LWLock and spinlock concurrency.
> > +my $labels = join ',', map { "'l$_'" } 1 .. 1000;
> >  pgbench(
> >      '--no-vacuum --client=5 --protocol=prepared --transactions=25',
> >      0,
> >      [qr{processed: 125/125}],
> >      [qr{^$}],
> > -    'concurrent insert workload',
> > +    'concurrent OID generation',
> >      {
> >          '001_pgbench_concurrent_insert' =>
> > -          'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'
> > +          "CREATE TYPE pg_temp.e AS ENUM ($labels); DROP TYPE pg_temp.e;"
> >      });
> 
> Hm, perhaps we should just do something stupid an insert into a catalog
> table, determining the oid to insert with pg_nextoid?  That ought to be a
> lot faster and thus more "stress testing" than going through a full
> blown DDL statement?  But perhaps that's just too ugly.

I expect the pg_nextoid strategy could have sufficed.  The ENUM strategy
wastes some time parsing 1000 label names, discarding odd-numbered OIDs, and
dropping the type.  The pg_nextoid strategy wastes time by performing the
insertion loop in the executor instead of dedicated C code of
EnumValuesCreate().  Hard to say how to weight those factors.



Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2019-07-20 11:21:52 -0700, Noah Misch wrote:
> On Fri, Jul 19, 2019 at 10:12:57AM -0700, Andres Freund wrote:
> > On 2019-07-07 10:00:35 -0700, Noah Misch wrote:
> > > +# Test concurrent OID generation via pg_enum_oid_index.  This indirectly
> > > +# exercises LWLock and spinlock concurrency.
> > > +my $labels = join ',', map { "'l$_'" } 1 .. 1000;
> > >  pgbench(
> > >      '--no-vacuum --client=5 --protocol=prepared --transactions=25',
> > >      0,
> > >      [qr{processed: 125/125}],
> > >      [qr{^$}],
> > > -    'concurrent insert workload',
> > > +    'concurrent OID generation',
> > >      {
> > >          '001_pgbench_concurrent_insert' =>
> > > -          'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'
> > > +          "CREATE TYPE pg_temp.e AS ENUM ($labels); DROP TYPE pg_temp.e;"
> > >      });
> > 
> > Hm, perhaps we should just do something stupid an insert into a catalog
> > table, determining the oid to insert with pg_nextoid?  That ought to be a
> > lot faster and thus more "stress testing" than going through a full
> > blown DDL statement?  But perhaps that's just too ugly.
> 
> I expect the pg_nextoid strategy could have sufficed.  The ENUM strategy
> wastes some time parsing 1000 label names, discarding odd-numbered OIDs, and
> dropping the type.  The pg_nextoid strategy wastes time by performing the
> insertion loop in the executor instead of dedicated C code of
> EnumValuesCreate().  Hard to say how to weight those factors.

Fair enough. Are you planning to commit your changes?

Greetings,

Andres Freund



Re: [RFC] Removing "magic" oids

От
Noah Misch
Дата:
On Sat, Jul 20, 2019 at 12:56:34PM -0700, Andres Freund wrote:
> On 2019-07-20 11:21:52 -0700, Noah Misch wrote:
> > On Fri, Jul 19, 2019 at 10:12:57AM -0700, Andres Freund wrote:
> > > On 2019-07-07 10:00:35 -0700, Noah Misch wrote:
> > > > +# Test concurrent OID generation via pg_enum_oid_index.  This indirectly
> > > > +# exercises LWLock and spinlock concurrency.
> > > > +my $labels = join ',', map { "'l$_'" } 1 .. 1000;
> > > >  pgbench(
> > > >      '--no-vacuum --client=5 --protocol=prepared --transactions=25',
> > > >      0,
> > > >      [qr{processed: 125/125}],
> > > >      [qr{^$}],
> > > > -    'concurrent insert workload',
> > > > +    'concurrent OID generation',
> > > >      {
> > > >          '001_pgbench_concurrent_insert' =>
> > > > -          'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'
> > > > +          "CREATE TYPE pg_temp.e AS ENUM ($labels); DROP TYPE pg_temp.e;"
> > > >      });
> > > 
> > > Hm, perhaps we should just do something stupid an insert into a catalog
> > > table, determining the oid to insert with pg_nextoid?  That ought to be a
> > > lot faster and thus more "stress testing" than going through a full
> > > blown DDL statement?  But perhaps that's just too ugly.
> > 
> > I expect the pg_nextoid strategy could have sufficed.  The ENUM strategy
> > wastes some time parsing 1000 label names, discarding odd-numbered OIDs, and
> > dropping the type.  The pg_nextoid strategy wastes time by performing the
> > insertion loop in the executor instead of dedicated C code of
> > EnumValuesCreate().  Hard to say how to weight those factors.
> 
> Fair enough. Are you planning to commit your changes?

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8a0cbb88524e8b6121597285b811640ee793b3e8



Re: [RFC] Removing "magic" oids

От
Andres Freund
Дата:
Hi,

On 2019-07-20 12:58:55 -0700, Noah Misch wrote:
> On Sat, Jul 20, 2019 at 12:56:34PM -0700, Andres Freund wrote:
> > Fair enough. Are you planning to commit your changes?
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8a0cbb88524e8b6121597285b811640ee793b3e8

Oh, sorry for that.  Still haven't fully cought up with things that
happened while I was on vacation.

Thanks.