Обсуждение: Removing special case OID generation

Поиск
Список
Период
Сортировка

Removing special case OID generation

От
Simon Riggs
Дата:
Recent events have made me notice the OID handling.

AFAICS, OIDs are just a sequence with a max value that fits in a uint.

So ISTM that we should just strip out the OID handling code and just
have a system sequence defined like this

CREATE SEQUENCE _pg_oid  MINVALUE 0  MAXVALUE 4294967296  CACHE 8192  CYCLE;

Which would then make it easier to have a sequence for each toast
table and a sequence for lobs.

Not sure its important now, but maybe it will reduce the size of the
executable and avoid oid-specific bugs.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Removing special case OID generation

От
Alvaro Herrera
Дата:
Excerpts from Simon Riggs's message of mar feb 07 10:46:09 -0300 2012:
> Recent events have made me notice the OID handling.
>
> AFAICS, OIDs are just a sequence with a max value that fits in a uint.
>
> So ISTM that we should just strip out the OID handling code and just
> have a system sequence defined like this
>
> CREATE SEQUENCE _pg_oid
>    MINVALUE 0
>    MAXVALUE 4294967296
>    CACHE 8192
>    CYCLE;
>
> Which would then make it easier to have a sequence for each toast
> table and a sequence for lobs.

Yeah, I agree that having a single sequence to handle oid allocations on
all toast tables across all databases is strange.  I don't have evidence
that this is a real scalability problem though.  But theoretically at
least it seems to me that there could sporadically be a problem if a
table has a long string of allocated OIDs and the system OID generator
wraps around to somewhere within that range to allocate a new one for
that table.  That could cause a long period of spinning to get a new
value, thus high latency on that insert.

(Now admittedly if the same were to happen with a non-shared sequence,
it would have to spin just the same -- but it'd do so without having to
grab a system-level lwlock each time.)

Having one sequence for each toast table could be wasteful though.  I
mean, sequences are not the best use of shared buffer cache currently.
If we could have more than one sequence data in a shared buffer page,
things would be different.  Not sure how serious this really is.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Removing special case OID generation

От
Tom Lane
Дата:
Simon Riggs <simon@2ndQuadrant.com> writes:
> So ISTM that we should just strip out the OID handling code and just
> have a system sequence defined like this

I think this is a pretty poor idea, because the overhead of nextval()
is quite a lot larger than the overhead to get an OID.
        regards, tom lane


Re: Removing special case OID generation

От
Jim Nasby
Дата:
On 2/7/12 8:14 AM, Alvaro Herrera wrote:
> Having one sequence for each toast table could be wasteful though.  I
> mean, sequences are not the best use of shared buffer cache currently.
> If we could have more than one sequence data in a shared buffer page,
> things would be different.  Not sure how serious this really is.

This would actually be an argument for supporting multiple page sizes... too bad that's such a beast.

FWIW, from our most complex production database:

cnuapp_prod@postgres08.obr=# select relkind, count(*) from pg_class group by 1; relkind | count
---------+------- S       |   522 r       |  1058 t       |   698 i       |  2894 v       |   221 c       |    12
(6 rows)
-- 
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


Re: Removing special case OID generation

От
Simon Riggs
Дата:
On Fri, Feb 10, 2012 at 10:38 PM, Jim Nasby <jim@nasby.net> wrote:
> On 2/7/12 8:14 AM, Alvaro Herrera wrote:
>>
>> Having one sequence for each toast table could be wasteful though.  I
>> mean, sequences are not the best use of shared buffer cache currently.
>> If we could have more than one sequence data in a shared buffer page,
>> things would be different.  Not sure how serious this really is.
>
>
> This would actually be an argument for supporting multiple page sizes... too
> bad that's such a beast.
>
> FWIW, from our most complex production database:
>
> cnuapp_prod@postgres08.obr=# select relkind, count(*) from pg_class group by
> 1;
>  relkind | count
> ---------+-------
>  S       |   522
>  r       |  1058
>  t       |   698
>  i       |  2894
>  v       |   221
>  c       |    12
> (6 rows)


Yeh, I was thinking we would do well to implement cached sequences for
say first 1000 sequences.

That would mean we'd only use a few thousand bytes of memory rather
than 4 MB for your sequences.

Idea would be to make Sequences as fast as OIDs and get rid of the
weird OID code.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Removing special case OID generation

От
Robert Haas
Дата:
On Sat, Feb 11, 2012 at 4:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Yeh, I was thinking we would do well to implement cached sequences for
> say first 1000 sequences.

Another option might be to store all the sequences for a particular
database in a single underlying data file.  The current implementation
uses a whole page for a single tuple that is presumably much smaller
than that.  So when you create a sequence "foo", it's really creating
a row in some new system catalog pg_sequences, or something like that.

> Idea would be to make Sequences as fast as OIDs and get rid of the
> weird OID code.

Honestly, I think the biggest hassle of the OID code is not so much
the way they're generated as the way they're stored within heap
tuples.  I've wondered whether we should go through the system
catalogs and replace all of the hidden OID columns with a normal
column called "oid" of type OID, always placing it at attnum = 1 since
we have a lot of code that assumes the OID column always has the same
attnum.  That would be a fairly large notational change, but maybe it
wouldn't break anything /too/ badly...

Anyway, if we could get away with that, we could eventually (after N
releases) drop the special case support for system OID columns, which
would be a nice simplification.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Removing special case OID generation

От
Marti Raudsepp
Дата:
On Mon, Feb 13, 2012 at 15:08, Robert Haas <robertmhaas@gmail.com> wrote:
> Honestly, I think the biggest hassle of the OID code is not so much
> the way they're generated as the way they're stored within heap
> tuples.  I've wondered whether we should go through the system
> catalogs and replace all of the hidden OID columns with a normal
> column called "oid" of type OID

Do we have a clear idea about what to do with user tables that are
created WITH OIDS? Do we care about compatibility with that at all? Do
we generate this explicit "oid" column manually or do we just tell
users to use a serial or global sequence instead?

Personally I'd also like to see us get rid of the default_with_oids
setting -- I assume the existence of that is the reason why pgAdmin
and TOAD still generate table DDL with an explicit "WITH (OIDS=FALSE)"

Regards,
Marti


Re: Removing special case OID generation

От
Robert Haas
Дата:
On Mon, Feb 13, 2012 at 8:51 AM, Marti Raudsepp <marti@juffo.org> wrote:
> On Mon, Feb 13, 2012 at 15:08, Robert Haas <robertmhaas@gmail.com> wrote:
>> Honestly, I think the biggest hassle of the OID code is not so much
>> the way they're generated as the way they're stored within heap
>> tuples.  I've wondered whether we should go through the system
>> catalogs and replace all of the hidden OID columns with a normal
>> column called "oid" of type OID
>
> Do we have a clear idea about what to do with user tables that are
> created WITH OIDS? Do we care about compatibility with that at all?

I think it would be fine to eventually drop support for user tables
with OIDs.  That hasn't been enabled by default for a very long time:

commit 7ce9b7c0d8c8dbefc04978765422f760dcf3788c
Author: Bruce Momjian <bruce@momjian.us>
Date:   Mon Dec 1 22:08:02 2003 +0000
   This patch adds a new GUC var, "default_with_oids", which follows the   proposal for eventually deprecating OIDs on
usertables that I posted   earlier to pgsql-hackers. pg_dump now always specifies WITH OIDS or   WITHOUT OIDS when
dumpinga table. The documentation has been updated. 
   Neil Conway

I think there's not much benefit to deprecating that feature as long
as we need system OID columns in the catalogs.  But if we got rid of
them there then I think we could drop support in userland, too.

> Do
> we generate this explicit "oid" column manually or do we just tell
> users to use a serial or global sequence instead?
>
> Personally I'd also like to see us get rid of the default_with_oids
> setting -- I assume the existence of that is the reason why pgAdmin
> and TOAD still generate table DDL with an explicit "WITH (OIDS=FALSE)"

That probably has as much to do with inertia as anything else.  I
agree that it's ugly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Removing special case OID generation

От
Andres Freund
Дата:
On Monday, February 13, 2012 02:08:08 PM Robert Haas wrote:
> On Sat, Feb 11, 2012 at 4:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > Yeh, I was thinking we would do well to implement cached sequences for
> > say first 1000 sequences.
> 
> Another option might be to store all the sequences for a particular
> database in a single underlying data file.  The current implementation
> uses a whole page for a single tuple that is presumably much smaller
> than that.  So when you create a sequence "foo", it's really creating
> a row in some new system catalog pg_sequences, or something like that.
I wonder if the tigher packing would be noticeable contentionwise.... If 
several hot sequences end up in a single page that could end up being 
measurable.

Andres


Re: Removing special case OID generation

От
Robert Haas
Дата:
On Mon, Feb 13, 2012 at 9:41 AM, Andres Freund <andres@anarazel.de> wrote:
> On Monday, February 13, 2012 02:08:08 PM Robert Haas wrote:
>> On Sat, Feb 11, 2012 at 4:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > Yeh, I was thinking we would do well to implement cached sequences for
>> > say first 1000 sequences.
>>
>> Another option might be to store all the sequences for a particular
>> database in a single underlying data file.  The current implementation
>> uses a whole page for a single tuple that is presumably much smaller
>> than that.  So when you create a sequence "foo", it's really creating
>> a row in some new system catalog pg_sequences, or something like that.
> I wonder if the tigher packing would be noticeable contentionwise.... If
> several hot sequences end up in a single page that could end up being
> measurable.

For the contention to really be an issue, you'd need a very high rate
of access to that sequence - in my tests so far, the only things that
seem to get hot enough to really hurt are the roots of btrees and
visibility map pages.  And on the plus side, you'd be reducing the
number of pages fighting to stay in shared_buffers.  That having been
said, it's something to watch out for - I certainly don't know enough
to say for certain that it wouldn't be a problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Removing special case OID generation

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Another option might be to store all the sequences for a particular
> database in a single underlying data file.

We've looked into that before, and found some roadblocks IIRC, though
it probably isn't completely infeasible.  See the archives.
        regards, tom lane