Re: Solving the OID-collision problem

Поиск
Список
Период
Сортировка
От mark@mark.mielke.cc
Тема Re: Solving the OID-collision problem
Дата
Msg-id 20050804175504.GA7147@mark.mielke.cc
обсуждение исходный текст
Ответ на Re: Solving the OID-collision problem  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Thu, Aug 04, 2005 at 12:20:24PM -0400, Tom Lane wrote:
> "Mark Woodward" <pgsql@mohawksoft.com> writes:
> >> I'm too lazy to run an experiment, but I believe it would.  Datum is
> >> involved in almost every function-call API in the backend. In
> >> particular this means that it would affect performance-critical code
> >> paths.
> > I hear you on the "lazy" part, but if OID becomes a structure, then you
> > are still comparing a native type until you get a match, then you make one
> > more comparison to confirm it is the right one, or move on.
> No, you're missing the point entirely: on 32-bit architectures, passing
> a 32-bit integral type to a function is an extremely well optimized
> operation, as is returning a 32-bit integral type.  Passing or
> returning a 64-bit struct is, um, not so well optimized.

I don't think this is necessarily true. For example, instead of passing
the 32-bit integer around, you would instead be passing a 32-bit pointer
to a data structure. This doesn't have to be expensive - although,
depending on the state of the API, it may require extensive changes to
make it inexpensive (or not - I don't know).

From my perspective (new to this list - could be good, or could be bad)
the concept of the OID was too generalized. As a generalization, it
appears to have originally been intended to uniquely identify every
row in the database (system tables and user tables). As a generalization,
32-bits was not enough to represent every row in the database. It was a
mistake.

The work-around for this mistake, was to allow user tables to be
specially defined to not unnecessarily steal range from the OID space.
This work-around proved to be desirable enough, that as of PostgreSQL 8,
tables are no longer created with OIDs by default. It's still a
work-around. What has been purchased with this work-around is time to
properly address this problem. The problem has not been solved.

I see a few ways to solve this:
   1) Create OID domains. The system tables could have their own OID      counter separate from the user table OID
counters.Tables that      have no relationship to each other would be put in their own      OID domain. It isn't as if
youcan map from row OID to table      anyways, so any use of OID assumes knowledge of the table      relationships. I
seethis as being relatively cheap to implement,      with no impact on backwards compatibility, except in unusual cases
    where people have seriously abused the concept of an OID. This      is another delay tactic, in that a sufficient
numberof changes      to the system tables would still cause a wrap-around, however,      it is equivalent or better to
thesuggestion that all user tables      be created without oids, as this at least allows user tables to      use oids
again.
   2) Enlarge the OID to be 64-bit or 128-bit. I don't see this as a      necessarily being a performance problem,
however,it might require      significant changes to the API, which would be expensive. It might      be argued that
enlargingthe OID merely delays the problem, and      doesn't actually address it. Perhaps delaying it by 2^32 is
effectivelyindefinately delaying it, or perhaps not. Those who      thought 32-bits would be enough, or those who
thought2 digit years      would be enough, under-estimated the problem. Compatibility can      be mostly maintained,
althoughthe databases would probably need      to be upgraded, and applications that assumed that the OID could
fitinto a 32-bit integer would break.
 
   3) Leave OIDs as the general database-wide row identifier, and don't      use OIDs to identifier system metadata.
Instead,use a UUID (128-bit)      or similar. System tables are special. Why shouldn't they have a      non-general
meansof identifying stored metadata? This has some      of the benefits of 1, all of the costs of 2, and it additional
   breaks compatibility for everything.
 

Based on my suggestions above, I see 1) as the best short and medium
term route. How hard would it be? Instead of a database wide OID
counter, we have several OID counters, with the table having an OID
counter association. Assuming the OID domain is properly defined, all
existing code continues to function properly, and wrap-around of the
OID in one domain, doesn't break the other domains, such as the system
tables.

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Mark Woodward"
Дата:
Сообщение: Re: Solving the OID-collision problem
Следующее
От: Stefan Kaltenbrunner
Дата:
Сообщение: Re: openbsd, plpython, missing threading symbols