Обсуждение: ID column naming convention

Поиск
Список
Период
Сортировка

ID column naming convention

От
droberts
Дата:
Hi, is there a problem calling ID's different when used as a FK vs table ID?
For example


mydimtable ()
 ID
 name
 description


myfacttable ()
  my_dim_id   # FK to ID above
  total_sales


I 'think' if I don't enforce foreign key constraints, then this practice
prevents tools from being able to generate ERD diagrams right?



--
View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: ID column naming convention

От
Rob Sargent
Дата:
On 10/13/2015 11:36 AM, droberts wrote:
Hi, is there a problem calling ID's different when used as a FK vs table ID? 
For example


mydimtable ()IDnamedescription


myfacttable () my_dim_id   # FK to ID above total_sales


I 'think' if I don't enforce foreign key constraints, then this practice
prevents tools from being able to generate ERD diagrams right?



--
View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Pretty sure _all_ ER diagramming tools rely on the DDL of column constraints not the names.  And personally I would name the column mydim_id. i.e don't inject the extra underscore which is not in your table name. (And I assume the table names are fake:  you don't really want "table" in table name)

Re: ID column naming convention

От
"David G. Johnston"
Дата:
On Tue, Oct 13, 2015 at 1:36 PM, droberts <david.roberts@riverbed.com> wrote:
I 'think' if I don't enforce foreign key constraints, then this practice
prevents tools from being able to generate ERD diagrams right?

​Yes, if you don't name them the same and don't setup an actual FK then there is no reason to assume that two random fields on separate tables are in any way related.

I suspect Rob is right that even naming them the same isn't enough - but that would depend upon the tool.  Given the, IMO misguided, prevalence of naming every PK fields "id" about the only meaningful logic such a tool could employ would be to look for fields of the form "table_id" and if "table" has an "id" field assume that they are indeed related.  Since your example use "my_dim" for a field that exists on the "mydimtable" even this logic would be insufficient to guess the link you know exists but don't encode into the schema.

Personally, I don't care whether it is a PK or FK - ID fields are database unique and when present always refer to the same entity.  In this case I would call the field "mydim_id" on both tables.  I would expect to have to define UNIQUE (PRIMARY KEY) and FOREIGN KEY constraints if I wanted tools to understand the relationship between the two tables even though I standardized the name of the ID field.​

There are implications to choosing any particular naming convention.  I'm not familiar with any that are so severe that I would call them problems.

David J.


Re: ID column naming convention

От
Gavin Flower
Дата:
On 14/10/15 06:36, droberts wrote:
> Hi, is there a problem calling ID's different when used as a FK vs table ID?
> For example
>
>
> mydimtable ()
>   ID
>   name
>   description
>
>
> myfacttable ()
>    my_dim_id   # FK to ID above
>    total_sales
>
>
> I 'think' if I don't enforce foreign key constraints, then this practice
> prevents tools from being able to generate ERD diagrams right?
>
>
>
> --
> View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
My practice is to name the PRIMARY KEY as id, and foreign keys with the
original table name plus the sufiix_id.

By leaving the table name off the primary key name, and just using id,
makes it more obvious that it is a primary key (plus it seems redundant
to prefix the primary key name with its own table name!).

CREATE TABLE house
(
     id      int PRIMARY KEY,
     address text
);

CREATE TABLE room
(
     id       int PRIMARY KEY,
     house_id int REFERENCES house(id),
     name     text
);


There are exceptions like:

CREATE TABLE human
(
     id        int PRIMARY KEY,
     mother_id int REFERENCES human (id),
     father_id int REFERENCES human (id),
     name      text
);

Cheers,
Gavin


Re: ID column naming convention

От
droberts
Дата:
Gavin Flower-2 wrote
> On 14/10/15 06:36, droberts wrote:
>> Hi, is there a problem calling ID's different when used as a FK vs table
>> ID?
>> For example
>>
>>
>> mydimtable ()
>>   ID
>>   name
>>   description
>>
>>
>> myfacttable ()
>>    my_dim_id   # FK to ID above
>>    total_sales
>>
>>
>> I 'think' if I don't enforce foreign key constraints, then this practice
>> prevents tools from being able to generate ERD diagrams right?
>>
>>
>>
>> --
>> View this message in context:
>> http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html
>> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>>
>>
> My practice is to name the PRIMARY KEY as id, and foreign keys with the
> original table name plus the sufiix_id.
>
> By leaving the table name off the primary key name, and just using id,
> makes it more obvious that it is a primary key (plus it seems redundant
> to prefix the primary key name with its own table name!).
>
> CREATE TABLE house
> (
>      id      int PRIMARY KEY,
>      address text
> );
>
> CREATE TABLE room
> (
>      id       int PRIMARY KEY,
>      house_id int REFERENCES house(id),
>      name     text
> );
>
>
> There are exceptions like:
>
> CREATE TABLE human
> (
>      id        int PRIMARY KEY,
>      mother_id int REFERENCES human (id),
>      father_id int REFERENCES human (id),
>      name      text
> );
>
> Cheers,
> Gavin
>
>
> --
> Sent via pgsql-general mailing list (

> pgsql-general@

> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Thanks.   My only question is how do you create a schema diagram (ERD) then?
The tool won't know what the relationships are unless maybe you put foreign
key constraints on.  BTW does anyone recommend a tool to to that?  I've been
playing with DbVisualizer.



--
View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844p5869881.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: ID column naming convention

От
John R Pierce
Дата:
On 10/13/2015 3:27 PM, droberts wrote:
> Thanks.   My only question is how do you create a schema diagram (ERD) then?
> The tool won't know what the relationships are unless maybe you put foreign
> key constraints on.  BTW does anyone recommend a tool to to that?  I've been
> playing with DbVisualizer.


I don't know of any ERD tool that will recognize foreign key references
without FK constraint definitions, regardless of what things are named.



--
john r pierce, recycling bits in santa cruz



Re: ID column naming convention

От
Karsten Hilbert
Дата:
On Tue, Oct 13, 2015 at 03:44:15PM -0700, John R Pierce wrote:

> On 10/13/2015 3:27 PM, droberts wrote:
> >Thanks.   My only question is how do you create a schema diagram (ERD) then?
> >The tool won't know what the relationships are unless maybe you put foreign
> >key constraints on.  BTW does anyone recommend a tool to to that?  I've been
> >playing with DbVisualizer.
>
>
> I don't know of any ERD tool that will recognize foreign key references
> without FK constraint definitions, regardless of what things are named.

Also, there wouldn't be anything to recognize. Only to assume.

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


Re: ID column naming convention

От
Jim Nasby
Дата:
On 10/13/15 2:34 PM, Gavin Flower wrote:
>>
> My practice is to name the PRIMARY KEY as id, and foreign keys with the
> original table name plus the sufiix_id.
>
> By leaving the table name off the primary key name, and just using id,
> makes it more obvious that it is a primary key (plus it seems redundant
> to prefix the primary key name with its own table name!).

There's two things that are ugly about that though:

Joins become MUCH easier to screw up. When you have 5 different fields
that are all called 'id' it's trivial to mix them up. It's much harder
to accidentally do something like 'blah.person_id = foo.invoice_id'.

The other issue is common to all "bare word" names (id, name,
description, etc): it becomes completely impossible to find all
occurrences of something in code. If you grep your entire codebase for
'person_id', you know you'll find exactly what you want. Grepping for
'id' OTOH would be useless.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: ID column naming convention

От
Gavin Flower
Дата:
On 16/10/15 13:09, Jim Nasby wrote:
> On 10/13/15 2:34 PM, Gavin Flower wrote:
>>>
>> My practice is to name the PRIMARY KEY as id, and foreign keys with the
>> original table name plus the sufiix_id.
>>
>> By leaving the table name off the primary key name, and just using id,
>> makes it more obvious that it is a primary key (plus it seems redundant
>> to prefix the primary key name with its own table name!).
>
> There's two things that are ugly about that though:
>
> Joins become MUCH easier to screw up. When you have 5 different fields
> that are all called 'id' it's trivial to mix them up. It's much harder
> to accidentally do something like 'blah.person_id = foo.invoice_id'.
>
> The other issue is common to all "bare word" names (id, name,
> description, etc): it becomes completely impossible to find all
> occurrences of something in code. If you grep your entire codebase for
> 'person_id', you know you'll find exactly what you want. Grepping for
> 'id' OTOH would be useless.
It would seem to be very dodgy to us a join based on apparently very
different semantic implied by 'blah.person_id = foo.invoice_id'!!! :-)

Because 2 fields in different tables have the same name, it does not
necessarily mean they have the same semantics. For example 2 tables
could have a field named 'start_date', but the one in a table called
'employment' would have different semantics to the one in 'project'.

Since 'id' is only used to indicate a PRIMARY KEY, there is less
confusion in joins, and it is clear when something is a foreign key
rather than a PRIMARY KEY.  For example, if two tables both refer to the
same human, you can join using a.human_id = b.human_id - and it is
clearer when you are joining a child to a parent table, for example
line_item.stock_id = stock.id.

Adopting you convention, it would result in not only picking up foreign
key references, but also the primary keys - which may, or may not, too
helpful!

It would be very rare to have a join such as project.id = task.id, it is
usually a mistake to join tables on their primary key - so using just
'id' as the PRIMARY KEY name is a bonus.

I once devised a stored procedure in SyBase with over 3,000 lines of SQL
(I would have broken it up in smaller units, but it was not practicable
in that development environment).  It had 7 temporary tables, 5 used
'id' as the PRIMARY KEY - and 2 used the name of the PRIMARY KEY of an
existing table ('tcs_id' & 'perorg_seq'), because that made more sense,
as they had the the same semantic meaning. I did not design the 2
databases I queried, but I suspect sometimes I might decide it best to
use something other than just 'id' - but it would be very rare (I won't
say never!) that I'd use the table name as a prefix for the primary key.

Searching on a bare word names can be useful when the fields have
similar, related semantics.  In a real database, I'd be very unlikely to
use 'name' for a field, though using 'description' might be valid.
Though in general, I would agree that using several words in a name is
normally preferable. Also it would also be better to define appropriate
DOMAINs rather than just using bare types like 'text' & 'int' - to
better document the semantics and make it easier to change things in a
more controlled way.

If one was grepping for the occurrences of the use of the PRIMARY KEY of
the table human, you would look for 'human_id' you would only grep for
'id' if one wanted to find the use of PRIMARY KEYs.

No naming convention is perfect in all situations, and I'll adapt mine
as appropriate.  In my experience, my convention (well to be honest, I
adopted it from others - so I can't claim to have originated it!) seems
to be better in general.

Essentially it is a guideline, I won't insist that you have have your
computers confiscated  if you use a different convention!






Re: ID column naming convention

От
Karsten Hilbert
Дата:
On Fri, Oct 16, 2015 at 02:28:25PM +1300, Gavin Flower wrote:

> Since 'id' is only used to indicate a PRIMARY KEY, there is less confusion
> in joins, and it is clear when something is a foreign key rather than a
> PRIMARY KEY.

Given that "id" often has meaning outside the database I much
prefer naming my primary keys "pk". And foreign keys "fk_TABLENAME":

    line_item.pk_stock = stock.pk

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


Re: ID column naming convention

От
Jim Nasby
Дата:
On 10/15/15 8:28 PM, Gavin Flower wrote:
> It would seem to be very dodgy to us a join based on apparently very
> different semantic implied by 'blah.person_id = foo.invoice_id'!!! :-)
>
> Because 2 fields in different tables have the same name, it does not
> necessarily mean they have the same semantics. For example 2 tables
> could have a field named 'start_date', but the one in a table called
> 'employment' would have different semantics to the one in 'project'.

Right, which is why the fields should be called employment_start_date
and project_start_date, not just start_date.

> Since 'id' is only used to indicate a PRIMARY KEY, there is less
> confusion in joins, and it is clear when something is a foreign key
> rather than a PRIMARY KEY.  For example, if two tables both refer to the
> same human, you can join using a.human_id = b.human_id - and it is
> clearer when you are joining a child to a parent table, for example
> line_item.stock_id = stock.id.
>
> Adopting you convention, it would result in not only picking up foreign
> key references, but also the primary keys - which may, or may not, too
> helpful!

It generally shouldn't matter, because a person_id is *always* a
person_id. In cases where it does matter then the field name alone
probably won't help you much, if at all.

Of course, this has been a debate forever, so I know neither of us will
convince the other to change. ;) I just wanted to point out some things
that hadn't come up already.

BTW, I found Karsten's idea of using 'pk' for the surrogate key, and
fk_table_name interesting. It helps avoid ambiguity from externally
generated ID values.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: ID column naming convention

От
Karsten Hilbert
Дата:
On Fri, Oct 16, 2015 at 04:59:12PM -0500, Jim Nasby wrote:

> BTW, I found Karsten's idea of using 'pk' for the surrogate key, and
> fk_table_name interesting. It helps avoid ambiguity from externally
> generated ID values.

That's the point :-)

Here's a real live schema using (mostly) the above approach:

    http://www.gnumed.de/~ncq/gnumed/schema/gnumed_v20/

in case anyone is interested in taking a look.

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


Re: ID column naming convention

От
Scott Mead
Дата:

> On Oct 13, 2015, at 18:27, droberts <david.roberts@riverbed.com> wrote:
>
> Gavin Flower-2 wrote
>>> On 14/10/15 06:36, droberts wrote:
>>> Hi, is there a problem calling ID's different when used as a FK vs table
>>> ID?
>>> For example
>>>
>>>
>>> mydimtable ()
>>>  ID
>>>  name
>>>  description
>>>
>>>
>>> myfacttable ()
>>>   my_dim_id   # FK to ID above
>>>   total_sales
>>>
>>>
>>> I 'think' if I don't enforce foreign key constraints, then this practice
>>> prevents tools from being able to generate ERD diagrams right?
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html
>>> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>> My practice is to name the PRIMARY KEY as id, and foreign keys with the
>> original table name plus the sufiix_id.
>>
>> By leaving the table name off the primary key name, and just using id,
>> makes it more obvious that it is a primary key (plus it seems redundant
>> to prefix the primary key name with its own table name!).
>>
>> CREATE TABLE house
>> (
>>     id      int PRIMARY KEY,
>>     address text
>> );
>>
>> CREATE TABLE room
>> (
>>     id       int PRIMARY KEY,
>>     house_id int REFERENCES house(id),
>>     name     text
>> );
>>
>>
>> There are exceptions like:
>>
>> CREATE TABLE human
>> (
>>     id        int PRIMARY KEY,
>>     mother_id int REFERENCES human (id),
>>     father_id int REFERENCES human (id),
>>     name      text
>> );
>>
>> Cheers,
>> Gavin
>>
>>
>> --
>> Sent via pgsql-general mailing list (
>
>> pgsql-general@
>
>> )
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
> Thanks.   My only question is how do you create a schema diagram (ERD) then?
> The tool won't know what the relationships are unless maybe you put foreign
> key constraints on.
That's how most tools work, usually by calling the driver api (jdbc databasemetadata, etc....) which in turn look at
theinformation_schema. If you don't setup real referential integrity, any tool that can use names is just guessing    

   I think dbvisualizer will 'infer' based on column names.  I KNOW that schemaspy has this option, but they explicitly
noteit is a GUESS.  

   Use foreign keys.



> BTW does anyone recommend a tool to to that?  I've been
> playing with DbVisualizer.
>
>
>
> --
> View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844p5869881.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: ID column naming convention

От
Gavin Flower
Дата:
On 18/10/15 00:13, Karsten Hilbert wrote:
> On Fri, Oct 16, 2015 at 04:59:12PM -0500, Jim Nasby wrote:
>
>> BTW, I found Karsten's idea of using 'pk' for the surrogate key, and
>> fk_table_name interesting. It helps avoid ambiguity from externally
>> generated ID values.
> That's the point :-)
>
> Here's a real live schema using (mostly) the above approach:
>
>     http://www.gnumed.de/~ncq/gnumed/schema/gnumed_v20/
>
> in case anyone is interested in taking a look.
>
> Karsten
Hmm...

Apparently (according to your naming convention) several tables (such as
'clin.substance_intake') have 2 PRIMARY KEYs!!!

I guess you must have a large wall to display the schema on!

Looks like a lot of fun.

Can you tell us more about the database (ignoring nit-picking!)?


Cheers,
Gavin


Re: ID column naming convention

От
"Karsten Hilbert"
Дата:
> > Here's a real live schema using (mostly) the above approach:
> >
> >     http://www.gnumed.de/~ncq/gnumed/schema/gnumed_v20/
> >
> > in case anyone is interested in taking a look.
> >
> > Karsten
>
> Apparently (according to your naming convention) several tables (such as
> 'clin.substance_intake') have 2 PRIMARY KEYs!!!

You've picked perhaps the most complex table in there :-)

> Can you tell us more about the database (ignoring nit-picking!)?

Sure. All the clinical tables inherit from clin.clin_root_item
which itself has a primary key (pk_item) and which, in turn,
inherits from audit.audit_fields, which, again, has a primary
key of its own (pk_audit). That is what makes several tables
appear to have more than one primary key. However, only the
top level serial column is declaratively set to be the actual
primary key of any one table. Certainly, hose columns all _do_
have the properties of primary keys.

Inheritance from clin.clin_root_item is leveraged for cross-table
search of clinical narrative, for one thing.

That table is also overly complex as it needs to take account
of patients being documented to be on treatment regimes with
a mixture of either branded drugs, possibly containing several
active substances, and generic substances without a documented
brand. That's why it is overly complex (I haven't found a
better way to implement this requirement).

Anything else you'd like to know ?

Karsten


Re: ID column naming convention

От
Jim Nasby
Дата:
On 10/18/15 4:32 AM, Karsten Hilbert wrote:
>> Can you tell us more about the database (ignoring nit-picking!)?
> Sure. All the clinical tables inherit from clin.clin_root_item
> which itself has a primary key (pk_item) and which, in turn,
> inherits from audit.audit_fields, which, again, has a primary
> key of its own (pk_audit). That is what makes several tables
> appear to have more than one primary key.

FWIW, I intentionally break some of my conventions when using
inheritance. For example, I normally use a single sequence to drive ALL
surrogate keys for the tables, and child tables keep the same surrogate
key as the parent (which means naming doesn't match). IMHO it makes more
sense to have inheritance do the work for you on this stuff.

BTW, if there's any interest, I have code that sets up a non-inheritance
table specifically for doing foreign keys to the inheritance parent. It
doesn't support updates right now, but should be fully safe from a FK
standpoint.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: ID column naming convention

От
Karsten Hilbert
Дата:
On Mon, Oct 19, 2015 at 04:25:39AM -0500, Jim Nasby wrote:

> BTW, if there's any interest, I have code that sets up a non-inheritance
> table specifically for doing foreign keys to the inheritance parent. It
> doesn't support updates right now, but should be fully safe from a FK
> standpoint.

Can you tell me more ?

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


Re: ID column naming convention

От
Jim Nasby
Дата:
On 10/22/15 11:37 AM, Karsten Hilbert wrote:
> On Mon, Oct 19, 2015 at 04:25:39AM -0500, Jim Nasby wrote:
>
>> BTW, if there's any interest, I have code that sets up a non-inheritance
>> table specifically for doing foreign keys to the inheritance parent. It
>> doesn't support updates right now, but should be fully safe from a FK
>> standpoint.
>
> Can you tell me more ?

I create a '_fk' table that looks like this:

> CREATE TABLE _lead.lead_fk(
>     lead_id                        int                    NOT NULL PRIMARY KEY
>     , organic_lead_id            int                    CONSTRAINT organic_lead_id_sanity CHECK( organic_lead_id IS
NULLOR organic_lead_id = lead_id ) 
>     , some_lead_provider_lead_id        int                    CONSTRAINT some_lead_provider_lead_id_sanity CHECK(
some_lead_provider_lead_idIS NULL OR some_lead_provider_lead_id = lead_id ) 
>     -- TODO , CHECK( only one blah_lead_id field is NOT NULL using count_nulls() from PGXN )
> );

Then each table that inherits from lead.lead (as well as lead.lead
itself) has:

>     , FOREIGN KEY (lead_id) REFERENCES _lead.lead_fk DEFERRABLE INITIALLY DEFERRED

That FK is in place to ensure that when a lead record is inserted, a
corresponding row is inserted into _lead.lead_fk as well. That insert is
handled by this trigger function:

CREATE OR REPLACE FUNCTION _lead.tg_lead_fk(
) RETURNS trigger LANGUAGE plpgsql

-- !!!!!!!!!
SECURITY DEFINER SET search_path = pg_catalog
-- !!!!!!!!!
   AS $body$
BEGIN
   EXECUTE format(
       $$INSERT INTO _lead.lead_fk( lead_id, %I ) VALUES( $1, $1 )$$
       , TG_TABLE_NAME || '_id'
     )
     USING NEW.lead_id
   ;
   RETURN NEW;
END
$body$;

Finally, a table that needs to have a FK to a lead has

>     , lead_id    int    NOT NULL REFERENCES _lead.lead_fk

I also have the following in a pgTap test function to verify that the FK
exists on all children of the lead.lead table.

   FOR r IN
     SELECT * FROM cat_tools.pg_class_v WHERE reloid = 'lead.lead'::regclass
     UNION ALL
     SELECT c.*
       FROM pg_inherits i
         JOIN cat_tools.pg_class_v c ON reloid = inhrelid
       WHERE inhparent = 'lead.lead'::regclass
   LOOP
     RETURN NEXT col_is_pk(
       r.relschema
       , r.relname
       , array[ 'lead_id' ]
       , 'lead_id is PK'
     );

     RETURN NEXT fk_ok(
       r.relschema
       , r.relname
       , 'lead_id'
       , '_lead'
       , 'lead_fk'
       , 'lead_id'
     );
  END LOOP;

At some point I'll turn this into metacode so that setting all of this
up is just a function call. I just haven't gotten to it yet. (Though, if
someone wanted to pay me to do that... ;P )
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com