Обсуждение: variant column type

Поиск
Список
Период
Сортировка

variant column type

От
salah jubeh
Дата:

 Hello,

suppose the following scenario

the car speed is 240 
the car has an airbag

Here the first value is integer and the second value is boolean. Consider that I have this table structure

feature (feature id feature name)
car (car id, .... )
car_feature (car id, feature id, value). the value attribute might have different domains. How can I model this using postgres and using ANSI compliant design ?

Regards 

Re: variant column type

От
John R Pierce
Дата:
On 07/26/11 10:02 AM, salah jubeh wrote:
> and using ANSI compliant design

American National Standards Institute?   they have an ANSI standard on
database schema design or something?



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: variant column type

От
salah jubeh
Дата:

 
Hello John,

I mean ANSI SQL 92 complaint, if I am not mistaken. One solution to this problem is to use something like hstore. but  it has some disadvantages in my application so I want another opinion. 

Regards
 



From: John R Pierce <pierce@hogranch.com>
To: pgsql-general@postgresql.org
Sent: Tue, July 26, 2011 7:10:47 PM
Subject: Re: [GENERAL] variant column type

On 07/26/11 10:02 AM, salah jubeh wrote:
> and using ANSI compliant design

American National Standards Institute?  they have an ANSI standard on database schema design or something?



-- john r pierce                            N 37, W 122
santa cruz ca                        mid-left coast


-- Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: variant column type

От
Ben Chobot
Дата:
On Jul 26, 2011, at 10:02 AM, salah jubeh wrote:


 Hello,

suppose the following scenario

the car speed is 240 
the car has an airbag

Here the first value is integer and the second value is boolean. Consider that I have this table structure

feature (feature id feature name)
car (car id, .... )
car_feature (car id, feature id, value). the value attribute might have different domains. How can I model this using postgres and using ANSI compliant design ?


You haven't been very clear but it sounds like maybe you're asking how to get both EAV "flexibility" and type safety? There isn't any good way to do that but you might consider something ugly, like:

car (id, ...)
car_boolean_features (car_id, ....)
car_int_features (car_id, ...)

A better thing to consider, of course, is how you might avoid the need for anything like EAV in the first place.

Re: variant column type

От
John R Pierce
Дата:
in general, attribute-value sorts of lists are very difficult to use for
relational operations and result in clumsy inefficient queries, as well
as poor data integrity.

whenever possible common attributes shoudl be stored properly as table
fields.   reserve EAV for highly sparse freeform information that could
not have been anticipated at design time.   for your example, all cars
have a speed, and do/don't have an airbag, so these should be normal
fields in a table.


--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: variant column type

От
"David Johnston"
Дата:

 

the car speed is 240 

the car has an airbag

 

Here the first value is integer and the second value is boolean. Consider that I have this table structure

 

feature (feature id feature name)

car (car id, .... )

car_feature (car id, feature id, value). the value attribute might have different domains. How can I model this using postgres and using ANSI compliant design ?

 

Regards 

Given “feature” and “car-feature” tables the presence of absence of an entry in “car-feature” will accomplish your desire for true/false - i.e., “the car has an airbag”.  By abstracting just a little every “feature” can be boiled down to a label/ID and then the added to “feature” and associated via “car-feature”.

In your example you could create a feature called “Top Speed – 240kph”

If every car is going to have a particular “feature” and only the “value” matters you could considering adding a “car-properties” table:

car_property (car id, top_speed, etc…) and populate the top_speed column with whatever value is applicable or leave it NULL if unknown or N/A.  The relationship between “car” and “car_property” would be one-to-one (1-to-1)

Dave J.

Re: variant column type

От
Thomas Kellerer
Дата:
salah jubeh, 26.07.2011 19:02:
>
> Hello,
>
> suppose the following scenario
>
> the car speed is 240
> the car has an airbag
>
> Here the first value is integer and the second value is boolean. Consider that I have this table structure
>
> feature (feature id feature name)
> car (car id, .... )
> car_feature (car id, feature id, value). the value attribute might have different domains. How can I model this using
postgresand using ANSI compliant design ? 
>
> Regards
>
Have a look at the hstore contrib module.

It allows you to store key/value pairs (lots of them) in a single column.

create table car
(
   car_id integer,
   features hstore
);

insert into car (car_id, features)
values
(1, 'speed => 240, airbag => true');

insert into car (car_id, features)
values
(2, 'speed => 140, airbag => false');

insert into car (car_id, features)
values
(3, 'speed => 140, flux_capacitor => true');

-- show the airbag attribute for all cars
-- will return null for those that don't have that attribute
select car_id, (features -> 'airbag') as airbag_flag
from car;

-- return all rows that have an attribute named flux_capacitor with the value true
select *
from car
where features @> ('flux_capacitor => true')

Note that the only drawback of this solution is that you don't have any datatypes for the attributes and you can't
createa foreign key constraint to a "feature" table. But it's probably the most flexible way to deal with such a
requirementin Postgres. 





Re: variant column type

От
Chris Travers
Дата:
On Tue, Jul 26, 2011 at 11:06 AM, David Johnston <polobo@yahoo.com> wrote:

> Given “feature” and “car-feature” tables the presence of absence of an entry
> in “car-feature” will accomplish your desire for true/false - i.e., “the car
> has an airbag”.  By abstracting just a little every “feature” can be boiled
> down to a label/ID and then the added to “feature” and associated via
> “car-feature”.
>
> In your example you could create a feature called “Top Speed – 240kph”
>
> If every car is going to have a particular “feature” and only the “value”
> matters you could considering adding a “car-properties” table:
>
> car_property (car id, top_speed, etc…) and populate the top_speed column
> with whatever value is applicable or leave it NULL if unknown or N/A.  The
> relationship between “car” and “car_property” would be one-to-one (1-to-1)
>
I don't like this approach for a couple of reasons.

1)  Storing non-applicable and unknowns as interchangeable in a
database schema introduces semantic ambiguity issues that are best
avoided if possible.

2)  While wide tables win in terms of supporting more complex
constraints, they lose in terms of storage, etc.

I would personally create three tables:

1)  car (id, top_speed, ec)
2)  features (id, feature_name, etc)
3)  car_has_feature (car_id, feature_id)

This has the benefits of allowing you to track additional information
about features.  For example, you could track that seatbelts are
required in Washington State on all cars manufactured after a certain
date.  The array functions in PostgreSQL are powerful enough to handle
queries of features by car pretty well, or that federal law requires
that certain airbag features are required.

Now, there are a few cases however where key-value-mapping is both
necessary and works and where variant column types are needed (for
example, storing application settings, or argument lists for the
functions that menu items call).  In those cases you have to have
somewhere that knows what the type is supposed to be and checks it.
That's really not a trivial problem because keeping things to a single
point of truth approach is very difficult with such relatively
unstructured data, which is why in applications where I have to do
this, we require that the table be updated through stored procedures
which do this checking.

Best Wishes,
Chris Travers

Re: variant column type

От
"David Johnston"
Дата:
-----Original Message-----
From: Chris Travers [mailto:chris.travers@gmail.com]
Sent: Tuesday, July 26, 2011 2:32 PM
To: David Johnston
Cc: salah jubeh; pgsql
Subject: Re: [GENERAL] variant column type
> In your example you could create a feature called “Top Speed – 240kph”
>
> If every car is going to have a particular “feature” and only the “value”
> matters you could considering adding a “car-properties” table:
>
> car_property (car id, top_speed, etc…) and populate the top_speed
> column with whatever value is applicable or leave it NULL if unknown
> or N/A.  The relationship between “car” and “car_property” would be
> one-to-one (1-to-1)
>
I don't like this approach for a couple of reasons.

1)  Storing non-applicable and unknowns as interchangeable in a database
schema introduces semantic ambiguity issues that are best avoided if
possible.

2)  While wide tables win in terms of supporting more complex constraints,
they lose in terms of storage, etc.

----------------------------------------------

Agreed.  But I was suggesting using the "wide-table" in addition to the
"car;feature;car_has_feature" tables.  Basically limit the extended table to
those properties that are truly (or at least almost truly) global.  There
should only be a few fields.  The fact that the car has a top-speed can be
assumed to be global and thus calling it a "feature" is possibly abstracting
things too much.  If you need to display it in a "feature list" you can
readily write a VIEW that will pull out that integer value from the extended
table, convert it into a meaningful "name/description", and present it as a
list of "Fixed Features".

My main concern with the whole "feature" table is you end up going down the
path of everything being a "feature" - the VIN, Make, Model, Year - where in
most sane cases you'd be better off having fields for those fields since
every car has one.  And so, while I say use an "wide-table" to capture some
of these additional values you can just add the "top-speed" field to the
main car table.  In this specific example there is not semantic ambiguity
since we know that a car has a top-speed and so a NULL must represent an
UNKNOWN value.  If the NULL could represent "Not Applicable" I would
probably leave it to the "feature" table.

David J.



Re: variant column type

От
Radosław Smogura
Дата:
 On Tue, 26 Jul 2011 10:45:27 -0700, John R Pierce wrote:
> in general, attribute-value sorts of lists are very difficult to use
> for relational operations and result in clumsy inefficient queries,
> as
> well as poor data integrity.
>
> whenever possible common attributes shoudl be stored properly as
> table fields.   reserve EAV for highly sparse freeform information
> that could not have been anticipated at design time.   for your
> example, all cars have a speed, and do/don't have an airbag, so these
> should be normal fields in a table.
>
>
> --
> john r pierce                            N 37, W 122
> santa cruz ca                         mid-left coast
 Everything above is true and.

 Database table is like C struct, no inheritance. If you have common
 attributes per some class, but no all cars have same class, you may
 create "extending" table with those attributes as columns, and then join
 it with car.

 Currently I work on project with design car 1..* features. It's
 painful. Many features id's hard-coded, no contract programming (no
 support from compiler, etc. I use O-R libraries, and I can't even write
 car.speed!

 Regards,
 Radek

Re: variant column type

От
salah jubeh
Дата:
Thanks for the help, and for different design options it really helped me. I had a look on vertical design and horizontal design and this is some cons and pros in general for vertical design

Advantages:
•    Avoid null values and utilize storage
•    Avoid constant schema changes due to adding columns
•    Avoid performance issues which may  encountered when a the table is very wide for specific queries
Disadvantages
•    Queries written against the vertical design became cumbersome.
•    Multiple joins to pull back each attribute.
•    Data needs to be converted to the horizontal design in many cases so you need the contribution modules such as table funcs
•    Data domains are problematic and hacks here can be used such as fix your attributes, or using many tables.... 
 
 
------------------------

the car speed is 240 

the car has an airbag

 

Here the first value is integer and the second value is boolean. Consider that I have this table structure

 

feature (feature id feature name)

car (car id, .... )

car_feature (car id, feature id, value). the value attribute might have different domains. How can I model this using postgres and using ANSI compliant design ?

 

Regards