Обсуждение: New DB-design - help and documentation pointers appreciated

Поиск
Список
Период
Сортировка

New DB-design - help and documentation pointers appreciated

От
Rikard Bosnjakovic
Дата:
Greetings.

I'm on the verge designing a new database for electrical components,
and I am in need of help. I have designed 3-4 small databases earlier,
but these suffer from bad design and I cannot afford bad design for
this database (which I surmise will be the biggest I will ever
create). My older designs suffer from the "stiff,
Excel-look-a-like"-format. That is; fixed column tables that relate to
each other, and when a new column is needed, simply an ALTER TABLE
<foo> ADD COLUMN was issued. I was told, and experienced it myself,
that this is not a durable design, so I need to learn other approaches
how to design a new database - i.e. I want to do The Right Thing from
the beginning.

This database is supposed to be used by many users as a "library" of
their own components.

What I am thinking of like in this database for electrical components
is, first, a table with names: 1N4148, SB640, BY164, and so forth.
These are the components' unique name. These components can be of a
specific type: Diode, Switching Diode and Bridge Rectifier (for the
above three mentioned components). Further on, these can have
different manufacturers, footprints, datasheets, etc. Tables for these
are doable by me.

However, the table that really makes me cringe in fear of un-knowledge
is this one: parameters. A parameter for a component can be something
that tells the designer that it must exceed a specific amount of
voltage. It's also got a unit. For example, the diode 1N4148 has a
parameter Vr (reverse voltage) that equals around 40-50 V (volts)
depending on the manufacturer. It also has another parameter, If
(forward current), that's around 100-120 mA (milliamperes) depending
on manufacturer. There may be parameters describing working
temperature, maximum rated frequency, and so forth. Different kind of
components has different kind of parameters.

Parameters for the components are in no way to be fixed. Users are
supposed to be able to enter new parameters on their own, re-use
already entered components, and so forth. If User #1 adds a 1N4148
with 10 parameters in his "library", another User #2 with 1N4148 may
want to add it to his library, but the components may be from
different manufactures and may deviate in its parameter values. (This
deviation is often very small, but it exists).

The problem in this using my old database design-brain (which I must
reprogram) is that my first approach would have been something like
this:

TABLE parameters (id integer, name text, unit text, value ???)

"???" denotes where I would fail, hard. If I hardcode 'value' to an
integer, that would be fine for values like "1V", "45mA" and so forth,
but in the future I might want to add a boolean value as a parameter,
or a string, or anything else than integer.  Like a float / numeric
denoting maximum voltage: "3.3V".

So, my question (I think) is: I have really no clue except that to use
a separate table for each separate kind of parameter, but since my
users are supposed to be able to add new parameter types, how am I
supposed to keep track of everything? Can anyone shed some light of
how I should setup / link these parameter tables all together, without
locking myself into a static and locked design?


--
- Rikard

Re: New DB-design - help and documentation pointers appreciated

От
Craig Ringer
Дата:
On 03/07/10 02:43, Rikard Bosnjakovic wrote:

> However, the table that really makes me cringe in fear of un-knowledge
> is this one: parameters. A parameter for a component can be something
> that tells the designer that it must exceed a specific amount of
> voltage. It's also got a unit. For example, the diode 1N4148 has a
> parameter Vr (reverse voltage) that equals around 40-50 V (volts)
> depending on the manufacturer. It also has another parameter, If
> (forward current), that's around 100-120 mA (milliamperes) depending
> on manufacturer. There may be parameters describing working
> temperature, maximum rated frequency, and so forth. Different kind of
> components has different kind of parameters.
>
> Parameters for the components are in no way to be fixed. Users are
> supposed to be able to enter new parameters on their own, re-use
> already entered components, and so forth. If User #1 adds a 1N4148
> with 10 parameters in his "library", another User #2 with 1N4148 may
> want to add it to his library, but the components may be from
> different manufactures and may deviate in its parameter values. (This
> deviation is often very small, but it exists).

Much as I generally loathe the approach, this is probably a case for EAV
 (Entity-Attribute-Value) design. Google will tell you more.

It's really an area where the structured typed sets approach of
relational SQL databases starts to fall down a bit. You might even find
yourself better off with another database system that permits
multi-valued fields, variable row lengths, keyword/value rows, or the like.

> So, my question (I think) is: I have really no clue except that to use
> a separate table for each separate kind of parameter, but since my
> users are supposed to be able to add new parameter types, how am I
> supposed to keep track of everything?

You've hit one of those areas where SQL databases kind of suck. You'll
have to use one of the well understood workarounds like EAV and live
with their limitations, or find a database better suited to the data.

It's easy to build a typed EAV table - just add a column for each type
you might want to use, and a CHECK constraint that requires that all but
one of those columns be NULL for any given row. It's ugly, but so is EAV.

--
Craig Ringer

Re: New DB-design - help and documentation pointers appreciated

От
Rikard Bosnjakovic
Дата:
On Sat, Jul 3, 2010 at 03:51, Craig Ringer <craig@postnewspapers.com.au> wrote:

[...]
> You've hit one of those areas where SQL databases kind of suck. You'll
> have to use one of the well understood workarounds like EAV and live
> with their limitations, or find a database better suited to the data.

Thanks for the feedback Craig.

After careful considerations I have decided to ditch the idea of
heavilly using parameters and go back to my initial idea of having a
simple components index instead.

What I've come up to is this structure:

-- Logos, etc, users should be able to upload photos of the components
CREATE TABLE images (
    image_filename varchar(32) NOT NULL,
    image_filesize integer DEFAULT 0,
    image_uploaded timestamp with time zone DEFAULT (localtimestamp),
    image_id serial PRIMARY KEY
);

CREATE TABLE manufacturers (
    manufacturer_name varchar(32) NOT NULL,
    manufacturer_url varchar(32),
    manufacturer_logo integer REFERENCES images(image_id),
    manufacturer_id serial PRIMARY KEY
);

-- E.g. "transistors", "diodes", "amplifiers"
CREATE TABLE categories (
    category_name varchar(32) NOT NULL,
    category_id serial PRIMARY KEY
);

-- E.g. "PNP", "NPN", "JFET"
CREATE TABLE subcategories (
    subcategory_name varchar(32) NOT NULL,
    subcategory_category integer REFERENCES categories(category_id),
    subcategory_id serial PRIMARY KEY
);

-- PDIP, SO, QFN, etc.
CREATE TABLE packages (
    package_name varchar(32) NOT NULL,
    package_image integer REFERENCES images(image_id),
    package_id serial PRIMARY KEY
);

CREATE TABLE users (
    user_name varchar(32) NOT NULL,
    user_password varchar(32),
    user_id serial PRIMARY KEY
);

CREATE TABLE datasheets (
    datasheet_filename varchar(32) NOT NULL,
    datasheet_filesize integer DEFAULT 0,
    datasheet_uploaded timestamp with time zone DEFAULT (localtimestamp),
    datasheet_id serial PRIMARY KEY
);

CREATE TABLE components (
    component_name varchar(32) NOT NULL,
    component_manufacturer integer REFERENCES manufacturers(manufacturer_id),
    component_category integer REFERENCES categories(category_id),
    component_subcategory integer REFERENCES subcategories(subcategory_id),
    component_package integer REFERENCES packages(package_id),
    component_pincount smallint,
    component_owner integer REFERENCES users(user_id),
    component_image integer REFERENCES images(image_id),
    component_datasheet integer REFERENCES datasheets(datasheet_id),
    component_comment text,
    component_scrap boolean DEFAULT FALSE,
    component_id serial PRIMARY KEY
);

Same kind of components can have different manufacturers, that's why
I'm not having any hard constraints anywhere (like UNIQUE in
component_name). Some examples that I want to be able to store:

Different manufacturers and packaging:

Fairchild 74LS14 DIP14 (hole mounted IC with 14 pins)
National  74LS14 SO14  (surface mounted IC with 14 pins)

Different subcategories / packages:

BC547, category transistors, subcategory NPN, package TO-92
BC547, category transistors, subcategory NPN, package TO-220
BC557, category transistors, subcategory PNP, package TO-92

and so forth. The point is that one component name can exist in many
different flavors. I have still not yet come up to a definite solution
how the subcategories will be implemented, so this is still just a
draft.

However, I feel that this design is the same design I seem to use for
all my databases, and in the end I always find that I designed them
wrong from the beginning. The table "components" feels like that one
is going to be locked into a corner; it seems to "fixed" and not as
flexible as I want this database to be. In the future I will probably
want to add more relations without having to make zillions of ugly
patches to the design.

I would therefore appreciate any feedback on this table structure and
how I can improve it even further.

Thanks.


--
- Rikard

Re: New DB-design - help and documentation pointers appreciated

От
Craig Ringer
Дата:
On 20/07/10 18:14, Rikard Bosnjakovic wrote:

> However, I feel that this design is the same design I seem to use for
> all my databases, and in the end I always find that I designed them
> wrong from the beginning. The table "components" feels like that one
> is going to be locked into a corner; it seems to "fixed" and not as
> flexible as I want this database to be.

Flexibility in storing and structuring data isn't a strong point of SQL
databases. That said, there *are* some workarounds. I've mentioned EAV
and why it's ... ugly.

A potentially superior option I didn't think to mention before is
hstore. You can use a hstore field to store key/value "extension" data
that isn't easy to model in a generic relational way without landing up
with hundreds of tiny tables. See:

  http://www.postgresql.org/docs/8.4/static/hstore.html

That way you can store the common stuff in a typical relational form for
easy querying and manipulation, but can fall back to key/value for
hard-to-model attributes that might be quite specific to particular
classes of component.

I guess you could even have your subcategories carry a column that
listedrequired hstore keys as an array, so that you could require that
all components of a particular subtype have a certain list of hstore keys.

(I often wish SQL would see some extensions to support a more ...
flexible ... representation of data. SQL is great for accessing purely
relational data, but it'd be so nice to be able to break the relational
mould where required without having to replace the entire database
system just because some of your data doesn't quite fit. There are
workarounds of a sort (EAV - ugh!, hstore, etc) but the lack of
convenient language support limits them somewhat.)

--
Craig Ringer