Обсуждение: New DB-design - help and documentation pointers appreciated
Greetings. I'm on the verge designing a new database for electrical components, and I am in need of help. I have designed 3-4 small databases earlier, but these suffer from bad design and I cannot afford bad design for this database (which I surmise will be the biggest I will ever create). My older designs suffer from the "stiff, Excel-look-a-like"-format. That is; fixed column tables that relate to each other, and when a new column is needed, simply an ALTER TABLE <foo> ADD COLUMN was issued. I was told, and experienced it myself, that this is not a durable design, so I need to learn other approaches how to design a new database - i.e. I want to do The Right Thing from the beginning. This database is supposed to be used by many users as a "library" of their own components. What I am thinking of like in this database for electrical components is, first, a table with names: 1N4148, SB640, BY164, and so forth. These are the components' unique name. These components can be of a specific type: Diode, Switching Diode and Bridge Rectifier (for the above three mentioned components). Further on, these can have different manufacturers, footprints, datasheets, etc. Tables for these are doable by me. However, the table that really makes me cringe in fear of un-knowledge is this one: parameters. A parameter for a component can be something that tells the designer that it must exceed a specific amount of voltage. It's also got a unit. For example, the diode 1N4148 has a parameter Vr (reverse voltage) that equals around 40-50 V (volts) depending on the manufacturer. It also has another parameter, If (forward current), that's around 100-120 mA (milliamperes) depending on manufacturer. There may be parameters describing working temperature, maximum rated frequency, and so forth. Different kind of components has different kind of parameters. Parameters for the components are in no way to be fixed. Users are supposed to be able to enter new parameters on their own, re-use already entered components, and so forth. If User #1 adds a 1N4148 with 10 parameters in his "library", another User #2 with 1N4148 may want to add it to his library, but the components may be from different manufactures and may deviate in its parameter values. (This deviation is often very small, but it exists). The problem in this using my old database design-brain (which I must reprogram) is that my first approach would have been something like this: TABLE parameters (id integer, name text, unit text, value ???) "???" denotes where I would fail, hard. If I hardcode 'value' to an integer, that would be fine for values like "1V", "45mA" and so forth, but in the future I might want to add a boolean value as a parameter, or a string, or anything else than integer. Like a float / numeric denoting maximum voltage: "3.3V". So, my question (I think) is: I have really no clue except that to use a separate table for each separate kind of parameter, but since my users are supposed to be able to add new parameter types, how am I supposed to keep track of everything? Can anyone shed some light of how I should setup / link these parameter tables all together, without locking myself into a static and locked design? -- - Rikard
On 03/07/10 02:43, Rikard Bosnjakovic wrote: > However, the table that really makes me cringe in fear of un-knowledge > is this one: parameters. A parameter for a component can be something > that tells the designer that it must exceed a specific amount of > voltage. It's also got a unit. For example, the diode 1N4148 has a > parameter Vr (reverse voltage) that equals around 40-50 V (volts) > depending on the manufacturer. It also has another parameter, If > (forward current), that's around 100-120 mA (milliamperes) depending > on manufacturer. There may be parameters describing working > temperature, maximum rated frequency, and so forth. Different kind of > components has different kind of parameters. > > Parameters for the components are in no way to be fixed. Users are > supposed to be able to enter new parameters on their own, re-use > already entered components, and so forth. If User #1 adds a 1N4148 > with 10 parameters in his "library", another User #2 with 1N4148 may > want to add it to his library, but the components may be from > different manufactures and may deviate in its parameter values. (This > deviation is often very small, but it exists). Much as I generally loathe the approach, this is probably a case for EAV (Entity-Attribute-Value) design. Google will tell you more. It's really an area where the structured typed sets approach of relational SQL databases starts to fall down a bit. You might even find yourself better off with another database system that permits multi-valued fields, variable row lengths, keyword/value rows, or the like. > So, my question (I think) is: I have really no clue except that to use > a separate table for each separate kind of parameter, but since my > users are supposed to be able to add new parameter types, how am I > supposed to keep track of everything? You've hit one of those areas where SQL databases kind of suck. You'll have to use one of the well understood workarounds like EAV and live with their limitations, or find a database better suited to the data. It's easy to build a typed EAV table - just add a column for each type you might want to use, and a CHECK constraint that requires that all but one of those columns be NULL for any given row. It's ugly, but so is EAV. -- Craig Ringer
On Sat, Jul 3, 2010 at 03:51, Craig Ringer <craig@postnewspapers.com.au> wrote: [...] > You've hit one of those areas where SQL databases kind of suck. You'll > have to use one of the well understood workarounds like EAV and live > with their limitations, or find a database better suited to the data. Thanks for the feedback Craig. After careful considerations I have decided to ditch the idea of heavilly using parameters and go back to my initial idea of having a simple components index instead. What I've come up to is this structure: -- Logos, etc, users should be able to upload photos of the components CREATE TABLE images ( image_filename varchar(32) NOT NULL, image_filesize integer DEFAULT 0, image_uploaded timestamp with time zone DEFAULT (localtimestamp), image_id serial PRIMARY KEY ); CREATE TABLE manufacturers ( manufacturer_name varchar(32) NOT NULL, manufacturer_url varchar(32), manufacturer_logo integer REFERENCES images(image_id), manufacturer_id serial PRIMARY KEY ); -- E.g. "transistors", "diodes", "amplifiers" CREATE TABLE categories ( category_name varchar(32) NOT NULL, category_id serial PRIMARY KEY ); -- E.g. "PNP", "NPN", "JFET" CREATE TABLE subcategories ( subcategory_name varchar(32) NOT NULL, subcategory_category integer REFERENCES categories(category_id), subcategory_id serial PRIMARY KEY ); -- PDIP, SO, QFN, etc. CREATE TABLE packages ( package_name varchar(32) NOT NULL, package_image integer REFERENCES images(image_id), package_id serial PRIMARY KEY ); CREATE TABLE users ( user_name varchar(32) NOT NULL, user_password varchar(32), user_id serial PRIMARY KEY ); CREATE TABLE datasheets ( datasheet_filename varchar(32) NOT NULL, datasheet_filesize integer DEFAULT 0, datasheet_uploaded timestamp with time zone DEFAULT (localtimestamp), datasheet_id serial PRIMARY KEY ); CREATE TABLE components ( component_name varchar(32) NOT NULL, component_manufacturer integer REFERENCES manufacturers(manufacturer_id), component_category integer REFERENCES categories(category_id), component_subcategory integer REFERENCES subcategories(subcategory_id), component_package integer REFERENCES packages(package_id), component_pincount smallint, component_owner integer REFERENCES users(user_id), component_image integer REFERENCES images(image_id), component_datasheet integer REFERENCES datasheets(datasheet_id), component_comment text, component_scrap boolean DEFAULT FALSE, component_id serial PRIMARY KEY ); Same kind of components can have different manufacturers, that's why I'm not having any hard constraints anywhere (like UNIQUE in component_name). Some examples that I want to be able to store: Different manufacturers and packaging: Fairchild 74LS14 DIP14 (hole mounted IC with 14 pins) National 74LS14 SO14 (surface mounted IC with 14 pins) Different subcategories / packages: BC547, category transistors, subcategory NPN, package TO-92 BC547, category transistors, subcategory NPN, package TO-220 BC557, category transistors, subcategory PNP, package TO-92 and so forth. The point is that one component name can exist in many different flavors. I have still not yet come up to a definite solution how the subcategories will be implemented, so this is still just a draft. However, I feel that this design is the same design I seem to use for all my databases, and in the end I always find that I designed them wrong from the beginning. The table "components" feels like that one is going to be locked into a corner; it seems to "fixed" and not as flexible as I want this database to be. In the future I will probably want to add more relations without having to make zillions of ugly patches to the design. I would therefore appreciate any feedback on this table structure and how I can improve it even further. Thanks. -- - Rikard
On 20/07/10 18:14, Rikard Bosnjakovic wrote: > However, I feel that this design is the same design I seem to use for > all my databases, and in the end I always find that I designed them > wrong from the beginning. The table "components" feels like that one > is going to be locked into a corner; it seems to "fixed" and not as > flexible as I want this database to be. Flexibility in storing and structuring data isn't a strong point of SQL databases. That said, there *are* some workarounds. I've mentioned EAV and why it's ... ugly. A potentially superior option I didn't think to mention before is hstore. You can use a hstore field to store key/value "extension" data that isn't easy to model in a generic relational way without landing up with hundreds of tiny tables. See: http://www.postgresql.org/docs/8.4/static/hstore.html That way you can store the common stuff in a typical relational form for easy querying and manipulation, but can fall back to key/value for hard-to-model attributes that might be quite specific to particular classes of component. I guess you could even have your subcategories carry a column that listedrequired hstore keys as an array, so that you could require that all components of a particular subtype have a certain list of hstore keys. (I often wish SQL would see some extensions to support a more ... flexible ... representation of data. SQL is great for accessing purely relational data, but it'd be so nice to be able to break the relational mould where required without having to replace the entire database system just because some of your data doesn't quite fit. There are workarounds of a sort (EAV - ugh!, hstore, etc) but the lack of convenient language support limits them somewhat.) -- Craig Ringer