cross-table constraints?
От | Kevin Hunter Kesling |
---|---|
Тема | cross-table constraints? |
Дата | |
Msg-id | 521CF2C0.6060100@ncsu.edu обсуждение исходный текст |
Ответы |
Re: cross-table constraints?
Re: cross-table constraints? |
Список | pgsql-novice |
Hello Postgres Novice Helpers, I don't know if this is the correct place for this question, but please redirect as appropriate. I am trying to create a datastore for a mathematical optimization model I've built. The math model data structure has a lot of interdependencies and data integrity has been a constant battle over the past year. I've been made aware that data integrity is the I in ACID, and is exactly the reason RDBMSs were created. Great! I've spent some time putting together a schema that appears to make sense, taking care of such concepts as "single point of authority" and "normalization". I've tried to normalize it as best I could, but I don't exactly have a lot of practice designing schemas. I'm now running into a situation where I can't create a constraint that I think I need. My research suggests that the type of constraint I want is colloquially known as a "cross-table constraint", and implies that I have not properly normalized or otherwise structured my schema. So, I would highly appreciate any pointers this list could give me. For the sake of the schema below, let me briefly describe what the model needs and does. The model is an optimization model, so at it's core finds a maximum or minimum value of an "objective function", subject to any number of constraints. For example, a typical objective function would be "Find the minimum cost to supply energy to this system." The model is organized into analyses, and each analysis can have a set of processes. I define 'process' as a technology-vintage tuple <t, v>. For example, a technology 'e_coal' might describe an electric coal power plant, and <e_coal, 1980> would describe a coal power plant built in 1980. We might define the amount of capacity installed as: CAP[e_coal, 1980] = 100 # some number of units The analysis also has the notion of a first period. For example, if the analysis in question had these vintages: 1980, 1990, 2000, 2010, 2020, 2030, 2040 And period_0 (the first period), was 2010, then the model only makes decisions about the periods 2010, 2020, and 2030. This means that to /utilize/ the <e_coal, 1980> process, we might define some "activity" variables for it: ACT[2010, e_coal, 1980] ACT[2020, e_coal, 1980] ACT[2030, e_coal, 1980] ACT[2040, e_coal, 1980] The ACTivity variable is indexed by period, technology, and vintage <p,t,v>. *Note that the set of periods (e.g., 2010, 2020, 2030, 2040) is a subset of the set of vintages.* Here's where I get to how this relates to my question: as the model can only make decisions with variables that describe future time periods (e.g., ACT[2010,...], ACT[2020,...], ACT[2030,...], ...), it makes no sense to define any characteristics for periods prior to 2010. For example, if the use of the <e_coal, 1980> cost different amounts in different periods, I might have data like this: Cost[2010, e_coal, 1980] = 0.10 # $/kWh Cost[2020, e_coal, 1980] = 0.30 # $/kWh Cost[2030, e_coal, 1980] = 0.60 # $/kWh Cost[2040, e_coal, 1980] = 0.33 # $/kWh However, what's currently to prevent a user from giving data for 2000, or 1990? At the moment, nothing other than my application logic. For example, consider this (minimized) schema: CREATE TABLE analysis ( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(32767) NOT NULL UNIQUE, period_0 INTEGER NOT NULL ); CREATE TABLE vintage ( id INTEGER NOT NULL PRIMARY KEY, analysis_id INTEGER NOT NULL REFERENCES analysis (id), vintage INTEGER NOT NULL ); CREATE TABLE technology ( id integer NOT NULL PRIMARY KEY, name varchar(32767) NOT NULL UNIQUE ); CREATE TABLE process ( id INTEGER NOT NULL PRIMARY KEY, analysis_id INTEGER NOT NULL REFERENCES analysis (id), technology_id INTEGER NOT NULL REFERENCES technology (id), vintage_id INTEGER NOT NULL REFERENCES vintage (id), UNIQUE (analysis_id, technology_id, vintage_id) ); CREATE TABLE cost ( id integer NOT NULL PRIMARY KEY, period_id integer NOT NULL REFERENCES vintage (id), process_id integer NOT NULL REFERENCES process (id), value real NOT NULL, UNIQUE (period_id, process_id) ); Ideally, I'd have a constraint on the 'cost' table that required that the period referenced by period_id was not less than the associated analysis period_0. I would like any INSERT statement that mimicks this printf format string would fail: INSERT INTO cost (period_id, process_id, value) VALUES (%d, %d, %f); (where, period_id maps to a vintage.vintage that is less than analysis.period_0) What would be a good way to reorganize my schema to enable a constraint that effectively limits the allowable rows of cost, subject to the value? Many thanks (and apologies for the long post!), Kevin
В списке pgsql-novice по дате отправления: