Обсуждение: Find "smallest common year"

Поиск

Список

Период

Сортировка

Find "smallest common year"

От

Stefan Schwarzer

Дата:

27 сентября 2007 г., 08:30:23

Hi there,

my aim is to plot a line graph for a single country but for two or

three variables. Now, one could use two different y-axis (if there

are two data sets with different units); but it's not really the

right way to analyse data.

One should rather normalize them to a base year to enhance

comparisons. To achieve this, one would set the values for a specific

year to 1 (or 100) and put the other values of all variables in

relation to it.

Thus, taking two variables, say,

Fish Catch

- in Thousand Metric Tons

- data available for 1975 to 2005

and

Total Population

- in Million People

- data available for 1960 to 2005

Country is Kenya.

The "smallest common year" would be 1975. One needs thus to set the

values for that year to 1 (or 100), and calculate the other values of

the variables in respect to the needed factor.

Now the question is how to find that "smallest common year"

automatically, via SQL or PHP?

Can anyone give me a hint? Thanks for any help!

Stef

____________________________________________________________________

Stefan Schwarzer

Lean Back and Relax - Enjoy some Nature Photography:

http://photoblog.la-famille-schwarzer.de

Appetite for Global Data? UNEP GEP Data Portal:

http://geodata.grid.unep.ch

____________________________________________________________________

Re: Find "smallest common year"

От

"William Leite Araújo"

Дата:

27 сентября 2007 г., 08:52:28

    If I understood your question, maybe it's you want:

   SELECT min(<desired data>) FROM table1 JOIN table2 ON (table1.data_field = table2.data_field);

2007/9/27, Stefan Schwarzer <stefan.schwarzer@grid.unep.ch >:

Hi there,

my aim is to plot a line graph for a single country but for two or
three variables. Now, one could use two different y-axis (if there
are two data sets with different units); but it's not really the
right way to analyse data.

One should rather normalize them to a base year to enhance
comparisons. To achieve this, one would set the values for a specific
year to 1 (or 100) and put the other values of all variables in
relation to it.

Thus, taking two variables, say,

Fish Catch
- in Thousand Metric Tons
- data available for 1975 to 2005
and
Total Population
- in Million People
- data available for 1960 to 2005

Country is Kenya.

The "smallest common year" would be 1975. One needs thus to set the
values for that year to 1 (or 100), and calculate the other values of
the variables in respect to the needed factor.

Now the question is how to find that "smallest common year"
automatically, via SQL or PHP?

Can anyone give me a hint? Thanks for any help!

Stef

____________________________________________________________________

   Stefan Schwarzer

Lean Back and Relax - Enjoy some Nature Photography :
http://photoblog.la-famille-schwarzer.de

   Appetite for Global Dat a? UNEP GEP Data Portal:
http://geodata.grid.unep.ch
   ____________________________________________________________________

--
William Leite Araújo
Estudante de paternidade - 24a semana

Re: Find "smallest common year"

От

Stefan Schwarzer

Дата:

27 сентября 2007 г., 09:07:21

Sorry,

I forgot to mention my table design, which is like this:

name 2001 2002 2003 2004 2005

-----------------------------------------------------------------

Afghanistan ....

Albania ....

(Yep, I know, bad table design.... :-)). I tried to change it to the more common "id | year | value" format, but there were too many SQL problems afterwards for the different queries/calculations we need to have....)

So, the mentioned SQL unfortunately doesn't work in that case....

Stef

On Sep 27, 2007, at 1:52 PM, William Leite Araújo wrote:

    If I understood your question, maybe it's you want:

   SELECT min(<desired data>) FROM table1 JOIN table2 ON (table1.data_field = table2.data_field);

2007/9/27, Stefan Schwarzer <stefan.schwarzer@grid.unep.ch >:
Hi there,

my aim is to plot a line graph for a single country but for two or
three variables. Now, one could use two different y-axis (if there
are two data sets with different units); but it's not really the
right way to analyse data.

One should rather normalize them to a base year to enhance
comparisons. To achieve this, one would set the values for a specific
year to 1 (or 100) and put the other values of all variables in
relation to it.

Thus, taking two variables, say,

Fish Catch
- in Thousand Metric Tons
- data available for 1975 to 2005
and
Total Population
- in Million People
- data available for 1960 to 2005

Country is Kenya.

The "smallest common year" would be 1975. One needs thus to set the
values for that year to 1 (or 100), and calculate the other values of
the variables in respect to the needed factor.

Now the question is how to find that "smallest common year"
automatically, via SQL or PHP?

Can anyone give me a hint? Thanks for any help!

Stef

____________________________________________________________________

   Stefan Schwarzer

Lean Back and Relax - Enjoy some Nature Photography :
http://photoblog.la-famille-schwarzer.de

   Appetite for Global Dat a? UNEP GEP Data Portal:
http://geodata.grid.unep.ch
   ____________________________________________________________________

--
William Leite Araújo
Estudante de paternidade - 24a semana

Re: Find "smallest common year"

От

Carlos Moreno

Дата:

27 сентября 2007 г., 09:17:39

Stefan Schwarzer wrote:
> Sorry,
>
> I forgot to mention my table design, which is like this:
>
>      name             2001    2002   2003   2004   2005
> -----------------------------------------------------------------
> Afghanistan    ....
> Albania            ....
>
> (Yep, I know, bad table design.... :-)). I tried to change it to the
> more common "id | year | value" format, but there were too many SQL
> problems afterwards for the different queries/calculations we need to
> have....)

May I suggest that you concentrate on solving *those* problems instead of
the programmatically trivial computation of lowest common value?  Notice
that a *really trivial* programming exercise becomes highly involved in
your case --- if I'm understanding correctly what you have, I assume you'd
have to check one by one the fields for NULL or non-NULL values --- that's
intolerably ugly, IMHO, and it is a very simple and direct consequence of
an as-unsound-as-it-gets db/table design.

Carlos
--

Re: Find "smallest common year"

От

Stefan Schwarzer

Дата:

27 сентября 2007 г., 11:55:13

I really tried it out. I changed my whole database to the "id-country | year | value" - format. And then tried to build my SQL queries. But it was really, really difficult, and sometimes impossible (for me) to get what I wanted.

Actually, it remains quite difficult for me to remember the actual problems I had with it.

But, for example, I need to aggregate the national data on-the-fly to their regions. I need to calculate per Capita data on-the-fly for each variable. Although, one would say this should be simple to accomplish, me and a semi-professional database expert could hardly solve these things.

In one case we came up with as many sub-selects as years were available (or selected by the user) (that can be up to 60 years). Is this "efficient" SQL programming?

What would you recommend for say, 500 global national statistical variables, 500 regional and 500 subregional and 500 global aggregations? Years being covered having something between 10 and 60 years for each of these variables. All available for 240 countries/territories.

Thanks for any recommendations!

Stef

Stefan Schwarzer wrote:

Sorry,

I forgot to mention my table design, which is like this:

name 2001 2002 2003 2004 2005

-----------------------------------------------------------------

Afghanistan ....

Albania ....

May I suggest that you concentrate on solving *those* problems instead of

the programmatically trivial computation of lowest common value? Notice

that a *really trivial* programming exercise becomes highly involved in

your case --- if I'm understanding correctly what you have, I assume you'd

have to check one by one the fields for NULL or non-NULL values --- that's

intolerably ugly, IMHO, and it is a very simple and direct consequence of

an as-unsound-as-it-gets db/table design.

____________________________________________________________________

Stefan Schwarzer

Lean Back and Relax - Enjoy some Nature Photography:

http://photoblog.la-famille-schwarzer.de

Appetite for Global Data? UNEP GEP Data Portal:

http://geodata.grid.unep.ch

____________________________________________________________________

Re: Find "smallest common year"

От

"Scott Marlowe"

Дата:

27 сентября 2007 г., 12:14:04

On 9/27/07, Stefan Schwarzer <stefan.schwarzer@grid.unep.ch> wrote:
>
> I really tried it out. I changed my whole database to the "id-country | year
> | value" - format. And then tried to build my SQL queries. But it was
> really, really difficult, and sometimes impossible (for me) to get what I
> wanted.
>
> Actually, it remains quite difficult for me to remember the actual problems
> I had with it.
>
> But, for example, I need to aggregate the national data on-the-fly to their
> regions. I need to calculate per Capita data on-the-fly for each variable.
> Although, one would say this should be simple to accomplish, me and a
> semi-professional database expert could hardly solve these things.

You should have brought your problem here.  You'd be surprised what a
fresh set of eyes can see.

> In one case we came up with as many sub-selects as years were available (or
> selected by the user) (that can be up to 60 years). Is this "efficient" SQL
> programming?

Probably not.  But that doesn't mean it was the right approach either.
 There well might have been a more efficient approach you didn't think
of.

> What would you recommend for say, 500 global national statistical variables,
> 500 regional and 500 subregional and 500 global aggregations? Years being
> covered having something between 10 and 60 years for each of these
> variables. All available for 240 countries/territories.

I generally approach such problems by putting the data right
(normalized) at the start, then munging the data into summary tables
to handle the problems you're seeing now.

I find it far easier to maintain normalized tables that produced
non-normalized ones (for things like data warehousing) than it is to
maintain non-normalized tables and trying to produce normalized data
from that.

Normalized Tables & SELECT [was: Find "smallest common year"]

От

Stefan Schwarzer

Дата:

28 сентября 2007 г., 04:47:28

>> What would you recommend for say, 500 global national statistical
>> variables,
>> 500 regional and 500 subregional and 500 global aggregations?
>> Years being
>> covered having something between 10 and 60 years for each of these
>> variables. All available for 240 countries/territories.
>
> I generally approach such problems by putting the data right
> (normalized) at the start, then munging the data into summary tables
> to handle the problems you're seeing now.
>
> I find it far easier to maintain normalized tables that produced
> non-normalized ones (for things like data warehousing) than it is to
> maintain non-normalized tables and trying to produce normalized data
> from that.

Ok, I do understand that.

So, instead of the earlier mentioned database design, I would have
something like this:

    - one table for the country names/ids/etc. (Afghanistan, 1;
Albania, 2....)
    - one table for the variable names/ids/etc. (GDP, 1; Population,
2; Fish Catch, 3;....)
    - one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973,
3; ....)
and
    - one table for all "statistical data" with four fields -
id_variable, id_country, id_year, and the actual value

You say

> I find it far easier to maintain normalized tables that produced
> non-normalized ones (for things like data warehousing) than it is to
> maintain non-normalized tables and trying to produce normalized data
> from that.

It seems to me more difficult now to produce a non-normalized output
based on the normalized table. How would look a query like, if I need
now to SELECT, say 100 countries and 20 years? Something like this
(simplified and without joins):

    SELECT
            value,
            id.year
    FROM
            main_table
    WHERE
            year = '1970' OR
            year = '1971' OR
            ....
            country_name = 'Afghanistan' OR
              country_name = 'Albania' OR
            ...

Actually, last time we came up with SubSelects for each year. So,
does this make sense?

Thanks a lot for your help!

Stef

  ____________________________________________________________________

   Stefan Schwarzer

   Lean Back and Relax - Enjoy some Nature Photography:
   http://photoblog.la-famille-schwarzer.de

   Appetite for Global Data? UNEP GEO Data Portal:
   http://geodata.grid.unep.ch
   ____________________________________________________________________

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Nis Jørgensen

Дата:

28 сентября 2007 г., 05:37:10

Stefan Schwarzer skrev:
>>> What would you recommend for say, 500 global national statistical
>>> variables,
>>> 500 regional and 500 subregional and 500 global aggregations? Years
>>> being
>>> covered having something between 10 and 60 years for each of these
>>> variables. All available for 240 countries/territories.
>>
>> I generally approach such problems by putting the data right
>> (normalized) at the start, then munging the data into summary tables
>> to handle the problems you're seeing now.
>>
>> I find it far easier to maintain normalized tables that produced
>> non-normalized ones (for things like data warehousing) than it is to
>> maintain non-normalized tables and trying to produce normalized data
>> from that.
>
> Ok, I do understand that.
>
> So, instead of the earlier mentioned database design, I would have
> something like this:
>
>    - one table for the country names/ids/etc. (Afghanistan, 1; Albania,
> 2....)
>    - one table for the variable names/ids/etc. (GDP, 1; Population, 2;
> Fish Catch, 3;....)
>    - one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973, 3;
> ....)
> and
>    - one table for all "statistical data" with four fields -
> id_variable, id_country, id_year, and the actual value

This is one posibility. Another is to have one table for each variable.
This has the benefit of not mixing different units/data types in the
same field. It does mean you cannot use the same (parameterized) query
for getting different measures.

Since it is easy to create views converting from one to the other of
these presentations, which one you choose is not that important

Also, there is no obvious need to have a lookup table for years - just
store the year as an integer in your data table(s). If necessary, add a
constraint indicating which years are valid. You can produce rows from
missing years by left joining with generate_series(start_year, end_year)

Even if you choose to store the valid years in a table, the id_year is
unnecessary - just use the year itself as the primary key.

More in another reply.

Nis

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Alban Hertroys

Дата:

28 сентября 2007 г., 06:04:41

Stefan Schwarzer wrote:
> Ok, I do understand that.
>
> So, instead of the earlier mentioned database design, I would have
> something like this:
>
>    - one table for the country names/ids/etc. (Afghanistan, 1; Albania,
> 2....)
>    - one table for the variable names/ids/etc. (GDP, 1; Population, 2;
> Fish Catch, 3;....)

>    - one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973, 3;
> ....)

If you _do_ need this table (because you want to constrain your
statistical data to only contain a specific set of years, or because you
need a quick list of available years to select from): Make the year
primary key and drop the artificial index.

Years are perfectly fine data to constrain on, and it saves you the
joins with that table (the foreign key constraint constrains your data
sufficiently).

You may want to constrain your years further by adding a check
constraint, fe. CHECK (year > 1900).

The same argument goes for the country names, but I generally don't like
to have text data for primary key.

Contrary to years, their values might change (for which an ON UPDATE
CASCADE FK constraint trigger would have to go through your data to
update all the references). Plus they take up a bit more space than
integers, although with country names that won't matter that much.

OTOH... If your country names contain non-ascii characters and the
database is configured to collate on those properly, it will be slower.

That aside (they're kind of minor points), the country name is also a
good candidate to become key (primary and foreign respectively).

> and
>    - one table for all "statistical data" with four fields -
> id_variable, id_country, id_year, and the actual value

Some things may have changed here due to my above suggestions.

> It seems to me more difficult now to produce a non-normalized output
> based on the normalized table. How would look a query like, if I need
> now to SELECT, say 100 countries and 20 years? Something like this
> (simplified and without joins):
>
>     SELECT
>             value,
>             id.year
>     FROM
>             main_table
>     WHERE
>             year = '1970' OR
>             year = '1971' OR
>             ....
>             country_name = 'Afghanistan' OR
>              country_name = 'Albania' OR
>             ...

I don't really see what the problem is; with the years as columns in
your table they're just in a different places in this query.

Without knowing what problem you're trying to solve with this query it's
a bit of a guess. I assume those years and countries are user input?

I usually use BETWEEN or IN (...) for such cases, but that boils down to
the same query. It's just shorter to write.

BTW, You didn't actually use type text for your year column, did you? No
quotes needed then. Otherwise you'd have to make sure your year values
are all the same length or sorting gets... interesting.

SELECT
    value,
    year
FROM
    main_table
WHERE
    year BETWEEN 1970 AND 1975
    AND country_name IN ('Afghanistan', 'Albania', ...)

But I think you'll usually be looking for aggregates, so GROUP BY and
HAVING are probably more appropriate for your queries.

SELECT country_name, year, SUM(value)
  FROM main_table
 WHERE country_name IN ('Afghanistan', 'Albania', ...)
   AND year BETWEEN 1970 AND 1975
 GROUP BY country_name, year
 HAVING SUM(value) > 0;

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Nis Jørgensen

Дата:

28 сентября 2007 г., 06:18:17

A few more comments on your table design.

Stefan Schwarzer skrev:

> So, instead of the earlier mentioned database design, I would have
> something like this:
>
>    - one table for the country names/ids/etc. (Afghanistan, 1; Albania,
> 2....)

There is a well-established natural key for countries - ISO 3166. Use
that instead of "id".

>    - one table for the variable names/ids/etc. (GDP, 1; Population, 2;
> Fish Catch, 3;....)
>    - one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973, 3;
> ....)

No need for an ID column on this one. Just use the year as the primary key.



Nis

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Richard Huxton

Дата:

28 сентября 2007 г., 06:21:43

Alban Hertroys wrote:
> If you _do_ need this table (because you want to constrain your
> statistical data to only contain a specific set of years, or because you
> need a quick list of available years to select from): Make the year
> primary key and drop the artificial index.
>
> Years are perfectly fine data to constrain on, and it saves you the
> joins with that table (the foreign key constraint constrains your data
> sufficiently).
>
> You may want to constrain your years further by adding a check
> constraint, fe. CHECK (year > 1900).
>
> The same argument goes for the country names, but I generally don't like
> to have text data for primary key.

Ah, but there is a standardised list of country-codes ideal for the DBA
since the code is (usually) easily understandable.
   http://en.wikipedia.org/wiki/ISO_3166
The only problem might be if historical data uses old boundaries (e.g.
Soviet Union, Yugoslavia).

--
   Richard Huxton
   Archonet Ltd

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Stefan Schwarzer

Дата:

28 сентября 2007 г., 06:25:55

> Ah, but there is a standardised list of country-codes ideal for the
> DBA since the code is (usually) easily understandable.
>   http://en.wikipedia.org/wiki/ISO_3166
> The only problem might be if historical data uses old boundaries
> (e.g. Soviet Union, Yugoslavia).

Yep, have all of them (ISO-2, ISO-3, UN), but they don't cover all
countries + territories. Nevertheless, in most cases the list is
sufficient...

Thanks nevertheless for the link!

Stef

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Stefan Schwarzer

Дата:

01 октября 2007 г., 11:13:31

>
>
>>    - one table for the years names/ids/etc. (1970, 1; 1971, 2;
>> 1973, 3;
>> ....)
>
> If you _do_ need this table (because you want to constrain your
> statistical data to only contain a specific set of years, or
> because you
> need a quick list of available years to select from): Make the year
> primary key and drop the artificial index.
>
> Years are perfectly fine data to constrain on, and it saves you the
> joins with that table (the foreign key constraint constrains your data
> sufficiently).

If my years are not only single years such as 1970, 1971... but time
spans, such as 1970-75.... should your proposal be still valid? It
won't be anymore an integer field, but a text field instead...

Stef

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Stefan Schwarzer

Дата:

01 октября 2007 г., 11:17:25


> BTW, You didn't actually use type text for your year column, did
> you? No
> quotes needed then. Otherwise you'd have to make sure your year values
> are all the same length or sorting gets... interesting.

Yep, my comment just before concerns especially this paragraph, I
guess. With not only years but also time spans for my data - so, not
only 1970, 1971.... but also 1970-75, 1975-80 etc. I would need
indeed to use text for my year column. Why get sorting than
"interesting"?

Stef

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Stefan Schwarzer

Дата:

01 октября 2007 г., 11:28:44

>> I find it far easier to maintain normalized tables that produced
>> non-normalized ones (for things like data warehousing) than it is to
>> maintain non-normalized tables and trying to produce normalized data
>> from that.
>
> Ok, I do understand that.
>
> So, instead of the earlier mentioned database design, I would have
> something like this:
>
>    - one table for the country names/ids/etc. (Afghanistan, 1;
> Albania, 2....)
>    - one table for the variable names/ids/etc. (GDP, 1; Population,
> 2; Fish Catch, 3;....)
>    - one table for the years names/ids/etc. (1970, 1; 1971, 2;
> 1973, 3; ....)
> and
>    - one table for all "statistical data" with four fields -
> id_variable, id_country, id_year, and the actual value
>
> You say
>
>> I find it far easier to maintain normalized tables that produced
>> non-normalized ones (for things like data warehousing) than it is to
>> maintain non-normalized tables and trying to produce normalized data
>> from that.

What is your view about (having 500 different variables/data sets)
using a single table for all data versus one table for each variable.
In terms of "readability" I guess the second solution would be
better. But, then,.... I don't know...

Thanks for any views....

Stef

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Alban Hertroys

Дата:

01 октября 2007 г., 11:38:12

Stefan Schwarzer wrote:
>
>
>> BTW, You didn't actually use type text for your year column, did you? No
>> quotes needed then. Otherwise you'd have to make sure your year values
>> are all the same length or sorting gets... interesting.
>
> Yep, my comment just before concerns especially this paragraph, I guess.
> With not only years but also time spans for my data - so, not only 1970,
> 1971.... but also 1970-75, 1975-80 etc. I would need indeed to use text
> for my year column. Why get sorting than "interesting"?
>
> Stef

Text sorts alphabetically, not numerically:

integer        text
-------        -------
1        '1'
2        '10'
3        '2'
4        '3'
5        '4'
6        '5'
7        '6'
8        '7'
9        '8'
10        '9'

I guess you won't have any problems with sorting, as you use full
century prefixes and the differences in syntax ('1970' vs '1970-75') are
at the rear of your values.

An entirely different question is whether it is a good idea to write a
range as a value that the database cannot interpret correctly (referring
to the '1970-75' notation). You cannot group records by value this way
if you need to (for example) combine data from '1970' with data from
'1970-75'.

But you seem to use these values just for labels, which I assume are
unique across years (eg. if you have a value '1970-75' you don't have
values '1970', 1971'..'1974'), in which case this is safe to use. As
pointed out by several people earlier, they make an excellent foreign
key too (provided they're unique).

Other options to handle these years involve having a start/end year or
date (to_date('01/01/1970', 'MM/DD/YYYY'), to_date('12/31/1975',
'MM/DD/YYYY')) or a start date and an interval (to_date('01/01/1970',
'MM/DD/YYYY'), INTERVAL '1 year').

That makes operators like OVERLAPS and BETWEEN available to your
queries, which may be convenient ;)

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Stefan Schwarzer

Дата:

01 октября 2007 г., 12:07:11

> An entirely different question is whether it is a good idea to write a
> range as a value that the database cannot interpret correctly
> (referring
> to the '1970-75' notation). You cannot group records by value this way
> if you need to (for example) combine data from '1970' with data from
> '1970-75'.
>
> But you seem to use these values just for labels, which I assume are
> unique across years (eg. if you have a value '1970-75' you don't have
> values '1970', 1971'..'1974'), in which case this is safe to use. As
> pointed out by several people earlier, they make an excellent foreign
> key too (provided they're unique).

Yep, this is question I posed myself too. In the moment, when doing
for example "per Capita" calculations on the fly of a variable which
has something like 1970-75, I would then sum up the Total Population
over the given period, divide it through the number of years and then
use it with the selected variable to get the "per Capita" data.

But if I would instead insert yearly data, it would mean that it had
five lines with the same values. No problem with that?

Stef

Re: Normalized Tables & SELECT [was: Find "smallest common year"]

От

Alban Hertroys

Дата:

01 октября 2007 г., 12:32:04

Stefan Schwarzer wrote:
>
>> An entirely different question is whether it is a good idea to write a
>> range as a value that the database cannot interpret correctly (referring
>> to the '1970-75' notation). You cannot group records by value this way
>> if you need to (for example) combine data from '1970' with data from
>> '1970-75'.
>>
>> But you seem to use these values just for labels, which I assume are
>> unique across years (eg. if you have a value '1970-75' you don't have
>> values '1970', 1971'..'1974'), in which case this is safe to use. As
>> pointed out by several people earlier, they make an excellent foreign
>> key too (provided they're unique).
>
> Yep, this is question I posed myself too. In the moment, when doing for
> example "per Capita" calculations on the fly of a variable which has
> something like 1970-75, I would then sum up the Total Population over
> the given period, divide it through the number of years and then use it
> with the selected variable to get the "per Capita" data.
>
> But if I would instead insert yearly data, it would mean that it had
> five lines with the same values. No problem with that?

Not entirely what I suggested, but also a viable solution, sure.

I was suggesting to add a column to your yearly data marking the end of
the range. Given your above examples, you could then do queries like:

SELECT population / num_years FROM my_data;

(Assuming you add the length of the interval as a number of years, which
seems plausible because you don't seem to calculate with any intervals
not dividable by a year).

Adding this additional column may justify putting the years (and their
durations) into their own table.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Find "smallest common year"