Обсуждение: Duplicate records returned

Поиск
Список
Период
Сортировка

Duplicate records returned

От
daniel65456@gmail.com
Дата:
I'm getting duplicate rows returned.  I don't know know and can't find
out how to construct the SQL to return what I want.  I have an old
version of postgres which I cannot upgrade.  I'm not even sure how to
get the version.  Does this help?

$ grep -i version ..../pi/bin/setup
wtversion='2.0'
export wtversion

My SQL knowledge is quite limited so I've searched all over the net
and read doco but I can't figure this out.

I want to find the total number of hours worked on all projects with
projects.parent="Projects", projects.pct<100, restrictions.hidden=5
(not hidden)

I keep getting duplicate records and I think it's it's becuase I'm
joining a table on 2 other tables.  I've tried INTERSECT but I keep
getting parse errors and can't work out why that is.  I've tried
creating a temp table but still have to do the twin joins!!  I've also
tried creating a view but without success

Here's an example of correct data, a single user and a single project

select username, sum(hours)
from timerecs
where project like 'Testing'
and username = 'long'
group by username
;

username |sum
---------+---
long     |127

but there will be many projects to process and to select the
applicable projects requires test on 2 other tables, projects and
restrictions

$the_sql  = " SELECT projectname, username, sum(hours)";
$the_sql .= " FROM timerecs";
$the_sql .= " WHERE projectname = projects.projectname ";
$the_sql .= " AND projectname = restrictions.projectname";
$the_sql .= " AND projects.parent = 'Projects'";
$the_sql .= " AND projects.pct < 100";
$the_sql .= " AND restrictions.hidden = 5";
$the_sql .= " AND projectname = 'Testing'";    # just for tsting
$the_sql .= " AND username = 'long'";        # just for testing
$the_sql .= " AND projectname = projects.projectname ";
$the_sql .= " GROUP BY projectname, username";
$the_sql .= " ORDER BY projectname, username";
$the_sql .= " ;";

produces

1 Testing|long|254

How do I get the right list of projectname from timerecs by joining
with the projects and restrictions tables?
I've tried SELECT DISTINCT projectname but make no difference.  If I
take the 'restrictions' join out it's fine.  I've also tried prefacing
all column names with table names without any change

Thanks


Re: Duplicate records returned

От
Richard Huxton
Дата:
daniel65456@gmail.com wrote:
> I'm getting duplicate rows returned.  I don't know know and can't find
> out how to construct the SQL to return what I want.  I have an old
> version of postgres which I cannot upgrade.  I'm not even sure how to
> get the version.  Does this help?
>
> $ grep -i version ..../pi/bin/setup
> wtversion='2.0'
> export wtversion

Nope - nothing to do with PostgreSQL I'm afraid.

Try issuing "SELECT version()" as an SQL statement.

>
> My SQL knowledge is quite limited so I've searched all over the net
> and read doco but I can't figure this out.
>
> I want to find the total number of hours worked on all projects with
> projects.parent="Projects", projects.pct<100, restrictions.hidden=5
> (not hidden)
>
> I keep getting duplicate records and I think it's it's becuase I'm
> joining a table on 2 other tables.  I've tried INTERSECT but I keep
> getting parse errors and can't work out why that is.  I've tried
> creating a temp table but still have to do the twin joins!!  I've also
> tried creating a view but without success

> but there will be many projects to process and to select the
> applicable projects requires test on 2 other tables, projects and
> restrictions
>
> $the_sql  = " SELECT projectname, username, sum(hours)";
> $the_sql .= " FROM timerecs";
> $the_sql .= " WHERE projectname = projects.projectname ";
> $the_sql .= " AND projectname = restrictions.projectname";
> $the_sql .= " AND projects.parent = 'Projects'";
> $the_sql .= " AND projects.pct < 100";
> $the_sql .= " AND restrictions.hidden = 5";
> $the_sql .= " AND projectname = 'Testing'";    # just for tsting
> $the_sql .= " AND username = 'long'";        # just for testing
> $the_sql .= " AND projectname = projects.projectname ";
> $the_sql .= " GROUP BY projectname, username";
> $the_sql .= " ORDER BY projectname, username";
> $the_sql .= " ;";

You might want to read up on "HERE documents" for multi-line blocks of text.

1. You've also not put all your tables into the FROM clause:
    FROM timerecs, projects, restrictions
    This *should* be generating a warning of some kind
2. You're not qualifying which column comes from which table, which
makes it harder to see what's happening. Try:
    FROM timerecs t, projects p, restrictions r
    WHERE t.projectname = p.projectname
    AND ...
    That's called table aliasing, where you give a short name to tables.
3. This query *can't* give duplicates for (projectname,username) pairs
unless you're activating a bug. The GROUP BY eliminates duplicates.

> produces
>
> 1 Testing|long|254
>
> How do I get the right list of projectname from timerecs by joining
> with the projects and restrictions tables?

You've not said what "right" means to you.

> I've tried SELECT DISTINCT projectname but make no difference.  If I
> take the 'restrictions' join out it's fine.  I've also tried prefacing
> all column names with table names without any change

Try the table aliasing, then post the query again (oh, trim the perl/php
if you could) along with some sample data, the results and what the
results should be. Otherwise there's not much anyone can say.

--
   Richard Huxton
   Archonet Ltd

Re: Duplicate records returned

От
Harald Fuchs
Дата:
In article <4680D3E9.7020706@archonet.com>,
Richard Huxton <dev@archonet.com> writes:

>> $the_sql  = " SELECT projectname, username, sum(hours)";
>> $the_sql .= " FROM timerecs";
>> $the_sql .= " WHERE projectname = projects.projectname ";
>> $the_sql .= " AND projectname = restrictions.projectname";
>> $the_sql .= " AND projects.parent = 'Projects'";
>> $the_sql .= " AND projects.pct < 100";
>> $the_sql .= " AND restrictions.hidden = 5";
>> $the_sql .= " AND projectname = 'Testing'";    # just for tsting
>> $the_sql .= " AND username = 'long'";        # just for testing
>> $the_sql .= " AND projectname = projects.projectname ";
>> $the_sql .= " GROUP BY projectname, username";
>> $the_sql .= " ORDER BY projectname, username";
>> $the_sql .= " ;";

> You might want to read up on "HERE documents" for multi-line blocks of text.

In case the above code is Perl, I think

  my $sql = q{
    SELECT ...
    FROM ...
    WHERE ...
    GROUP ...
  };

looks nicer than a here-document.

> 1. You've also not put all your tables into the FROM clause:
>    FROM timerecs, projects, restrictions
>    This *should* be generating a warning of some kind
> 2. You're not qualifying which column comes from which table, which
> makes it harder to see what's happening. Try:
>    FROM timerecs t, projects p, restrictions r
>    WHERE t.projectname = p.projectname
>    AND ...
>    That's called table aliasing, where you give a short name to tables.

This still mixes JOIN conditions with other result restrictions.

  SELECT ...
  FROM timerecs t
  JOIN projects p ON p.projectname = t.projectname
  ...

makes it more explicit.