Re: Need help building this query

Поиск
Список
Период
Сортировка
От David Johnston
Тема Re: Need help building this query
Дата
Msg-id 00e401cd4fdd$589b1e60$09d15b20$@yahoo.com
обсуждение исходный текст
Ответ на Need help building this query  (Rihad <rihad@stream.az>)
Ответы Re: Need help building this query  (rihad@stream.az)
Re: Need help building this query  (rihad@stream.az)
Список pgsql-sql
> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-
> owner@postgresql.org] On Behalf Of Rihad
> Sent: Thursday, June 21, 2012 1:49 PM
> To: pgsql-sql@postgresql.org
> Subject: [SQL] Need help building this query
> 
> Hi, folks. I currently need to join two tables that lack primary keys, and
> columns used to distinguish each record can be duplicated. I need to build
> statistics over the data in those tables. Consider this:
> 
> 
> TableA:
> row 1: foo: 123, bar: 456, baz: 789, amount: 10.99, date_of_op: date row
2:
> foo: 123, bar: 456, baz: 789, amount: 10.99, date_of_op: date row 3: foo:
123,
> bar: 456, baz: 789, amount: 10.99, date_of_op: date
> 
> TableB:
> row 1: foo: 123, bar: 456, baz: 789, amount: 10.99, date_of_op: date
> 
> 
> Columns foo + bar + baz are used to distinguish a performed "operation":
> TableA.date_of_op isn't, because it can lag behind TableB.
> 
> Not all different "operations" are in table B.
> Table B is just there so we know which "operations" are complete, so to
> speak (happening under external means and not under any of my control).
> 
> Now, for each operation (foo+bar+baz) in table A, only *one* row should be
> matched in table B, because it only has one matching row there.
> The other two in TableA should be considered unmatched.
> 
> Now the query should be able to get count(*) and sum(amount) every day
> for that day, considering that matched and unmatched operations should be
> counted separately. The report would look something like this:
> 
> TableA.date_of_op  TableB.date_of_op
> 2012-06-21            [empty]                  [count(*) and sum(amount)
> of all data in TableA for this day unmatched in TableB]
> 2012-06-21            2012-06-20            [count(*) and sum(amount) of
> all data in TableA matched in TableB for the 20-th]
> 2012-06-21            2012-06-19            [count(*) and sum(amount) of
> all data in TableA matched in TableB for the 19-th]
> 
> 
> Can this awkward thing be done in pure SQL, or I'd be better off using
> programming for this?
> 
> Thanks, I hope I could explain this.
> 


You seem to be describing a straight reconciliation between two tables.  My
current means of doing this are programmatically but for the simple case
pure SQL should be doable.  The main thing is that you have to distinguish
between "duplicate" records first and then match them up:

TableA Keys:

AA
AA
AA
AB
AB
AC

TableB Keys:
AA
AA
AB

First you use "ROW_NUMBER() OVER (PARTITION BY key)" to assign an integer
"sub-id" to every set of possible keys in both tables:

TableA-Sub:
AA-1
AA-2
AA-3
AB-1
AB-2
AC-1

TableB-Sub:
AA-1
AA-2
AB-1

Now, with these newly constructed key+sub-key values in place, you can
perform a simple LEFT (or possibly FULL) JOIN between tables A & B.

This makes no allowances for any of kind of desired date restriction on the
matching nor does it consider the eventual report that you wish to generate.
What this gives you is a listing of ALL rows in both tables with matched
records joined together into a single (NULL-less) row while unmatched
records will have one of the two resultant columns NULLed

SELECT tableA.subid_a, tableB.subid_b
FROM tableA FULL OUTER JOIN tableB ON (tableA.subid_a = tableB.subid_b)

Requires at least version 8.4

David J.




В списке pgsql-sql по дате отправления:

Предыдущее
От: Rihad
Дата:
Сообщение: Need help building this query
Следующее
От: "Oliver d'Azevedo Christina"
Дата:
Сообщение: Re: Need help building this query