"Voting" question?

Поиск
Список
Период
Сортировка
От Gary Warner
Тема "Voting" question?
Дата
Msg-id 5082A2C0.6000107@askgar.com
обсуждение исходный текст
Ответы Re: "Voting" question?  (Josh Kupershmidt <schmiddy@gmail.com>)
Список pgsql-novice

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
 
My real problem has to do with recording which of a very large number of anti-virus products agree that a given sample is a member of a given anti-virus family. The database has millions of samples, with tens of anti-virus products voting on each sample. I want to ask a query like "For the malware containing the name 'BadBadVirus' which sample had the most votes, and which vendors voted for it?" and get results like:

|"BadBadVirus"  
                     V1  V2  V3  V4  V5  V6  V7  
Sample 1 - 4 votes    1   0   1   0   0   1   1      
Sample 2 - 5 votes    1   0   1   0   1   1   1   
Sample 3 - 5 votes    1   0   1   0   1   1   1  

 total     14         3       3       2   3   3  
|

Which might be used to tell me that Vendor 2 and Vendor 4 either don't know how to detect this malware, or that they name it something different.

++++++++++++++++

I'm going to try to generalize my question slightly while hopefully not breaking your ability to help me. Assume that I have five voters (Alex, Bob, Carol, Dave, Ed) who have been asked to look at five photographs (P1, P2, P3, P4, P5) and decide what the "main subject" of the photograph is. For our example, we'll just assume they were limited to "Cat", "Dog", or "Horse".

In my "real" data, the fields are:
MD5, Vendor, Malware_name

but for now, consider . . . the data is in the Table "PhotoVotes" in this form  Photo, Voter, Decision

(1, 'Alex', 'Cat')
(1, 'Bob', 'Dog')
(1, 'Carol', 'Cat')
(1, 'Dave', 'Cat')
(1, 'Ed', 'Cat')
(2, 'Alex', 'Cat')
(2, 'Bob', 'Dog')
(2, 'Carol', 'Cat')
(2, 'Dave', 'Cat')
(2, 'Ed', 'Dog')
(3, 'Alex', 'Horse')
(3, 'Bob', 'Horse')
(3, 'Carol', 'Dog')
(3, 'Dave', 'Horse')
(3, 'Ed', 'Horse')
(4, 'Alex', 'Horse')
(4, 'Bob', 'Horse')
(4, 'Carol', 'Cat')
(4, 'Dave', 'Horse')
(4, 'Ed', 'Horse')
(5, 'Alex', 'Dog')
(5, 'Bob', 'Cat')
(5, 'Carol', 'Cat')
(5, 'Dave', 'Cat')
(5, 'Ed', 'Cat')

The objective is that given a photo topic we are looking for, we'd like to know how many voters thought that WAS the main point of that photo, but also list WHICH VOTERS thought that.  In the real malware-related version we're matching with an "ilike" on the string we are looking for.   I was hoping to just "count" (1 or 0) for whether that matched.

Query for: "Cat"
      Total  Alex  Bob Carol Dave Ed
1 -     4      1    0    1     1   1
2 -     3      1    0    1     1   0
3 -     0      0    0    0     0   0
4 -     1      0    0    1     0   0
5 -     4      0    1    1     1   1
- ------------------------------------
total  12      2    1    4     3   2

Query for: "Dog"
      Total  Alex  Bob Carol Dave Ed
1 -     1     0      1   0    0    0
2 -     2     0      1   0    0    1
3 -     1     0      0   1    0    0
4 -     0     0      0   0    0    0
5 -     1     1      0   0    0    0
- ------------------------------------
total   5     1      2   1    0    1

Is that something I can do IN THE DATABASE with the data in the format that I have it stored?

I'm having difficulty getting a query that does that - although it's simple enough to dump the data out and then write a program to do that, I'd really like to be able to do it IN THE DATABASE if I can.   If it's not "query-able" we'll probably end up writing a stored procedure...

Thanks for any suggestions.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
 
iEYEARECAAYFAlCCor8ACgkQg79eYCOO6PvAYgCfdZE+MHHdq+JFsM0TGsHoxqh0
sbkAnjb2V1thpALv/IX621bekMEdhL9A
=/wUW
-----END PGP SIGNATURE-----

В списке pgsql-novice по дате отправления:

Предыдущее
От: Peter Wintoft
Дата:
Сообщение: Re: Latex formatting
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Guidance needed to be a postgreSQL DBA