Re: Netflix Prize data

Поиск
Список
Период
Сортировка
От Mark Woodward
Тема Re: Netflix Prize data
Дата
Msg-id 21733.24.91.171.78.1160002282.squirrel@mail.mohawksoft.com
обсуждение исходный текст
Ответ на Re: Netflix Prize data  ("Greg Sabino Mullane" <greg@turnstep.com>)
Список pgsql-hackers
>> I signed up for the Netflix Prize. (www.netflixprize.com)
>> and downloaded their data and have imported it into PostgreSQL.
>> Here is how I created the table:
>
> I signed up as well, but have the table as follows:
>
> CREATE TABLE rating (
>   movie  SMALLINT NOT NULL,
>   person INTEGER  NOT NULL,
>   rating SMALLINT NOT NULL,
>   viewed DATE     NOT NULL
> );
>
> I also recommend not loading the entire file until you get further
> along in the algorithm solution. :)
>
> Not that I have time to really play with this....

As luck would have it, I wrote a recommendations system based on music
ratings a few years ago.

After reading the NYT article, it seems as though one or more of the guys
behind "Net Perceptions" is either helping them or did their system, I'm
not sure. I wrote my system because Net Perceptions was too slow and did a
lousy job.

I think the notion of "communities" in general is an interesting study in
statistics, but every thing I've seen in the form of bad recommendations
shows that while [N] people may share certain tastes, but that doesn't
nessisarily mean that what one likes the others do. This is especially
flawed with movie rentals because it is seldom a 1:1 ratio of movies to
people. There are often multiple people in a household. Also, movies are
almost always for multiple people.

Anyway, good luck! (Not better than me, of course :-)


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: workaround for buggy strtod is not necessary
Следующее
От: "Mark Woodward"
Дата:
Сообщение: Re: Netflix Prize data