Re: Netflix Prize data

Поиск
Список
Период
Сортировка
От Mark Woodward
Тема Re: Netflix Prize data
Дата
Msg-id 21735.24.91.171.78.1160002678.squirrel@mail.mohawksoft.com
обсуждение исходный текст
Ответ на Re: Netflix Prize data  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> "Mark Woodward" <pgsql@mohawksoft.com> writes:
>> The one thing I notice is that it is REAL slow.
>
> How fast is your disk?  Counting on my fingers, I estimate you are
> scanning the table at about 47MB/sec, which might or might not be
> disk-limited...
>
>> I'm using 8.1.4. The "rdate" field looks something like: "2005-09-06"
>
> So why aren't you storing it as type "date"?
>

You are assuming I gave it any thought at all. :-)

I converted it to a date type (create table ratings2 as ....)
markw@snoopy:~/netflix/download$ time psql -c "select count(*) from
ratings" netflix  count
-----------100480507
(1 row)


real    1m29.852s
user    0m0.002s
sys     0m0.005s

That's about the right increase based on the reduction in data size.

OK, I guess I am crying wolf, 47M/sec isn't all that bad for the system.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Mark Woodward"
Дата:
Сообщение: Re: Netflix Prize data
Следующее
От: Gregory Stark
Дата:
Сообщение: Re: Netflix Prize data