Re: Window functions, partitioning, and sorting performance

Поиск
Список
Период
Сортировка
От Eli Naeher
Тема Re: Window functions, partitioning, and sorting performance
Дата
Msg-id CAJVWyAwMk3CzSOundbDKZtkJ0W2tZQ+XfcGqNeD4N0azSJdHLw@mail.gmail.com
обсуждение исходный текст
Ответ на Window functions, partitioning, and sorting performance  (Eli Naeher <enaeher@gmail.com>)
Ответы Re: Window functions, partitioning, and sorting performance  (Eli Naeher <enaeher@gmail.com>)
Re: Window functions, partitioning, and sorting performance  (Marti Raudsepp <marti@juffo.org>)
Список pgsql-performance
Upping work_mem did roughly halve the time, but after thinking about Shaun's suggestion, I figured it's better to calculate this stuff once and then store it. So here is how the table looks now:

                                          Table "public.stop_event"
       Column        |            Type             |                        Modifiers                        
---------------------+-----------------------------+---------------------------------------------------------
 stop_time           | timestamp without time zone | not null
 stop                | integer                     | not null
 bus                 | integer                     | not null
 direction           | integer                     | not null
 route               | integer                     | not null
 id                  | bigint                      | not null default nextval('stop_event_id_seq'::regclass)
 previous_stop_event | bigint                      | 
Indexes:
    "stop_event_pkey" PRIMARY KEY, btree (id)
    "stop_event_previous_stop_event_idx" btree (previous_stop_event)
Foreign-key constraints:
    "stop_event_direction_id_fkey" FOREIGN KEY (direction) REFERENCES direction(id)
    "stop_event_previous_stop_event_fkey" FOREIGN KEY (previous_stop_event) REFERENCES stop_event(id)
    "stop_event_route_fkey" FOREIGN KEY (route) REFERENCES route(id)
    "stop_event_stop" FOREIGN KEY (stop) REFERENCES stop(id)
Referenced by:
    TABLE "stop_event" CONSTRAINT "stop_event_previous_stop_event_fkey" FOREIGN KEY (previous_stop_event) REFERENCES stop_event(id)

previous_stop_event simply references the previous (by stop_time) stop event for the combination of stop, route, and direction. I have successfully populated this column for my existing test data. However, when I try to do a test self-join using it, Postgres does two seq scans across the whole table, even though I have indexes on both id and previous_stop_event: http://explain.depesz.com/s/ctck. Any idea why those indexes are not being used?

Thank you again,
-Eli

On Thu, Aug 21, 2014 at 9:05 AM, Shaun Thomas <sthomas@optionshouse.com> wrote:
On 08/21/2014 08:29 AM, Eli Naeher wrote:

With around 1.2 million rows, this takes 20 seconds to run. 1.2 million
rows is only about a week's worth of data, so I'd like to figure out a
way to make this faster.

Well, you'll probably be able to reduce the run time a bit, but even with really good hardware and all in-memory processing, you're not going to see significant run-time improvements with that many rows. This is one of the reasons reporting-specific structures, such as fact tables, were designed to address.

Repeatedly processing the same week/month/year aggregate worth of several million rows will just increase linearly with each iteration as data size increases. You need to maintain up-to-date aggregates on the metrics you actually want to measure, so you're only reading the few hundred rows you introduce every update period. You can retrieve those kind of results in a few milliseconds.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


В списке pgsql-performance по дате отправления:

Предыдущее
От: Shaun Thomas
Дата:
Сообщение: Re: Turn off Hyperthreading! WAS: 60 core performance with 9.3
Следующее
От: Eli Naeher
Дата:
Сообщение: Re: Window functions, partitioning, and sorting performance