Window functions, partitioning, and sorting performance

Поиск

Список

Период

Сортировка

От	Eli Naeher
Тема	Window functions, partitioning, and sorting performance
Дата	21 августа 2014 г. 16:30:01
Msg-id	CAJVWyAwqnyAw0XvM5tbGFMVuP51YYJXJJoGaRF2gHxhoQn_BFg@mail.gmail.com обсуждение исходный текст
Ответы	Re: Window functions, partitioning, and sorting performance Re: Window functions, partitioning, and sorting performance
Список	pgsql-performance

Дерево обсуждения

I have a table called stop_event (a stop event is one bus passing one bus stop at a given time for a given route and direction), and I'd like to get the average interval for each stop/route/direction combination.

A few hundred new events are written to the table once every minute. No rows are ever updated (or deleted, except in development).

stop_event looks like this:

Table "public.stop_event"

Column | Type | Modifiers

-----------+-----------------------------+-----------

stop_time | timestamp without time zone | not null

stop | integer | not null

bus | integer | not null

direction | integer | not null

route | integer | not null

Foreign-key constraints:

"stop_event_direction_id_fkey" FOREIGN KEY (direction) REFERENCES direction(id)

"stop_event_route_fkey" FOREIGN KEY (route) REFERENCES route(id)

"stop_event_stop" FOREIGN KEY (stop) REFERENCES stop(id)

And my query looks like this:

SELECT (floor(date_part(E'epoch', avg(interval))) / 60)::INTEGER,
route,
direction,
name,
st_asgeojson(stop_location)::JSON
FROM
(SELECT (stop_time - (lag(stop_time) OVER w)) AS interval,
route,
direction,
name,
stop_location
FROM stop_event
INNER JOIN stop ON (stop_event.stop = stop.id)

WINDOW w AS (PARTITION BY route, direction, stop ORDER BY stop_time))

AS all_intervals
WHERE (interval IS NOT NULL)
GROUP BY route,
direction,
name,
stop_location;

With around 1.2 million rows, this takes 20 seconds to run. 1.2 million rows is only about a week's worth of data, so I'd like to figure out a way to make this faster. The EXPLAIN ANALYZE is at http://explain.depesz.com/s/ntC.

Clearly the bulk of the time is spent sorting the rows in the original table, and then again sorting the results of the subselect. But I'm afraid I don't really know what to do with this information. Is there any way I can speed this up? Is my use of an aggregate key for stop_event causing problems? Would using a synthetic key help?

Thank you for any help you can provide,
-Eli

В списке pgsql-performance по дате отправления:

Предыдущее

От: Mark Kirkwood
Дата: 21 августа 2014 г., 13:02:22
Сообщение: Re: Turn off Hyperthreading! WAS: 60 core performance with 9.3

Следующее

От: Marti Raudsepp
Дата: 21 августа 2014 г., 17:02:58
Сообщение: Re: Window functions, partitioning, and sorting performance

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Window functions, partitioning, and sorting performance

Предыдущее

Следующее