Re: Processor usage/tuning question

Поиск

Список

Период

Сортировка

От	Israel Brewster
Тема	Re: Processor usage/tuning question
Дата	8 октября 2014 г. 23:17:55
Msg-id	886EAC21-B2DA-4E8F-9717-C73DAA80933A@ravnalaska.net обсуждение исходный текст
Ответ на	Re: Processor usage/tuning question (Jeff Janes <jeff.janes@gmail.com>)
Ответы	Re: Processor usage/tuning question (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список	pgsql-general

Дерево обсуждения

-----------------------------------------------

Israel Brewster

Systems Analyst II

Ravn Alaska

5245 Airport Industrial Rd

Fairbanks, AK 99709

(907) 450-7293

-----------------------------------------------

On Oct 8, 2014, at 11:34 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Tue, Oct 7, 2014 at 12:06 PM, israel <israel@eraalaska.net> wrote:

Thank you all for the advice. It looks like the load is due to a query that is taking around 1300ms to complete - a query that is run by every client connected (probably half a dozen or so, although I don't have specific numbers), every fifteen seconds or so. As you can imagine, that keeps the server rather busy :-) Specifically, it looks like the time is due to a sort (PARTITION BY tail ORDER BY pointtime DESC) that operates on around 100,000 rows.

The lovely details:

The query in question is the following:

SELECT *
FROM (SELECT tail, to_char(pointtime,'MM/DD/YYYY HH24:MI:SS'), lat,lng,altitude,heading,speed,source,pointtime,
ROW_NUMBER() OVER (PARTITION BY tail ORDER BY pointtime DESC) as row
FROM data
WHERE tail in (<list of about 55 values or so>) and pointtime>='<timestamp of 24 hours prior to current UTC time>'::timestamp) s1
WHERE s1.row<=5
ORDER BY tail, pointtime DESC

In english, it boils down to get the five most recent data points for each listed tail number. I look at the last 24 hours of data because it is quite possible that a tail number may have no recent data points.

How many different tail numbers do you have in the last 24 hours? Based on the numbers you provide, it sounds like the list of 55 tail numbers is pretty much all of them that it could expect to find anyway.

In this case yes - that 55 or so is everything (worst case, but fairly common scenario). The system is set up such that the user can select which tail number(s) they want to see, thus the tail in <list> construct. It's just that they often select all tail numbers.

One obvious optimization is to look at a smaller time range. This will definitely speed up the query, but at the risk of not getting any data points for one or more of the requested tail numbers (there is already this risk, but looking back 24 hours keeps it fairly small for us).

But then, do you care? If you haven't heard from an airplane in 24 hours, it seems like either you don't care, or you care very very much and don't need the database to remind you.

Except that the last data point received is still valid as the aircraft's current location, even if it came in several hours ago - which we may well care about. That is, some users may need to see where a given aircraft (or the entire fleet) is, even if an aircraft hasn't updated in a while. That said, I did discuss this with my higher-ups, and got the ok to take it down to four hours.

The table description:
tracking=# \d data
Table "public.data"
Column | Type | Modifiers
-----------+-----------------------------+---------------------------------------------------
id | bigint | not null default nextval('data_id_seq'::regclass)
tail | character varying(16) | not null
timerecp | timestamp without time zone | not null default now()
altitude | integer |
pointtime | timestamp without time zone |
lat | numeric(7,5) | not null
lng | numeric(8,5) | not null
speed | integer |
heading | integer |
source | character varying(64) |
syncd | boolean | default false
Indexes:
"data_pkey" PRIMARY KEY, btree (id)
"pointtime_idx" btree (pointtime)
"syncd_idx" btree (syncd)
"tail_idx" btree (tail)
"tailtime_idx" btree (tail, pointtime DESC)
"timerecp_idx" btree (timerecp)

tracking=#

Adding the two-column sorted index didn't seem to affect the query time much.

I don't think PostgreSQL is going to be able to reason very effectively about a ROW_NUMBER() in a inner table and then a row<=5 in the outer one being equivalent to a LIMIT query for which it could walk an index and then stopping once it finds 5 of them.

Does this need to issued as a single query? Why not issue 55 different queries? It seems like the client is likely going to need to pick the returned list back out by tail number anyway, so both the client and the server might be happier with separate queries.

Good point. Doing that brought the entire execution time down to around 60ms. Just ever so slightly better than the ~1200ms I was getting before. :-) I just have an unreasonable aversion to making multiple queries. I guess this is a prime example of why I need to overcome that :-)

The table current contains 1303951 rows, and any given 24 hour period has around 110,000 rows.

The results of the explain analyze command can be seen here: http://explain.depesz.com/s/H5w (nice site, btw. I'll have to be sure to bookmark it), where it clearly shows the the sort on data.tail,data.pointtime is the largest timesink (if I am reading it right).

The sort does seem pretty slow. What is your encoding and collation? Could you use the "C" collation if you are not already?

Encoding is UTF8, SHOW LC_COLLATE shows en_US.UTF-8. I don't know enough about collations to know if I can use the "C" collation or not :-) I'll have to look into that more.

Course, that said, given the speedup accomplished by your suggestion of running the queries separately, perhaps the rest of this is a moot point.