Re: window function induces full table scan
От | Thomas Mayer |
---|---|
Тема | Re: window function induces full table scan |
Дата | |
Msg-id | 52C6008C.7090201@student.kit.edu обсуждение исходный текст |
Ответ на | Re: window function induces full table scan (Thomas Mayer <thomas.mayer@student.kit.edu>) |
Список | pgsql-performance |
Just to track it down: The limitation can also be reproduced without using views. Using views is just a use case where the suggested optimization is actually needed. Plus, when I remove the condition "WHERE datepos = 1", the same behaviour still occurs. Here, I wanted to see if postgresql is preferring the condition "WHERE datepos = 1" (datepos is the result of the window function) over the condition "user_id = 43" for optimization. But this is not the case. -- workaround example: "WHERE user_id = 43" condition in subselect SELECT user_id, latitude, longitude FROM ( SELECT user_id, latitude, longitude, rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC) AS datepos FROM checkin_node WHERE user_id = 43 ) AS tmp_last_position WHERE datepos = 1; -- takes 2 ms -- track it down: reproduce limitation without a view: SELECT user_id, latitude, longitude FROM ( SELECT user_id, latitude, longitude, rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC) AS datepos FROM checkin_node ) AS tmp_last_position WHERE datepos = 1 AND user_id = 43; -- takes 6621 ms -- without datepos condition SELECT user_id, latitude, longitude FROM ( SELECT user_id, latitude, longitude, rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC) AS datepos FROM checkin_node ) AS tmp_last_position WHERE user_id = 43; -- takes 6574 ms Best regards, Thomas Am 03.01.2014 00:12, schrieb Thomas Mayer: > > Am 02.01.2014 23:43, schrieb Tom Lane: >> Jeff Janes <jeff.janes@gmail.com> writes: >>> On Thu, Jan 2, 2014 at 1:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> It's possible that in the specific case you exhibit here, pushing down >>>> the clause wouldn't result in changes in the window function's output for >>>> the selected rows, but the optimizer doesn't have enough knowledge about >>>> window functions to determine that. >> >>> A restriction in the WHERE clause which corresponds to the PARTITION BY >>> should be pushable, no? I think it doesn't need to understand the internal >>> semantics of the window function itself, just of the PARTITION BY, which >>> should be doable, at least in principle. >> >> If the restriction clause must give the same answer for any two rows of >> the same partition, then yeah, we could in principle push it down without >> knowing anything about the specific window function. It'd be a less than >> trivial test to make, I think. In any case, it's not a "bug" that the >> optimizer doesn't do this currently. > > I agree, this is not a "bug" in v9.3.2 in terms of correctness. > > But it's a limitation, because the query plan is by far not optimal. You > may consider this report as a feature request then. > > The optimization I suggested is normally performed, when no window > function occurs in the statement. > > It seems like the optimizer is already capable of doing a check if the > WHERE can be done first. > > However, this check seems to be done too conservative: I guess, the > check is ignoring the PARTITION-BY-sets of attributes completely. > >> >> regards, tom lane >> . >> > > Best regards > Thomas > > -- ====================================== Thomas Mayer Durlacher Allee 61 D-76131 Karlsruhe Telefon: +49-721-2081661 Fax: +49-721-72380001 Mobil: +49-174-2152332 E-Mail: thomas.mayer@student.kit.edu =======================================
В списке pgsql-performance по дате отправления: