Re: Windowing Function Patch Review -> Standard Conformance
От | Hitoshi Harada |
---|---|
Тема | Re: Windowing Function Patch Review -> Standard Conformance |
Дата | |
Msg-id | e08cc0400811240541p296f051v9f3298b821e23e0@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Windowing Function Patch Review -> Standard Conformance (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Список | pgsql-hackers |
2008/11/24 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>: > Hitoshi Harada wrote: >> >> 2008/11/20 David Rowley <dgrowley@gmail.com>: >>> >>> I won't be around until Monday evening (Tuesday morning JST). I'll pickup >>> the review again there. It's really time for me to push on with this >>> review. >>> The patch is getting closer to getting the thumbs up from me. I'm really >>> hoping that can happen on Monday. Then it's over to Heikki for more code >>> feedback. >> >> This time I only fixed some bugs and rebased against the HEAD, but did >> not refactored. I can figure out some part of what Heikki claimed but >> not all, so I'd like to see what he will send and post another patch >> if needed. > > Thanks! Here's what I've got this far I merged your latest patch into this, > as well as latest changes from CVS HEAD. > > It's a bit of a mess, but here's the big changes I've done: > - Removed all the iterator stuff. You can just call WinGetPart/FrameGetArg > repeatedly. It ought to be just as fast if done right in nodeWindow.c. > - Removed all the Shrinked/Extended stuff, as it's not needed until we have > full support for window frames. > - Removed all the WinRow* functions, you can just call WinFrame/PartGet* > functions, using WINDOW_SEEK_CURRENT > - Removed WinFrame/PartGetTuple functions. They were only used for > determining if two tuples are peer with each other, so I added a > WinRowIsPeer function instead. > - Removed all the buffering strategy stuff. Currently, the whole partition > is always materialized. That's of course not optimal; I'm still thinking we > should just read the tuples from the outer node lazily, on-demand, instead. > To avoid keeping all tuples in the partition in tuplestore, when not needed, > should add an extra function to trim the tuplestore. > > There's now a lot less code in nodeWindow.c, and I'm starting to understand > most of what-s left :-). > > TODO > - clean up the comments and other mess. > - Modify WinPart/FrameGetArg so that it's passed the parameter number > instead of an Expr node. > > I'll continue working on the executor, but please let me know what you > think. > It is amazing changes and I like it. Actually I wanted to get it slimer but hadn't found the way. two points, frame concept and window_gettupleslot If you keep this in your mind, please don't be annoyed but the current frame concept is wrong. -- yours sample=# select depname, empno, salary, last_value(empno) over(order by salary) from empsalary; depname | empno | salary | last_value -----------+-------+--------+------------personnel | 5 | 3500 | 5personnel | 2 | 3900 | 2develop | 7 | 4200 | 7develop | 9 | 4500 | 9sales | 4 | 4800 | 4sales | 3 | 4800 | 3sales |b1 | 5000 | 1develop | 11 | 5200 | 11develop | 10 | 5200 | 10develop | 8 | 6000 | 8 (10 rows) -- mine sample=# select depname, empno, salary, last_value(empno) over(order by salary) from empsalary; depname | empno | salary | last_value -----------+-------+--------+------------personnel | 5 | 3500 | 5personnel | 2 | 3900 | 2develop | 7 | 4200 | 7develop | 9 | 4500 | 9sales | 4 | 4800 | 3sales | 3 | 4800 | 3sales | 1 | 5000 | 1develop | 11 | 5200 | 10develop | 10 | 5200 | 10develop | 8 | 6000 | 8 (10 rows) Note that on empno=4 then last_value=4(yours)/3(mine), which means the frame is applied to last_value() as well as nth_value() and first_value(). Not only to aggregates. So these lines in nodeWindow.c, /* * If there is an ORDER BY (we don't support other window frame * specifications yet), the frame runs from first row ofthe partition * to the current row. Otherwise the frame is the whole partition. */if (((Window *) winobj->winstate->ss.ps.plan)->ordNumCols== 0) max_pos = winobj->partition_rows - 1;else max_pos = winobj->currentpos; max_pos is bogus. RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW means max_pos would be currentpos + rows_peers. I looked over the patch and aggregates care it with winobj->aggregatedupto but what about other window functions? Then window_gettupleslot looks like killing performance in some cases. What if the partition has millions of rows and wfunc1 seeks the head and wfunc2 seeks the tail? So as you point it is possible to define frame and partition while scanning current rows rather than before scanning, I felt it'd cost more. But I know this is work in progress, so you probably think about the solutions. What do you think? Regards, -- Hitoshi Harada
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Zdenek KotalaДата:
Сообщение: Re: Minor race-condition problem during database startup