Re: Windowing Function Patch Review -> Standard Conformance

Поиск
Список
Период
Сортировка
От Hitoshi Harada
Тема Re: Windowing Function Patch Review -> Standard Conformance
Дата
Msg-id e08cc0400811261921w332d463fvd5d4c43bc96a490c@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Windowing Function Patch Review -> Standard Conformance  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
2008/11/27 Tom Lane <tgl@sss.pgh.pa.us>:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Here's another updated patch, including all your bug fixes.
>
> I did a very fast pass through this and have a few comments.

Thanks for your comments. The executor part is now being refactored by
Heikki, but remaining are still on me. Note that some of those are
because of my earlier poor understanding.

>
> * Don't bother manually updating keywords.sgml.  As stated therein, that
> table is automatically generated.  All you'll accomplish is to cause merge
> problems.  (The effort would be a lot better spent on the non-boilerplate
> parts of the docs anyway.)

OK, I intend nothing here but didn't know the rule. will remove it.

> * I assume there's going to be some way to create user-defined window
> functions?

Yes, but for 8.4 no. The window functions will need specific window
function APIs rather than existing PG_XXX APIs and the design of them
affects how to design the Window executor node. So we are currently
desgining the APIs. If it completes in the future users can create
their own functions.

> * It seems fairly unlikely that you can get away with not supporting
> any qual expression on a Window plan node.  What will you do with HAVING
> qualifiers?

Window nodes are executed after any of WHERE, GROUP BY, HAVING, and
before ORDER BY. Window nodes don't have qual and HAVING doesn't give
any effect to Window operations.

> * The find_aggref code added to planagg.c (where it doesn't belong anyway)
> doesn't seem to be used anywhere.

It was needed to extract Aggref node in planner once, but not needed
anymore. will remove it.

> * In the same vein, I'm unimpressed with moving GetAggInitVal into
> execGrouping.c, which it isn't at all related to.  I'd just leave it alone
> and duplicate the code in nodeWindow.c --- it's not exactly large.  If you
> did insist on sharing this code it would be appropriate to change the
> name and comments to reflect the fact that it's being used for more than
> just aggregates, anyhow.

It is now in the discussion. Since nodeWindow has much duplicated code
in initialize/advance/finalize so we wonder if those codes should be
shared among the two nodes. If so, GetAggInitVal seems to be shared as
well as other aggregate specific code. If we decide to separate them,
your suggestion that GetAggInitVal should be duplicated will be sane.

> * And in the same vein. var.c is hardly the place to put a
> search-for-wfuncs routine.

Agreed, but where to go? clause.c may be, or under parser/ ?

> * It seems like a coin was flipped to determine whether struct and field
> names would use "window", "win", or just "w" (I find "WFunc" to be
> particularly unhelpful to a reader who doesn't already know what it is).
> Please try to reduce the surprise factor.  I'd suggest consistently using
> "window" in type names, though "win" is an OK prefix for field names
> within window-related structs.

I named WFunc as WinFunc once, but sounds too long for such heavily
used node. I liked it like Agg, but Win is not appropriate neither is
Func. And also, its name is consistent with the added pg_proc column
named proiswfunc. I wonder it would be proiswinfunc if we rename WFunc
as WinFunc.

> * This is a bad idea:
>
>  /*
> +  * OrderClause -
> +  *       representation of ORDER BY in Window
> +  */
> + typedef SortGroupClause OrderClause;
> +
> +
> + /*
> +  * PartitionClause -
> +  *       representaition of PATITION BY in Window
> +  */
> + typedef SortGroupClause PartitionClause;
>
> If they're just SortGroupClauses, call them that, don't invent an alias.
> (Yes, I know SortClause and GroupClause used to be aliases.  That was a
> bad idea: it confused matters and required lots of useless duplicated
> code, except for the places where we didn't duplicate code because we were
> willing to assume struct equivalence.  There's basically just nothing that
> wins about that approach.)  In any case, "order" and "partition" are
> really bad names to be using here given the number of possible other
> meanings for those terms in a DBMS context.  If you actually need separate
> struct types then names like WindowPartitionClause would be appropriate.

This is because I didn't know quite well about windowed table
specification earlier (and when I was started the Group and the Sort
was separated as you point). And now I can tell the two nodes can be
named SortGroupClause, nothing special.

> * The API changes chosen for func_get_detail seem pretty bizarre.
> Why didn't you just add a new return code FUNCDETAIL_WINDOW?

An aggregate that is existing currently can be used as a window
function. But we need to treat it as specialized case. A normal
aggregate without OVER clause is GROUP BY aggregate and with OVER
clause it's window aggregate. For func_get_detail to determine which
aggregate windef variable must be passed. Is it better?

And also, block starting with "Oops. Time to die" comment in
ParseFuncOrColumn can be shared among two types. So I thought the two
are similar and func_get_detail cannot determine them in it. Sometimes
to be the same, sometimes not.

> * The node support needs to be gone over more closely.  I noticed just in
> random checking that WFunc is missing parse-location support in
> nodeFuncs.c and the Query.hasWindow field got added to copyfuncs, outfuncs,
> readfuncs, but not equalfuncs.

Agree. will check it.

> * Please heed the comment at the top of parallel_schedule about the max
> number of tests per parallel group.

Oops. I need to be more heedful.

> * I don't find the test added to opr_sanity.sql to be particularly sane.
> We *will* have the ability to add window functions.  But it might be
> helpful to check that proisagg and proiswfunc aren't both set.

Hmmm, but we don't forbid the both set, so it seems slightly
confusing, especially when someone looks later. Ah, is it because
builtin functions don't have multiple src?

> * errcodes.h is not the only place that has to be touched to add a new
> error code --- see also sgml/errcodes.sgml, plpgsql/src/plerrcodes.h.
> And what is your precedent for using 42813?  I don't see that in the
> standard.  If it's coming from DB2 it's okay, otherwise we should be
> using a private 'P' code.

I'm not sure but just incremented from GROUPING_ERROR. I googled a bit
DB2 codes but 42813 doesn't make sense and not found appropriate code.
Maybe 'P' will do.

> * Please try to eliminate random whitespace changes from the patch
> ... *particularly* in files that otherwise wouldn't be touched at all
> (there are at least two cases of that in this patch)

OK.

>
> That's all I have time for right now ... more to come no doubt.

Particularly some of the planner part probably makes wrong. Thank you
for your check.


Regards,


-- 
Hitoshi Harada


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Jaime Casanova"
Дата:
Сообщение: Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1
Следующее
От: "Robert Haas"
Дата:
Сообщение: Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1