Re: WITHIN GROUP patch

Поиск
Список
Период
Сортировка
От Andrew Gierth
Тема Re: WITHIN GROUP patch
Дата
Msg-id 87eh5qh34u.fsf@news-spur.riddles.org.uk
обсуждение исходный текст
Ответ на Re: WITHIN GROUP patch  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: WITHIN GROUP patch
Re: WITHIN GROUP patch
Список pgsql-hackers
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Further questions about WITHIN GROUP:
Tom> I believe that the spec requires that the "direct" arguments ofTom> an inverse or hypothetical-set aggregate must
notcontain anyTom> Vars of the current query level.
 

False.

The spec requires that the direct arguments must not contain _ungrouped_
columns (see <set function specification>), but nowhere are grouped
columns forbidden.
Tom> They don't manage to say that in plain English, of course, butTom> in the <hypothetical set function> case the
behavioris definedTom> in terms of this sub-select:
 
Tom>     ( SELECT 0, SK1, ..., SKKTom>       FROM TNAME UNION ALLTom>       VALUES (1, VE1, ..., VEK) )

"TNAME" there is not the input table or an alias for it, but rather
the specific subset of rows to which the grouping operation is being
applied (after applying a FILTER if any).

We're in the General Rules here, not the Syntax Rules, so this is
describing _how to compute the result_ rather than a syntactic
transformation of the input.

In any event, going by the docs on the web, Oracle does not forbid
grouped columns there (their wording is "This expr must be constant
within each aggregation group.") MSSQL seems to require a literal
constant, but that's obviously not per spec. IBM doesn't seem to
have it in db2 for linux, but some of their other products have it
and include examples of using grouped vars: see


http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.dbu.doc%2Fc_dbuser_hypothetical_set_family_syntax.html

So I'm going to say that the majority opinion is on my side here.
Tom> So that's all fine, but the patch seems a bit confused about it.

The patch treats vars in the direct args exactly as it would treat
them anywhere else where they must be ungrouped.

[snip a bunch of stuff based on false assumptions]
Tom> What I now think we should do about the added pg_aggregateTom> columns is to have:
Tom> aggnfixedargs    int    number of fixed arguments, excluding anyTom>             hypothetical ones (always 0 for
normalaggregates)
 
Tom> aggkind        char    'n' for normal aggregate,Tom>                   'o' for ordered set function,Tom>
 'h' for hypothetical-set function
 

I don't see any obvious problem with this.
Tom> with the parsing rules that given
Tom>      agg( n fixed arguments ) WITHIN GROUP ( ORDER BY k sort specifications )
Tom> 1. WITHIN GROUP is disallowed for normal aggregates.

(This is what the submitted patch does.)
Tom> 2. For an ordered set function, n must equal aggnfixedargs.  WeTom> treat all n fixed arguments as contributing to
theaggregate'sTom> result collation, but ignore the sort arguments.
 

That doesn't work for getting a sensible collation out of
percentile_disc applied to a collatable type. (Which admittedly is an
extension to the spec, which allows only numeric and interval, but it
seems to me to be worth having.)
Tom> 3. For a hypothetical-set function, n must equal aggnfixedargsTom> plus k, and we match up type and collation info
ofthe last kTom> fixed arguments with the corresponding sort arguments.  TheTom> first n-k fixed arguments contribute
tothe aggregate's resultTom> collation, the rest not.
 

The submitted patch does essentially this but taking the number of
non-variadic args in place of the suggested aggnfixedargs. Presumably
in your version the latter would be derived from the former?
Tom> Reading back over this email, I see I've gone back and forthTom> between using the terms "direct args" and "fixed
args"for theTom> evaluate-once stuff to the left of WITHIN GROUP.  I guess I'mTom> not really sold on either
terminology,but we need something weTom> can use consistently in the code and docs.  The spec is no help,Tom> it has no
genericterm at all for these args.  Does anybody elseTom> have a preference, or maybe another suggestion entirely?
 

We (Atri and I) have been using "direct args", but personally I'm not
amazingly happy with it. Documentation for other dbs tends to just call
them "arguments", and refer to the WITHIN GROUP expressions as "ordering
expressions" or similar.

-- 
Andrew (irc:RhodiumToad)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Performance Improvement by reducing WAL for Update Operation
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: ANALYZE sampling is too good