Re: Function array_agg(array)

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Function array_agg(array)
Дата
Msg-id CAFj8pRDurJzmwuBBvOKY7DQUE4fvGXFNbuT2VfLTdraQS53PdQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Function array_agg(array)  (Ali Akbar <the.apaan@gmail.com>)
Ответы Re: Function array_agg(array)  (Ali Akbar <the.apaan@gmail.com>)
Список pgsql-hackers


2014-10-22 16:58 GMT+02:00 Ali Akbar <the.apaan@gmail.com>:
Thanks for the review

2014-10-22 20:51 GMT+07:00 Pavel Stehule <pavel.stehule@gmail.com>:
I agree with your proposal. I have a few comments to design:

1. patch doesn't hold documentation and regress tests, please append it.
OK, i'll add the documentation and regression test 
 
2. this functionality (multidimensional aggregation) can be interesting more times, so maybe some interface like array builder should be preferred.
We already have accumArrayResult and makeArrayResult/makeMdArrayResult, haven't we?

Actually array_agg(anyarray) can be implemented by using accumArrayResult and makeMdArrayResult, but in the process we will deconstruct the array data and null bit-flags into ArrayBuildState->dvalues and dnulls. And we will reconstruct it through makeMdArrayResult. I think this way it's not efficient, so i decided to reimplement it and memcpy the array data and null flags as-is.

aha, so isn't better to fix a performance for accumArrayResult ?
 

In other places, i think it's clearer if we just use accumArrayResult and makeArrayResult/makeMdArrayResult as API for building (multidimensional) arrays.
 
3. array_agg was consistent with array(subselect), so it should be fixed too

postgres=# select array_agg(a) from test;
       array_agg      
-----------------------
 {{1,2,3,4},{1,2,3,4}}
(1 row)

postgres=# select array(select a from test);
ERROR:  could not find array type for data type integer[]

I'm pretty new in postgresql development. Can you point out where is array(subselect) implemented?

where you can start?

postgres=# explain select array(select a from test);
ERROR:  42704: could not find array type for data type integer[]
LOCATION:  exprType, nodeFuncs.c:116

look to code ... your magic keyword is a ARRAY_SUBLINK .. search in postgresql sources this keyword
 

 
4. why you use a magic constant (64) there?

+         astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
+         astate->aitems = 64 * nitems;

+             astate->nullbitmap = (bits8 *)
+                 repalloc(astate->nullbitmap, (astate->aitems + 7) / 8);

just follow the arbitrary size choosen in accumArrayResult (arrayfuncs.c):
astate->alen = 64; /* arbitrary starting array size */

it can be any number not too small and too big. Too small, and we will realloc shortly. Too big, we will end up wasting memory.

you can try to alloc 1KB instead as start -- it is used more times in Postgres. Then a overhead is max 1KB per agg call - what is acceptable.

You take this value from accumArrayResult, but it is targeted for shorted scalars - you should to expect so any array will be much larger.





Regards,
--
Ali Akbar

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: Question about RI checks
Следующее
От: Pavel Stehule
Дата:
Сообщение: idea: allow AS label inside ROW constructor