Обсуждение: count_nulls(VARIADIC "any")

Поиск
Список
Период
Сортировка

count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
Hi,

I'd like to suggest $SUBJECT for inclusion in Postgres 9.6.  I'm sure
everyone would've found it useful at some point in their lives, and the
fact that it can't be properly implemented in any language other than C
I think speaks for the fact that we as a project should provide it.

A quick and dirty proof of concept (patch attached):

=# select count_nulls(null::int, null::text, 17, 'bar');
  count_nulls
-------------
            2
(1 row)

Its natural habitat would be CHECK constraints, e.g:

   CHECK (count_nulls(a,b,c) IN (0, 3))

Will finish this up for the next CF, unless someone wants to tell me how
stupid this idea is before that.


.m

Вложения

Re: count_nulls(VARIADIC "any")

От
Greg Stark
Дата:
On Wed, Aug 12, 2015 at 6:18 PM, Marko Tiikkaja <marko@joh.to> wrote:
> Will finish this up for the next CF, unless someone wants to tell me how
> stupid this idea is before that.

I'm kind of puzzled what kind of schema would need this.

-- 
greg



Re: count_nulls(VARIADIC "any")

От
Alvaro Herrera
Дата:
Greg Stark wrote:
> On Wed, Aug 12, 2015 at 6:18 PM, Marko Tiikkaja <marko@joh.to> wrote:
> > Will finish this up for the next CF, unless someone wants to tell me how
> > stupid this idea is before that.
> 
> I'm kind of puzzled what kind of schema would need this.

I've seen cases where you want some entity to be of either of some
number of types.  In those cases you have nullable FKs in separate
columns, and only one of them can be non null.

The name count_nulls() suggest an aggregate function to me, though.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:
Hi

2015-08-12 19:18 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
Hi,

I'd like to suggest $SUBJECT for inclusion in Postgres 9.6.  I'm sure everyone would've found it useful at some point in their lives, and the fact that it can't be properly implemented in any language other than C I think speaks for the fact that we as a project should provide it.

A quick and dirty proof of concept (patch attached):

=# select count_nulls(null::int, null::text, 17, 'bar');
 count_nulls
-------------
           2
(1 row)

Its natural habitat would be CHECK constraints, e.g:

  CHECK (count_nulls(a,b,c) IN (0, 3))

Will finish this up for the next CF, unless someone wants to tell me how stupid this idea is before that.

 It is not bad idea

+1

Pavel



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: count_nulls(VARIADIC "any")

От
Peter Geoghegan
Дата:
On Wed, Aug 12, 2015 at 10:30 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> The name count_nulls() suggest an aggregate function to me, though.

I thought the same.


-- 
Peter Geoghegan



Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 2015-08-12 7:23 PM, Greg Stark wrote:
> On Wed, Aug 12, 2015 at 6:18 PM, Marko Tiikkaja <marko@joh.to> wrote:
>> Will finish this up for the next CF, unless someone wants to tell me how
>> stupid this idea is before that.
>
> I'm kind of puzzled what kind of schema would need this.

The first example I could find from our schema was specifying the URL 
for a Remote Procedure Call.  You can either specify a single request 
URI, or you can specify the pieces: protocol, host, port, path.  So the 
constraints look roughly like this:
  CHECK ((fulluri IS NULL) <> (protocol IS NULL)),  CHECK ((protocol IS NULL) = ALL(ARRAY[host IS NULL, port IS NULL, 
path IS NULL]))

Obviously the second one would be much prettier with count_nulls().

The other example is an OOP inheritance-like schema where an object 
could be one of any X number of types.  You could write that:
  CHECK ((a IS NULL)::int + (b IS NULL)::int + (c IS NULL)::int) = 1)

or just:
  CHECK (count_nulls(a,b,c) = 1)

The first example could be redesigned with three tables, but that seems 
like a cure worse than the disease.


.m



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:


2015-08-12 19:32 GMT+02:00 Peter Geoghegan <pg@heroku.com>:
On Wed, Aug 12, 2015 at 10:30 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> The name count_nulls() suggest an aggregate function to me, though.

I thought the same.

maybe nulls_count ?

we have regr_count already

Regards

Pavel
 


--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 2015-08-12 7:35 PM, Pavel Stehule wrote:
> maybe nulls_count ?
>
> we have regr_count already

But that's an aggregate as well..


.m



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:


2015-08-12 19:37 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 2015-08-12 7:35 PM, Pavel Stehule wrote:
maybe nulls_count ?

we have regr_count already

But that's an aggregate as well..

my mistake

Pavel
 


.m

Re: count_nulls(VARIADIC "any")

От
Tom Lane
Дата:
Peter Geoghegan <pg@heroku.com> writes:
> On Wed, Aug 12, 2015 at 10:30 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> The name count_nulls() suggest an aggregate function to me, though.

> I thought the same.

Ditto.  I'd be fine with this if we can come up with a name that
doesn't sound like an aggregate.  The best I can do offhand is
"number_of_nulls()", which doesn't seem very pretty.
        regards, tom lane



Re: count_nulls(VARIADIC "any")

От
"David G. Johnston"
Дата:
On Wed, Aug 12, 2015 at 4:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Geoghegan <pg@heroku.com> writes:
> On Wed, Aug 12, 2015 at 10:30 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> The name count_nulls() suggest an aggregate function to me, though.

> I thought the same.

Ditto.  I'd be fine with this if we can come up with a name that
doesn't sound like an aggregate.  The best I can do offhand is
"number_of_nulls()", which doesn't seem very pretty.


nulls_in(a, b, c) IN (0, 3) - yes the repetition is not great...

David J.
 

Re: count_nulls(VARIADIC "any")

От
"Shulgin, Oleksandr"
Дата:
On Thu, Aug 13, 2015 at 2:19 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wed, Aug 12, 2015 at 4:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Geoghegan <pg@heroku.com> writes:
> On Wed, Aug 12, 2015 at 10:30 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> The name count_nulls() suggest an aggregate function to me, though.

> I thought the same.

Ditto.  I'd be fine with this if we can come up with a name that
doesn't sound like an aggregate.  The best I can do offhand is
"number_of_nulls()", which doesn't seem very pretty.


nulls_in(a, b, c) IN (0, 3) - yes the repetition is not great...

How about these:

nulls_rank() (the analogy being 0 <= "rank" <= "set size")
nnulls()

or just

nulls() (this one might be a bit confusing due to existing NULLS LAST/FIRST syntax, though)

--
Alex

Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:
> nnulls()

I think I'd prefer num_nulls() over that.


.m



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:


2015-08-13 9:21 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:
nnulls()

I think I'd prefer num_nulls() over that.

can be

what about similar twin function num_nonulls()?

Pavel


.m



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: count_nulls(VARIADIC "any")

От
Atri Sharma
Дата:


On Thu, Aug 13, 2015 at 12:55 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:


2015-08-13 9:21 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:
nnulls()

I think I'd prefer num_nulls() over that.

can be

what about similar twin function num_nonulls()?

 
+1

Re: count_nulls(VARIADIC "any")

От
"Shulgin, Oleksandr"
Дата:
On Thu, Aug 13, 2015 at 9:25 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:


2015-08-13 9:21 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:
nnulls()

I think I'd prefer num_nulls() over that.

can be

what about similar twin function num_nonulls()?

Yes.  But I'm can't see any precedent for naming it like num_*...  And if anything, should it be num_nonnulls() then?

Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:


2015-08-13 9:47 GMT+02:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de>:
On Thu, Aug 13, 2015 at 9:25 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:


2015-08-13 9:21 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:
nnulls()

I think I'd prefer num_nulls() over that.

can be

what about similar twin function num_nonulls()?

Yes.  But I'm can't see any precedent for naming it like num_*...  And if anything, should it be num_nonnulls() then?

it is detail -  depends on final naming convention.
 

Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
Hello,

Here's a patch implementing this under the name num_nulls().  For
January's CF, of course.



.m

Вложения

Re: count_nulls(VARIADIC "any")

От
Tom Lane
Дата:
Marko Tiikkaja <marko@joh.to> writes:
> Here's a patch implementing this under the name num_nulls().  For 
> January's CF, of course.

What's this do that "count(*) - count(x)" doesn't?
        regards, tom lane



Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 2015-11-21 06:06, Tom Lane wrote:
> Marko Tiikkaja <marko@joh.to> writes:
>> Here's a patch implementing this under the name num_nulls().  For
>> January's CF, of course.
>
> What's this do that "count(*) - count(x)" doesn't?

This is sort of a lateral version of count(x); the input is a list of 
expressions rather than an expression executed over a bunch of input rows.


.m



Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 2015-11-21 06:02, I wrote:
> Here's a patch implementing this under the name num_nulls().  For
> January's CF, of course.

I forgot to update the some references in the documentation.  Fixed in
v3, attached.


.m

Вложения

Re: count_nulls(VARIADIC "any")

От
Jim Nasby
Дата:
On 11/20/15 11:12 PM, Marko Tiikkaja wrote:
> On 2015-11-21 06:02, I wrote:
>> Here's a patch implementing this under the name num_nulls().  For
>> January's CF, of course.
>
> I forgot to update the some references in the documentation.  Fixed in
> v3, attached.

I thought there was going to be a not-null equivalent as well? I've 
definitely wanted both variations in the past.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 2015-11-21 06:52, Jim Nasby wrote:
> On 11/20/15 11:12 PM, Marko Tiikkaja wrote:
>> On 2015-11-21 06:02, I wrote:
>>> Here's a patch implementing this under the name num_nulls().  For
>>> January's CF, of course.
>>
>> I forgot to update the some references in the documentation.  Fixed in
>> v3, attached.
>
> I thought there was going to be a not-null equivalent as well? I've
> definitely wanted both variations in the past.

I'm not sure that's necessary.  It's quite simple to implement yourself 
using the  int - int  operator.


.m



Re: count_nulls(VARIADIC "any")

От
Jim Nasby
Дата:
On 11/20/15 11:55 PM, Marko Tiikkaja wrote:
> On 2015-11-21 06:52, Jim Nasby wrote:
>> On 11/20/15 11:12 PM, Marko Tiikkaja wrote:
>>> On 2015-11-21 06:02, I wrote:
>>>> Here's a patch implementing this under the name num_nulls().  For
>>>> January's CF, of course.
>>>
>>> I forgot to update the some references in the documentation.  Fixed in
>>> v3, attached.
>>
>> I thought there was going to be a not-null equivalent as well? I've
>> definitely wanted both variations in the past.
>
> I'm not sure that's necessary.  It's quite simple to implement yourself
> using the  int - int  operator.

Only if you know how many columns there already are.

Or does this not work if you hand it a row?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 2015-11-22 18:29, Jim Nasby wrote:
> Only if you know how many columns there already are.
>
> Or does this not work if you hand it a row?

It "works" in the sense that it tells you whether the row is NULL or 
not.  I.e. the answer will always be 0 or 1.


.m



Re: count_nulls(VARIADIC "any")

От
Jim Nasby
Дата:
On 11/22/15 11:34 AM, Marko Tiikkaja wrote:
> On 2015-11-22 18:29, Jim Nasby wrote:
>> Only if you know how many columns there already are.
>>
>> Or does this not work if you hand it a row?
>
> It "works" in the sense that it tells you whether the row is NULL or
> not.  I.e. the answer will always be 0 or 1.

Hrm, I was hoping something like count_nulls(complex_type.*) would work.

I guess one could always create a wrapper function that does 
count_not_nulls() anyway.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 2015-11-22 21:17, Jim Nasby wrote:
> On 11/22/15 11:34 AM, Marko Tiikkaja wrote:
>> On 2015-11-22 18:29, Jim Nasby wrote:
>>> Only if you know how many columns there already are.
>>>
>>> Or does this not work if you hand it a row?
>>
>> It "works" in the sense that it tells you whether the row is NULL or
>> not.  I.e. the answer will always be 0 or 1.
>
> Hrm, I was hoping something like count_nulls(complex_type.*) would work.

Nope:

=# select num_nulls((f).*) from (select '(1,2,3)'::foo) ss(f);
ERROR:  row expansion via "*" is not supported here


.m



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:
<div dir="ltr">Hi<br /><div class="gmail_extra"><br /><div class="gmail_quote">2015-08-12 19:18 GMT+02:00 Marko
Tiikkaja<span dir="ltr"><<a href="mailto:marko@joh.to" target="_blank">marko@joh.to</a>></span>:<br /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br /><br /> I'd like to
suggest$SUBJECT for inclusion in Postgres 9.6.  I'm sure everyone would've found it useful at some point in their
lives,and the fact that it can't be properly implemented in any language other than C I think speaks for the fact that
weas a project should provide it.<br /><br /> A quick and dirty proof of concept (patch attached):<br /><br /> =#
selectcount_nulls(null::int, null::text, 17, 'bar');<br />  count_nulls<br /> -------------<br />            2<br /> (1
row)<br/><br /> Its natural habitat would be CHECK constraints, e.g:<br /><br />   CHECK (count_nulls(a,b,c) IN (0,
3))<br/><br /> Will finish this up for the next CF, unless someone wants to tell me how stupid this idea is before
that.<spanclass="HOEnZb"><font color="#888888"><br /><br /><br /> .m<br /></font></span></blockquote></div><br
/></div><divclass="gmail_extra">I am sending updated version - support num_nulls and num_notnulls<br /><br /></div><div
class="gmail_extra">Regards<br/><br /></div><div class="gmail_extra">Pavel<br /></div></div> 

Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:


2016-01-03 21:37 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi


2015-08-12 19:18 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
Hi,

I'd like to suggest $SUBJECT for inclusion in Postgres 9.6.  I'm sure everyone would've found it useful at some point in their lives, and the fact that it can't be properly implemented in any language other than C I think speaks for the fact that we as a project should provide it.

A quick and dirty proof of concept (patch attached):

=# select count_nulls(null::int, null::text, 17, 'bar');
 count_nulls
-------------
           2
(1 row)

Its natural habitat would be CHECK constraints, e.g:

  CHECK (count_nulls(a,b,c) IN (0, 3))

Will finish this up for the next CF, unless someone wants to tell me how stupid this idea is before that.


.m

I am sending updated version - support num_nulls and num_notnulls

and patch
 

Regards

Pavel

Вложения

Re: count_nulls(VARIADIC "any")

От
Jim Nasby
Дата:
On 1/3/16 2:37 PM, Pavel Stehule wrote:
> +         /* num_nulls(VARIADIC NULL) is defined as NULL */
> +         if (PG_ARGISNULL(0))
> +             return false;

Could you add to the comment explaining why that's the desired behavior?

> +         /*
> +          * Non-null argument had better be an array.  We assume that any call
> +          * context that could let get_fn_expr_variadic return true will have
> +          * checked that a VARIADIC-labeled parameter actually is an array.  So
> +          * it should be okay to just Assert that it's an array rather than
> +          * doing a full-fledged error check.
> +          */
> +         Assert(OidIsValid(get_base_element_type(get_fn_expr_argtype(fcinfo->flinfo, 0))));

Erm... is that really the way to verify that what you have is an array? 
ISTM there should be a macro for that somewhere...

> +         /* OK, safe to fetch the array value */
> +         arr = PG_GETARG_ARRAYTYPE_P(0);
> +
> +         ndims = ARR_NDIM(arr);
> +         dims = ARR_DIMS(arr);
> +         nitems = ArrayGetNItems(ndims, dims);
> +
> +         bitmap = ARR_NULLBITMAP(arr);
> +         if (bitmap)
> +         {
> +             bitmask = 1;
> +
> +             for (i = 0; i < nitems; i++)
> +             {
> +                 if ((*bitmap & bitmask) == 0)
> +                     count++;
> +
> +                 bitmask <<= 1;
> +                 if (bitmask == 0x100)
> +                 {
> +                     bitmap++;
> +                     bitmask = 1;

For brevity and example sake it'd probably be better to just use the 
normal iterator, unless there's a serious speed difference?

In the unit test, I'd personally prefer just building a table with the 
test cases and the expected NULL/NOT NULL results, at least for all the 
calls that would fit that paradigm. That should significantly reduce the 
size of the test. Not a huge deal though...

Also, I don't think anything is testing multiples of whatever value... 
how 'bout change the generate_series CASE statement to >40 instead of <>40?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:
Hi

2016-01-03 22:49 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/3/16 2:37 PM, Pavel Stehule wrote:
+               /* num_nulls(VARIADIC NULL) is defined as NULL */
+               if (PG_ARGISNULL(0))
+                       return false;

Could you add to the comment explaining why that's the desired behavior?

This case should be different than num_nulls(VARIADIC ARRAY[..]) - this situation is really equivalent of missing data and NULL is correct answer. It should not be too clean in num_nulls, but when it is cleaner for num_notnulls. And more, it is consistent with other variadic functions in Postgres: see concat_internal and text_format.
 

+               /*
+                * Non-null argument had better be an array.  We assume that any call
+                * context that could let get_fn_expr_variadic return true will have
+                * checked that a VARIADIC-labeled parameter actually is an array.  So
+                * it should be okay to just Assert that it's an array rather than
+                * doing a full-fledged error check.
+                */
+               Assert(OidIsValid(get_base_element_type(get_fn_expr_argtype(fcinfo->flinfo, 0))));

Erm... is that really the way to verify that what you have is an array? ISTM there should be a macro for that somewhere...

really, it is. It is used more time. Although I am not against some macro, I don't think so it is necessary. The macro should not be too shorter than this text.
 

+               /* OK, safe to fetch the array value */
+               arr = PG_GETARG_ARRAYTYPE_P(0);
+
+               ndims = ARR_NDIM(arr);
+               dims = ARR_DIMS(arr);
+               nitems = ArrayGetNItems(ndims, dims);
+
+               bitmap = ARR_NULLBITMAP(arr);
+               if (bitmap)
+               {
+                       bitmask = 1;
+
+                       for (i = 0; i < nitems; i++)
+                       {
+                               if ((*bitmap & bitmask) == 0)
+                                       count++;
+
+                               bitmask <<= 1;
+                               if (bitmask == 0x100)
+                               {
+                                       bitmap++;
+                                       bitmask = 1;

For brevity and example sake it'd probably be better to just use the normal iterator, unless there's a serious speed difference?

The iterator does some memory allocations and some access to type cache. Almost all work of iterator is useless for this case. This code is developed by Marko, but I agree with this design. Using the iterator is big gun for this case. I didn't any performance checks, but it should be measurable  for any varlena arrays.
 
Regards

Pavel


In the unit test, I'd personally prefer just building a table with the test cases and the expected NULL/NOT NULL results, at least for all the calls that would fit that paradigm. That should significantly reduce the size of the test. Not a huge deal though...

Also, I don't think anything is testing multiples of whatever value... how 'bout change the generate_series CASE statement to >40 instead of <>40?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: count_nulls(VARIADIC "any")

От
Jim Nasby
Дата:
On 1/3/16 10:23 PM, Pavel Stehule wrote:
> Hi
>
> 2016-01-03 22:49 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com
> <mailto:Jim.Nasby@bluetreble.com>>:
>
>     On 1/3/16 2:37 PM, Pavel Stehule wrote:
>
>         +               /* num_nulls(VARIADIC NULL) is defined as NULL */
>         +               if (PG_ARGISNULL(0))
>         +                       return false;
>
>
>     Could you add to the comment explaining why that's the desired behavior?
>
>
> This case should be different than num_nulls(VARIADIC ARRAY[..]) - this
> situation is really equivalent of missing data and NULL is correct
> answer. It should not be too clean in num_nulls, but when it is cleaner
> for num_notnulls. And more, it is consistent with other variadic
> functions in Postgres: see concat_internal and text_format.

Makes sense, now that you explain it. Which is why I'm thinking it'd be 
good to add that explanation to the comment... ;)

>           Assert(OidIsValid(get_base_element_type(get_fn_expr_argtype(fcinfo->flinfo, 0))));
>
>
>     Erm... is that really the way to verify that what you have is an
>     array? ISTM there should be a macro for that somewhere...
>
>
> really, it is. It is used more time. Although I am not against some
> macro, I don't think so it is necessary. The macro should not be too
> shorter than this text.

Well, if there's other stuff doing that... would be nice to refactor 
that though.

>     For brevity and example sake it'd probably be better to just use the
>     normal iterator, unless there's a serious speed difference?
>
>
> The iterator does some memory allocations and some access to type cache.
> Almost all work of iterator is useless for this case. This code is
> developed by Marko, but I agree with this design. Using the iterator is
> big gun for this case. I didn't any performance checks, but it should be
> measurable  for any varlena arrays.

Makes sense then.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:
Hi

2016-01-04 5:49 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/3/16 10:23 PM, Pavel Stehule wrote:
Hi

2016-01-03 22:49 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com
<mailto:Jim.Nasby@bluetreble.com>>:

    On 1/3/16 2:37 PM, Pavel Stehule wrote:

        +               /* num_nulls(VARIADIC NULL) is defined as NULL */
        +               if (PG_ARGISNULL(0))
        +                       return false;


    Could you add to the comment explaining why that's the desired behavior?


This case should be different than num_nulls(VARIADIC ARRAY[..]) - this
situation is really equivalent of missing data and NULL is correct
answer. It should not be too clean in num_nulls, but when it is cleaner
for num_notnulls. And more, it is consistent with other variadic
functions in Postgres: see concat_internal and text_format.

Makes sense, now that you explain it. Which is why I'm thinking it'd be good to add that explanation to the comment... ;)

          Assert(OidIsValid(get_base_element_type(get_fn_expr_argtype(fcinfo->flinfo, 0))));


    Erm... is that really the way to verify that what you have is an
    array? ISTM there should be a macro for that somewhere...


really, it is. It is used more time. Although I am not against some
macro, I don't think so it is necessary. The macro should not be too
shorter than this text.

Well, if there's other stuff doing that... would be nice to refactor that though.

    For brevity and example sake it'd probably be better to just use the
    normal iterator, unless there's a serious speed difference?


The iterator does some memory allocations and some access to type cache.
Almost all work of iterator is useless for this case. This code is
developed by Marko, but I agree with this design. Using the iterator is
big gun for this case. I didn't any performance checks, but it should be
measurable  for any varlena arrays.

Makes sense then.


+ enhanced comment
+ rewritten regress tests

Regards

Pavel
 
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Вложения

Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 03/01/16 22:49, Jim Nasby wrote:
> In the unit test, I'd personally prefer just building a table with the
> test cases and the expected NULL/NOT NULL results, at least for all the
> calls that would fit that paradigm. That should significantly reduce the
> size of the test. Not a huge deal though...

I don't really see the point.  "The size of the test" doesn't seem like 
a worthwhile optimization target, unless the test scripts are somehow 
really unnecessarily large.

Further, if you were developing code related to this, previously you 
could just copy-paste the defective test case in order to easily 
reproduce a problem.  But now suddenly you need a ton of different setup.

I don't expect to really have a say in this, but I think the tests are 
now worse than they were before.


.m



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:


2016-01-12 17:27 GMT+01:00 Marko Tiikkaja <marko@joh.to>:
On 03/01/16 22:49, Jim Nasby wrote:
In the unit test, I'd personally prefer just building a table with the
test cases and the expected NULL/NOT NULL results, at least for all the
calls that would fit that paradigm. That should significantly reduce the
size of the test. Not a huge deal though...

I don't really see the point.  "The size of the test" doesn't seem like a worthwhile optimization target, unless the test scripts are somehow really unnecessarily large.

Further, if you were developing code related to this, previously you could just copy-paste the defective test case in order to easily reproduce a problem.  But now suddenly you need a ton of different setup.

I don't expect to really have a say in this, but I think the tests are now worse than they were before.

the form of regress tests is not pretty significant issue. Jim's design is little bit transparent, Marko's is maybe little bit practical. Both has sense from my opinion, and any hasn't significant advantage against other.

Regards

Pavel
 


.m

Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:
Hi

2016-01-17 8:43 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:


2016-01-12 17:27 GMT+01:00 Marko Tiikkaja <marko@joh.to>:
On 03/01/16 22:49, Jim Nasby wrote:
In the unit test, I'd personally prefer just building a table with the
test cases and the expected NULL/NOT NULL results, at least for all the
calls that would fit that paradigm. That should significantly reduce the
size of the test. Not a huge deal though...

I don't really see the point.  "The size of the test" doesn't seem like a worthwhile optimization target, unless the test scripts are somehow really unnecessarily large.

Further, if you were developing code related to this, previously you could just copy-paste the defective test case in order to easily reproduce a problem.  But now suddenly you need a ton of different setup.

I don't expect to really have a say in this, but I think the tests are now worse than they were before.

the form of regress tests is not pretty significant issue. Jim's design is little bit transparent, Marko's is maybe little bit practical. Both has sense from my opinion, and any hasn't significant advantage against other.

any possible agreement, how these tests should be designed?

simple patch, simple regress tests, so there are no reason for long waiting.

Regards

Pavel


Regards

Pavel
 


.m


Re: count_nulls(VARIADIC "any")

От
Jim Nasby
Дата:
On 1/21/16 1:48 PM, Pavel Stehule wrote:
>     the form of regress tests is not pretty significant issue. Jim's
>     design is little bit transparent, Marko's is maybe little bit
>     practical. Both has sense from my opinion, and any hasn't
>     significant advantage against other.
>
>
> any possible agreement, how these tests should be designed?
>
> simple patch, simple regress tests, so there are no reason for long waiting.

I don't really see how individual tests are more practical (you can 
still cut and paste a table...), but since there's no strong consensus 
either way I'd say it's up to you as author.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:


2016-01-22 13:34 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/21/16 1:48 PM, Pavel Stehule wrote:
    the form of regress tests is not pretty significant issue. Jim's
    design is little bit transparent, Marko's is maybe little bit
    practical. Both has sense from my opinion, and any hasn't
    significant advantage against other.


any possible agreement, how these tests should be designed?

simple patch, simple regress tests, so there are no reason for long waiting.

I don't really see how individual tests are more practical (you can still cut and paste a table...), but since there's no strong consensus either way I'd say it's up to you as author.

Marco is a author of this patch, so - Marco, please, send final version of this patch

Regards

Pavel
 

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 25/01/16 19:57, Pavel Stehule wrote:
> Marco is a author of this patch, so - Marco, please, send final version of
> this patch

I don't really care about the tests.  Can we not use the v5 patch 
already in the thread?  As far as I could tell there were no reviewer's 
comments on it anymore.


.m



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:


2016-01-26 11:42 GMT+01:00 Marko Tiikkaja <marko@joh.to>:
On 25/01/16 19:57, Pavel Stehule wrote:
Marco is a author of this patch, so - Marco, please, send final version of
this patch

I don't really care about the tests.  Can we not use the v5 patch already in the thread?  As far as I could tell there were no reviewer's comments on it anymore.

It was not my request, but I counted Jim as second reviewer.

So, I'll return back original regress tests. If I remember well, there are no any other objection, so I'll mark this version as ready for commiter.

1. the patch is rebased against master
2. now warning or errors due compilation
3. all tests are passed
4. the code is simple without side effects and possible negative performance impacts
6. there was not objections against the design
7. the iteration over null bitmap is used more times in our code, but this is new special case. We don't need iterate over array elements, so we should not to use existing array iterators.

I'll mark this patch as ready for commiter

Regards

Pavel



.m

Вложения

Re: count_nulls(VARIADIC "any")

От
Tom Lane
Дата:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> [ num_nulls_v6.patch ]

I started looking through this.  It seems generally okay, but I'm not
very pleased with the function name "num_notnulls".  I think it would
be better as "num_nonnulls", as I see Oleksandr suggested already.
        regards, tom lane



Re: count_nulls(VARIADIC "any")

От
Tom Lane
Дата:
I wrote:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> [ num_nulls_v6.patch ]

> I started looking through this.  It seems generally okay, but I'm not
> very pleased with the function name "num_notnulls".  I think it would
> be better as "num_nonnulls", as I see Oleksandr suggested already.

Not hearing any complaints, I pushed it with that change and some other
cosmetic adjustments.
        regards, tom lane



Re: count_nulls(VARIADIC "any")

От
Pavel Stehule
Дата:
<p dir="ltr"><br /> Dne 5. 2. 2016 1:33 napsal uživatel "Tom Lane" <<a
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>:<br/> ><br /> > Pavel Stehule <<a
href="mailto:pavel.stehule@gmail.com">pavel.stehule@gmail.com</a>>writes:<br /> > > [ num_nulls_v6.patch ]<br
/>><br /> > I started looking through this.  It seems generally okay, but I'm not<br /> > very pleased with
thefunction name "num_notnulls".  I think it would<br /> > be better as "num_nonnulls", as I see Oleksandr suggested
already.<pdir="ltr">I have no problem with it.<p dir="ltr">Regards<p dir="ltr">Pavel<br /> ><br /> >            
           regards, tom lane<br /> 

Re: count_nulls(VARIADIC "any")

От
Marko Tiikkaja
Дата:
On 2016-02-05 05:06, Tom Lane wrote:
> I wrote:
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>> [ num_nulls_v6.patch ]
>
>> I started looking through this.  It seems generally okay, but I'm not
>> very pleased with the function name "num_notnulls".  I think it would
>> be better as "num_nonnulls", as I see Oleksandr suggested already.
>
> Not hearing any complaints, I pushed it with that change and some other
> cosmetic adjustments.

Thanks Tom and Pavel and everyone who provided feedback.


.m



Re: count_nulls(VARIADIC "any")

От
Thomas Munro
Дата:
On Fri, Feb 5, 2016 at 5:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>> [ num_nulls_v6.patch ]
>
>> I started looking through this.  It seems generally okay, but I'm not
>> very pleased with the function name "num_notnulls".  I think it would
>> be better as "num_nonnulls", as I see Oleksandr suggested already.
>
> Not hearing any complaints, I pushed it with that change and some other
> cosmetic adjustments.

Would num_values be a better name than num_nonnulls?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: count_nulls(VARIADIC "any")

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> Would num_values be a better name than num_nonnulls?

If "value" is a term that excludes null values, it's news to me.
        regards, tom lane



Re: count_nulls(VARIADIC "any")

От
Thomas Munro
Дата:
On Tue, Feb 9, 2016 at 9:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> Would num_values be a better name than num_nonnulls?
>
> If "value" is a term that excludes null values, it's news to me.

Ah, right, I was thinking of null as the absence of a value.  But in
fact it is a special value that indicates the absence of a "data
value".  And num_data_values doesn't sound great.

-- 
Thomas Munro
http://www.enterprisedb.com