Обсуждение: Add generate_series(date,date) and generate_series(date,date,integer)

Поиск
Список
Период
Сортировка

Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
This patch addresses a personal need: nearly every time I use generate_series for timestamps, I end up casting the result into date or the ISO string thereof. Like such:

SELECT d.dt::date as dt
FROM generate_series('2015-01-01'::date,
                     '2016-01-04'::date, 
                     interval '1 day') AS d(dt);

That's less than elegant.

With this patch, we can do this:


SELECT d.date_val FROM generate_series('1991-09-24'::date,'1991-10-01'::date) as d(date_val);
  date_val
------------
 1991-09-24
 1991-09-25
 1991-09-26
 1991-09-27
 1991-09-28
 1991-09-29
 1991-09-30
 1991-10-01
(8 rows)

SELECT d.date_val FROM generate_series('1991-09-24'::date,'1991-10-01'::date,7) as d(date_val);
  date_val
------------
 1991-09-24
 1991-10-01
(2 rows)

SELECT d.date_val FROM generate_series('1999-12-31'::date,'1999-12-29'::date,-1) as d(date_val);
  date_val
------------
 1999-12-31
 1999-12-30
 1999-12-29
(3 rows)

One thing I discovered in doing this patch is that if you do a timestamp generate_series involving infinity....it tries to do it. I didn't wait to see if it finished. 

For the date series, I put in checks to return an empty set:

SELECT d.date_val FROM generate_series('-infinity'::date,'1999-12-29'::date) as d(date_val);
 date_val
----------
(0 rows)

SELECT d.date_val FROM generate_series('1991-09-24'::date,'infinity'::date) as d(date_val);
 date_val
----------
(0 rows)


Notes:
- I borrowed the int4 implementation's check for step-size of 0 for POLA reasons. However, it occurred to me that the function might be leakproof if the behavior where changed to instead return an empty set. I'm not sure that leakproof is a goal in and of itself.

First attempt at this patch attached. The examples above are copied from the new test cases.

Вложения

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Michael Paquier
Дата:
On Mon, Jan 25, 2016 at 3:00 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
> This patch addresses a personal need: nearly every time I use
> generate_series for timestamps, I end up casting the result into date or the
> ISO string thereof. Like such:
>
> [...]
>
> One thing I discovered in doing this patch is that if you do a timestamp
> generate_series involving infinity....it tries to do it. I didn't wait to
> see if it finished.

Well, I would think that this is a bug that we had better address and
backpatch. It does not make much sense to use infinity for timestamps,
but letting it run infinitely is not good either.

> For the date series, I put in checks to return an empty set:
>
> SELECT d.date_val FROM generate_series('-infinity'::date,'1999-12-29'::date)
> as d(date_val);
>  date_val
> ----------
> (0 rows)
>
> SELECT d.date_val FROM generate_series('1991-09-24'::date,'infinity'::date)
> as d(date_val);
>  date_val
> ----------
> (0 rows)

Wouldn't a proper error be more adapted?
-- 
Michael



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Tom Lane
Дата:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Mon, Jan 25, 2016 at 3:00 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
>> One thing I discovered in doing this patch is that if you do a timestamp
>> generate_series involving infinity....it tries to do it. I didn't wait to
>> see if it finished.

> Well, I would think that this is a bug that we had better address and
> backpatch. It does not make much sense to use infinity for timestamps,
> but letting it run infinitely is not good either.

Meh.  Where would you cut it off?  AD 10000000000?  A few zeroes either
way doesn't really make much difference.

If it didn't respond to SIGINT, that would be an issue, but otherwise
this doesn't seem much more exciting than any other way to create a
query that will run longer than you want to wait.
        regards, tom lane



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px#ccc solid;padding-left:1ex"><br /> If it didn't respond to SIGINT, that would be an issue, but
otherwise<br/> this doesn't seem much more exciting than any other way to create a<br /> query that will run longer
thanyou want to wait.<br /><br />                         regards, tom lane<br /></blockquote></div><br /></div><div
class="gmail_extra">Itresponded to SIGINT, so yeah, meh.</div><div class="gmail_extra"><br /></div><div
class="gmail_extra">Ican see value in aligning the behavior of infinity queries between date and timestamp, but I have
nostrong opinion about which behavior is better: it's either set step = 0 or an ereport(), no biggie if we want to
handlethe condition, I rip out the DATE_NOT_FINITE() checks.</div><div class="gmail_extra"><br /></div><div
class="gmail_extra">Incidentally,is there a reason behind the tendency of internal functions to avoid parameter
defaultsin favor of multiple pg_proc entries? I copied the existing behavior of the int4 generate_series, but having
oneentry with the defaults seemed more self-documenting.</div><div class="gmail_extra"><br /></div><div
class="gmail_extra"><br/></div></div> 

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Tom Lane
Дата:
Corey Huinker <corey.huinker@gmail.com> writes:
> Incidentally, is there a reason behind the tendency of internal functions
> to avoid parameter defaults in favor of multiple pg_proc entries?

Yes: you can't specify defaults in pg_proc.h.

We work around that where absolutely necessary, see the last part of
system_views.sql.  But it's messy enough, and bug-prone enough, to be
better avoided --- for example, it's very easy for the redeclaration
in system_view.sql to forget a STRICT or other similar marking.

Personally I'd say that every one of the existing cases that simply has
a default for the last argument was a bad idea, and would better have
been done in the traditional way with two pg_proc.h entries.
        regards, tom lane



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Tomas Vondra
Дата:
On 01/25/2016 07:22 AM, Tom Lane wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> On Mon, Jan 25, 2016 at 3:00 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
>>> One thing I discovered in doing this patch is that if you do a timestamp
>>> generate_series involving infinity....it tries to do it. I didn't wait to
>>> see if it finished.
>
>> Well, I would think that this is a bug that we had better address and
>> backpatch. It does not make much sense to use infinity for timestamps,
>> but letting it run infinitely is not good either.
>
> Meh. Where would you cut it off? AD 10000000000? A few zeroes either
> way doesn't really make much difference.

Why cut off? Why not to check if any of the input values is an infinity 
and simply error out in that case?

>
> If it didn't respond to SIGINT, that would be an issue, but
> otherwise this doesn't seem much more exciting than any other way to
> create a query that will run longer than you want to wait.

I disagree. Sure, it's possible to construct queries that take forever, 
but that's difficult (or impossible) to detect at query start. I don't 
think that means we should not guard against cases that are provably 
infinite and can't possibly work.

Imagine for example a script that in some rare cases passes happens to 
pass infinity into generate_series() - in that case I'd much rather 
error out than wait till the end of the universe.

So +1 from me to checking for infinity.

regard

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jan 25, 2016 at 2:07 AM, Tom Lane <span
dir="ltr"><<ahref="mailto:tgl@sss.pgh.pa.us" target="_blank">tgl@sss.pgh.pa.us</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">CoreyHuinker <<a href="mailto:corey.huinker@gmail.com">corey.huinker@gmail.com</a>> writes:<br /> >
Incidentally,is there a reason behind the tendency of internal functions<br /> > to avoid parameter defaults in
favorof multiple pg_proc entries?<br /><br /></span>Yes: you can't specify defaults in pg_proc.h.<br /><br /> We work
aroundthat where absolutely necessary, see the last part of<br /> system_views.sql.  But it's messy enough, and
bug-proneenough, to be<br /> better avoided --- for example, it's very easy for the redeclaration<br /> in
system_view.sqlto forget a STRICT or other similar marking.<br /><br /> Personally I'd say that every one of the
existingcases that simply has<br /> a default for the last argument was a bad idea, and would better have<br /> been
donein the traditional way with two pg_proc.h entries.<br /><br />                         regards, tom lane<br
/></blockquote></div><br/></div><div class="gmail_extra">...which answers my follow up question where make_interval was
gettingthe defaults I saw in the table but not the .h file. Thanks!</div><div class="gmail_extra"><br /></div></div> 

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Michael Paquier
Дата:
On Mon, Jan 25, 2016 at 6:55 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 01/25/2016 07:22 AM, Tom Lane wrote:
>>
>> Michael Paquier <michael.paquier@gmail.com> writes:
>>>
>>> On Mon, Jan 25, 2016 at 3:00 PM, Corey Huinker <corey.huinker@gmail.com>
>>> wrote:
>>>>
>>>> One thing I discovered in doing this patch is that if you do a timestamp
>>>> generate_series involving infinity....it tries to do it. I didn't wait
>>>> to
>>>> see if it finished.
>>
>>
>>> Well, I would think that this is a bug that we had better address and
>>> backpatch. It does not make much sense to use infinity for timestamps,
>>> but letting it run infinitely is not good either.
>>
>>
>> Meh. Where would you cut it off? AD 10000000000? A few zeroes either
>> way doesn't really make much difference.
>
>
> Why cut off? Why not to check if any of the input values is an infinity and
> simply error out in that case?

That's exactly my point.

>> If it didn't respond to SIGINT, that would be an issue, but
>> otherwise this doesn't seem much more exciting than any other way to
>> create a query that will run longer than you want to wait.
>
> I disagree. Sure, it's possible to construct queries that take forever, but
> that's difficult (or impossible) to detect at query start. I don't think
> that means we should not guard against cases that are provably infinite and
> can't possibly work.
>
> Imagine for example a script that in some rare cases passes happens to pass
> infinity into generate_series() - in that case I'd much rather error out
> than wait till the end of the universe.

Or wait until all the memory of the system is consumed. In the worst
case the OOM killer would come by, say 'Hi!' and do the police work,
but that's not cool either.
-- 
Michael



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Simon Riggs
Дата:
On 25 January 2016 at 09:55, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
 
Imagine for example a script that in some rare cases passes happens to pass infinity into generate_series() - in that case I'd much rather error out than wait till the end of the universe.

So +1 from me to checking for infinity.

+1

ERROR infinite result sets are not supported, yet

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Torsten Zuehlsdorff
Дата:
On 26.01.2016 07:52, Simon Riggs wrote:

>> Imagine for example a script that in some rare cases passes happens to
>> pass infinity into generate_series() - in that case I'd much rather error
>> out than wait till the end of the universe.
>>
>> So +1 from me to checking for infinity.
>>
>
> +1
>
> ERROR infinite result sets are not supported, yet

Maybe we should skip the "yet". Or do we really plan to support them in 
(infinite) future? ;)

+1 from me to check infinity also.

Greetings,
Torsten




Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Michael Paquier
Дата:
On Tue, Jan 26, 2016 at 7:00 PM, Torsten Zuehlsdorff
<mailinglists@toco-domains.de> wrote:
>
> On 26.01.2016 07:52, Simon Riggs wrote:
>
>>> Imagine for example a script that in some rare cases passes happens to
>>> pass infinity into generate_series() - in that case I'd much rather error
>>> out than wait till the end of the universe.
>>>
>>> So +1 from me to checking for infinity.
>>>
>>
>> +1
>>
>> ERROR infinite result sets are not supported, yet
>
>
> Maybe we should skip the "yet". Or do we really plan to support them in
> (infinite) future? ;)
>
> +1 from me to check infinity also.

Something like the patch attached would be fine? This wins a backpatch
because the query continuously running eats memory, no?
--
Michael

Вложения

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Jan 26, 2016 at 7:53 AM, Michael Paquier <span
dir="ltr"><<ahref="mailto:michael.paquier@gmail.com" target="_blank">michael.paquier@gmail.com</a>></span>
wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">OnTue, Jan 26, 2016 at 7:00 PM, Torsten Zuehlsdorff<br /> <<a
href="mailto:mailinglists@toco-domains.de">mailinglists@toco-domains.de</a>>wrote:<br /> ><br /> > On
26.01.201607:52, Simon Riggs wrote:<br /> ><br /> >>> Imagine for example a script that in some rare cases
passeshappens to<br /> >>> pass infinity into generate_series() - in that case I'd much rather error<br />
>>>out than wait till the end of the universe.<br /> >>><br /> >>> So +1 from me to checking
forinfinity.<br /> >>><br /> >><br /> >> +1<br /> >><br /> >> ERROR infinite result
setsare not supported, yet<br /> ><br /> ><br /> > Maybe we should skip the "yet". Or do we really plan to
supportthem in<br /> > (infinite) future? ;)<br /> ><br /> > +1 from me to check infinity also.<br /><br
/></span>Somethinglike the patch attached would be fine? This wins a backpatch<br /> because the query continuously
runningeats memory, no?<br /><span class="HOEnZb"><font color="#888888">--<br /> Michael<br /></font></span><br /><br
/>--<br /> Sent via pgsql-hackers mailing list (<a
href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your
subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers" rel="noreferrer"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/><br /></blockquote></div><br /></div><div
class="gmail_extra">I'llmodify my generate_series_date to give similar errors.</div><div class="gmail_extra"><br
/></div><divclass="gmail_extra">I have no opinion on backpatch.<br /><br />+1 for flippant references to
infinity.</div><divclass="gmail_extra"><br /></div></div> 

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
David Fetter
Дата:
On Tue, Jan 26, 2016 at 09:53:26PM +0900, Michael Paquier wrote:
> On Tue, Jan 26, 2016 at 7:00 PM, Torsten Zuehlsdorff
> <mailinglists@toco-domains.de> wrote:
> >
> > On 26.01.2016 07:52, Simon Riggs wrote:
> >
> >>> Imagine for example a script that in some rare cases passes
> >>> happens to pass infinity into generate_series() - in that case
> >>> I'd much rather error out than wait till the end of the
> >>> universe.
> >>>
> >>> So +1 from me to checking for infinity.
> >>>
> >>
> >> +1
> >>
> >> ERROR infinite result sets are not supported, yet
> >
> >
> > Maybe we should skip the "yet". Or do we really plan to support
> > them in (infinite) future? ;)
> >
> > +1 from me to check infinity also.
> 
> Something like the patch attached would be fine? This wins a
> backpatch because the query continuously running eats memory, no?

+1 for back-patching.  There's literally no case where an infinite
input could be correct as the start or end of an interval for
generate_series.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Euler Taveira
Дата:
On 26-01-2016 09:53, Michael Paquier wrote:
> Something like the patch attached would be fine? This wins a backpatch
> because the query continuously running eats memory, no?
> 
+1. Although it breaks compatibility, a function that just eats
resources is not correct.


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
Revised patch:

Differences in this patch vs my first one:
- infinite bounds generate errors identical to Michael's timestamp patch (though I did like Simon's highly optimistic error message).
- Bounds checking moved before memory context allocation
- arg variable "finish" renamed "stop" to match parameter name in documentation and error message

On Tue, Jan 26, 2016 at 12:47 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
On Tue, Jan 26, 2016 at 7:53 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Tue, Jan 26, 2016 at 7:00 PM, Torsten Zuehlsdorff
<mailinglists@toco-domains.de> wrote:
>
> On 26.01.2016 07:52, Simon Riggs wrote:
>
>>> Imagine for example a script that in some rare cases passes happens to
>>> pass infinity into generate_series() - in that case I'd much rather error
>>> out than wait till the end of the universe.
>>>
>>> So +1 from me to checking for infinity.
>>>
>>
>> +1
>>
>> ERROR infinite result sets are not supported, yet
>
>
> Maybe we should skip the "yet". Or do we really plan to support them in
> (infinite) future? ;)
>
> +1 from me to check infinity also.

Something like the patch attached would be fine? This wins a backpatch
because the query continuously running eats memory, no?
--
Michael


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


I'll modify my generate_series_date to give similar errors.

I have no opinion on backpatch.

+1 for flippant references to infinity.


Вложения

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Alvaro Herrera
Дата:
Simon Riggs wrote:
> On 25 January 2016 at 09:55, Tomas Vondra <tomas.vondra@2ndquadrant.com>
> wrote:
> 
> 
> > Imagine for example a script that in some rare cases passes happens to
> > pass infinity into generate_series() - in that case I'd much rather error
> > out than wait till the end of the universe.
> >
> > So +1 from me to checking for infinity.
> 
> +1
> 
> ERROR infinite result sets are not supported, yet

In the future we may get lazy evaluation of functions.  When and if that
happens we may want to remove this limitation so that you can get a
streaming result producing timestamp values continuously.

(I don't necessarily think you'd use generate_series in such cases.
Maybe you'd want clock_timestamp to generate the present value at each
call, for example.  But there may be uses for synthetic values as well.)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Torsten Zühlsdorff
Дата:
On 26.01.2016 13:53, Michael Paquier wrote:

>>>> Imagine for example a script that in some rare cases passes happens to
>>>> pass infinity into generate_series() - in that case I'd much rather error
>>>> out than wait till the end of the universe.
>>>>
>>>> So +1 from me to checking for infinity.
>>>>
>>>
>>> +1
>>>
>>> ERROR infinite result sets are not supported, yet
>>
>>
>> Maybe we should skip the "yet". Or do we really plan to support them in
>> (infinite) future? ;)
>>
>> +1 from me to check infinity also.
>
> Something like the patch attached would be fine? This wins a backpatch
> because the query continuously running eats memory, no?

Looks fine to me.

(Minor mention: is there no newline needed between the tests?)

Greetings,
Torsten



Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
David Steele
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, failed

Everything looks good except for two minor issues:

1) There should be an informative comment for this struct:

+/* Corey BEGIN */
+typedef struct
+{
+    DateADT        current;
+    DateADT        stop;
+    int32        step;
+} generate_series_date_fctx;

2) There's an extra linefeed after the struct.  Needed?

Regards,
-David

The new status of this patch is: Waiting on Author

Re: Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
Corey Huinker
Дата:
Doh, I left that comment to myself in there. :)

The corresponding structs in timestamp.c and int.c have no comment, so suggestions of what should be there are welcome. In the interim I put in this:
/* state for generate_series_date(date,date,[step]) */

Extra linefeed after struct removed.

Do you have any insight as to why the documentation test failed?

In the mean time, here's the updated patch.


On Tue, Feb 2, 2016 at 11:41 AM, David Steele <david@pgmasters.net> wrote:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, failed

Everything looks good except for two minor issues:

1) There should be an informative comment for this struct:

+/* Corey BEGIN */
+typedef struct
+{
+       DateADT         current;
+       DateADT         stop;
+       int32           step;
+} generate_series_date_fctx;

2) There's an extra linefeed after the struct.  Needed?

Regards,
-David

The new status of this patch is: Waiting on Author

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
David Steele
Дата:
On 2/2/16 1:01 PM, Corey Huinker wrote:
> Doh, I left that comment to myself in there. :)

If I had a dime for every time I've done that...

> The corresponding structs in timestamp.c and int.c have no comment, so
> suggestions of what should be there are welcome. In the interim I put in
> this:
>
>     /* state for generate_series_date(date,date,[step]) */

I think that's fine -- whoever commits it may feel otherwise.

> Do you have any insight as to why the documentation test failed?
>
> In the mean time, here's the updated patch.

I didn't pass the docs on account of the wonky comment since I consider
code comments to be part of the docs.  The sgml docs build fine and look
good to me.

I'll retest and update the review accordingly.

--
-David
david@pgmasters.net


Re: Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
Corey Huinker
Дата:

> Do you have any insight as to why the documentation test failed?
>
> In the mean time, here's the updated patch.

I didn't pass the docs on account of the wonky comment since I consider
code comments to be part of the docs.  The sgml docs build fine and look
good to me.


Ah, understood. I rebuilt the docs thinking there was a make check failure I missed, then viewed the html in links (I develop on an AWS box) and saw nothing untoward.
 
I'll retest and update the review accordingly.

Thanks! 

Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
David Steele
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Everything looks good to me.  Ready for a committer to have a look.

The new status of this patch is: Ready for Committer

Re: Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
Simon Riggs
Дата:
On 2 February 2016 at 18:01, Corey Huinker <corey.huinker@gmail.com> wrote:
Doh, I left that comment to myself in there. :)

The corresponding structs in timestamp.c and int.c have no comment, so suggestions of what should be there are welcome. In the interim I put in this:
/* state for generate_series_date(date,date,[step]) */

Extra linefeed after struct removed.

Do you have any insight as to why the documentation test failed?

In the mean time, here's the updated patch.

[step] is in days, but is not documented as such.

My understanding is you want to replace this

SELECT d.dt::date as dt
FROM generate_series('2015-01-01'::date,
                     '2016-01-04'::date
                     interval '1 day') AS d(dt);

with this

SELECT d.dt
FROM generate_series('
2015-01-01'::date,
                     '2016-01-04'::date,
                     7) as d(dt);
 
Personally, I think writing  INTERVAL '7 days' to be clearer than just typing 7.

Other than that, the only difference is the ::date part. Is it really worth adding extra code just for that? I would say not.

No comments on the patch itself, which seems to do the job, so apologies to give this opinion on your work, I do hope it doesn't put you off further contributions.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
Corey Huinker
Дата:

[step] is in days, but is not documented as such.


It is in days, and is not documented as such, but since a day is the smallest unit of time for a date, I felt there was no other interpretation.

 
My understanding is you want to replace this

SELECT d.dt::date as dt
FROM generate_series('2015-01-01'::date,
                     '2016-01-04'::date
                     interval '1 day') AS d(dt);

with this

SELECT d.dt
FROM generate_series('
2015-01-01'::date,
                     '2016-01-04'::date,
                     7) as d(dt);


I'd also like to be able to join the values of d.dt without typecasting them.
To me it's as awkward as (select 1::double + 1::double)::integer
 

Personally, I think writing  INTERVAL '7 days' to be clearer than just typing 7.

Well, nearly all my use cases involve the step being 1 (and thus omitted) or -1. 

Maybe this example will appeal to you

SELECT d.birth_date, COUNT(r.kiddo_name) 
FROM generate_series('2016-01-01'::date,'2016-01-10'::date) as d(birth_date)
LEFT OUTER JOIN birth_records r ON r.birth_date = d.birth_date
GROUP BY 1
ORDER BY 1;


Other than that, the only difference is the ::date part. Is it really worth adding extra code just for that? I would say not.

I would argue it belongs for the sake of completeness. 
We added generate_series for numerics when generate_series for floats already existed.

No comments on the patch itself, which seems to do the job, so apologies to give this opinion on your work, I do hope it doesn't put you off further contributions.

Thanks. I appreciate that.
 

Re: Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
Vik Fearing
Дата:
On 02/21/2016 07:56 PM, Corey Huinker wrote:
> 
>     Other than that, the only difference is the ::date part. Is it
>     really worth adding extra code just for that? I would say not.
> 
> 
> I would argue it belongs for the sake of completeness. 

So would I.

+1 for adding this missing function.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Christoph Berg
Дата:
Re: David Fetter 2016-01-26 <20160126180011.GA16903@fetter.org>
> +1 for back-patching.  There's literally no case where an infinite
> input could be correct as the start or end of an interval for
> generate_series.

select * from generate_series(now(), 'infinity', '1 day') limit 10;

... seems pretty legit to me. If limit pushdown into SRFs happened to
work some day, it'd be a pity if the above query raised an error.

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Tom Lane
Дата:
Christoph Berg <myon@debian.org> writes:
> Re: David Fetter 2016-01-26 <20160126180011.GA16903@fetter.org>
>> +1 for back-patching.  There's literally no case where an infinite
>> input could be correct as the start or end of an interval for
>> generate_series.

> select * from generate_series(now(), 'infinity', '1 day') limit 10;
> ... seems pretty legit to me. If limit pushdown into SRFs happened to
> work some day, it'd be a pity if the above query raised an error.

Oooh ... actually, that works today if you consider the SRF-in-targetlist
case:

regression=# select generate_series(now(), 'infinity', '1 day') limit 10;       generate_series        
-------------------------------2016-02-21 16:51:03.303064-052016-02-22 16:51:03.303064-052016-02-23
16:51:03.303064-052016-02-2416:51:03.303064-052016-02-25 16:51:03.303064-052016-02-26 16:51:03.303064-052016-02-27
16:51:03.303064-052016-02-2816:51:03.303064-052016-02-29 16:51:03.303064-052016-03-01 16:51:03.303064-05
 
(10 rows)

Time: 8.457 ms

Given that counterexample, I think we not only shouldn't back-patch such a
change but should reject it altogether.
        regards, tom lane



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Michael Paquier
Дата:
On Mon, Feb 22, 2016 at 6:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Oooh ... actually, that works today if you consider the SRF-in-targetlist
> case:
>
> regression=# select generate_series(now(), 'infinity', '1 day') limit 10;
>         generate_series
> -------------------------------
>  2016-02-21 16:51:03.303064-05
>  2016-02-22 16:51:03.303064-05
>  2016-02-23 16:51:03.303064-05
>  2016-02-24 16:51:03.303064-05
>  2016-02-25 16:51:03.303064-05
>  2016-02-26 16:51:03.303064-05
>  2016-02-27 16:51:03.303064-05
>  2016-02-28 16:51:03.303064-05
>  2016-02-29 16:51:03.303064-05
>  2016-03-01 16:51:03.303064-05
> (10 rows)
>
> Time: 8.457 ms
>
> Given that counterexample, I think we not only shouldn't back-patch such a
> change but should reject it altogether.

Ouch, good point. The overflows are a different problem that we had
better address though (still on my own TODO list)...
-- 
Michael



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
>
> Given that counterexample, I think we not only shouldn't back-patch such a
> change but should reject it altogether.

Ouch, good point. The overflows are a different problem that we had
better address though (still on my own TODO list)...

So I should remove the bounds checking from generate_series(date,date,[int]) altogether?


 

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Tomas Vondra
Дата:
Hi,

On 02/22/2016 08:04 PM, Corey Huinker wrote:
>     >
>     > Given that counterexample, I think we not only shouldn't back-patch such a
>     > change but should reject it altogether.
>
>     Ouch, good point. The overflows are a different problem that we had
>     better address though (still on my own TODO list)...
>
>
> So I should remove the bounds checking from
> generate_series(date,date,[int]) altogether?

I feel rather uneasy about simply removing the 'infinity' checks. Is 
there a way to differentiate those two cases, i.e. when the 
generate_series is called in target list and in the FROM part? If yes, 
we could do the check only in the FROM part, which is the case that does 
not work (and consumes arbitrary amounts of memory).

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Re: Add generate_series(date, date) and generate_series(date, date, integer)

От
David Steele
Дата:
On 2/21/16 2:24 PM, Vik Fearing wrote:
> On 02/21/2016 07:56 PM, Corey Huinker wrote:
>>
>>     Other than that, the only difference is the ::date part. Is it
>>     really worth adding extra code just for that? I would say not.
>>
>>
>> I would argue it belongs for the sake of completeness.
>
> So would I.
>
> +1 for adding this missing function.

+1. FWIW, a sample query I wrote for a customer yesterday would have
looked nicer with this function. Here's how the generate_series looked:

generate_series('2016-03-01'::date, '2016-03-31'::date, interval '1
day')::date

But it would have been cleaner to write:

generate_series('2016-03-01'::date, '2016-03-31'::date)

More importantly, though, I don't like that the timestamp version of the
function happily takes date parameters but returns timestamps. I feel
this could lead to some subtle bugs for the unwary.

--
-David
david@pgmasters.net


Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:

I feel rather uneasy about simply removing the 'infinity' checks. Is there a way to differentiate those two cases, i.e. when the generate_series is called in target list and in the FROM part? If yes, we could do the check only in the FROM part, which is the case that does not work (and consumes arbitrary amounts of memory).


It would be simple enough to remove the infinity test on the "stop" and leave it on the "start". Or yank both. Just waiting for others to agree which checks should remain.


 

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Robert Haas
Дата:
On Fri, Mar 4, 2016 at 3:17 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
>>
>> I feel rather uneasy about simply removing the 'infinity' checks. Is there
>> a way to differentiate those two cases, i.e. when the generate_series is
>> called in target list and in the FROM part? If yes, we could do the check
>> only in the FROM part, which is the case that does not work (and consumes
>> arbitrary amounts of memory).
>
> It would be simple enough to remove the infinity test on the "stop" and
> leave it on the "start". Or yank both. Just waiting for others to agree
> which checks should remain.

Let's yank 'em.  This is a minor issue which is distracting us from
the main point of this patch, and I don't think it's worth getting
distracted.

+     <row>
+      <entry><literal><function>generate_series(<parameter>start</parameter>,
<parameter>stop</parameter>, <parameter>step
integer</parameter>)</function></literal></entry>
+      <entry><type>date</type></entry>
+      <entry><type>setof date</type></entry>
+      <entry>
+       Generate a series of values, from <parameter>start</parameter>
to <parameter>stop</parameter>
+       with a step size of <parameter>step</parameter>

I think this should be followed by the word "days" and a period.

+       else
+               /* do when there is no more left */
+               SRF_RETURN_DONE(funcctx);

I think we should drop the "else" and unindent the next two lines.
That's the style I have seen elsewhere.  Plus less indentation equals
more happiness.

I'm pretty meh about the whole idea of this function, though,
actually, and I don't see a single clear +1 vote for this
functionality upthread.  (Apologies if I've missed one.)  In the
absence of a few of those, I recommend we reject this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
> It would be simple enough to remove the infinity test on the "stop" and
> leave it on the "start". Or yank both. Just waiting for others to agree
> which checks should remain.

Let's yank 'em.  This is a minor issue which is distracting us from
the main point of this patch, and I don't think it's worth getting
distracted.

+1. It leaves this function consistent with the others, and if we want to add checks later we can do them all at the same time.
 

+     <row>
+      <entry><literal><function>generate_series(<parameter>start</parameter>,
<parameter>stop</parameter>, <parameter>step
integer</parameter>)</function></literal></entry>
+      <entry><type>date</type></entry>
+      <entry><type>setof date</type></entry>
+      <entry>
+       Generate a series of values, from <parameter>start</parameter>
to <parameter>stop</parameter>
+       with a step size of <parameter>step</parameter>

I think this should be followed by the word "days" and a period.


No objections. I just followed the pattern of the other generate_series() docs.
 
+       else
+               /* do when there is no more left */
+               SRF_RETURN_DONE(funcctx);

I think we should drop the "else" and unindent the next two lines.
That's the style I have seen elsewhere.  Plus less indentation equals
more happiness.

No objections here either. I just followed the pattern of generate_series() for int there.
 

I'm pretty meh about the whole idea of this function, though,
actually, and I don't see a single clear +1 vote for this
functionality upthread.  (Apologies if I've missed one.)  In the
absence of a few of those, I recommend we reject this.

Just David and Vik so far. The rest were either against(Simon), meh(Robert) or +1ed/-1ed the backpatch, leaving their thoughts on the function itself unspoken.

Happy to make the changes above if we're moving forward with it.

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
"David G. Johnston"
Дата:
On Tue, Mar 8, 2016 at 3:57 PM, Corey Huinker <corey.huinker@gmail.com> wrote:

I'm pretty meh about the whole idea of this function, though,
actually, and I don't see a single clear +1 vote for this
functionality upthread.  (Apologies if I've missed one.)  In the
absence of a few of those, I recommend we reject this.

Just David and Vik so far. The rest were either against(Simon), meh(Robert) or +1ed/-1ed the backpatch, leaving their thoughts on the function itself unspoken.

Happy to make the changes above if we're moving forward with it.

​I'll toss a user-land +1 for this.

David J.

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Alvaro Herrera
Дата:
Robert Haas wrote:

> Let's yank 'em.  This is a minor issue which is distracting us from
> the main point of this patch, and I don't think it's worth getting
> distracted.

+1

> I'm pretty meh about the whole idea of this function, though,
> actually, and I don't see a single clear +1 vote for this
> functionality upthread.  (Apologies if I've missed one.)  In the
> absence of a few of those, I recommend we reject this.

+1

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Michael Paquier
Дата:
On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> I'm pretty meh about the whole idea of this function, though,
>> actually, and I don't see a single clear +1 vote for this
>> functionality upthread.  (Apologies if I've missed one.)  In the
>> absence of a few of those, I recommend we reject this.
>
> +1

I'm meh for this patch.
-- 
Michael



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Robert Haas
Дата:
On Thu, Mar 10, 2016 at 1:53 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Robert Haas wrote:
>>> I'm pretty meh about the whole idea of this function, though,
>>> actually, and I don't see a single clear +1 vote for this
>>> functionality upthread.  (Apologies if I've missed one.)  In the
>>> absence of a few of those, I recommend we reject this.
>>
>> +1
>
> I'm meh for this patch.

+1: David Steele, Vik Fearing, David Johnson, Alvaro Herrera
meh: Robert Haas, Michael Paquier
-1: Simon Riggs

That's probably marginally enough support to proceed with this, but
I'm somewhat unenthused about putting time into it myself.  Alvaro,
any chance you want to handle this one?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Alvaro Herrera
Дата:
Robert Haas wrote:

> That's probably marginally enough support to proceed with this, but
> I'm somewhat unenthused about putting time into it myself.  Alvaro,
> any chance you want to handle this one?

OK, I can take it, but I have a few other things earlier in my queue so
it will be a few days.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Simon Riggs
Дата:
On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> I'm pretty meh about the whole idea of this function, though,
>> actually, and I don't see a single clear +1 vote for this
>> functionality upthread.  (Apologies if I've missed one.)  In the
>> absence of a few of those, I recommend we reject this.
>
> +1

I'm meh for this patch.

"meh" == +1

I thought it meant -1 

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Alvaro Herrera
Дата:
Simon Riggs wrote:
> On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com>
> wrote:
> 
> > On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
> > <alvherre@2ndquadrant.com> wrote:
> > > Robert Haas wrote:
> > >> I'm pretty meh about the whole idea of this function, though,
> > >> actually, and I don't see a single clear +1 vote for this
> > >> functionality upthread.  (Apologies if I've missed one.)  In the
> > >> absence of a few of those, I recommend we reject this.
> > >
> > > +1
> >
> > I'm meh for this patch.
> 
> "meh" == +1
> 
> I thought it meant -1

I would say that "meh" is +0, actually.  So the the tally is a small
positive number -- not great, but seems enough to press forward unless
we get more -1 votes.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Stephen Frost
Дата:
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> Simon Riggs wrote:
> > On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com>
> > wrote:
> >
> > > On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
> > > <alvherre@2ndquadrant.com> wrote:
> > > > Robert Haas wrote:
> > > >> I'm pretty meh about the whole idea of this function, though,
> > > >> actually, and I don't see a single clear +1 vote for this
> > > >> functionality upthread.  (Apologies if I've missed one.)  In the
> > > >> absence of a few of those, I recommend we reject this.
> > > >
> > > > +1
> > >
> > > I'm meh for this patch.
> >
> > "meh" == +1
> >
> > I thought it meant -1
>
> I would say that "meh" is +0, actually.  So the the tally is a small
> positive number -- not great, but seems enough to press forward unless
> we get more -1 votes.

I'm +1 on this also, for my part.

Thanks!

Stephen

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Robert Haas
Дата:
On Thu, Mar 10, 2016 at 10:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>> On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>> > Robert Haas wrote:
>> >> I'm pretty meh about the whole idea of this function, though,
>> >> actually, and I don't see a single clear +1 vote for this
>> >> functionality upthread.  (Apologies if I've missed one.)  In the
>> >> absence of a few of those, I recommend we reject this.
>> >
>> > +1
>>
>> I'm meh for this patch.
>
>
> "meh" == +1
>
> I thought it meant -1

In my case it meant, like, -0.5.  I don't really like adding lots of
utility functions like this to the default install, because I'm not
sure how widely they get used and it gradually increases the size of
the code, system catalogs, etc.  But I also don't want to block
genuinely useful things.  So basically, I'm not excited about this
patch, but I don't want to fight about it either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
On Thu, Mar 10, 2016 at 10:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 10, 2016 at 10:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>> On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>> > Robert Haas wrote:
>> >> I'm pretty meh about the whole idea of this function, though,
>> >> actually, and I don't see a single clear +1 vote for this
>> >> functionality upthread.  (Apologies if I've missed one.)  In the
>> >> absence of a few of those, I recommend we reject this.
>> >
>> > +1
>>
>> I'm meh for this patch.
>
>
> "meh" == +1
>
> I thought it meant -1

In my case it meant, like, -0.5.  I don't really like adding lots of
utility functions like this to the default install, because I'm not
sure how widely they get used and it gradually increases the size of
the code, system catalogs, etc.  But I also don't want to block
genuinely useful things.  So basically, I'm not excited about this
patch, but I don't want to fight about it either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

New patch for Alvaro's consideration.

Very minor changes since the last time, the explanations below are literally longer than the changes:
- Rebased, though I don't think any of the files had changed in the mean time
- Removed infinity checks/errors and the test cases to match
- Amended documentation to add 'days' after 'step' as suggested
- Did not add a period as suggested, to remain consistent with other descriptions in the same sgml table
- Altered test case and documentation of 7 day step example so that the generated dates do not land evenly on the end date. A reader might incorrectly infer that the end date must be in the result set, when it doesn't have to be.
- Removed unnecessary indentation that existed purely due to following of other generate_series implementations
Вложения

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Alvaro Herrera
Дата:
Corey Huinker wrote:

> New patch for Alvaro's consideration.

Thanks.  As I said, it will be a few days before I get to this; any
further reviews in the meantime are welcome.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Michael Paquier
Дата:
On Thu, Mar 10, 2016 at 4:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Mar 10, 2016 at 10:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com>
>> wrote:
>>>
>>> On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
>>> <alvherre@2ndquadrant.com> wrote:
>>> > Robert Haas wrote:
>>> >> I'm pretty meh about the whole idea of this function, though,
>>> >> actually, and I don't see a single clear +1 vote for this
>>> >> functionality upthread.  (Apologies if I've missed one.)  In the
>>> >> absence of a few of those, I recommend we reject this.
>>> >
>>> > +1
>>>
>>> I'm meh for this patch.
>>
>>
>> "meh" == +1
>>
>> I thought it meant -1
>
> In my case it meant, like, -0.5.  I don't really like adding lots of
> utility functions like this to the default install, because I'm not
> sure how widely they get used and it gradually increases the size of
> the code, system catalogs, etc.  But I also don't want to block
> genuinely useful things.  So basically, I'm not excited about this
> patch, but I don't want to fight about it either.

I am of the same feeling, at -0.5. I don't feel like putting -1 for
this patch, as I don't really understand why this is worth adding more
complexity in the code for something that can be done with
generate_series using timestamps. Also, why only dates? And why not
other units like hours or seconds?
-- 
Michael



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
"David G. Johnston"
Дата:
On Thu, Mar 10, 2016 at 8:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 10, 2016 at 10:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>> On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>> > Robert Haas wrote:
>> >> I'm pretty meh about the whole idea of this function, though,
>> >> actually, and I don't see a single clear +1 vote for this
>> >> functionality upthread.  (Apologies if I've missed one.)  In the
>> >> absence of a few of those, I recommend we reject this.
>> >
>> > +1
>>
>> I'm meh for this patch.
>
>
> "meh" == +1
>
> I thought it meant -1

In my case it meant, like, -0.5.  I don't really like adding lots of
utility functions like this to the default install, because I'm not
sure how widely they get used and it gradually increases the size of
the code, system catalogs, etc.  But I also don't want to block
genuinely useful things.  So basically, I'm not excited about this
patch, but I don't want to fight about it either.

​I tend to think we err toward this too much.  This seems like development concerns trumping usability.  Consider that anything someone took the time to write and polish to make committable to core was obviously genuinely useful to them - and for every person capable of actually taking things that far there are likely many more like myself who cannot but have encountered the, albeit minor, usability annoyance that this kind of function seeks to remove.

I really don't care that the codebase is marginally larger or that there is another few records in the catalog - I hope that our code organization and performance is capable of handling it (and many more).

The overhead of adding an entirely new concept to core would give me more pause in that situation but this function in particular simply fleshes out what the user community sees to be a minor yet notable lack in our existing offering of the generate_series() feature.  Its threshold for meeting "genuinely" should be considerably lower than for more invasive or complex features such as those that require entirely new syntax (e.g., the recently rejected ALTER ROLE patch) to implement.

If something can be done with a user-defined function on stock PostgreSQL, especially for concepts or features we already have, the decision to commit a c-language function that someone else has written and others have reviewed should, IMO, be given the benefit of assumed usefulness.

My $0.02

David J.

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
"David G. Johnston"
Дата:
On Thu, Mar 10, 2016 at 9:30 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Mar 10, 2016 at 4:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Mar 10, 2016 at 10:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com>
>> wrote:
>>>
>>> On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
>>> <alvherre@2ndquadrant.com> wrote:
>>> > Robert Haas wrote:
>>> >> I'm pretty meh about the whole idea of this function, though,
>>> >> actually, and I don't see a single clear +1 vote for this
>>> >> functionality upthread.  (Apologies if I've missed one.)  In the
>>> >> absence of a few of those, I recommend we reject this.
>>> >
>>> > +1
>>>
>>> I'm meh for this patch.
>>
>>
>> "meh" == +1
>>
>> I thought it meant -1
>
> In my case it meant, like, -0.5.  I don't really like adding lots of
> utility functions like this to the default install, because I'm not
> sure how widely they get used and it gradually increases the size of
> the code, system catalogs, etc.  But I also don't want to block
> genuinely useful things.  So basically, I'm not excited about this
> patch, but I don't want to fight about it either.

I am of the same feeling, at -0.5. I don't feel like putting -1 for
this patch, as I don't really understand why this is worth adding more
complexity in the code for something that can be done with
generate_series using timestamps. Also, why only dates? And why not
other units like hours or seconds?

​A date is a type, hours and seconds are not types.  To use hours and seconds you need timestamp (I guess we could offer a "time" version of this function too) which we already have.  Also, not choosing to implement something else generally shouldn't preclude something that exists and have genuine value from being committed.

Obviously there is some user-land annoyance at having to play with timestamp when all one really cares about is date.  Given the prevalence of date usage in user-land this is not surprising.

We're not forcing anyone to review this that doesn't see that it is worth their time.  We are asking th
​at​
the people that the community has placed in a position of authority spend some a limited amount of effort reviewing a minor addition that has been deemed desirable and that has already been reviewed and deemed something that meets the project's technical requirements.
​  The expectation is that the amount of ongoing support this function would require is similar to that of the existing generate_series functions.​


​This is something that can be easily added by the user as a SQL function - its complexity cannot be so great as to be deemed a risk to the system but including its c-language variant.  As you said, we already do something very similar for timestamps so the marginal complexity being added shouldn't be significant.

If you are going to -1 or -0.5 for "adds too much complexity" it would be helpful to know specifics.  Scanning the thread the only real concern was dealing with infinity - which is already a complexity the current functions have so there is no "additional" there - but maybe I've missed something.

I understand Robert's position and while I find it to be community-hostile this is an open source project and so I accept that this is a possible outcome.  But as soon as he asked for some +1s he got them (mostly without reasons but the reality is that the request was for desire) and a few "I vote -0.5 since my dislike is only half-baked".  And the fact is that a single +1 for the idea likely represents many people at large who would use the function if present while I suspect most of those who could offer an informed -1 are already on this list.  The vast majority probably don't care either way as long as we don't introduce bugs.

David J.

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
removed leftover development comment

On Thu, Mar 10, 2016 at 11:02 AM, Corey Huinker <corey.huinker@gmail.com> wrote:
On Thu, Mar 10, 2016 at 10:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 10, 2016 at 10:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 10 March 2016 at 06:53, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>> On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>> > Robert Haas wrote:
>> >> I'm pretty meh about the whole idea of this function, though,
>> >> actually, and I don't see a single clear +1 vote for this
>> >> functionality upthread.  (Apologies if I've missed one.)  In the
>> >> absence of a few of those, I recommend we reject this.
>> >
>> > +1
>>
>> I'm meh for this patch.
>
>
> "meh" == +1
>
> I thought it meant -1

In my case it meant, like, -0.5.  I don't really like adding lots of
utility functions like this to the default install, because I'm not
sure how widely they get used and it gradually increases the size of
the code, system catalogs, etc.  But I also don't want to block
genuinely useful things.  So basically, I'm not excited about this
patch, but I don't want to fight about it either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

New patch for Alvaro's consideration.

Very minor changes since the last time, the explanations below are literally longer than the changes:
- Rebased, though I don't think any of the files had changed in the mean time
- Removed infinity checks/errors and the test cases to match
- Amended documentation to add 'days' after 'step' as suggested
- Did not add a period as suggested, to remain consistent with other descriptions in the same sgml table
- Altered test case and documentation of 7 day step example so that the generated dates do not land evenly on the end date. A reader might incorrectly infer that the end date must be in the result set, when it doesn't have to be.
- Removed unnecessary indentation that existed purely due to following of other generate_series implementations

Вложения

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Robert Haas
Дата:
On Thu, Mar 10, 2016 at 11:33 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> I tend to think we err toward this too much.  This seems like development
> concerns trumping usability.  Consider that anything someone took the time
> to write and polish to make committable to core was obviously genuinely
> useful to them - and for every person capable of actually taking things that
> far there are likely many more like myself who cannot but have encountered
> the, albeit minor, usability annoyance that this kind of function seeks to
> remove.

Sure, an individual function like this has almost no negative impact.
On the other hand, working around its absence is also trivial.  You
can create a wrapper function that does exactly this in a couple of
lines of SQL.  In my opinion, saying that people should do that in
they need it has some advantages over shipping it to everyone.  If you
don't have it and you want it, you can easily get it.  But what if you
have it and you don't want it, for example because what you really
want is a minimal postgres installation?  You can't take anything in
core back out again, or at least not easily.  Everything about core is
expanding very randomly - code size, shared memory footprint, all of
it.  If you think that has no downside for users, I respectfully
disagree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Simon Riggs
Дата:
On 10 March 2016 at 18:45, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 10, 2016 at 11:33 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> I tend to think we err toward this too much.  This seems like development
> concerns trumping usability.  Consider that anything someone took the time
> to write and polish to make committable to core was obviously genuinely
> useful to them - and for every person capable of actually taking things that
> far there are likely many more like myself who cannot but have encountered
> the, albeit minor, usability annoyance that this kind of function seeks to
> remove.

Sure, an individual function like this has almost no negative impact.
On the other hand, working around its absence is also trivial.  You
can create a wrapper function that does exactly this in a couple of
lines of SQL.  In my opinion, saying that people should do that in
they need it has some advantages over shipping it to everyone.  If you
don't have it and you want it, you can easily get it.  But what if you
have it and you don't want it, for example because what you really
want is a minimal postgres installation?  You can't take anything in
core back out again, or at least not easily.  Everything about core is
expanding very randomly - code size, shared memory footprint, all of
it.  If you think that has no downside for users, I respectfully
disagree.

Not sure what y'all are discussing, but I should add that I would have committed this based solely upon Vik's +1.

My objection was made, then overruled; that works because the objection wasn't "it won't work", only a preference so I'm happy.

But I still don't know "meh" means.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Robert Haas
Дата:
On Thu, Mar 10, 2016 at 2:35 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> But I still don't know "meh" means.

Maybe this helps?

https://en.wikipedia.org/wiki/Meh

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
"Igal @ Lucee.org"
Дата:
On 3/10/2016 11:44 AM, Robert Haas wrote:
> On Thu, Mar 10, 2016 at 2:35 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> But I still don't know "meh" means.
> Maybe this helps?
>
> https://en.wikipedia.org/wiki/Meh

LOL
https://en.wikipedia.org/wiki/LOL




Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
"David G. Johnston"
Дата:
On Thu, Mar 10, 2016 at 11:45 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 10, 2016 at 11:33 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> I tend to think we err toward this too much.  This seems like development
> concerns trumping usability.  Consider that anything someone took the time
> to write and polish to make committable to core was obviously genuinely
> useful to them - and for every person capable of actually taking things that
> far there are likely many more like myself who cannot but have encountered
> the, albeit minor, usability annoyance that this kind of function seeks to
> remove.

Sure, an individual function like this has almost no negative impact.
On the other hand, working around its absence is also trivial.  You
can create a wrapper function that does exactly this in a couple of
lines of SQL.  In my opinion, saying that people should do that in
they need it has some advantages over shipping it to everyone.  If you
don't have it and you want it, you can easily get it.  But what if you
have it and you don't want it, for example because what you really
want is a minimal postgres installation?

I'd rather cater to the people who are content that everything in PostgreSQL is of high quality and ignore those things that they have no immediate need for - and then are happy to find out that when they have a new need PostgreSQL already provides them well thought-out and tested tool that they can use.

We purport to be highly extensible but that doesn't preclude us from being well-stocked at the same time.

And by not including these kinds of things in core the broader ecosystem is harmed since not every provider or PostgreSQL services is willing or capable of allowing random third-party extensions; nor is adding 20 "important and generally useful to me but an annoyance to the project" functions to every database I create a trivial exercise.  But it is trivial to ignore something that exists but that I have no need for.

You can't take anything in
core back out again, or at least not easily.  Everything about core is
expanding very randomly - code size, shared memory footprint, all of
it.  If you think that has no downside for users, I respectfully
disagree.

Attempts to limit that expansion seemingly happen "very randomly" as well.​

If there is a goal and demand for a "minimalist" installation then we don't seem to have anything going on that is actively trying to make it a reality.  I'm likely being a bit myopic here but at the same time the increased footprint from JSON, parallel queries, and the like dwarf the contribution a handful of C-language functions would add.

I do understand why more trivial features are scrutinized the way they are but I still hold the opinion that features such as this should generally be given the benefit of inclusion unless there are concrete technical concerns.

David J.


Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Gavin Flower
Дата:
On 11/03/16 08:48, Igal @ Lucee.org wrote:
> On 3/10/2016 11:44 AM, Robert Haas wrote:
>> On Thu, Mar 10, 2016 at 2:35 PM, Simon Riggs <simon@2ndquadrant.com> 
>> wrote:
>>> But I still don't know "meh" means.
>> Maybe this helps?
>>
>> https://en.wikipedia.org/wiki/Meh
>
> LOL
> https://en.wikipedia.org/wiki/LOL
>
>
>
I'm pretending to work, so will you and Robert stop making that pretence 
more difficult!!!  :-)



Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
David Steele
Дата:
On 3/10/16 1:24 PM, Corey Huinker wrote:

>     New patch for Alvaro's consideration.
>
>     Very minor changes since the last time, the explanations below are
>     literally longer than the changes:
>     - Rebased, though I don't think any of the files had changed in the
>     mean time
>     - Removed infinity checks/errors and the test cases to match
>     - Amended documentation to add 'days' after 'step' as suggested
>     - Did not add a period as suggested, to remain consistent with other
>     descriptions in the same sgml table
>     - Altered test case and documentation of 7 day step example so that
>     the generated dates do not land evenly on the end date. A reader
>     might incorrectly infer that the end date must be in the result set,
>     when it doesn't have to be.
>     - Removed unnecessary indentation that existed purely due to
>     following of other generate_series implementations

As far as I can see overall support is in favor of this patch although 
it is not overwhelming by any means.

I think in this case it comes down to a committer's judgement so I have 
marked this "ready for committer" and passed the buck on to Álvaro.

-- 
-David
david@pgmasters.net



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Dean Rasheed
Дата:
On 16 March 2016 at 23:32, David Steele <david@pgmasters.net> wrote:
> On 3/10/16 1:24 PM, Corey Huinker wrote:
>
>>     New patch for Alvaro's consideration.
>>
>>     Very minor changes since the last time, the explanations below are
>>     literally longer than the changes:
>>     - Rebased, though I don't think any of the files had changed in the
>>     mean time
>>     - Removed infinity checks/errors and the test cases to match
>>     - Amended documentation to add 'days' after 'step' as suggested
>>     - Did not add a period as suggested, to remain consistent with other
>>     descriptions in the same sgml table
>>     - Altered test case and documentation of 7 day step example so that
>>     the generated dates do not land evenly on the end date. A reader
>>     might incorrectly infer that the end date must be in the result set,
>>     when it doesn't have to be.
>>     - Removed unnecessary indentation that existed purely due to
>>     following of other generate_series implementations
>
>
> As far as I can see overall support is in favor of this patch although it is
> not overwhelming by any means.
>
> I think in this case it comes down to a committer's judgement so I have
> marked this "ready for committer" and passed the buck on to Álvaro.
>

So I was pretty much "meh" on this patch too, because I'm not
convinced it actually saves much typing, if any.

However, I now realise that it introduces a backwards-compatibility
breakage. Today it is possible to type

SELECT * FROM generate_series('01-01-2000'::date, '01-04-2000', '7 days');

and it works with just that one cast. However, this patch breaks that.

Now I'm not saying that I have used the above construct. Probably not
in fact, but maybe my work colleagues have. I honestly can't say,  but
the thought of having to grep through thousands of lines of
application code to check isn't particularly appealing.

So I'm afraid it's -1 from me.

Regards,
Dean



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
David Steele
Дата:
On 3/17/16 4:49 AM, Dean Rasheed wrote:

> On 16 March 2016 at 23:32, David Steele <david@pgmasters.net> wrote:
>
>>
>> I think in this case it comes down to a committer's judgement so I have
>> marked this "ready for committer" and passed the buck on to Álvaro.
> 
> So I was pretty much "meh" on this patch too, because I'm not
> convinced it actually saves much typing, if any.
> 
> However, I now realise that it introduces a backwards-compatibility
> breakage. Today it is possible to type
> 
> SELECT * FROM generate_series('01-01-2000'::date, '01-04-2000', '7 days');

It can also be broken as below and this is even scarier to me:

postgres=# SELECT * FROM generate_series('01-01-2000'::date,
'01-04-2000'::date, '7 days');
ERROR:  invalid input syntax for integer: "7 days"
LINE 1: ...te_series('01-01-2000'::date, '01-04-2000'::date, '7 days');

And only works when:

postgres=# SELECT * FROM generate_series('01-01-2000'::date,
'01-04-2000'::date, '7 days'::interval);   generate_series
------------------------2000-01-01 00:00:00+00
(1 row)

One might argue that string constants for dates in actual code might be
rare but I think it's safe to say that string constants for intervals
are pretty common.  I also think it unlikely that they are all
explicitly cast.

I marked this "waiting for author" so Corey can respond.  I actually
tested with the v3 patch since the v4 patch seems to be broken with git
apply or patch:

$ patch -p1 < ../other/generate_series_date.v4.diff
patching file doc/src/sgml/func.sgml
Hunk #1 succeeded at 14787 (offset 87 lines).
Hunk #2 succeeded at 14871 (offset 87 lines).
patching file src/backend/utils/adt/date.c
patch: **** malformed patch at line 163: diff --git
a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h

$ git apply -3 ../other/generate_series_date.v4.diff
fatal: corrupt patch at line 163

-- 
-David
david@pgmasters.net



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
On Thu, Mar 17, 2016 at 10:00 AM, David Steele <david@pgmasters.net> wrote:
On 3/17/16 4:49 AM, Dean Rasheed wrote:

> On 16 March 2016 at 23:32, David Steele <david@pgmasters.net> wrote:
>
>>
>> I think in this case it comes down to a committer's judgement so I have
>> marked this "ready for committer" and passed the buck on to Álvaro.
>
> So I was pretty much "meh" on this patch too, because I'm not
> convinced it actually saves much typing, if any.
>
> However, I now realise that it introduces a backwards-compatibility
> breakage. Today it is possible to type
>
> SELECT * FROM generate_series('01-01-2000'::date, '01-04-2000', '7 days');

It can also be broken as below and this is even scarier to me:

postgres=# SELECT * FROM generate_series('01-01-2000'::date,
'01-04-2000'::date, '7 days');
ERROR:  invalid input syntax for integer: "7 days"
LINE 1: ...te_series('01-01-2000'::date, '01-04-2000'::date, '7 days');

And only works when:

postgres=# SELECT * FROM generate_series('01-01-2000'::date,
'01-04-2000'::date, '7 days'::interval);
    generate_series
------------------------
 2000-01-01 00:00:00+00
(1 row)

One might argue that string constants for dates in actual code might be
rare but I think it's safe to say that string constants for intervals
are pretty common.  I also think it unlikely that they are all
explicitly cast.

I marked this "waiting for author" so Corey can respond.  I actually
tested with the v3 patch since the v4 patch seems to be broken with git
apply or patch:

$ patch -p1 < ../other/generate_series_date.v4.diff
patching file doc/src/sgml/func.sgml
Hunk #1 succeeded at 14787 (offset 87 lines).
Hunk #2 succeeded at 14871 (offset 87 lines).
patching file src/backend/utils/adt/date.c
patch: **** malformed patch at line 163: diff --git
a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h

$ git apply -3 ../other/generate_series_date.v4.diff
fatal: corrupt patch at line 163

--
-David
david@pgmasters.net

Not sure what's wrong with the patch, but I get a clean one to you pending the outcome of the design discussion. v4 just ripped out the infinity tests, so v3 is valid for the issue you found..

So the function clobbers the situation where the user meant a timestamp range but instead gave two dates, and meant an interval but gave a string. I'm curious how generate_series() for numeric didn't have the same issue with floats.

I see two ways around this: 

1. Drop the step parameter entirely. My own use cases only ever require the step values 1 and -1, and which one the user wants can be inferred from (start < end). This would still change the output type where a person wanted timestamps, but instead input two dates.
2. Rename the function date_series() or generate_series_date()

I still think this is an important function because at the last several places I've worked, I've found myself manufacturing a query where some event data is likely to have date gaps, but we want to see the date gaps or at least have the 0 values contribute to a later average aggregate.

Like this:

select d.dt, sum(s.v1), sum(s2.v2), sum(t.v1), sum(t2.v2)
from   date_series_function(input1,input2) as d(dt),
       some_dimensional_characteristic c
left join sparse_table s on s.event_date = d.dt and s.c1 = c.cval
left join sparse_table s2 on s2.event_date = d.dt and s2.c2 = c.cval
left join other_sparse t on t.event_date = d.dt and t.c1 = c.cval
left join other_sparse t2 on t2.event_date = d.dt and t2.c2 = c.cval
group by d.dt
order by d.dt

This gets cumbersome when you have to cast every usage of your date column.

Furthermore, this is 90%+ of my usage of generate_series(), the remaining 10% being the integer case to populate sample data for tests.

Any thoughts on how to proceed?

Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
"David G. Johnston"
Дата:
On Thu, Mar 17, 2016 at 7:57 AM, Corey Huinker <corey.huinker@gmail.com> wrote:
On Thu, Mar 17, 2016 at 10:00 AM, David Steele <david@pgmasters.net> wrote:
On 3/17/16 4:49 AM, Dean Rasheed wrote:

> On 16 March 2016 at 23:32, David Steele <david@pgmasters.net> wrote:
>
>>
>> I think in this case it comes down to a committer's judgement so I have
>> marked this "ready for committer" and passed the buck on to Álvaro.
>
> So I was pretty much "meh" on this patch too, because I'm not
> convinced it actually saves much typing, if any.
>
> However, I now realise that it introduces a backwards-compatibility
> breakage. Today it is possible to type
>
> SELECT * FROM generate_series('01-01-2000'::date, '01-04-2000', '7 days');

It can also be broken as below and this is even scarier to me:

Above and below are the same query​...
 

postgres=# SELECT * FROM generate_series('01-01-2000'::date,
'01-04-2000'::date, '7 days');
ERROR:  invalid input syntax for integer: "7 days"
LINE 1: ...te_series('01-01-2000'::date, '01-04-2000'::date, '7 days');

And only works when:

postgres=# SELECT * FROM generate_series('01-01-2000'::date,
'01-04-2000'::date, '7 days'::interval);
    generate_series
------------------------
 2000-01-01 00:00:00+00
(1 row)
--
-David
david@pgmasters.net

Not sure what's wrong with the patch, but I get a clean one to you pending the outcome of the design discussion. v4 just ripped out the infinity tests, so v3 is valid for the issue you found..

So the function clobbers the situation where the user meant a timestamp range but instead gave two dates, and meant an interval but gave a string. I'm curious how generate_series() for numeric didn't have the same issue with floats.


​The numeric forms use anyelement for all three arguments but the timestamp version uses an explicit interval for the third.​


I see two ways around this: 

1. Drop the step parameter entirely. My own use cases only ever require the step values 1 and -1, and which one the user wants can be inferred from (start < end). This would still change the output type where a person wanted timestamps, but instead input two dates.

​Undesirable.​

2. Rename the function date_series() or generate_series_date()

I still think this is an important function because at the last several places I've worked, I've found myself manufacturing a query where some event data is likely to have date gaps, but we want to see the date gaps or at least have the 0 values contribute to a later average aggregate.


​I'd call it "generate_dates(...)" and be done with it.

We would then have:

generate_series()
generate_subscripts()
generate_dates()

David J.
 

Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
David Steele
Дата:
On 3/17/16 11:30 AM, David G. Johnston wrote:
> On Thu, Mar 17, 2016 at 7:57 AM, Corey Huinker <corey.huinker@gmail.com
> <mailto:corey.huinker@gmail.com>>wrote:
> 
>     On Thu, Mar 17, 2016 at 10:00 AM, David Steele <david@pgmasters.net
>     <mailto:david@pgmasters.net>> wrote:
> 
>         On 3/17/16 4:49 AM, Dean Rasheed wrote:
> 
>         > On 16 March 2016 at 23:32, David Steele <david@pgmasters.net <mailto:david@pgmasters.net>> wrote:
>         >
>         >>
>         >> I think in this case it comes down to a committer's judgement so I have
>         >> marked this "ready for committer" and passed the buck on to Álvaro.
>         >
>         > So I was pretty much "meh" on this patch too, because I'm not
>         > convinced it actually saves much typing, if any.
>         >
>         > However, I now realise that it introduces a backwards-compatibility
>         > breakage. Today it is possible to type
>         >
>         > SELECT * FROM generate_series('01-01-2000'::date, '01-04-2000', '7 days');
> 
>         It can also be broken as below and this is even scarier to me:
> 
> 
> Above and below are the same query​...

Not sure I agree.  My point was that even if developers were pretty
careful with their casting (or are using two actual dates) then there's
still possibility for breakage.

>         postgres=# SELECT * FROM generate_series('01-01-2000'::date,
>         '01-04-2000'::date, '7 days');
>         ERROR:  invalid input syntax for integer: "7 days"
>         LINE 1: ...te_series('01-01-2000'::date, '01-04-2000'::date, '7
>         days');
> <...>
> 
>     I see two ways around this: 
> 
>     1. Drop the step parameter entirely. My own use cases only ever
>     require the step values 1 and -1, and which one the user wants can
>     be inferred from (start < end). This would still change the output
>     type where a person wanted timestamps, but instead input two dates.
> 
> ​Undesirable.​

Very undesirable.  Week intervals are a very valid use case and I don't
like the automagic interval idea.

> 
>     2. Rename the function date_series() or generate_series_date()
> 
>     I still think this is an important function because at the last
>     several places I've worked, I've found myself manufacturing a query
>     where some event data is likely to have date gaps, but we want to
>     see the date gaps or at least have the 0 values contribute to a
>     later average aggregate.
> 
> 
> ​I'd call it "generate_dates(...)" and be done with it.
> 
> We would then have:
> 
> generate_series()
> generate_subscripts()
> generate_dates()

To me this completely negates the idea of this "just working" which is
why it got a +1 from me in the first place.  If I have to remember to
use a different function name then I'd prefer to just cast on the
timestamp version of generate_series().

Sorry, but this is now -1 from me, at least for this commitfest.  While
I like the idea I think it is far too late to be redesigning such a
minor feature.

-- 
-David
david@pgmasters.net



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:

On Thu, Mar 17, 2016 at 11:30 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
​I'd call it "generate_dates(...)" and be done with it.

Sold. Hope to have a revised patch for you today or tomorrow.

Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
"David G. Johnston"
Дата:
On Thu, Mar 17, 2016 at 8:41 AM, David Steele <david@pgmasters.net> wrote:
On 3/17/16 11:30 AM, David G. Johnston wrote:
> On Thu, Mar 17, 2016 at 7:57 AM, Corey Huinker <corey.huinker@gmail.com
> <mailto:corey.huinker@gmail.com>>wrote:
>
>     On Thu, Mar 17, 2016 at 10:00 AM, David Steele <david@pgmasters.net
>     <mailto:david@pgmasters.net>> wrote:
>
>         On 3/17/16 4:49 AM, Dean Rasheed wrote:
>
>         > On 16 March 2016 at 23:32, David Steele <david@pgmasters.net <mailto:david@pgmasters.net>> wrote:
>         >
>         >>
>         >> I think in this case it comes down to a committer's judgement so I have
>         >> marked this "ready for committer" and passed the buck on to Álvaro.
>         >
>         > So I was pretty much "meh" on this patch too, because I'm not
>         > convinced it actually saves much typing, if any.
>         >
>         > However, I now realise that it introduces a backwards-compatibility
>         > breakage. Today it is possible to type
>         >
>         > SELECT * FROM generate_series('01-01-2000'::date, '01-04-2000', '7 days');
>
>         It can also be broken as below and this is even scarier to me:
>
>
> Above and below are the same query​...

Not sure I agree.  My point was that even if developers were pretty
careful with their casting (or are using two actual dates) then there's
still possibility for breakage.

With the first argument casted to date it doesn't matter whether you cast the second argument as the pseudo-type anyelement will take its value from the first argument and force the second to date.

The main problem is that the system tries to cast unknown to integer based upon finding gs(date, date, integer) and fails without ever considering that gs(ts, ts, interval) would succeed.


>         postgres=# SELECT * FROM generate_series('01-01-2000'::date,
>         '01-04-2000'::date, '7 days');
>         ERROR:  invalid input syntax for integer: "7 days"
>         LINE 1: ...te_series('01-01-2000'::date, '01-04-2000'::date, '7
>         days');
> <...>
>
>     I see two ways around this:
>
>     1. Drop the step parameter entirely. My own use cases only ever
>     require the step values 1 and -1, and which one the user wants can
>     be inferred from (start < end). This would still change the output
>     type where a person wanted timestamps, but instead input two dates.
>
> ​Undesirable.​

Very undesirable.  Week intervals are a very valid use case and I don't
like the automagic interval idea.

>
>     2. Rename the function date_series() or generate_series_date()
>
>     I still think this is an important function because at the last
>     several places I've worked, I've found myself manufacturing a query
>     where some event data is likely to have date gaps, but we want to
>     see the date gaps or at least have the 0 values contribute to a
>     later average aggregate.
>
>
> ​I'd call it "generate_dates(...)" and be done with it.
>
> We would then have:
>
> generate_series()
> generate_subscripts()
> generate_dates()

To me this completely negates the idea of this "just working" which is
why it got a +1 from me in the first place.  If I have to remember to
use a different function name then I'd prefer to just cast on the
timestamp version of generate_series().


​So let the user decide whether to trade-off learning a new function name but getting dates instead of timestamps or going through the hassle of casting.​

For me, it would have been a nice bonus if the generate_series() could be used directly but I feel that the desired functionality is desirable and the name itself is of only minor consideration.

I can see that newbies might ponder why two functions exist but they should understand "backward compatibility".

I suspect that most people will simply think: "use generate_series for numbers and use generate_dates for, well, dates".  The ones left out in the cold are those wondering why they use "generate_series" for timestamp series with a finer precision than date.  I'd be willing to offer a "generate_timestamps" to placate them.  And might as well toss in "generate_numbers" for completeness - though now I likely doom the proposal...so I'll stick with just add generate_dates so the behavior is possible without superfluous casting or the need to write a custom function.  Dates are too common a unit of measure to leave working with them this cumbersome.

David J.

Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Tom Lane
Дата:
David Steele <david@pgmasters.net> writes:
> On 3/17/16 11:30 AM, David G. Johnston wrote:
>> ​I'd call it "generate_dates(...)" and be done with it.
>> We would then have:
>> generate_series()
>> generate_subscripts()
>> generate_dates()

> To me this completely negates the idea of this "just working" which is
> why it got a +1 from me in the first place.  If I have to remember to
> use a different function name then I'd prefer to just cast on the
> timestamp version of generate_series().

Yeah, this point greatly weakens the desirability of this function IMO.
I've also gone from "don't care" to "-1".
        regards, tom lane



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
David Steele
Дата:
On 3/17/16 11:55 AM, David G. Johnston wrote:

> On Thu, Mar 17, 2016 at 8:41 AM, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>>wrote:
> 
>> Not sure I agree.  My point was that even if developers were pretty
>> careful with their casting (or are using two actual dates) then there's
>> still possibility for breakage.
> 
> With the first argument casted to date it doesn't matter whether you
> cast the second argument as the pseudo-type anyelement will take its
> value from the first argument and force the second to date.

Ah, I see.

> The main problem is that the system tries to cast unknown to integer
> based upon finding gs(date, date, integer) and fails without ever
> considering that gs(ts, ts, interval) would succeed.

I'm tempted to say, "why don't we just fix that?" but I'm sure any
changes in that area would introduce a variety of new and interesting
behaviors.

> ​So let the user decide whether to trade-off learning a new function
> name but getting dates instead of timestamps or going through the hassle
> of casting.​
> 
> For me, it would have been a nice bonus if the generate_series() could
> be used directly but I feel that the desired functionality is desirable
> and the name itself is of only minor consideration.
> 
> I can see that newbies might ponder why two functions exist but they
> should understand "backward compatibility".
> 
> I suspect that most people will simply think: "use generate_series for
> numbers and use generate_dates for, well, dates".  The ones left out in
> the cold are those wondering why they use "generate_series" for
> timestamp series with a finer precision than date.  I'd be willing to
> offer a "generate_timestamps" to placate them.  And might as well toss
> in "generate_numbers" for completeness - though now I likely doom the
> proposal...so I'll stick with just add generate_dates so the behavior is
> possible without superfluous casting or the need to write a custom
> function.  Dates are too common a unit of measure to leave working with
> them this cumbersome.

I'm not finding this argument persuasive.

-- 
-David
david@pgmasters.net



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Corey Huinker
Дата:
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Mar 17, 2016 at 12:04 PM, Tom Lane <span
dir="ltr"><<ahref="mailto:tgl@sss.pgh.pa.us" target="_blank">tgl@sss.pgh.pa.us</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">DavidSteele <<a href="mailto:david@pgmasters.net">david@pgmasters.net</a>> writes:<br /> > On 3/17/16
11:30AM, David G. Johnston wrote:<br /></span><span class="">>> ​I'd call it "generate_dates(...)" and be done
withit.<br /> >> We would then have:<br /> >> generate_series()<br /> >> generate_subscripts()<br />
>>generate_dates()<br /><br /> > To me this completely negates the idea of this "just working" which is<br />
>why it got a +1 from me in the first place.  If I have to remember to<br /> > use a different function name then
I'dprefer to just cast on the<br /> > timestamp version of generate_series().<br /><br /></span>Yeah, this point
greatlyweakens the desirability of this function IMO.<br /> I've also gone from "don't care" to "-1".<br /><br />      
                 regards, tom lane<br /></blockquote></div><br /></div><div class="gmail_extra">Since that diminishes
thealready moderate support for this patch, I'll withdraw it.</div><div class="gmail_extra"><br /></div><div
class="gmail_extra"><br/></div><div class="gmail_extra"><br /></div><div class="gmail_extra"><br /></div><div
class="gmail_extra"><br/></div><div class="gmail_extra"><br /></div></div> 

Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Tom Lane
Дата:
David Steele <david@pgmasters.net> writes:
> On 3/17/16 11:55 AM, David G. Johnston wrote:
>> With the first argument casted to date it doesn't matter whether you
>> cast the second argument as the pseudo-type anyelement will take its
>> value from the first argument and force the second to date.

> Ah, I see.

FWIW, there isn't any "anyelement" involved here, AFAICT.  The set
of functions we have today is:

regression=# \df generate*                                                                List of functions  Schema   |
      Name         |         Result data type          |                        Argument data types
   |  Type  
 

------------+---------------------+-----------------------------------+--------------------------------------------------------------------+--------pg_catalog
|generate_series     | SETOF bigint                      | bigint, bigint
     | normalpg_catalog | generate_series     | SETOF bigint                      | bigint, bigint, bigint
                              | normalpg_catalog | generate_series     | SETOF integer                     | integer,
integer                                                  | normalpg_catalog | generate_series     | SETOF integer
             | integer, integer, integer                                          | normalpg_catalog | generate_series
  | SETOF numeric                     | numeric, numeric                                                   |
normalpg_catalog| generate_series     | SETOF numeric                     | numeric, numeric, numeric
                      | normalpg_catalog | generate_series     | SETOF timestamp with time zone    | timestamp with
timezone, timestamp with time zone, interval       | normalpg_catalog | generate_series     | SETOF timestamp without
timezone | timestamp without time zone, timestamp without time zone, interval | normalpg_catalog | generate_subscripts
|SETOF integer                     | anyarray, integer                                                  |
normalpg_catalog| generate_subscripts | SETOF integer                     | anyarray, integer, boolean
                      | normal
 
(10 rows)

Now, generate_subscripts is a totally different function that *should*
have a separate name; it does not take a couple of endpoints.  That's
not much of an argument for inventing different names for some of
the functions that do work with a pair of endpoints.

The real problem is that if we invent generate_series(date,date,integer)
then, given the problem "generate_series(date,unknown,unknown)"
the parser's ambiguous-function rules[1] will resolve that as
generate_series(date,date,integer), on the grounds that that provides
more exact type matches (i.e. 1) than any of the other alternatives.
Previously, the same call would have been resolved as
generate_series(timestamptz,timestamptz,interval) on the grounds that
that's reachable by casting and timestamptz is a preferred type.
So if you had written such a call with an interval literal and didn't
bother to cast the literal, your code would break.

We could avoid that problem if the new function were defined as
generate_series(date,date,interval), but then I'm not certain
what the code ought to do if the given interval is not a multiple
of a day.

One idea that might be worth considering is to define the function
as generate_series(date,date,interval) returns timestamp (without
time zone).  The point here would be only to move the behavior for
date inputs as far as getting timestamp without tz rather than
timestamp with tz; which would at least save some timezone rotations
in typical use, as well as rather debatable semantics.  (The fact
that timestamptz is the preferred type in this hierarchy isn't
really doing us any favors here.)
        regards, tom lane

[1] http://www.postgresql.org/docs/devel/static/typeconv-func.html



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Robert Haas
Дата:
On Thu, Mar 17, 2016 at 12:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> One idea that might be worth considering is to define the function
> as generate_series(date,date,interval) returns timestamp (without
> time zone).  The point here would be only to move the behavior for
> date inputs as far as getting timestamp without tz rather than
> timestamp with tz; which would at least save some timezone rotations
> in typical use, as well as rather debatable semantics.  (The fact
> that timestamptz is the preferred type in this hierarchy isn't
> really doing us any favors here.)

That's a fairly tenuous benefit, though, and a substantially different
patch.  I think it's time to give up here and move on.  We can revisit
this for another release after we've had more time to think about it,
if that seems like a smart thing to do when the time comes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
David Steele
Дата:
On 3/17/16 1:17 PM, Robert Haas wrote:
> On Thu, Mar 17, 2016 at 12:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> One idea that might be worth considering is to define the function
>> as generate_series(date,date,interval) returns timestamp (without
>> time zone).  The point here would be only to move the behavior for
>> date inputs as far as getting timestamp without tz rather than
>> timestamp with tz; which would at least save some timezone rotations
>> in typical use, as well as rather debatable semantics.  (The fact
>> that timestamptz is the preferred type in this hierarchy isn't
>> really doing us any favors here.)
> 
> That's a fairly tenuous benefit, though, and a substantially different
> patch.  I think it's time to give up here and move on.  We can revisit
> this for another release after we've had more time to think about it,
> if that seems like a smart thing to do when the time comes.

This has been marked "returned with feedback".

-- 
-David
david@pgmasters.net



Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Mar 17, 2016 at 12:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> One idea that might be worth considering is to define the function
>> as generate_series(date,date,interval) returns timestamp (without
>> time zone).  The point here would be only to move the behavior for
>> date inputs as far as getting timestamp without tz rather than
>> timestamp with tz; which would at least save some timezone rotations
>> in typical use, as well as rather debatable semantics.

> That's a fairly tenuous benefit, though, and a substantially different
> patch.

Well, some folks were already of the opinion that the patch's benefits
were tenuous ;-)

> I think it's time to give up here and move on.

Agreed, letting this go for now seems like the right move.  Maybe
someone will have a bright idea later.
        regards, tom lane