Обсуждение: regexp_matches illegally restricts rows

Поиск
Список
Период
Сортировка

regexp_matches illegally restricts rows

От
Josh Berkus
Дата:
Severity: major (data loss)
Versions Tested: 8.4.2, 9.0 HEAD
Test Case:

create table regex_test ( id serial not null primary key, myname text );

insert into regex_test ( myname )
values ( 'josh'),('joe'),('mary'),('stephen'), ('jose'),
('kelley'),('alejandro');

select id, regexp_matches(myname, $x$(j[\w]+)$x$)
from regex_test;

The above will return 4 rows, not the 7 which are in the table.

I can't see how this is anything but a bug; as far as I know, nothing in
the target list is allowed to restrict the number of rows which are
returned by the query.  We should get 7 rows, 3 of which have an empty
array or a NULL in the 2nd column.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: regexp_matches illegally restricts rows -- just a documentation issue?

От
Josh Berkus
Дата:
On 4/5/10 9:16 PM, Josh Berkus wrote:

> I can't see how this is anything but a bug; as far as I know, nothing in
> the target list is allowed to restrict the number of rows which are
> returned by the query.  We should get 7 rows, 3 of which have an empty
> array or a NULL in the 2nd column.

Just noticed it's a SETOF[] function.  Which makes it odd that I can
call it in the target list at all, but explains the row restriction.

It's still confusing behavior (three regulars on IRC thought it was a
bug too) and users should be warned in the documentation.  Not sure
exactly where, though ... maybe in 9.7?

--Josh Berkus

Re: Re: regexp_matches illegally restricts rows -- just a documentation issue?

От
Robert Haas
Дата:
On Tue, Apr 6, 2010 at 1:06 AM, Josh Berkus <josh@postgresql.org> wrote:
> On 4/5/10 9:16 PM, Josh Berkus wrote:
>
>> I can't see how this is anything but a bug; as far as I know, nothing in
>> the target list is allowed to restrict the number of rows which are
>> returned by the query. =A0We should get 7 rows, 3 of which have an empty
>> array or a NULL in the 2nd column.
>
> Just noticed it's a SETOF[] function. =A0Which makes it odd that I can
> call it in the target list at all, but explains the row restriction.
>
> It's still confusing behavior (three regulars on IRC thought it was a
> bug too) and users should be warned in the documentation. =A0Not sure
> exactly where, though ... maybe in 9.7?

While I understand why this is confusing, it's really very normal
behavior for a SRF, and I don't really think it makes sense to
document that this SRF behaves just like other SRFs...

...Robert

Re: Re: regexp_matches illegally restricts rows -- just a documentation issue?

От
Josh Berkus
Дата:
> While I understand why this is confusing, it's really very normal
> behavior for a SRF, and I don't really think it makes sense to
> document that this SRF behaves just like other SRFs...

It's likely to be used by people who do not otherwise use SRFs, and many
would not be prepared for the consequences.  It's not instinctive that a
regexp function would be an SRF in any case; if someone is not looking
closely at the docs, it would be easy to miss this entirely -- as 3
experienced PG people did yesterday.

Personally, I also think that PostgreSQL is wrong to allow an SRF in the
target list to restrict the number of rows output.  A subselect in the
target list does not do so.  However, that's completely another discussion.

--Josh Berkus

Re: Re: regexp_matches illegally restricts rows -- just a documentation issue?

От
"Erik Rijkers"
Дата:
On Tue, April 6, 2010 21:42, Josh Berkus wrote:
>
>> While I understand why this is confusing, it's really very normal
>> behavior for a SRF, and I don't really think it makes sense to
>> document that this SRF behaves just like other SRFs...
>
> It's likely to be used by people who do not otherwise use SRFs, and many
> would not be prepared for the consequences.  It's not instinctive that a
> regexp function would be an SRF in any case; if someone is not looking
> closely at the docs, it would be easy to miss this entirely -- as 3
> experienced PG people did yesterday.
>
> Personally, I also think that PostgreSQL is wrong to allow an SRF in the
> target list to restrict the number of rows output.  A subselect in the
> target list does not do so.  However, that's completely another discussion.
>

You said:
  "users should be warned in the documentation.";

The documentation has this warning:

"Currently, functions returning sets can also be called in the select list
of a query. For each row that the query generates by itself, the function
returning set is invoked, and an output row is generated for each element
of the function’s result set. Note, however, that this capability is
deprecated and might be removed in future releases."

(8.4 docs, section 34.4.7.)


Erik Rijkers