Обсуждение: Array behavior oddities

Поиск
Список
Период
Сортировка

Array behavior oddities

От
Bruce Momjian
Дата:
In reading our array documentation I came across two unusual behaviors. 
The issue relates to slices:
 We can also access arbitrary rectangular slices of an array, or subarrays.  An array slice is denoted by writing
<literal><replaceable>lower-bound</replaceable>:<replaceable>upper-bound</replaceable></literal>for one or more array
dimensions. For example, this query retrieves the first item on Bill's schedule for the first two days of the week:
 
 SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill';

First issue:
 If any dimension is written as a slice, i.e. contains a colon, then all dimensions are treated as slices.  Any
dimensionthat has only a single number (no colon) is treated as being from <literal>1</> to the number specified.  For
example,<literal>[2]</> is treated as <literal>[1:2]</>, as in this example:
 

Is the the behavior of assuming an entry with no colon is a slice what
we want, or are we just stuck with it?

Also:
 An array subscript expression will return null if either the array itself or any of the subscript expressions are
null. Also, null is returned if a subscript is outside the array bounds (this case does not raise an error).  For
example,if <literal>schedule</> currently has the dimensions <literal>[1:3][1:2]</> then referencing
<literal>schedule[3][3]</>yields NULL.  Similarly, an array reference with the wrong number of subscripts yields a null
ratherthan an error.
 
 An array slice expression likewise yields null if the array itself or any of the subscript expressions are null.
However,in other corner cases such as selecting an array slice that is completely outside the current array bounds, a
sliceexpression yields an empty (zero-dimensional) array instead of null.  If the requested slice partially overlaps
thearray bounds, then it is silently reduced to just the overlapping region.
 

Is there a reason out-of-bounds array accesses behave differently for
slices and non-slices?

Having slices and non-slices behave differently is very confusing to me.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Array behavior oddities

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
>   If any dimension is written as a slice, i.e. contains a colon, then all
>   dimensions are treated as slices.

> Is the the behavior of assuming an entry with no colon is a slice what
> we want, or are we just stuck with it?

Why do you find that surprising?  It's either a slice or it isn't,
there's no halfway point.  Are you proposing to throw an error if only
some of the subscripts have colons?  What would be the point?

> Is there a reason out-of-bounds array accesses behave differently for
> slices and non-slices?

History (although "sloppy original implementation" would do too).  I'm
not sure if we should try to clean it up --- there've not been that many
complaints, but I'm sure we'd get complaints from people whose code
stopped working, if we change it.
        regards, tom lane


Re: Array behavior oddities

От
Bruce Momjian
Дата:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> >   If any dimension is written as a slice, i.e. contains a colon, then all
> >   dimensions are treated as slices.
> 
> > Is the the behavior of assuming an entry with no colon is a slice what
> > we want, or are we just stuck with it?
> 
> Why do you find that surprising?  It's either a slice or it isn't,
> there's no halfway point.  Are you proposing to throw an error if only
> some of the subscripts have colons?  What would be the point?

What is confusing is if I see [1:3][2], I assume that is [1:3][2:2], not[1:3][1:2].  By what logic does [2] mean from 1
to2?
 

For example, in [1:3][2], [2] means [1:2], two elements, but in [1][2],[2] means one element.

> > Is there a reason out-of-bounds array accesses behave differently for
> > slices and non-slices?
> 
> History (although "sloppy original implementation" would do too).  I'm
> not sure if we should try to clean it up --- there've not been that many
> complaints, but I'm sure we'd get complaints from people whose code
> stopped working, if we change it.

OK, so there is no grand design I am missing;  it is just a wart in our
implementation, that at least we document.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Array behavior oddities

От
Gregory Stark
Дата:
"Bruce Momjian" <bruce@momjian.us> writes:

> Tom Lane wrote:
>> Why do you find that surprising?  It's either a slice or it isn't,
>> there's no halfway point.  Are you proposing to throw an error if only
>> some of the subscripts have colons?  What would be the point?
>
> What is confusing is if I see [1:3][2], I assume that is [1:3][2:2], not
>  [1:3][1:2].  By what logic does [2] mean from 1 to 2?

Here's something else which confused me just now. Why does the second query
return NULL instead of an array slice?

postgres=# select (array['{foo}'::text[],'{bar}'])[1][1];array 
-------foo
(1 row)

postgres=# select (array['{foo}'::text[],'{bar}'])[1];array 
-------
(1 row)


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!


Re: Array behavior oddities

От
Tom Lane
Дата:
Gregory Stark <stark@enterprisedb.com> writes:
> Here's something else which confused me just now. Why does the second query
> return NULL instead of an array slice?

Because it isn't a slice expression --- you used colon nowhere, so the
result type is going to be text not text[].  (Remember that the parser
must determine the expression's result type at parse time, so whether
there are enough subscripts can't enter into this.)  Our alternatives
here are to throw a subscripting error or return NULL.  I'd personally
have gone with throwing an error, I think, but it seems far too late to
revisit that decision.
        regards, tom lane


Re: Array behavior oddities

От
Jens-Wolfhard Schicke
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bruce Momjian wrote:
>   An array subscript expression will return null if either the array itself
>   or any of the subscript expressions are null.  Also, null is returned
>   if a subscript is outside the array bounds (this case does not raise an
>   error).  For example, if <literal>schedule</> currently has the
>   dimensions <literal>[1:3][1:2]</> then referencing
>   <literal>schedule[3][3]</> yields NULL.  Similarly, an array reference
>   with the wrong number of subscripts yields a null rather than an error.
> 
>   An array slice expression likewise yields null if the array itself or
>   any of the subscript expressions are null.  However, in other corner
>   cases such as selecting an array slice that is completely outside the
>   current array bounds, a slice expression yields an empty
>   (zero-dimensional) array instead of null.  If the requested slice
>   partially overlaps the array bounds, then it is silently reduced to just
>   the overlapping region.
> 
> Is there a reason out-of-bounds array accesses behave differently for
> slices and non-slices?
> 
> Having slices and non-slices behave differently is very confusing to me.
I think the case of partially-out-of-bound slices is a good reason to have
this difference:

fastgraph=# select ('{foo,bar}'::text[])[1:2];  text
- -----------{foo,bar}
(1 row)

fastgraph=# select ('{foo,bar}'::text[])[2:3];text
- -------{bar}
(1 row)

fastgraph=# select ('{foo,bar}'::text[])[3:4];text
- ------{}
(1 row)

We cannot return an empty array in case of unsliced out-of-bounds access
because the type wouldn't match at all.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHjdsOzhchXT4RR5ARAvBvAKCGVxgl6u2ZUcB/Bvl2jPN2/p6hzACdFXE3
9w01URr/xPYukzHhD5qhudE=
=iZxq
-----END PGP SIGNATURE-----


Re: Array behavior oddities

От
Bruce Momjian
Дата:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> >   If any dimension is written as a slice, i.e. contains a colon, then all
> >   dimensions are treated as slices.
>
> > Is the the behavior of assuming an entry with no colon is a slice what
> > we want, or are we just stuck with it?
>
> Why do you find that surprising?  It's either a slice or it isn't,
> there's no halfway point.  Are you proposing to throw an error if only
> some of the subscripts have colons?  What would be the point?
>
> > Is there a reason out-of-bounds array accesses behave differently for
> > slices and non-slices?
>
> History (although "sloppy original implementation" would do too).  I'm
> not sure if we should try to clean it up --- there've not been that many
> complaints, but I'm sure we'd get complaints from people whose code
> stopped working, if we change it.

I have updated the documention (patch attached) to be clearer on the odd
array slice behavior.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/array.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/array.sgml,v
retrieving revision 1.62
diff -c -c -r1.62 array.sgml
*** doc/src/sgml/array.sgml    7 Jun 2007 14:49:56 -0000    1.62
--- doc/src/sgml/array.sgml    26 Mar 2008 14:42:14 -0000
***************
*** 258,263 ****
--- 258,266 ----
   {{meeting,lunch},{training,presentation}}
  (1 row)
  </programlisting>
+
+   To avoid confusion with slices, use slice syntax for all dimmension
+   references, e.g.  <literal>[1:2][1:1]</>, not <literal>[2][1:1]</>.
   </para>

   <para>
***************
*** 275,281 ****
    any of the subscript expressions are null.  However, in other corner
    cases such as selecting an array slice that
    is completely outside the current array bounds, a slice expression
!   yields an empty (zero-dimensional) array instead of null.
    If the requested slice partially overlaps the array bounds, then it
    is silently reduced to just the overlapping region.
   </para>
--- 278,285 ----
    any of the subscript expressions are null.  However, in other corner
    cases such as selecting an array slice that
    is completely outside the current array bounds, a slice expression
!   yields an empty (zero-dimensional) array instead of null.  (This
!   does not match non-slice behavior and is done for historical reasons.)
    If the requested slice partially overlaps the array bounds, then it
    is silently reduced to just the overlapping region.
   </para>