Обсуждение: Array behavior oddities
In reading our array documentation I came across two unusual behaviors. The issue relates to slices: We can also access arbitrary rectangular slices of an array, or subarrays. An array slice is denoted by writing <literal><replaceable>lower-bound</replaceable>:<replaceable>upper-bound</replaceable></literal>for one or more array dimensions. For example, this query retrieves the first item on Bill's schedule for the first two days of the week: SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill'; First issue: If any dimension is written as a slice, i.e. contains a colon, then all dimensions are treated as slices. Any dimensionthat has only a single number (no colon) is treated as being from <literal>1</> to the number specified. For example,<literal>[2]</> is treated as <literal>[1:2]</>, as in this example: Is the the behavior of assuming an entry with no colon is a slice what we want, or are we just stuck with it? Also: An array subscript expression will return null if either the array itself or any of the subscript expressions are null. Also, null is returned if a subscript is outside the array bounds (this case does not raise an error). For example,if <literal>schedule</> currently has the dimensions <literal>[1:3][1:2]</> then referencing <literal>schedule[3][3]</>yields NULL. Similarly, an array reference with the wrong number of subscripts yields a null ratherthan an error. An array slice expression likewise yields null if the array itself or any of the subscript expressions are null. However,in other corner cases such as selecting an array slice that is completely outside the current array bounds, a sliceexpression yields an empty (zero-dimensional) array instead of null. If the requested slice partially overlaps thearray bounds, then it is silently reduced to just the overlapping region. Is there a reason out-of-bounds array accesses behave differently for slices and non-slices? Having slices and non-slices behave differently is very confusing to me. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > If any dimension is written as a slice, i.e. contains a colon, then all > dimensions are treated as slices. > Is the the behavior of assuming an entry with no colon is a slice what > we want, or are we just stuck with it? Why do you find that surprising? It's either a slice or it isn't, there's no halfway point. Are you proposing to throw an error if only some of the subscripts have colons? What would be the point? > Is there a reason out-of-bounds array accesses behave differently for > slices and non-slices? History (although "sloppy original implementation" would do too). I'm not sure if we should try to clean it up --- there've not been that many complaints, but I'm sure we'd get complaints from people whose code stopped working, if we change it. regards, tom lane
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > If any dimension is written as a slice, i.e. contains a colon, then all > > dimensions are treated as slices. > > > Is the the behavior of assuming an entry with no colon is a slice what > > we want, or are we just stuck with it? > > Why do you find that surprising? It's either a slice or it isn't, > there's no halfway point. Are you proposing to throw an error if only > some of the subscripts have colons? What would be the point? What is confusing is if I see [1:3][2], I assume that is [1:3][2:2], not[1:3][1:2]. By what logic does [2] mean from 1 to2? For example, in [1:3][2], [2] means [1:2], two elements, but in [1][2],[2] means one element. > > Is there a reason out-of-bounds array accesses behave differently for > > slices and non-slices? > > History (although "sloppy original implementation" would do too). I'm > not sure if we should try to clean it up --- there've not been that many > complaints, but I'm sure we'd get complaints from people whose code > stopped working, if we change it. OK, so there is no grand design I am missing; it is just a wart in our implementation, that at least we document. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
"Bruce Momjian" <bruce@momjian.us> writes: > Tom Lane wrote: >> Why do you find that surprising? It's either a slice or it isn't, >> there's no halfway point. Are you proposing to throw an error if only >> some of the subscripts have colons? What would be the point? > > What is confusing is if I see [1:3][2], I assume that is [1:3][2:2], not > [1:3][1:2]. By what logic does [2] mean from 1 to 2? Here's something else which confused me just now. Why does the second query return NULL instead of an array slice? postgres=# select (array['{foo}'::text[],'{bar}'])[1][1];array -------foo (1 row) postgres=# select (array['{foo}'::text[],'{bar}'])[1];array ------- (1 row) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
Gregory Stark <stark@enterprisedb.com> writes: > Here's something else which confused me just now. Why does the second query > return NULL instead of an array slice? Because it isn't a slice expression --- you used colon nowhere, so the result type is going to be text not text[]. (Remember that the parser must determine the expression's result type at parse time, so whether there are enough subscripts can't enter into this.) Our alternatives here are to throw a subscripting error or return NULL. I'd personally have gone with throwing an error, I think, but it seems far too late to revisit that decision. regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bruce Momjian wrote: > An array subscript expression will return null if either the array itself > or any of the subscript expressions are null. Also, null is returned > if a subscript is outside the array bounds (this case does not raise an > error). For example, if <literal>schedule</> currently has the > dimensions <literal>[1:3][1:2]</> then referencing > <literal>schedule[3][3]</> yields NULL. Similarly, an array reference > with the wrong number of subscripts yields a null rather than an error. > > An array slice expression likewise yields null if the array itself or > any of the subscript expressions are null. However, in other corner > cases such as selecting an array slice that is completely outside the > current array bounds, a slice expression yields an empty > (zero-dimensional) array instead of null. If the requested slice > partially overlaps the array bounds, then it is silently reduced to just > the overlapping region. > > Is there a reason out-of-bounds array accesses behave differently for > slices and non-slices? > > Having slices and non-slices behave differently is very confusing to me. I think the case of partially-out-of-bound slices is a good reason to have this difference: fastgraph=# select ('{foo,bar}'::text[])[1:2]; text - -----------{foo,bar} (1 row) fastgraph=# select ('{foo,bar}'::text[])[2:3];text - -------{bar} (1 row) fastgraph=# select ('{foo,bar}'::text[])[3:4];text - ------{} (1 row) We cannot return an empty array in case of unsliced out-of-bounds access because the type wouldn't match at all. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHjdsOzhchXT4RR5ARAvBvAKCGVxgl6u2ZUcB/Bvl2jPN2/p6hzACdFXE3 9w01URr/xPYukzHhD5qhudE= =iZxq -----END PGP SIGNATURE-----
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > If any dimension is written as a slice, i.e. contains a colon, then all > > dimensions are treated as slices. > > > Is the the behavior of assuming an entry with no colon is a slice what > > we want, or are we just stuck with it? > > Why do you find that surprising? It's either a slice or it isn't, > there's no halfway point. Are you proposing to throw an error if only > some of the subscripts have colons? What would be the point? > > > Is there a reason out-of-bounds array accesses behave differently for > > slices and non-slices? > > History (although "sloppy original implementation" would do too). I'm > not sure if we should try to clean it up --- there've not been that many > complaints, but I'm sure we'd get complaints from people whose code > stopped working, if we change it. I have updated the documention (patch attached) to be clearer on the odd array slice behavior. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/array.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/array.sgml,v retrieving revision 1.62 diff -c -c -r1.62 array.sgml *** doc/src/sgml/array.sgml 7 Jun 2007 14:49:56 -0000 1.62 --- doc/src/sgml/array.sgml 26 Mar 2008 14:42:14 -0000 *************** *** 258,263 **** --- 258,266 ---- {{meeting,lunch},{training,presentation}} (1 row) </programlisting> + + To avoid confusion with slices, use slice syntax for all dimmension + references, e.g. <literal>[1:2][1:1]</>, not <literal>[2][1:1]</>. </para> <para> *************** *** 275,281 **** any of the subscript expressions are null. However, in other corner cases such as selecting an array slice that is completely outside the current array bounds, a slice expression ! yields an empty (zero-dimensional) array instead of null. If the requested slice partially overlaps the array bounds, then it is silently reduced to just the overlapping region. </para> --- 278,285 ---- any of the subscript expressions are null. However, in other corner cases such as selecting an array slice that is completely outside the current array bounds, a slice expression ! yields an empty (zero-dimensional) array instead of null. (This ! does not match non-slice behavior and is done for historical reasons.) If the requested slice partially overlaps the array bounds, then it is silently reduced to just the overlapping region. </para>