Обсуждение: proposal: row_to_array function

Поиск

Список

Период

Сортировка

proposal: row_to_array function

От

Pavel Stehule

Дата:

16 января 2015 г., 12:46:34

I am returning back to processing records in plpgsql.

I am thinking so it can be simply processed with transformations to array.

Now we have similar functions - hstore(row), row_to_json, ... but using of these functions can be a useless step. Any row variable can be transformed to 2D text array.

There two possible transformations:

row_to_array --> [[key1, value1],[key2, value2], ...]

row_to_row_array --> [(key1, value1), (key2, value2), ... ]

Both transformations can be simply implemented.

Comments, notices?

Regards

Pavel

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

16 января 2015 г., 20:03:25

On 1/16/15 3:45 AM, Pavel Stehule wrote:
> I am returning back to processing records in plpgsql.
>
> I am thinking so it can be simply processed with transformations to array.
>
> Now we have similar functions - hstore(row), row_to_json, ... but using of these functions can be a useless step. Any
rowvariable can be transformed to 2D text array.

How is it useless? Why wouldn't you just use JSON and be done with it?

Do you have some use cases you can share?

> There two possible transformations:
>
> row_to_array --> [[key1, value1],[key2, value2], ...]
> row_to_row_array --> [(key1, value1), (key2, value2), ... ]

If we're going to go that route, I think it makes more sense to create an actual key/value type (ie:
http://pgxn.org/dist/pair/doc/pair.html)and return an array of that.

-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

16 января 2015 г., 20:22:47

2015-01-16 18:03 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/16/15 3:45 AM, Pavel Stehule wrote:
I am returning back to processing records in plpgsql.

I am thinking so it can be simply processed with transformations to array.

Now we have similar functions - hstore(row), row_to_json, ... but using of these functions can be a useless step. Any row variable can be transformed to 2D text array.

How is it useless? Why wouldn't you just use JSON and be done with it?

We can use a FOREACH IN ARRAY iteration in plpgsql (second variant is a implementation FOREACH for jsonb)

so ROW->ARRAY is shorter than ROW->JSON->ARRAY or ROW->HSTORE->ARRAY

Do you have some use cases you can share?

processing of NEW, OLD variables in triggers

There two possible transformations:

row_to_array --> [[key1, value1],[key2, value2], ...]
row_to_row_array --> [(key1, value1), (key2, value2), ... ]

If we're going to go that route, I think it makes more sense to create an actual key/value type (ie: http://pgxn.org/dist/pair/doc/pair.html) and return an array of that.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

16 января 2015 г., 20:42:46

On 1/16/15 11:22 AM, Pavel Stehule wrote:
>
>
> 2015-01-16 18:03 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>>:
>
>     On 1/16/15 3:45 AM, Pavel Stehule wrote:
>
>         I am returning back to processing records in plpgsql.
>
>         I am thinking so it can be simply processed with transformations to array.
>
>         Now we have similar functions - hstore(row), row_to_json, ... but using of these functions can be a useless
step.Any row variable can be transformed to 2D text array.
 
>
>
>     How is it useless? Why wouldn't you just use JSON and be done with it?
>
>
> We can use a FOREACH IN ARRAY iteration in plpgsql (second variant is a implementation FOREACH for jsonb)
>
> so ROW->ARRAY is shorter than ROW->JSON->ARRAY or ROW->HSTORE->ARRAY

I think the real problem here is that we're inventing a bunch of different ways to do the same thing: iterate over a
set.Instead of doing that, should we add the idea of an iterator to the type system? That would make sense for arrays,
hstore,json and XML.
 

>     Do you have some use cases you can share?
>
>
> processing of NEW, OLD variables in triggers

Note that last time I checked you couldn't do something like NEW.variable, and I don't think you could use EXEC to do
iteither. So there's more needed here than just converting a record to an array.
 

>         There two possible transformations:
>
>         row_to_array --> [[key1, value1],[key2, value2], ...]
>         row_to_row_array --> [(key1, value1), (key2, value2), ... ]
>
>
>     If we're going to go that route, I think it makes more sense to create an actual key/value type (ie:
http://pgxn.org/dist/pair/doc/__pair.html<http://pgxn.org/dist/pair/doc/pair.html>) and return an array of that.
 
>
>
> ok
>
>     --
>     Jim Nasby, Data Architect, Blue Treble Consulting
>     Data in Trouble? Get it in Treble! http://BlueTreble.com
>
>


-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

16 января 2015 г., 21:02:52

2015-01-16 18:42 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/16/15 11:22 AM, Pavel Stehule wrote:

2015-01-16 18:03 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>>:

On 1/16/15 3:45 AM, Pavel Stehule wrote:

I am returning back to processing records in plpgsql.

I am thinking so it can be simply processed with transformations to array.

Now we have similar functions - hstore(row), row_to_json, ... but using of these functions can be a useless step. Any row variable can be transformed to 2D text array.

How is it useless? Why wouldn't you just use JSON and be done with it?

We can use a FOREACH IN ARRAY iteration in plpgsql (second variant is a implementation FOREACH for jsonb)

so ROW->ARRAY is shorter than ROW->JSON->ARRAY or ROW->HSTORE->ARRAY

I think the real problem here is that we're inventing a bunch of different ways to do the same thing: iterate over a set. Instead of doing that, should we add the idea of an iterator to the type system? That would make sense for arrays, hstore, json and XML.

what do you think? How this can be implemented?

Do you have some use cases you can share?

processing of NEW, OLD variables in triggers

Note that last time I checked you couldn't do something like NEW.variable, and I don't think you could use EXEC to do it either. So there's more needed here than just converting a record to an array.

There two possible transformations:

row_to_array --> [[key1, value1],[key2, value2], ...]
row_to_row_array --> [(key1, value1), (key2, value2), ... ]

If we're going to go that route, I think it makes more sense to create an actual key/value type (ie: http://pgxn.org/dist/pair/doc/__pair.html <http://pgxn.org/dist/pair/doc/pair.html>) and return an array of that.

ok

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Andrew Dunstan

Дата:

17 января 2015 г., 00:35:55

On 01/16/2015 12:22 PM, Pavel Stehule wrote:
>
>
>         There two possible transformations:
>
>         row_to_array --> [[key1, value1],[key2, value2], ...]
>         row_to_row_array --> [(key1, value1), (key2, value2), ... ]
>
>
>     If we're going to go that route, I think it makes more sense to
>     create an actual key/value type (ie:
>     http://pgxn.org/dist/pair/doc/pair.html) and return an array of that.
>
>
> ok
>
>     <http://BlueTreble.com>
>


I think we'd possibly be better off with simply returning a flat array, 
[key1, value1, ...]

Thats's what the hstore(text[]) and json_object(text[]) functions 
accept, along with the 2D variant, if we want a precedent.

cheers

andrew

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

17 января 2015 г., 09:27:28

2015-01-16 22:35 GMT+01:00 Andrew Dunstan <andrew@dunslane.net>:

On 01/16/2015 12:22 PM, Pavel Stehule wrote:

There two possible transformations:

row_to_array --> [[key1, value1],[key2, value2], ...]
row_to_row_array --> [(key1, value1), (key2, value2), ... ]

If we're going to go that route, I think it makes more sense to
create an actual key/value type (ie:
http://pgxn.org/dist/pair/doc/pair.html) and return an array of that.

ok

<http://BlueTreble.com>

I think we'd possibly be better off with simply returning a flat array, [key1, value1, ...]

Thats's what the hstore(text[]) and json_object(text[]) functions accept, along with the 2D variant, if we want a precedent.

It can be one of supported variant. I should not be one, because we cannot to simply iterate over it

Next possibility is teach FOREACH to take key and value in one step.

Regards

Pavel

cheers

andrew

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

18 января 2015 г., 22:31:32

2015-01-17 7:26 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2015-01-16 22:35 GMT+01:00 Andrew Dunstan <andrew@dunslane.net>:

On 01/16/2015 12:22 PM, Pavel Stehule wrote:

There two possible transformations:

row_to_array --> [[key1, value1],[key2, value2], ...]
row_to_row_array --> [(key1, value1), (key2, value2), ... ]

If we're going to go that route, I think it makes more sense to
create an actual key/value type (ie:
http://pgxn.org/dist/pair/doc/pair.html) and return an array of that.

ok

<http://BlueTreble.com>

I think we'd possibly be better off with simply returning a flat array, [key1, value1, ...]

Thats's what the hstore(text[]) and json_object(text[]) functions accept, along with the 2D variant, if we want a precedent.

It can be one of supported variant. I should not be one, because we cannot to simply iterate over it

Next possibility is teach FOREACH to take key and value in one step.

I looked to code and iteration over pair (key, value) is more simple

FOREACH supports target list, but source should be composite array.

ostgres=# do $$
declare a int;
b int;
begin
foreach a,b in array ARRAY[(1,2),(3,4)]
loop
raise notice 'a = %, b = %', a,b;
end loop;
end;
$$ language plpgsql;
NOTICE: a = 1, b = 2
NOTICE: a = 3, b = 4
DO

Conversion from ARRAY[k1,v1,k2,v2, ... ] is not well consistent with current design

Regards

Pavel

cheers

andrew

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

25 января 2015 г., 13:24:15

I tested a concept iteration over array in format [key1, value1, key2, value2, .. ] - what is nice, it works for [[key1,value1],[key2, value2], ...] too

It is only a few lines more to current code, and this change doesn't break a compatibility.

Do you think, so this patch is acceptable?

Ideas, comments?

Regards

Pavel

2015-01-16 22:35 GMT+01:00 Andrew Dunstan <andrew@dunslane.net>:

On 01/16/2015 12:22 PM, Pavel Stehule wrote:

There two possible transformations:

row_to_array --> [[key1, value1],[key2, value2], ...]
row_to_row_array --> [(key1, value1), (key2, value2), ... ]

If we're going to go that route, I think it makes more sense to
create an actual key/value type (ie:
http://pgxn.org/dist/pair/doc/pair.html) and return an array of that.

ok

<http://BlueTreble.com>

I think we'd possibly be better off with simply returning a flat array, [key1, value1, ...]

Thats's what the hstore(text[]) and json_object(text[]) functions accept, along with the 2D variant, if we want a precedent.

cheers

andrew

Вложения

foreach-key-value-PoC.patch

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

26 января 2015 г., 23:44:37

On 1/25/15 4:23 AM, Pavel Stehule wrote:
>
> I tested a concept iteration over array in format [key1, value1, key2, value2, .. ] - what is nice, it works for
[[key1,value1],[key2,value2], ...] too
 
>
> It is only a few lines more to current code, and this change doesn't break a compatibility.
>
> Do you think, so this patch is acceptable?
>
> Ideas, comments?

Aside from fixing the comments... I think this needs more tests on corner cases. For example, what happens when you do

foreach a, b, c in array(array(1,2),array(3,4)) ?

Or the opposite case of

foreach a,b in array(array(1,2,3))

Also, what about:

foreach a,b in '{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[] ?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

27 января 2015 г., 13:42:05

2015-01-26 21:44 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/25/15 4:23 AM, Pavel Stehule wrote:

I tested a concept iteration over array in format [key1, value1, key2, value2, .. ] - what is nice, it works for [[key1,value1],[key2, value2], ...] too

It is only a few lines more to current code, and this change doesn't break a compatibility.

Do you think, so this patch is acceptable?

Ideas, comments?

Aside from fixing the comments... I think this needs more tests on corner cases. For example, what happens when you do

foreach a, b, c in array(array(1,2),array(3,4)) ?

it is relative simple behave -- empty values are NULL

array(1,2),array(3,4) -- do you think ARRAY[[1,2],[3,4]] is effectively ARRAY[1,2,3,4]

Or the opposite case of

foreach a,b in array(array(1,2,3))

Also, what about:

foreach a,b in '{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[] ?

postgres=# select array(select unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));
array
-------------------
{1,2,3,4,5,6,7,8}
(1 row)

so it generate pairs {1,2}{3,4},{5,6},{7,8}

Regards

Pavel Stehule

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

27 января 2015 г., 21:59:36

2015-01-27 11:41 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2015-01-26 21:44 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/25/15 4:23 AM, Pavel Stehule wrote:

I tested a concept iteration over array in format [key1, value1, key2, value2, .. ] - what is nice, it works for [[key1,value1],[key2, value2], ...] too

It is only a few lines more to current code, and this change doesn't break a compatibility.

Do you think, so this patch is acceptable?

Ideas, comments?

Aside from fixing the comments... I think this needs more tests on corner cases. For example, what happens when you do

foreach a, b, c in array(array(1,2),array(3,4)) ?

it is relative simple behave -- empty values are NULL

array(1,2),array(3,4) -- do you think ARRAY[[1,2],[3,4]] is effectively ARRAY[1,2,3,4]

Or the opposite case of

foreach a,b in array(array(1,2,3))

Also, what about:

foreach a,b in '{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[] ?

postgres=# select array(select unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));
array
-------------------
{1,2,3,4,5,6,7,8}
(1 row)

so it generate pairs {1,2}{3,4},{5,6},{7,8}

I fixed situation when array has not enough elements.

More tests, simple doc

Regards

Pavel

Regards

Pavel Stehule

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Вложения

foreach-key-value-01.patch

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

27 января 2015 г., 23:27:43

Hello

here is a initial version of row_to_array function - transform any row to array in format proposed by Andrew.

Regards

Pavel

2015-01-27 19:58 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

2015-01-27 11:41 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2015-01-26 21:44 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/25/15 4:23 AM, Pavel Stehule wrote:

I tested a concept iteration over array in format [key1, value1, key2, value2, .. ] - what is nice, it works for [[key1,value1],[key2, value2], ...] too

It is only a few lines more to current code, and this change doesn't break a compatibility.

Do you think, so this patch is acceptable?

Ideas, comments?

Aside from fixing the comments... I think this needs more tests on corner cases. For example, what happens when you do

foreach a, b, c in array(array(1,2),array(3,4)) ?

it is relative simple behave -- empty values are NULL

array(1,2),array(3,4) -- do you think ARRAY[[1,2],[3,4]] is effectively ARRAY[1,2,3,4]

Or the opposite case of

foreach a,b in array(array(1,2,3))

Also, what about:

foreach a,b in '{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[] ?

postgres=# select array(select unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));
array
-------------------
{1,2,3,4,5,6,7,8}
(1 row)

so it generate pairs {1,2}{3,4},{5,6},{7,8}

I fixed situation when array has not enough elements.

More tests, simple doc

Regards

Pavel

Regards

Pavel Stehule

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Вложения

row_to_array-PoC.patch

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

27 января 2015 г., 23:34:33

Example:

postgres=# do $$
declare r record;
declare k text; v text;
begin
for r in select * from foo loop
    foreach k,v in array row_to_array(r) loop
      raise notice 'k: %, v: %', k, v;
    end loop;
end loop;
end;
$$;
NOTICE: k: a, v: 2
NOTICE: k: b, v: NAZDAR
NOTICE: k: c, v: 2015-01-27
NOTICE: k: a, v: 2
NOTICE: k: b, v: AHOJ
NOTICE: k: c, v: 2015-01-27
DO

Regards

Pavel

2015-01-27 21:26 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hello

here is a initial version of row_to_array function - transform any row to array in format proposed by Andrew.

Regards

Pavel

2015-01-27 19:58 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi

2015-01-27 11:41 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2015-01-26 21:44 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/25/15 4:23 AM, Pavel Stehule wrote:

I tested a concept iteration over array in format [key1, value1, key2, value2, .. ] - what is nice, it works for [[key1,value1],[key2, value2], ...] too

It is only a few lines more to current code, and this change doesn't break a compatibility.

Do you think, so this patch is acceptable?

Ideas, comments?

Aside from fixing the comments... I think this needs more tests on corner cases. For example, what happens when you do

foreach a, b, c in array(array(1,2),array(3,4)) ?

it is relative simple behave -- empty values are NULL

array(1,2),array(3,4) -- do you think ARRAY[[1,2],[3,4]] is effectively ARRAY[1,2,3,4]

Or the opposite case of

foreach a,b in array(array(1,2,3))

Also, what about:

foreach a,b in '{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[] ?

postgres=# select array(select unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));
array
-------------------
{1,2,3,4,5,6,7,8}
(1 row)

so it generate pairs {1,2}{3,4},{5,6},{7,8}

I fixed situation when array has not enough elements.

More tests, simple doc

Regards

Pavel

Regards

Pavel Stehule

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

28 января 2015 г., 02:16:36

On 1/27/15 2:26 PM, Pavel Stehule wrote:
> here is a initial version of row_to_array function - transform any row to array in format proposed by Andrew.

Please start a new thread for this... does it depend on the key-value patch?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

28 января 2015 г., 02:26:00

On 1/27/15 12:58 PM, Pavel Stehule wrote:
>       postgres=# select array(select unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));
>             array
>     -------------------
>       {1,2,3,4,5,6,7,8}
>     (1 row)
>
>     so it generate pairs {1,2}{3,4},{5,6},{7,8}
>
>
> I fixed situation when array has not enough elements.
>
> More tests, simple doc

Hrm, this wasn't what I was expecting:

+ select foreach_test_ab(array[1,2,3,4]);
+ NOTICE:  a: 1, b: 2
+ NOTICE:  a: 3, b: 4

I was expecting that foreach a,b array would be expecting something in the array to have a dimension of 2. :(

I think this is bad, because this:

foreach_test_ab('{{1,2,3},{4,5,6}}'::int[]);

will give you 1,2; 3,4; 5,6. I don't see any way that that makes sense. Even if it did make sense, I'm more concerned
thatadding this will seriously paint us into a corner when it comes to the (to me) more rational case of returning
{1,2,3},{4,5,6}.

I think we need to think some more about this, at least to make sure we're not painting ourselves into a corner for
moreappropriate array iteration.

-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

28 января 2015 г., 08:50:04

<p dir="ltr"><br /> Dne 28.1.2015 0:25 "Jim Nasby" <<a
href="mailto:Jim.Nasby@bluetreble.com">Jim.Nasby@bluetreble.com</a>>napsal(a):<br /> ><br /> > On 1/27/15
12:58PM, Pavel Stehule wrote:<br /> >><br /> >>       postgres=# select array(select
unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));<br/> >>             array<br /> >>    
-------------------<br/> >>       {1,2,3,4,5,6,7,8}<br /> >>     (1 row)<br /> >><br /> >>    
soit generate pairs {1,2}{3,4},{5,6},{7,8}<br /> >><br /> >><br /> >> I fixed situation when array
hasnot enough elements.<br /> >><br /> >> More tests, simple doc<br /> ><br /> ><br /> > Hrm, this
wasn'twhat I was expecting:<br /> ><br /> > + select foreach_test_ab(array[1,2,3,4]);<br /> > + NOTICE:  a: 1,
b:2<br /> > + NOTICE:  a: 3, b: 4<br /> ><br /> > I was expecting that foreach a,b array would be expecting
somethingin the array to have a dimension of 2. :(<p dir="ltr">It is inconsist (your expectation) with current
implementationof FOREACH. It doesnt produce a array when SLICING is missing. And it doesnt calculate with dimensions.<p
dir="ltr">Iwould not to change this rule. It is not ambigonuous and it allows to work with<br /> 1d, 2d, 3d dimensions
array.You can process Andrew format well and my proposed format (2d array) well too. <p dir="ltr">There can be differen
behavewhen SLICING is used. There we can iterate exactly with dimensions. We can design a behave in this case?<br /><p
dir="ltr">><br/> > I think this is bad, because this:<br /> ><br /> >
foreach_test_ab('{{1,2,3},{4,5,6}}'::int[]);<br/> ><br /> > will give you 1,2; 3,4; 5,6. I don't see any way that
thatmakes sense. Even if it did make sense, I'm more concerned that adding this will seriously paint us into a corner
whenit comes to the (to me) more rational case of returning {1,2,3},{4,5,6}.<br /> ><br /> > I think we need to
thinksome more about this, at least to make sure we're not painting ourselves into a corner for more appropriate array
iteration.<br/> ><br /> > -- <br /> > Jim Nasby, Data Architect, Blue Treble Consulting<br /> > Data in
Trouble?Get it in Treble! <a href="http://BlueTreble.com">http://BlueTreble.com</a><br />

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

28 января 2015 г., 10:17:42

2015-01-28 0:16 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/27/15 2:26 PM, Pavel Stehule wrote:
here is a initial version of row_to_array function - transform any row to array in format proposed by Andrew.

Please start a new thread for this... does it depend on the key-value patch?

partially - a selected format should be well supported by FOREACH statement

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

28 января 2015 г., 10:29:04

2015-01-28 6:49 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Dne 28.1.2015 0:25 "Jim Nasby" <Jim.Nasby@bluetreble.com> napsal(a):
>
> On 1/27/15 12:58 PM, Pavel Stehule wrote:
>>
>> postgres=# select array(select unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));
>> array
>> -------------------
>> {1,2,3,4,5,6,7,8}
>> (1 row)
>>
>> so it generate pairs {1,2}{3,4},{5,6},{7,8}
>>
>>
>> I fixed situation when array has not enough elements.
>>
>> More tests, simple doc
>
>
> Hrm, this wasn't what I was expecting:
>
> + select foreach_test_ab(array[1,2,3,4]);
> + NOTICE: a: 1, b: 2
> + NOTICE: a: 3, b: 4
>
> I was expecting that foreach a,b array would be expecting something in the array to have a dimension of 2. :(
It is inconsist (your expectation) with current implementation of FOREACH. It doesnt produce a array when SLICING is missing. And it doesnt calculate with dimensions.
I would not to change this rule. It is not ambigonuous and it allows to work with
1d, 2d, 3d dimensions array. You can process Andrew format well and my proposed format (2d array) well too.

one small example

CREATE OR REPLACE FUNCTION iterate_over_pairs(text[])
RETURNS void AS $$
DECLARE v1 text; v2 text; e text; i int := 0;
BEGIN
FOREACH e IN ARRAY $1 LOOP
    IF i % 2 = 0 THEN v1 := e;
    ELSE v2 := e; RAISE NOTICE 'v1: %, v2: %', v1, v2; END IF;
    i := i + 1;
END LOOP;
END;
$$ LANGUAGE plpgsql;

postgres=# SELECT iterate_over_pairs(ARRAY[1,2,3,4]::text[]);
NOTICE: v1: 1, v2: 2
NOTICE: v1: 3, v2: 4
iterate_over_pairs
--------------------

(1 row)

postgres=# SELECT iterate_over_pairs(ARRAY[[1,2],[3,4]]::text[]);
NOTICE: v1: 1, v2: 2
NOTICE: v1: 3, v2: 4
iterate_over_pairs
--------------------

(1 row)

I can use iterate_over_pairs for 1D or 2D arrays well -- a FOREACH was designed in this direction - without SLICE a dimensions data are unimportant.

Discussed enhancing of FOREACH is faster and shorter (readable) iterate_over_pairs use case.

FOREACH v1, v2 IN ARRAY $1 LOOP
..

END LOOP;

It is consistent with current design

You can look to patch - in this moment a SLICE > 0 is disallowed for situation, when target variable is ROW and source is not ROW.

Regards

Pavel

There can be differen behave when SLICING is used. There we can iterate exactly with dimensions. We can design a behave in this case?
>
> I think this is bad, because this:
>
> foreach_test_ab('{{1,2,3},{4,5,6}}'::int[]);
>
> will give you 1,2; 3,4; 5,6. I don't see any way that that makes sense. Even if it did make sense, I'm more concerned that adding this will seriously paint us into a corner when it comes to the (to me) more rational case of returning {1,2,3},{4,5,6}.
>
> I think we need to think some more about this, at least to make sure we're not painting ourselves into a corner for more appropriate array iteration.
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Jeff Janes

Дата:

29 марта 2015 г., 01:53:57

On Tue, Jan 27, 2015 at 10:58 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

2015-01-27 11:41 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2015-01-26 21:44 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/25/15 4:23 AM, Pavel Stehule wrote:

I tested a concept iteration over array in format [key1, value1, key2, value2, .. ] - what is nice, it works for [[key1,value1],[key2, value2], ...] too

It is only a few lines more to current code, and this change doesn't break a compatibility.

Do you think, so this patch is acceptable?

Ideas, comments?

Aside from fixing the comments... I think this needs more tests on corner cases. For example, what happens when you do

foreach a, b, c in array(array(1,2),array(3,4)) ?

it is relative simple behave -- empty values are NULL

array(1,2),array(3,4) -- do you think ARRAY[[1,2],[3,4]] is effectively ARRAY[1,2,3,4]

Or the opposite case of

foreach a,b in array(array(1,2,3))

Also, what about:

foreach a,b in '{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[] ?

postgres=# select array(select unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));
array
-------------------
{1,2,3,4,5,6,7,8}
(1 row)

so it generate pairs {1,2}{3,4},{5,6},{7,8}

I fixed situation when array has not enough elements.

This no longer applies due to conflicts in src/pl/plpgsql/src/pl_exec.c caused by e524cbdc45ec6d677b1dd49

Also, what is the relationship of this patch to the row_to_array patch? Are they independent, or does one depend on the other? row_to_array by itself applies but doesn't compile.

Cheers,

Jeff

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

29 марта 2015 г., 19:43:26

here is rebased patch.

It contains both patches - row_to_array function and foreach array support.

This design is in conformity with hstore functions. There can be good synergy.

Regards

Pavel

2015-03-28 23:53 GMT+01:00 Jeff Janes <jeff.janes@gmail.com>:

On Tue, Jan 27, 2015 at 10:58 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

2015-01-27 11:41 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2015-01-26 21:44 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/25/15 4:23 AM, Pavel Stehule wrote:

I tested a concept iteration over array in format [key1, value1, key2, value2, .. ] - what is nice, it works for [[key1,value1],[key2, value2], ...] too

It is only a few lines more to current code, and this change doesn't break a compatibility.

Do you think, so this patch is acceptable?

Ideas, comments?

Aside from fixing the comments... I think this needs more tests on corner cases. For example, what happens when you do

foreach a, b, c in array(array(1,2),array(3,4)) ?

it is relative simple behave -- empty values are NULL

array(1,2),array(3,4) -- do you think ARRAY[[1,2],[3,4]] is effectively ARRAY[1,2,3,4]

Or the opposite case of

foreach a,b in array(array(1,2,3))

Also, what about:

foreach a,b in '{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[] ?

postgres=# select array(select unnest('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]));
array
-------------------
{1,2,3,4,5,6,7,8}
(1 row)

so it generate pairs {1,2}{3,4},{5,6},{7,8}

I fixed situation when array has not enough elements.

This no longer applies due to conflicts in src/pl/plpgsql/src/pl_exec.c caused by e524cbdc45ec6d677b1dd49

Also, what is the relationship of this patch to the row_to_array patch? Are they independent, or does one depend on the other? row_to_array by itself applies but doesn't compile.

Cheers,

Jeff

Вложения

row_to_array-20150329-01.patch

Re: proposal: row_to_array function

От

Tom Lane

Дата:

29 марта 2015 г., 21:27:35

Pavel Stehule <pavel.stehule@gmail.com> writes:
> here is rebased patch.
> It contains both patches - row_to_array function and foreach array support.

While I don't have a problem with hstore_to_array, I don't think that
row_to_array is a very good idea; it's basically encouraging people to
throw away SQL datatypes altogether and imagine that everything is text.
They've already bought into that concept if they are using hstore or
json, so smashing elements of those containers to text is not a problem.
But that doesn't make this version a good thing.

(In any case, those who insist can get there through row_to_json, no?)

Also, could we please *not* mix up these two very independent features?
"foreach array" as implemented here may or may not be a good thing, but
it should get its own discussion.
        regards, tom lane

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

29 марта 2015 г., 22:21:31

2015-03-29 20:27 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> here is rebased patch.
> It contains both patches - row_to_array function and foreach array support.

While I don't have a problem with hstore_to_array, I don't think that
row_to_array is a very good idea; it's basically encouraging people to
throw away SQL datatypes altogether and imagine that everything is text.

This is complementation of ARRAY API - we have row_to_json, probably will have row_to_jsonb, row_to_hstore and "row_to_array" is relative logical. Casting to text is not fast, but on second hand - working with text arrays is fast.

I know so casting to text is a problem, but if you iterate over record's fields, then you have to find common shared type due sharing plans - and text arrays can be simple solution.

Now, with current possibilities I'll do full sql expression SELECT key, value FROM each(hstore(ROW)) or FOREACH ARRAY hstore_to_matrix(hstore(ROW))

row_to_array(ROW) can reduce a hstore overhead

any other solution based on PL/Perl or PL/Python are slower due PL engine start and due same transformation to some form of structured text.

They've already bought into that concept if they are using hstore or
json, so smashing elements of those containers to text is not a problem.
But that doesn't make this version a good thing.

(In any case, those who insist can get there through row_to_json, no?)

Also, could we please *not* mix up these two very independent features?
"foreach array" as implemented here may or may not be a good thing, but
it should get its own discussion.

ok, I'll send two patches.

regards, tom lane

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

31 марта 2015 г., 15:51:07

2015-03-29 21:20 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2015-03-29 20:27 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> here is rebased patch.
> It contains both patches - row_to_array function and foreach array support.

While I don't have a problem with hstore_to_array, I don't think that
row_to_array is a very good idea; it's basically encouraging people to
throw away SQL datatypes altogether and imagine that everything is text.

This is complementation of ARRAY API - we have row_to_json, probably will have row_to_jsonb, row_to_hstore and "row_to_array" is relative logical. Casting to text is not fast, but on second hand - working with text arrays is fast.

I know so casting to text is a problem, but if you iterate over record's fields, then you have to find common shared type due sharing plans - and text arrays can be simple solution.

Now, with current possibilities I'll do full sql expression SELECT key, value FROM each(hstore(ROW)) or FOREACH ARRAY hstore_to_matrix(hstore(ROW))

row_to_array(ROW) can reduce a hstore overhead

any other solution based on PL/Perl or PL/Python are slower due PL engine start and due same transformation to some form of structured text.

They've already bought into that concept if they are using hstore or
json, so smashing elements of those containers to text is not a problem.
But that doesn't make this version a good thing.

(In any case, those who insist can get there through row_to_json, no?)

Also, could we please *not* mix up these two very independent features?
"foreach array" as implemented here may or may not be a good thing, but
it should get its own discussion.

ok, I'll send two patches.

attachments contains previous patch separated to two independent patches.

Regards

Pavel

regards, tom lane

Вложения

Re: proposal: row_to_array function

От

Merlin Moncure

Дата:

01 апреля 2015 г., 20:59:41

On Sun, Mar 29, 2015 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> here is rebased patch.
>> It contains both patches - row_to_array function and foreach array support.
>
> While I don't have a problem with hstore_to_array, I don't think that
> row_to_array is a very good idea; it's basically encouraging people to
> throw away SQL datatypes altogether and imagine that everything is text.
> They've already bought into that concept if they are using hstore or
> json, so smashing elements of those containers to text is not a problem.
> But that doesn't make this version a good thing.
>
> (In any case, those who insist can get there through row_to_json, no?)

You have a point.  What does attached do that to_json does not do
besides completely discard type information?  Our json api is pretty
rich and getting richer.  For better or ill, we dumped all json
support into the already stupendously bloated public namespace and so
it's always available.

merlin

Re: proposal: row_to_array function

От

Brendan Jurd

Дата:

19 июня 2015 г., 04:12:29

On Thu, 2 Apr 2015 at 05:00 Merlin Moncure <mmoncure@gmail.com> wrote:

On Sun, Mar 29, 2015 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> While I don't have a problem with hstore_to_array, I don't think that
> row_to_array is a very good idea; it's basically encouraging people to
> throw away SQL datatypes altogether and imagine that everything is text.

...

>
> (In any case, those who insist can get there through row_to_json, no?)

You have a point. What does attached do that to_json does not do
besides completely discard type information?

FWIW, I think row_to_array is nice, and I would make use of it. If you have a record, and you want to iterate over its fields in a generic way, at least IMO converting to a text array is an obvious thing to reach for, and it makes for very clearly intentioned code. While it's true that you could go through JSON or hstore to achieve much the same thing, it is a bit of a circumlocution.

I get Tom's point that smashing to text should not be done frivolously, but there are circumstances when it's a reasonable move. Is it possible that it might be used unwisely? Yes, but then you could say that about pretty much everything.

Would it alleviate your concerns at all if the function was named row_to_text_array, to stress the fact that you are throwing away data types?

If the patch was invasive, I would probably not support it, but from what I can see it's a pretty cheap add.

Cheers,

Re: proposal: row_to_array function

От

Craig Ringer

Дата:

22 июня 2015 г., 06:18:25

On 2 April 2015 at 01:59, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Sun, Mar 29, 2015 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>> here is rebased patch.
>>> It contains both patches - row_to_array function and foreach array support.
>>
>> While I don't have a problem with hstore_to_array, I don't think that
>> row_to_array is a very good idea; it's basically encouraging people to
>> throw away SQL datatypes altogether and imagine that everything is text.
>> They've already bought into that concept if they are using hstore or
>> json, so smashing elements of those containers to text is not a problem.
>> But that doesn't make this version a good thing.
>>
>> (In any case, those who insist can get there through row_to_json, no?)
>
> You have a point.  What does attached do that to_json does not do
> besides completely discard type information?  Our json api is pretty
> rich and getting richer.  For better or ill, we dumped all json
> support into the already stupendously bloated public namespace and so
> it's always available.

I can see plenty of utility for a function like Pavel speaks of, but
I'd personally rather see it as a function that returns table (colname
name, coltype regtype, coltypmod integer, coltextvalue text,
colordinal integer) so it can carry more complete information and
there's no need to worry about foreach(array). The main use of a
function that includes text representations of the values would IMO be
using it from plain SQL, rather than PL/PgSQL, when faced with
anonymous records.

I'd find it more useful to have lvalue-expressions for dynamic access
to record fields and a function to get record metadata - field names,
types and typmods. Some kind of "pg_get_record_info(record) returns
table(fieldname text, fieldtype regtype, fieldtypmod integer)" and a
PL/PgSQL lvalue-expression for record field access like
"RECORD_FIELD(therecord, fieldname)". I would _certainly_ want to be
able to get the type metadata without the values.

That way you could interact natively with the fields in their true
types, without forcing conversion into and out of 'text', which is a
known performance pain-point with PL/PgSQL. (PL/PgSQL doesn't have a
VARIANT type or support for using 'anyelement', which would be the
other way to solve the type flattening problem IMO).

Think:

DECLARE   myrow record;   fi record;
BEGIN   EXECUTE user_supplied_dynamic_query INTO myrow;   FOR fi IN       SELECT fieldname, fieldtype, fieldtypmod
FROM pg_get_record_info(myrow)   LOOP       IF fi.fieldtype == 'int4'::regtype THEN           RECORD_FIELD(myrow,
fi.fieldname):= RECORD_FIELD(myrow,

fi.fieldname) + 1;       END IF;   END LOOP;
END;

OK, so it's a stupid example - increment all int4 fields by one. It
conveys the rough idea though - native use of the field types.

Note that RECORD_FIELD is distinct from the existing support for
   EXECUTE format('SELECT $1.%I', fieldname) USING therecord;

in that that approach doesn't work for all ways that a record can be
produced, it's slow, it doesn't have a good way to enumerate field
names, and there's no equivalent to write to the field. Current
approaches for that are ghastly:
http://stackoverflow.com/q/7711432/398670 .

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

22 июня 2015 г., 10:47:02

2015-06-22 5:18 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:

On 2 April 2015 at 01:59, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Sun, Mar 29, 2015 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>> here is rebased patch.
>>> It contains both patches - row_to_array function and foreach array support.
>>
>> While I don't have a problem with hstore_to_array, I don't think that
>> row_to_array is a very good idea; it's basically encouraging people to
>> throw away SQL datatypes altogether and imagine that everything is text.
>> They've already bought into that concept if they are using hstore or
>> json, so smashing elements of those containers to text is not a problem.
>> But that doesn't make this version a good thing.
>>
>> (In any case, those who insist can get there through row_to_json, no?)
>
> You have a point. What does attached do that to_json does not do
> besides completely discard type information? Our json api is pretty
> rich and getting richer. For better or ill, we dumped all json
> support into the already stupendously bloated public namespace and so
> it's always available.

I can see plenty of utility for a function like Pavel speaks of, but
I'd personally rather see it as a function that returns table (colname
name, coltype regtype, coltypmod integer, coltextvalue text,
colordinal integer) so it can carry more complete information and
there's no need to worry about foreach(array). The main use of a
function that includes text representations of the values would IMO be
using it from plain SQL, rather than PL/PgSQL, when faced with
anonymous records.

I'd find it more useful to have lvalue-expressions for dynamic access
to record fields and a function to get record metadata - field names,
types and typmods. Some kind of "pg_get_record_info(record) returns
table(fieldname text, fieldtype regtype, fieldtypmod integer)" and a
PL/PgSQL lvalue-expression for record field access like
"RECORD_FIELD(therecord, fieldname)". I would _certainly_ want to be
able to get the type metadata without the values.

That way you could interact natively with the fields in their true
types, without forcing conversion into and out of 'text', which is a
known performance pain-point with PL/PgSQL. (PL/PgSQL doesn't have a
VARIANT type or support for using 'anyelement', which would be the
other way to solve the type flattening problem IMO).

Think:

DECLARE
myrow record;
fi record;
BEGIN
EXECUTE user_supplied_dynamic_query INTO myrow;
FOR fi IN
SELECT fieldname, fieldtype, fieldtypmod
FROM pg_get_record_info(myrow)
LOOP
IF fi.fieldtype == 'int4'::regtype THEN
RECORD_FIELD(myrow, fi.fieldname) := RECORD_FIELD(myrow,
fi.fieldname) + 1;
END IF;
END LOOP;
END;

I am thinking so this is separate task, that should not be solved simply too. I wrote a set functions for working with record (https://github.com/okbob/pltoolbox/blob/master/record.c). But it doesn't solve the basic issues:

1. speed - FOR IN SELECT FROM is more expensive then just unpacking row or record

2. unclean game with creating more code path for any special type.

I have little bit different idea. FOR IN RECORD can change type of any automatic variable in any iteration. Internally we can do more code paths - so your code can be rewritten to

FOREACH key, val IN RECORD myrow

LOOP

IF pg_typeof(val) IN ('int4', 'double precision', 'numeric') THEN

val := val + 1; -- these variables can be mutable

-- or maybe in futore

myrow[key] := val + 1;

END IF;

END LOOP;

What is important - "val" is automatic variable, and it can has different type in any step.

It is little bit strange, but impossible to solve, so we cannot to support row[var] as right value (without immutable casting). But we can do it with left value.

OK, so it's a stupid example - increment all int4 fields by one. It
conveys the rough idea though - native use of the field types.

Note that RECORD_FIELD is distinct from the existing support for

EXECUTE format('SELECT $1.%I', fieldname) USING therecord;

in that that approach doesn't work for all ways that a record can be
produced, it's slow, it doesn't have a good way to enumerate field
names, and there's no equivalent to write to the field. Current
approaches for that are ghastly:
http://stackoverflow.com/q/7711432/398670 .

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

23 июня 2015 г., 02:56:52

On 6/22/15 2:46 AM, Pavel Stehule wrote:
>
> FOREACH key, val IN RECORD myrow
> LOOP
>    IF pg_typeof(val) IN ('int4', 'double precision', 'numeric') THEN
>      val := val + 1; -- these variables can be mutable
>      -- or maybe in futore
>     myrow[key] := val + 1;
>    END IF;
> END LOOP;
>
> What is important - "val" is automatic variable, and it can has
> different type in any step.
>
> It is little bit strange, but impossible to solve, so we cannot to
> support row[var] as right value (without immutable casting). But we can
> do it with left value.

Actually, you can (theoretically) solve it for the right value as well 
with if val is an actual type and you have operators on that type that 
know to search for a specific operator given the actual types that are 
involved. So if val is int4, val + 1 becomes int4 + int4.

The problem I've run into with this is by the time you've added enough 
casts to make this workable you've probably created a situation where 
val + something is going to recurse back to itself. I've partially 
solved this in [1], and intend to finish it by calling back in via SPI 
to do the final resolution, the same way the RI triggers do.

What would be a lot better is if we had better control over function and 
operator resolution.

[1] 

https://github.com/decibel/variant/commit/2b99067744a405da8a325de1ebabd213106f794f#diff-8aa2db4a577ee4201d6eb9041c2a457eR846
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

23 июня 2015 г., 17:45:57

2015-06-23 1:56 GMT+02:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 6/22/15 2:46 AM, Pavel Stehule wrote:

FOREACH key, val IN RECORD myrow
LOOP
IF pg_typeof(val) IN ('int4', 'double precision', 'numeric') THEN
val := val + 1; -- these variables can be mutable
-- or maybe in futore
myrow[key] := val + 1;
END IF;
END LOOP;

What is important - "val" is automatic variable, and it can has
different type in any step.

It is little bit strange, but impossible to solve, so we cannot to
support row[var] as right value (without immutable casting). But we can
do it with left value.

Actually, you can (theoretically) solve it for the right value as well with if val is an actual type and you have operators on that type that know to search for a specific operator given the actual types that are involved. So if val is int4, val + 1 becomes int4 + int4.

The problem I've run into with this is by the time you've added enough casts to make this workable you've probably created a situation where val + something is going to recurse back to itself. I've partially solved this in [1], and intend to finish it by calling back in via SPI to do the final resolution, the same way the RI triggers do.

What would be a lot better is if we had better control over function and operator resolution.

[1] https://github.com/decibel/variant/commit/2b99067744a405da8a325de1ebabd213106f794f#diff-8aa2db4a577ee4201d6eb9041c2a457eR846

The solution of dynamic operators changes philosophy about 180° - and I afraid about a performance.

Now if I am thinking about possibilities - probably it is solvable on right side too. It needs to solve two steps:

1. parametrized record reference syntax - some like SELECT $1[$]

2. possibility to throw plan cache, if result has different type than is expected in cache.

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX

Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

23 июня 2015 г., 22:58:08

On 6/23/15 9:45 AM, Pavel Stehule wrote:
>
> 2015-06-23 1:56 GMT+02:00 Jim Nasby <Jim.Nasby@bluetreble.com
> <mailto:Jim.Nasby@bluetreble.com>>:
>
>     On 6/22/15 2:46 AM, Pavel Stehule wrote:
>
>
>         FOREACH key, val IN RECORD myrow
>         LOOP
>             IF pg_typeof(val) IN ('int4', 'double precision', 'numeric')
>         THEN
>               val := val + 1; -- these variables can be mutable
>               -- or maybe in futore
>              myrow[key] := val + 1;
>             END IF;
>         END LOOP;
>
>         What is important - "val" is automatic variable, and it can has
>         different type in any step.
>
>         It is little bit strange, but impossible to solve, so we cannot to
>         support row[var] as right value (without immutable casting). But
>         we can
>         do it with left value.
>
>
>     Actually, you can (theoretically) solve it for the right value as
>     well with if val is an actual type and you have operators on that
>     type that know to search for a specific operator given the actual
>     types that are involved. So if val is int4, val + 1 becomes int4 + int4.
>
>     The problem I've run into with this is by the time you've added
>     enough casts to make this workable you've probably created a
>     situation where val + something is going to recurse back to itself.
>     I've partially solved this in [1], and intend to finish it by
>     calling back in via SPI to do the final resolution, the same way the
>     RI triggers do.
>
>     What would be a lot better is if we had better control over function
>     and operator resolution.
>
>     [1]
>
https://github.com/decibel/variant/commit/2b99067744a405da8a325de1ebabd213106f794f#diff-8aa2db4a577ee4201d6eb9041c2a457eR846
>
>
> The solution of dynamic operators changes philosophy about 180° - and I
> afraid about a performance.
>
> Now if I am thinking about possibilities - probably it is solvable on
> right side too. It needs to solve two steps:
>
> 1. parametrized record reference syntax - some like SELECT $1[$]
> 2. possibility to throw plan cache, if result has different type than is
> expected in cache.

Well, the other option is we allow for cases where we don't know in 
advance what the type will be. That would handle this, JSON, variant, 
and possibly some other scenarios.

BTW, I think this relates to the desire to be able to do more OO-ish 
things in the database. Like "do X to all elements in this array". And 
to have actual classes, private members, real arrays of arrays. It seems 
like there's a bigger need here that's only being addressed piecemeal. :/
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Merlin Moncure

Дата:

23 июня 2015 г., 23:22:28

On Tue, Jun 23, 2015 at 2:57 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 6/23/15 9:45 AM, Pavel Stehule wrote:
>> 1. parametrized record reference syntax - some like SELECT $1[$]
>> 2. possibility to throw plan cache, if result has different type than is
>> expected in cache.
>
>
> Well, the other option is we allow for cases where we don't know in advance
> what the type will be. That would handle this, JSON, variant, and possibly
> some other scenarios.
>
> BTW, I think this relates to the desire to be able to do more OO-ish things
> in the database. Like "do X to all elements in this array". And to have
> actual classes, private members, real arrays of arrays. It seems like
> there's a bigger need here that's only being addressed piecemeal. :/

I would rephrase that to: "do X to all fields of an object".
Array handling is pretty good now (minus arrays of arrays, but arrays
of objects containing arrays is 'good enough' for most real world
cases).  We've suffered for a while now with hstore/json as a
temporary container to handle operations that are not well supported
by postgres's particularly strongly typed flavor SQL.   The "OO" of
postgres has been gradually diluting away; it's not a 'object
relational' database anymore and the OO features, very much a product
of the silly 90's OO hysteria, have been recast into more useful
features like inheritance and/or pruned back.

I don't mind having to push everything to jsonb and back for tuple
manipulation and I expect that's how these types of things are going
to be done moving forwards. jsonb has clearly caught a bid judging by
what I'm reading in the blogosphere and will continue to accrete
features things like this.

merlin

Re: proposal: row_to_array function

От

Pavel Stehule

Дата:

23 июня 2015 г., 23:41:22

2015-06-23 21:57 GMT+02:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 6/23/15 9:45 AM, Pavel Stehule wrote:

2015-06-23 1:56 GMT+02:00 Jim Nasby <Jim.Nasby@bluetreble.com
<mailto:Jim.Nasby@bluetreble.com>>:

On 6/22/15 2:46 AM, Pavel Stehule wrote:

FOREACH key, val IN RECORD myrow
LOOP
IF pg_typeof(val) IN ('int4', 'double precision', 'numeric')
THEN
val := val + 1; -- these variables can be mutable
-- or maybe in futore
myrow[key] := val + 1;
END IF;
END LOOP;

What is important - "val" is automatic variable, and it can has
different type in any step.

It is little bit strange, but impossible to solve, so we cannot to
support row[var] as right value (without immutable casting). But
we can
do it with left value.

Actually, you can (theoretically) solve it for the right value as
well with if val is an actual type and you have operators on that
type that know to search for a specific operator given the actual
types that are involved. So if val is int4, val + 1 becomes int4 + int4.

The problem I've run into with this is by the time you've added
enough casts to make this workable you've probably created a
situation where val + something is going to recurse back to itself.
I've partially solved this in [1], and intend to finish it by
calling back in via SPI to do the final resolution, the same way the
RI triggers do.

What would be a lot better is if we had better control over function
and operator resolution.

[1]
https://github.com/decibel/variant/commit/2b99067744a405da8a325de1ebabd213106f794f#diff-8aa2db4a577ee4201d6eb9041c2a457eR846

The solution of dynamic operators changes philosophy about 180° - and I
afraid about a performance.

Now if I am thinking about possibilities - probably it is solvable on
right side too. It needs to solve two steps:

1. parametrized record reference syntax - some like SELECT $1[$]
2. possibility to throw plan cache, if result has different type than is
expected in cache.

Well, the other option is we allow for cases where we don't know in advance what the type will be. That would handle this, JSON, variant, and possibly some other scenarios.

BTW, I think this relates to the desire to be able to do more OO-ish things in the database. Like "do X to all elements in this array". And to have actual classes, private members, real arrays of arrays. It seems like there's a bigger need here that's only being addressed piecemeal. :/

I would not to open this box - and I would not to throw or redesign almost all PostgreSQL type handling system. I am sure, so it is not necessary. PL can be relative static if the dynamic is covered by query language. The few features can implemented without to necessity to redesign all. Still there are other PL - and we have not force to design new Perl, JavaScript, ...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

23 июня 2015 г., 23:45:54

On 6/23/15 3:22 PM, Merlin Moncure wrote:
> I would rephrase that to: "do X to all fields of an object".
> Array handling is pretty good now (minus arrays of arrays, but arrays

Except that still won't make it easy to do something to each element of 
an array in SQL, which I think would be nice to have.

> of objects containing arrays is 'good enough' for most real world
> cases).  We've suffered for a while now with hstore/json as a
> temporary container to handle operations that are not well supported
> by postgres's particularly strongly typed flavor SQL.   The "OO" of
> postgres has been gradually diluting away; it's not a 'object
> relational' database anymore and the OO features, very much a product
> of the silly 90's OO hysteria, have been recast into more useful
> features like inheritance and/or pruned back.

Admittedly I've never played with an OO database, but I think our data 
features are pretty good [1]. Where I do think we can improve though is 
developing/coding things in the database. For example, I'd love to have 
the equivalent to a class. Perhaps that could be accomplished by 
allowing multiple instances of an extension. I'd also like stronger 
support for private objects (permissions don't really fit that bill).

> I don't mind having to push everything to jsonb and back for tuple
> manipulation and I expect that's how these types of things are going
> to be done moving forwards. jsonb has clearly caught a bid judging by
> what I'm reading in the blogosphere and will continue to accrete
> features things like this.

I think it's unfortunate to lose the strong typing that we have. That 
can be especially important for something like numbers (was it 
originally a float or a numeric?). But maybe JSON is good enough.

[1] The one OO-ish data feature I'd like is the ability to de-reference 
a foreign key "pointer". So if

CREATE TABLE b( a_id int REFERENCES a);

then have

SELECT a_id.some_field FROM b;

transform to

SELECT a.some_field FROM b JOIN a ...;
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Jim Nasby

Дата:

23 июня 2015 г., 23:55:25

On 6/23/15 3:40 PM, Pavel Stehule wrote:
>     BTW, I think this relates to the desire to be able to do more OO-ish
>     things in the database. Like "do X to all elements in this array".
>     And to have actual classes, private members, real arrays of arrays.
>     It seems like there's a bigger need here that's only being addressed
>     piecemeal. :/
>
>
> I would not to open this box - and I would not to throw or redesign
> almost all PostgreSQL type handling system. I am sure, so it is not
> necessary. PL can be relative static if the dynamic is covered by query
> language. The few features can implemented without to necessity to
> redesign all. Still there are other PL - and we have not force to design
> new Perl, JavaScript, ...

By that argument why are we putting it into plpgsql either? You can 
easily do the stuff we've been talking about in plperl (and presumably 
most other pl's). So why mess around with adding it to plpgsql?

More importantly, these are things that would be extremely useful at the 
SQL level. When it comes to records for example, we frequently know 
exactly what's in them, so why do we force users to statically specify 
that at the SQL level? This is why we don't support pivot tables (which 
in the BI world is a Big Deal).

I think it's a mistake to try and solve this strictly through plpgsql 
without recognizing the larger desire and trying to move the ball that 
direction. I'm not saying a first effort should boil the ocean, but if 
we keep piecemealing this without more though we're going to keep 
getting more warts (like a lot of the gotchas we have with arrays).
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: proposal: row_to_array function

От

Merlin Moncure

Дата:

24 июня 2015 г., 00:50:58

On Tue, Jun 23, 2015 at 3:45 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 6/23/15 3:22 PM, Merlin Moncure wrote:
>>
>> I would rephrase that to: "do X to all fields of an object".
>> Array handling is pretty good now (minus arrays of arrays, but arrays
>
>
> Except that still won't make it easy to do something to each element of an
> array in SQL, which I think would be nice to have.

Maybe, or maybe we're framing the problem incorrectly.  To me, it's
not really all that difficult to do:
select foo(x) from unnest(bar) x;

Unless you have to maintain state inside of foo(), in which case I'd
probably using the highly underutilized 'window over custom aggregate'
technique.

merlin

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: proposal: row_to_array function

Вложения

Вложения

Вложения

Вложения

Вложения