Обсуждение: proposal - function string_to_table

Поиск

Список

Период

Сортировка

proposal - function string_to_table

От

Pavel Stehule

Дата:

17 апреля 2020 г., 20:47:15

Hi

I propose new function string_to_table. This function is significantly faster (and simpler) variant of regexp_split_to_array function. There was same process years ago when we implemented string_agg as faster variant of array_to_string(array_agg()). string_to_table is faster variant (and little bit more intuitive alternative of unnest(string_to_array()).

string_to_table is about 15% faster than unnest(string_to_array()) and about 40% faster than regexp_split_to_array.

Initial patch is attached

Notes, comments?

Regards

Pavel

Вложения

string_to_table.patch

Re: proposal - function string_to_table

От

Justin Pryzby

Дата:

18 апреля 2020 г., 00:29:51

On Fri, Apr 17, 2020 at 07:47:15PM +0200, Pavel Stehule wrote:
> I propose new function string_to_table. This function is significantly

+1

> +/*
> + * Add text to result set (table or array). Build a table when set is a expected or build
> + * a array

as expected (??)
*an* array

> +select string_to_table('abc', '', 'abc');
> + string_to_table 
> +-----------------
> + 
> +(1 row)

Maybe you should \pset null '(null)' for this

-- 
Justin

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

18 апреля 2020 г., 06:45:22

pá 17. 4. 2020 v 23:29 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:

On Fri, Apr 17, 2020 at 07:47:15PM +0200, Pavel Stehule wrote:
> I propose new function string_to_table. This function is significantly

+1

> +/*
> + * Add text to result set (table or array). Build a table when set is a expected or build
> + * a array

as expected (??)
*an* array

I tried to fix this comment

> +select string_to_table('abc', '', 'abc');
> + string_to_table
> +-----------------
> +
> +(1 row)

Maybe you should \pset null '(null)' for this

changing NULL output can break lot of existing tests, but I add second column with info about null

+select string_to_table('1,2,3,4,*,6', ',', '*'), string_to_table('1,2,3,4,*,6', ',', '*') IS NULL;
+ string_to_table | ?column?
+-----------------+----------
+ 1 | f
+ 2 | f
+ 3 | f
+ 4 | f
+ | t
+ 6 | f
+(6 rows)

Regards

Pavel

--
Justin

Вложения

string_to_table-20200418.patch

Re: proposal - function string_to_table

От

"movead.li@highgo.ca"

Дата:

04 июня 2020 г., 12:49:25

+{ oid => '2228', descr => 'split delimited text',

+ proname => 'string_to_table', prorows => '1000', proretset => 't',

+ prorettype => 'text', proargtypes => 'text text',

+ prosrc => 'text_to_table' },

+{ oid => '2282', descr => 'split delimited text with null string',

+ proname => 'string_to_table', prorows => '1000', proretset => 't',

+ prorettype => 'text', proargtypes => 'text text text',

+ prosrc => 'text_to_table_null' },

I go through the patch, and everything looks good to me. But I do not know

why it needs a 'text_to_table_null()', it's ok to put a 'text_to_table' there, I think.

Regards,

Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

05 июня 2020 г., 14:55:58

Hi

čt 4. 6. 2020 v 11:49 odesílatel movead.li@highgo.ca <movead.li@highgo.ca> napsal:

+{ oid => '2228', descr => 'split delimited text',
+ proname => 'string_to_table', prorows => '1000', proretset => 't',
+ prorettype => 'text', proargtypes => 'text text',
+ prosrc => 'text_to_table' },
+{ oid => '2282', descr => 'split delimited text with null string',
+ proname => 'string_to_table', prorows => '1000', proretset => 't',
+ prorettype => 'text', proargtypes => 'text text text',
+ prosrc => 'text_to_table_null' },

I go through the patch, and everything looks good to me. But I do not know
why it needs a 'text_to_table_null()', it's ok to put a 'text_to_table' there, I think.

It is a convention in Postgres - every SQL unique signature has its own unique internal C function.

I am sending a refreshed patch.

Regards

Pavel

Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Вложения

string_to_table-20200605.patch

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

05 июля 2020 г., 14:30:29

pá 5. 6. 2020 v 13:55 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:

Hi

čt 4. 6. 2020 v 11:49 odesílatel movead.li@highgo.ca <movead.li@highgo.ca> napsal:
+{ oid => '2228', descr => 'split delimited text',
+ proname => 'string_to_table', prorows => '1000', proretset => 't',
+ prorettype => 'text', proargtypes => 'text text',
+ prosrc => 'text_to_table' },
+{ oid => '2282', descr => 'split delimited text with null string',
+ proname => 'string_to_table', prorows => '1000', proretset => 't',
+ prorettype => 'text', proargtypes => 'text text text',
+ prosrc => 'text_to_table_null' },

I go through the patch, and everything looks good to me. But I do not know
why it needs a 'text_to_table_null()', it's ok to put a 'text_to_table' there, I think.

It is a convention in Postgres - every SQL unique signature has its own unique internal C function.

I am sending a refreshed patch.

rebase

Regards

Pavel

Regards

Pavel

Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Вложения

string_to_table-20200705.patch

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

06 июля 2020 г., 08:05:37

ne 5. 7. 2020 v 13:30 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:

pá 5. 6. 2020 v 13:55 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

čt 4. 6. 2020 v 11:49 odesílatel movead.li@highgo.ca <movead.li@highgo.ca> napsal:
+{ oid => '2228', descr => 'split delimited text',
+ proname => 'string_to_table', prorows => '1000', proretset => 't',
+ prorettype => 'text', proargtypes => 'text text',
+ prosrc => 'text_to_table' },
+{ oid => '2282', descr => 'split delimited text with null string',
+ proname => 'string_to_table', prorows => '1000', proretset => 't',
+ prorettype => 'text', proargtypes => 'text text text',
+ prosrc => 'text_to_table_null' },

I go through the patch, and everything looks good to me. But I do not know
why it needs a 'text_to_table_null()', it's ok to put a 'text_to_table' there, I think.

It is a convention in Postgres - every SQL unique signature has its own unique internal C function.

I am sending a refreshed patch.

rebase

two fresh fix

a) remove garbage from patch that breaks doc

b) these functions should not be strict - be consistent with string_to_array functions

Regards

Pavel

Regards

Pavel

Regards

Pavel

Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Вложения

string_to_table-20200706-2.patch

Re: proposal - function string_to_table

От

Peter Smith

Дата:

20 августа 2020 г., 05:06:42

Hi.

I have been looking at the patch: string_to_table-20200706-2.patch

Below are some review comments for your consideration.

====

COMMENT func.sgml (style)

+       <para>
+        splits string into table using supplied delimiter and
+        optional null string.
+       </para>

The format style of the short description is inconsistent with the
other functions.
e.g. Should start with Capital letter.
e.g. Should tag the parameter names properly

Something like:
<para>
Splits <parameter>string</parameter> into a table
using supplied <parameter>delimiter</parameter>
and optional null string <parameter>nullstr</parameter>.
</para>

====

COMMENT func.sgml (what does nullstr do)

The description does not sufficiently describe the purpose/behaviour
of the nullstr.

e.g. Firstly I thought that it meant if 2 consecutive delimiters were
encountered it would substitute this string as the row value. But it
is doing the opposite of what I guessed - if the extracted row value
is the same as nullstr then a NULL row is inserted instead.

====

COMMENT func.sgml (wrong sample output)

+<programlisting>xx
+yy,
+zz</programlisting>

This output is incorrect for the sample given. There is no "yy," in
the output because there is a 'yy' nullstr substitution.

Should be:
---
xx
NULL
zz
---

====

COMMENT func.sgml (related to regexp_split_to_table)

Because this new function is similar to the existing
regexp_split_to_table, perhaps they should cross-reference each other
so a reader of this documentation is made aware of the alternative
function?

====

COMMENT (test cases)

It is impossible to tell difference in the output between empty
strings and nulls currently, so maybe you can change all the tests to
have a form like below so they can be validated properly:

# select v, v IS NULL as "is null" from
string_to_table('a,b,*,c,d,',',','*') g(v);
 v | is null
---+---------
 a | f
 b | f
   | t
 c | f
 d | f
   | f
(6 rows)

or maybe like this is even easier:

# select quote_nullable(string_to_table('a|*||c|d|','|','*'));
 quote_nullable
----------------
 'a'
 NULL
 ''
 'c'
 'd'
 ''
(6 rows)

Something similar was already proposed before [1] but that never got
put into the test code.
[1] https://www.postgresql.org/message-id/CAFj8pRDSzDYmaS06dfMXBfbr8x%2B3xjDJxA5kbL3h8%2BeOGoRUcA%40mail.gmail.com

====

COMMENT (test cases)

There are multiple combinations of the parameters to this function and
MANY different results depending on different values they can take, so
the existing tests only cover a small sample.

I have attached a lot more test scenarios that you may want to include
for better test coverage. Everything seemed to work as expected.

PSA test-cases.pdf

====

COMMENT (accum_result)

+ Datum values[1];
+ bool nulls[1];
+
+ values[0] = PointerGetDatum(result_text);
+ nulls[0] = is_null;

Why not use variables instead of arrays with only 1 element?

====

COMMENT (text_to_array_internal)

+ if (!tstate.astate)
+ PG_RETURN_ARRAYTYPE_P(construct_empty_array(TEXTOID));

Maybe the condition is more readable when expressed as:
if (tstate.astate == NULL)

====

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

test-cases.pdf

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

20 августа 2020 г., 22:21:10

Hi

čt 20. 8. 2020 v 4:07 odesílatel Peter Smith <smithpb2250@gmail.com> napsal:

Hi.

I have been looking at the patch: string_to_table-20200706-2.patch

Below are some review comments for your consideration.

====

COMMENT func.sgml (style)

+ <para>
+ splits string into table using supplied delimiter and
+ optional null string.
+ </para>

The format style of the short description is inconsistent with the
other functions.
e.g. Should start with Capital letter.
e.g. Should tag the parameter names properly

Something like:
<para>
Splits <parameter>string</parameter> into a table
using supplied <parameter>delimiter</parameter>
and optional null string <parameter>nullstr</parameter>.
</para>

done

====

COMMENT func.sgml (what does nullstr do)

The description does not sufficiently describe the purpose/behaviour
of the nullstr.

e.g. Firstly I thought that it meant if 2 consecutive delimiters were
encountered it would substitute this string as the row value. But it
is doing the opposite of what I guessed - if the extracted row value
is the same as nullstr then a NULL row is inserted instead.

done

====

COMMENT func.sgml (wrong sample output)

+<programlisting>xx
+yy,
+zz</programlisting>

This output is incorrect for the sample given. There is no "yy," in
the output because there is a 'yy' nullstr substitution.

Should be:
---
xx
NULL
zz
---

fixed

====

COMMENT func.sgml (related to regexp_split_to_table)

Because this new function is similar to the existing
regexp_split_to_table, perhaps they should cross-reference each other
so a reader of this documentation is made aware of the alternative
function?

I wrote new sentence with ref

====

COMMENT (test cases)

It is impossible to tell difference in the output between empty
strings and nulls currently, so maybe you can change all the tests to
have a form like below so they can be validated properly:

# select v, v IS NULL as "is null" from
string_to_table('a,b,*,c,d,',',','*') g(v);
v | is null
---+---------
a | f
b | f
| t
c | f
d | f
| f
(6 rows)

or maybe like this is even easier:

# select quote_nullable(string_to_table('a|*||c|d|','|','*'));
quote_nullable
----------------
'a'
NULL
''
'c'
'd'
''
(6 rows)

I prefer the first variant, it is clean. It is good idea, done

Something similar was already proposed before [1] but that never got
put into the test code.
[1] https://www.postgresql.org/message-id/CAFj8pRDSzDYmaS06dfMXBfbr8x%2B3xjDJxA5kbL3h8%2BeOGoRUcA%40mail.gmail.com

====

COMMENT (test cases)

There are multiple combinations of the parameters to this function and
MANY different results depending on different values they can take, so
the existing tests only cover a small sample.

I have attached a lot more test scenarios that you may want to include
for better test coverage. Everything seemed to work as expected.

ok, merged

PSA test-cases.pdf

====

COMMENT (accum_result)

+ Datum values[1];
+ bool nulls[1];
+
+ values[0] = PointerGetDatum(result_text);
+ nulls[0] = is_null;

Why not use variables instead of arrays with only 1 element?

Technically it is equivalent, but I think so using one element array is more correct, because function heap_form_tuple expects an array. Sure in C language there is no difference between pointer to value or pointer to array, but minimally the name of the argument "values" implies so argument is an array.

This pattern is used more times in Postgres. You can find a fragments where although we know so array has only one field, still we works with array

misc.c

hash.c

execTuples.c

but I can this code simplify little bit - I can use function tuplestore_putvalues(tupstore, tupdesc, values, nulls);

I see, so this code can be reduced more, and I don't need local variables, but I prefer to be consistent with other parts, and I feel better if I pass an array where the array is expected.

This is not extra important, and I can it change, just I think this variant is cleaner little bit

====

COMMENT (text_to_array_internal)

+ if (!tstate.astate)
+ PG_RETURN_ARRAYTYPE_P(construct_empty_array(TEXTOID));

Maybe the condition is more readable when expressed as:
if (tstate.astate == NULL)

done

new patch attached

Thank you for precious review

Regards

Pavel

====

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

string_to_table-20200820.patch

Re: proposal - function string_to_table

От

Peter Smith

Дата:

21 августа 2020 г., 10:43:54

On Fri, Aug 21, 2020 at 5:21 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

> new patch attached

Thanks for taking some of my previous review comments.

I have re-checked the string_to_table_20200820.patch.

Below are some remaining questions/comments:

====

COMMENT (help text)

+        Splits the <parameter>string</parameter> at occurrences
+        of <parameter>delimiter</parameter> and forms the remaining data
+        into a <type>text</type> tavke.

What did you mean by "remaining" in that description?
It gets a bit strange thinking about remaining NULLs, or remaining
empty strings.

Why not just say "... and forms the data into a <type>text</type> table."

---

+        Splits the <parameter>string</parameter> at occurrences
+        of <parameter>delimiter</parameter> and forms the remaining data
+        into a <type>text</type> tavke.

Typo: "tavke." -> "table."

====

COMMENT (help text reference to regexp_split_to_table)

+        input <parameter>string</parameter> can be done by function
+        <function>regexp_split_to_table</function> (see <xref
linkend="functions-posix-regexp"/>).
+       </para>

In the previous review I suggested adding a reference to the
regexp_split_to_table function.
A hyperlink would be a bonus, but maybe it is not possible.

The hyperlink added in the latest patch is to page for POSIX Regular
Expressions, which doesn't seem appropriate.

====

QUESTION (test cases)

Thanks for merging lots of my additional test cases!

Actually, the previous PDF I sent was 2 pages long but you only merged
the tests of page 1.
I wondered was it accidental to omit all those 2nd page tests?

====

QUESTION (function name?)

I noticed that ALL current string functions that use delimiters have
the word "split" in their name.

e.g.
* regexp_split_to_array
* regexp_split_to_table
* split_part

But "string_to_table" is not following this pattern.

Maybe a different choice of function name would be more consistent
with what is already there?
e.g.  split_to_table, string_split_to_table, etc.

====

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

21 августа 2020 г., 12:08:45

pá 21. 8. 2020 v 9:44 odesílatel Peter Smith <smithpb2250@gmail.com> napsal:

On Fri, Aug 21, 2020 at 5:21 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

> new patch attached

Thanks for taking some of my previous review comments.

I have re-checked the string_to_table_20200820.patch.

Below are some remaining questions/comments:

====

COMMENT (help text)

+ Splits the <parameter>string</parameter> at occurrences
+ of <parameter>delimiter</parameter> and forms the remaining data
+ into a <type>text</type> tavke.

What did you mean by "remaining" in that description?
It gets a bit strange thinking about remaining NULLs, or remaining
empty strings.

Why not just say "... and forms the data into a <type>text</type> table."

---

+ Splits the <parameter>string</parameter> at occurrences
+ of <parameter>delimiter</parameter> and forms the remaining data
+ into a <type>text</type> tavke.

Typo: "tavke." -> "table."

This text is taken from doc for string_to_array

====

COMMENT (help text reference to regexp_split_to_table)

+ input <parameter>string</parameter> can be done by function
+ <function>regexp_split_to_table</function> (see <xref
linkend="functions-posix-regexp"/>).
+ </para>

In the previous review I suggested adding a reference to the
regexp_split_to_table function.
A hyperlink would be a bonus, but maybe it is not possible.

The hyperlink added in the latest patch is to page for POSIX Regular
Expressions, which doesn't seem appropriate.

ok I remove it

====

QUESTION (test cases)

Thanks for merging lots of my additional test cases!

Actually, the previous PDF I sent was 2 pages long but you only merged
the tests of page 1.
I wondered was it accidental to omit all those 2nd page tests?

I'll check it

====

QUESTION (function name?)

I noticed that ALL current string functions that use delimiters have
the word "split" in their name.

e.g.
* regexp_split_to_array
* regexp_split_to_table
* split_part

But "string_to_table" is not following this pattern.

Maybe a different choice of function name would be more consistent
with what is already there?
e.g. split_to_table, string_split_to_table, etc.

I don't agree. This function is twin (with almost identical behaviour) for "string_to_array" function, so I think so the name is correct.

====

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

21 августа 2020 г., 12:45:08

pá 21. 8. 2020 v 11:08 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:

pá 21. 8. 2020 v 9:44 odesílatel Peter Smith <smithpb2250@gmail.com> napsal:
On Fri, Aug 21, 2020 at 5:21 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

> new patch attached

Thanks for taking some of my previous review comments.

I have re-checked the string_to_table_20200820.patch.

Below are some remaining questions/comments:

====

COMMENT (help text)

+ Splits the <parameter>string</parameter> at occurrences
+ of <parameter>delimiter</parameter> and forms the remaining data
+ into a <type>text</type> tavke.

What did you mean by "remaining" in that description?
It gets a bit strange thinking about remaining NULLs, or remaining
empty strings.

Why not just say "... and forms the data into a <type>text</type> table."

---

+ Splits the <parameter>string</parameter> at occurrences
+ of <parameter>delimiter</parameter> and forms the remaining data
+ into a <type>text</type> tavke.

Typo: "tavke." -> "table."

This text is taken from doc for string_to_array

I fixed typo. I hope and expect so doc will be finalized by native speakers.

====

COMMENT (help text reference to regexp_split_to_table)

+ input <parameter>string</parameter> can be done by function
+ <function>regexp_split_to_table</function> (see <xref
linkend="functions-posix-regexp"/>).
+ </para>

In the previous review I suggested adding a reference to the
regexp_split_to_table function.
A hyperlink would be a bonus, but maybe it is not possible.

The hyperlink added in the latest patch is to page for POSIX Regular
Expressions, which doesn't seem appropriate.

ok I remove it

====

QUESTION (test cases)

Thanks for merging lots of my additional test cases!

Actually, the previous PDF I sent was 2 pages long but you only merged
the tests of page 1.
I wondered was it accidental to omit all those 2nd page tests?

I'll check it

I forgot it - now it is merged. Maybe it is over dimensioned for one function, but it is (at the end) a test of string_to_array function too.

====

QUESTION (function name?)

I noticed that ALL current string functions that use delimiters have
the word "split" in their name.

e.g.
* regexp_split_to_array
* regexp_split_to_table
* split_part

But "string_to_table" is not following this pattern.

Maybe a different choice of function name would be more consistent
with what is already there?
e.g. split_to_table, string_split_to_table, etc.

I don't agree. This function is twin (with almost identical behaviour) for "string_to_array" function, so I think so the name is correct.

Unfortunately - there is not consistency in naming already, But I think so string_to_table is a better name, because this function is almost identical with string_to_array.

Regards

Pavel

====

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

string_to_table-20200821.patch

Re: proposal - function string_to_table

От

Peter Smith

Дата:

24 августа 2020 г., 05:18:33

I have re-checked the string_to_table_20200821.patch.

Below is one remaining problem.

====

COMMENT (help text)

+        Splits the <parameter>string</parameter> at occurrences
+        of <parameter>delimiter</parameter> and forms the remaining data
+        into a table with one <type>text</type> type column.
+        If <parameter>delimiter</parameter> is <literal>NULL</literal>,
+        each character in the <parameter>string</parameter> will become a
+        separate element in the array.

Seems like here is a cut/paste error from the string_to_array help text.

"separate element in the array" should say "separate row of the table"

====

>>> Maybe a different choice of function name would be more consistent
>>> with what is already there?
>>> e.g.  split_to_table, string_split_to_table, etc.
>>
>> I don't agree. This function is twin (with almost identical behaviour) for "string_to_array" function, so I think so
thename is correct.
 

OK

====

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

24 августа 2020 г., 19:33:34

po 24. 8. 2020 v 4:19 odesílatel Peter Smith <smithpb2250@gmail.com> napsal:

I have re-checked the string_to_table_20200821.patch.

Below is one remaining problem.

====

COMMENT (help text)

+ Splits the <parameter>string</parameter> at occurrences
+ of <parameter>delimiter</parameter> and forms the remaining data
+ into a table with one <type>text</type> type column.
+ If <parameter>delimiter</parameter> is <literal>NULL</literal>,
+ each character in the <parameter>string</parameter> will become a
+ separate element in the array.

Seems like here is a cut/paste error from the string_to_array help text.

"separate element in the array" should say "separate row of the table"

fixed

====

>>> Maybe a different choice of function name would be more consistent
>>> with what is already there?
>>> e.g. split_to_table, string_split_to_table, etc.
>>
>> I don't agree. This function is twin (with almost identical behaviour) for "string_to_array" function, so I think so the name is correct.

OK

====

please, check attached patch

Regards

Pavel

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

string_to_table-20200824.patch

Re: proposal - function string_to_table

От

Peter Smith

Дата:

25 августа 2020 г., 02:19:02

Hi.

I have re-checked the string_to_table_20200824.patch.

====

On Tue, Aug 25, 2020 at 2:34 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

>> COMMENT (help text)
>>
>> +        Splits the <parameter>string</parameter> at occurrences
>> +        of <parameter>delimiter</parameter> and forms the remaining data
>> +        into a table with one <type>text</type> type column.
>> +        If <parameter>delimiter</parameter> is <literal>NULL</literal>,
>> +        each character in the <parameter>string</parameter> will become a
>> +        separate element in the array.
>>
>> Seems like here is a cut/paste error from the string_to_array help text.
>>
>> "separate element in the array" should say "separate row of the table"
>
>
> fixed
>

No. You wrote "separate row of table". Should say "separate row of the table".

====

QUESTION (pg_proc.dat)

I noticed the oids of the functions are modified in this latest patch.
They seem 1000's away from the next nearest oid.
I was curious about the reason for those particular numbers (8432, 8433)?

====

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

25 августа 2020 г., 09:57:27

út 25. 8. 2020 v 1:19 odesílatel Peter Smith <smithpb2250@gmail.com> napsal:

Hi.

I have re-checked the string_to_table_20200824.patch.

====

On Tue, Aug 25, 2020 at 2:34 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

>> COMMENT (help text)
>>
>> + Splits the <parameter>string</parameter> at occurrences
>> + of <parameter>delimiter</parameter> and forms the remaining data
>> + into a table with one <type>text</type> type column.
>> + If <parameter>delimiter</parameter> is <literal>NULL</literal>,
>> + each character in the <parameter>string</parameter> will become a
>> + separate element in the array.
>>
>> Seems like here is a cut/paste error from the string_to_array help text.
>>
>> "separate element in the array" should say "separate row of the table"
>
>
> fixed
>

No. You wrote "separate row of table". Should say "separate row of the table".

should be fixed now

====

QUESTION (pg_proc.dat)

I noticed the oids of the functions are modified in this latest patch.
They seem 1000's away from the next nearest oid.
I was curious about the reason for those particular numbers (8432, 8433)?

When you run ./unused_oids script, then you get this message

[pavel@nemesis catalog]$ ./unused_oids
4 - 9
560 - 583
786 - 789
811 - 816
1136 - 1137
2121
2137
2228
3435
3585
4035
4142
4179 - 4180
4198 - 4199
4225 - 4301
4388 - 4401
4450 - 4451
4532 - 4565
4572 - 4999
5097 - 5999
6015 - 6099
6105
6107 - 6109
6116
6122 - 8431
8434 - 8455
8457 - 9999
Patches should use a more-or-less consecutive range of OIDs.
Best practice is to start with a random choice in the range 8000-9999.
Suggested random unused OID: 8973 (1027 consecutive OID(s) available starting here)

For me, this is simple protection against oid collision under development, and I expect so commiters does oid' space defragmentation.

Regards

Pavel

====

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

string_to_table-20200825.patch

Re: proposal - function string_to_table

От

Peter Smith

Дата:

25 августа 2020 г., 12:18:26

On Tue, Aug 25, 2020 at 4:58 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
> When you run ./unused_oids script, then you get this message
>
> [pavel@nemesis catalog]$ ./unused_oids
<snip>
> Patches should use a more-or-less consecutive range of OIDs.
> Best practice is to start with a random choice in the range 8000-9999.
> Suggested random unused OID: 8973 (1027 consecutive OID(s) available starting here)
>
> For me, this is simple protection against oid collision under development, and I expect so commiters does oid' space
defragmentation.

I have not used that tool before. Thanks for teaching me!

===

I have re-checked the string_to_table_20200825.patch.

Everything looks good to me now, so I am marking this as "ready for committer".


Kind Regards,
Peter Smith.
Fujitsu Australia

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

25 августа 2020 г., 12:22:44

út 25. 8. 2020 v 11:19 odesílatel Peter Smith <smithpb2250@gmail.com> napsal:

On Tue, Aug 25, 2020 at 4:58 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
> When you run ./unused_oids script, then you get this message
>
> [pavel@nemesis catalog]$ ./unused_oids
<snip>
> Patches should use a more-or-less consecutive range of OIDs.
> Best practice is to start with a random choice in the range 8000-9999.
> Suggested random unused OID: 8973 (1027 consecutive OID(s) available starting here)
>
> For me, this is simple protection against oid collision under development, and I expect so commiters does oid' space defragmentation.

I have not used that tool before. Thanks for teaching me!

:)

===

I have re-checked the string_to_table_20200825.patch.

Everything looks good to me now, so I am marking this as "ready for committer".

Thank you very much :)

Regard

Pavel

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: proposal - function string_to_table

От

Tom Lane

Дата:

03 сентября 2020 г., 01:30:12

Pavel Stehule <pavel.stehule@gmail.com> writes:
> [ string_to_table-20200825.patch ]

I reviewed this, whacked it around a little, and pushed it.

Possibly the most controversial thing I did was to move the existing
documentation entry for string_to_array() into the string-functions
table.  I did not like it one bit that the patch was documenting
string_to_table() far away from string_to_array(), and on reflection
I concluded that you'd picked the right place and the issue here is
that string_to_array() was in the wrong place.

Also, I pared the proposed regression tests a great deal, ending up
with something that matches the existing tests for string_to_array().
The proposed tests seemed mighty duplicative, and they even contained
syntax errors, so I didn't believe that they were carefully considered.

            regards, tom lane

Re: proposal - function string_to_table

От

Peter Smith

Дата:

03 сентября 2020 г., 04:52:57

On Thu, Sep 3, 2020 at 8:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The proposed tests seemed mighty duplicative, and they even contained
> syntax errors, so I didn't believe that they were carefully considered.

Can you please share examples of what syntax errors were in those
previous tests?

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: proposal - function string_to_table

От

Tom Lane

Дата:

03 сентября 2020 г., 04:59:50

Peter Smith <smithpb2250@gmail.com> writes:
> On Thu, Sep 3, 2020 at 8:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The proposed tests seemed mighty duplicative, and they even contained
>> syntax errors, so I didn't believe that they were carefully considered.

> Can you please share examples of what syntax errors were in those
> previous tests?

At about line 415 of string_to_table-20200825.patch:

+select v, v is null as "is null" from string_to_table('1,2,3,4,,6', ',') g(v) g(v);
+ERROR:  syntax error at or near "g"
+LINE 1: ..."is null" from string_to_table('1,2,3,4,,6', ',') g(v) g(v);
+                                                                  ^

Without the duplicate "g(v)", this is identical to the preceding test
case.

            regards, tom lane

Re: proposal - function string_to_table

От

Pavel Stehule

Дата:

03 сентября 2020 г., 06:01:25

čt 3. 9. 2020 v 0:30 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> [ string_to_table-20200825.patch ]

I reviewed this, whacked it around a little, and pushed it.

Possibly the most controversial thing I did was to move the existing
documentation entry for string_to_array() into the string-functions
table. I did not like it one bit that the patch was documenting
string_to_table() far away from string_to_array(), and on reflection
I concluded that you'd picked the right place and the issue here is
that string_to_array() was in the wrong place.

Also, I pared the proposed regression tests a great deal, ending up
with something that matches the existing tests for string_to_array().
The proposed tests seemed mighty duplicative, and they even contained
syntax errors, so I didn't believe that they were carefully considered.

Thank you

Pavel

regards, tom lane