Обсуждение: [PATCH v1] Fix parsing of a complex type that has an array of complex types
Hi hackers, I ran into an issue today when I was trying to insert a complex types where one of its attributes is also an array of complex types, As an example: CREATE TYPE inventory_item AS ( name text, supplier_id integer, price numeric ); CREATE TYPE item_2d AS (id int, items inventory_item[][]); CREATE TABLE item_2d_table (id int, item item_2d); INSERT INTO item_2d_table VALUES(1, '(1,{{("inv a",42,1.99),("inv b",42,1.99)},{("inv c",42,1.99),("inv d",42,2)}})'); The INSERT statement will fail due to how complex types are parsed, I have included a patch in this email to support this scenario. The reason why this fails is because record_in lacks support of detecting an array string when one of the attributes is of type array. Due to this, it will stop processing the column value prematurely, which results in a corrupted value for that particular column. As a result array_in will receive a malformed string which is bound to error. To fix this, record_in can detect columns that are of type array and in such cases leave the input intact. array_in will attempt to extract the elements one by one. In case it is dealing with unquoted elements, the logic needs to slightly change, since if the element is a record, quotes can be allowed, ex: {{("test field")} There are some adjustments that can be made to the patch, for example: We can detect the number of the dimensions of the array in record_in, do we want to error out in case the string has more dimensions than MAXDIM in array.h, (to prevent number over/underflow-ing) or whether we want to error out if number of dimensions is not the same with the number of dimensions that the attribute is supposed to have, or both? Regards, Arjan Marku
Вложения
Arjan Marku <arjanmarku02@gmail.com> writes: > INSERT INTO item_2d_table VALUES(1, '(1,{{("inv a",42,1.99),("inv > b",42,1.99)},{("inv c",42,1.99),("inv d",42,2)}})'); > The INSERT statement will fail due to how complex types are parsed, I > have included a patch in this email to support this scenario. The actual problem with this input is that it's inadequately quoted. The record fields need to be quoted according to the rules in https://www.postgresql.org/docs/current/rowtypes.html#ROWTYPES-IO-SYNTAX and then in the field that is an array, the array elements need to be quoted according to the rules in https://www.postgresql.org/docs/current/arrays.html#ARRAYS-IO and then the innermost record fields need yet another level of quoting (well, you might be able to skip that given that there are no special characters in this particular data, but in general you'd have to). It looks to me like your patch is trying to make array_in and record_in sufficiently aware of each other's rules that the outer quoting could be omitted, but I believe that that's a seriously bad idea. In the first place, it's far from clear that that's even possible without ambiguity (much less that this specific patch does it correctly). In the second place, this will make it even more difficult for these functions to issue on-point error messages for incorrect input. (They're already struggling with that, see e.g. [1].) And in the third place, these functions are nearly unmaintainable spaghetti already. (There have also been complaints that they're too slow, which this won't improve.) We don't need another big dollop of complexity here. Our normal recommendation for handwritten input is to not try to deal with the complications of correctly quoting nested array/record data. Instead use row() and array[] constructors. So you could write something like INSERT INTO item_2d_table VALUES(1, row(1, array[[row('inv a',42,1.99), row('inv b',42,1.99)], [row('inv c',42,1.99), row('inv d',42,2)]]::inventory_item[])); In this particular example we need an explicit cast to cue the parser about the type of the array elements, but then it can cope with casting the outer row() construct automatically. regards, tom lane [1] https://www.postgresql.org/message-id/flat/CACJufxExAcpvrkbLGrZGdZ%3DbFAuj7OVp1mOhk%2BfsBzeUbOGuHQ%40mail.gmail.com
Re: [PATCH v1] Fix parsing of a complex type that has an array of complex types
От
Arjan Marku
Дата:
I agree with all your points as I encountered them myself, especially ambiguity and error handling.
The introduced dependency between those functions also is for me a bad idea but it seemed like the only way to support that use case, but turns out my assumption that the array literal shouldn't be quoted was wrong,
I managed to execute this query fine:
INSERT INTO item_2d_table VALUES(1, '(1,"{{""(\\""inv a\\"",42,1.99)"",""(\\"inv b\\",42,1.99)""},{""(\\""inv c\\"",42,1.99)"",""(\\""inv d\\"",42,2)""}}")');
Thanks for your insights on this,
Kind regards,
Arjan Marku
On 7/15/24 9:15 PM, Tom Lane wrote:
INSERT INTO item_2d_table VALUES(1, '(1,{{("inv a",42,1.99),("inv b",42,1.99)},{("inv c",42,1.99),("inv d",42,2)}})');