Обсуждение: Re: About #13489, array dimensions and CREATE TABLE ... LIKE
On Fri, Sep 8, 2023 at 05:10:51PM -0400, Bruce Momjian wrote: > I knew we only considered the array dimension sizes to be documentation > _in_ the query, but I thought we at least properly displayed the number > of dimensions specified at creation when we described the table in psql, > but it seems we don't do that either. > > A big question is why we even bother to record the dimensions in > pg_attribute if is not accurate for LIKE and not displayed to the user > in a meaningful way by psql. > > I think another big question is whether the structure we are using to > supply the column information to BuildDescForRelation is optimal. The > typmod that has to be found for CREATE TABLE uses: > > typenameTypeIdAndMod(NULL, entry->typeName, &atttypid, &atttypmod); > > which calls typenameTypeIdAndMod() -> typenameType() -> LookupTypeName() > -> LookupTypeNameExtended() -> typenameTypeMod(). This seems very > complicated because the ColumnDef, at least in the LIKE case, already > has the value. Is there a need to revisit how we handle type such > cases? (Bug report moved to hackers, previous bug reporters added CCs.) I looked at this some more and found more fundamental problems. We have pg_attribute.attndims which does record the array dimensionality: CREATE TABLE test (data integer, data_array integer[5][5]); SELECT attndims FROM pg_attribute WHERE attrelid = 'test'::regclass AND attname = 'data_array'; attndims ---------- 2 The first new problem I found is that we don't dump the dimensionality: $ pg_dump test ... CREATE TABLE public.test ( data integer, --> data_array integer[] ); and psql doesn't display the dimensionality: \d test Table "public.test" Column | Type | Collation | Nullable | Default ------------+-----------+-----------+----------+--------- data | integer | | | --> data_array | integer[] | | | A report from 2015 reports that CREATE TABLE ... LIKE and CREATE TABLE ... AS doesn't propagate the dimensionality: https://www.postgresql.org/message-id/flat/20150707072942.1186.98151%40wrigleys.postgresql.org and this thread from 2018 supplied a fix: https://www.postgresql.org/message-id/flat/7862e882-8b9a-0c8e-4a38-40ad374d3634%40brandwatch.com though in my testing it only fixes LIKE, not CREATE TABLE ... AS. This report from April of this year also complains about LIKE: https://www.postgresql.org/message-id/flat/ZD%2B14YZ4IUue8Rhi%40gendo.asyd.net Here is the output from master for LIKE: CREATE TABLE test2 (LIKE test); SELECT attndims FROM pg_attribute WHERE attrelid = 'test2'::regclass AND attname = 'data_array'; attndims ---------- --> 0 and this is the output for CREATE TABLE ... AS: CREATE TABLE test3 AS SELECT * FROM test; SELECT attndims FROM pg_attribute WHERE attrelid = 'test3'::regclass AND attname = 'data_array'; attndims ---------- --> 0 The attached patch fixes pg_dump: $ pg_dump test ... CREATE TABLE public.test2 ( data integer, --> data_array integer[][] ); It uses repeat() at the SQL level rather then modifying format_type() at the SQL or C level. It seems format_type() is mostly used to get the type name, e.g. int4[], rather than the column definition so I added brackets at the call site. I used a similar fix for psql output: \d test Table "public.test" Column | Type | Collation | Nullable | Default ------------+-------------+-----------+----------+--------- data | integer | | | --> data_array | integer[][] | | | The 2018 patch from Alexey Bashtanov fixes the LIKE case: CREATE TABLE test2 (LIKE test); \d test2 Table "public.test2" Column | Type | Collation | Nullable | Default ------------+-------------+-----------+----------+--------- data | integer | | | --> data_array | integer[][] | | | It does not fix CREATE TABLE ... AS because it looks like fixing that would require adding an ndims column to Var for WITH NO DATA and adding ndims to TupleDesc for WITH DATA. I am not sure if that overhead is warrented to fix this item. I have added C comments where they should be added. I would like to apply this patch to master because I think our current deficiencies in this area are unacceptable. An alternate approach would be to remove pg_attribute.attndims so we don't even try to preserve dimensionality. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Вложения
Bruce Momjian <bruce@momjian.us> writes: > I would like to apply this patch to master because I think our current > deficiencies in this area are unacceptable. I do not think this is a particularly good idea, because it creates the impression in a couple of places that we track this data, when we do not really do so to any meaningful extent. > An alternate approach would > be to remove pg_attribute.attndims so we don't even try to preserve > dimensionality. I could get behind that, perhaps. It looks like we're not using the field in any meaningful way, and we could simplify TupleDescInitEntry and perhaps some other APIs. regards, tom lane
On Mon, Nov 20, 2023 at 09:04:21PM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > I would like to apply this patch to master because I think our current > > deficiencies in this area are unacceptable. > > I do not think this is a particularly good idea, because it creates > the impression in a couple of places that we track this data, when > we do not really do so to any meaningful extent. Okay, I thought we could get by without tracking the CREATE TABLE AS case, but it is inconsistent. My patch just makes it less inconsistent. > > An alternate approach would > > be to remove pg_attribute.attndims so we don't even try to preserve > > dimensionality. > > I could get behind that, perhaps. It looks like we're not using the > field in any meaningful way, and we could simplify TupleDescInitEntry > and perhaps some other APIs. So should I work on that patch or do you want to try? I think we should do something. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Nov 20, 2023 at 09:04:21PM -0500, Tom Lane wrote: >> Bruce Momjian <bruce@momjian.us> writes: >>> An alternate approach would >>> be to remove pg_attribute.attndims so we don't even try to preserve >>> dimensionality. >> I could get behind that, perhaps. It looks like we're not using the >> field in any meaningful way, and we could simplify TupleDescInitEntry >> and perhaps some other APIs. > So should I work on that patch or do you want to try? I think we should > do something. Let's wait for some other opinions, first ... regards, tom lane
On Mon, 2023-11-20 at 21:13 -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Mon, Nov 20, 2023 at 09:04:21PM -0500, Tom Lane wrote: > > > Bruce Momjian <bruce@momjian.us> writes: > > > > An alternate approach would > > > > be to remove pg_attribute.attndims so we don't even try to preserve > > > > dimensionality. > > > > I could get behind that, perhaps. It looks like we're not using the > > > field in any meaningful way, and we could simplify TupleDescInitEntry > > > and perhaps some other APIs. > > > So should I work on that patch or do you want to try? I think we should > > do something. > > Let's wait for some other opinions, first ... Looking at the code, I get the impression that we wouldn't lose anything without "pg_attribute.attndims", so +1 for removing it. This would call for some documentation. We should remove most of the documentation about the non-existing difference between declaring a column "integer[]", "integer[][]" or "integer[3][3]" and just describe the first variant in detail, perhaps mentioning that the other notations are accepted for backward compatibility. I also think that it would be helpful to emphasize that while dimensionality does not matter to a column definition, it matters for individual array values. Perhaps it would make sense to recommend a check constraint if one wants to make sure that an array column should contain only a certain kind of array. Yours, Laurenz Albe
On Tue, Nov 21, 2023 at 09:33:18AM +0100, Laurenz Albe wrote: > On Mon, 2023-11-20 at 21:13 -0500, Tom Lane wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > > On Mon, Nov 20, 2023 at 09:04:21PM -0500, Tom Lane wrote: > > > > Bruce Momjian <bruce@momjian.us> writes: > > > > > An alternate approach would > > > > > be to remove pg_attribute.attndims so we don't even try to preserve > > > > > dimensionality. > > > > > > I could get behind that, perhaps. It looks like we're not using the > > > > field in any meaningful way, and we could simplify TupleDescInitEntry > > > > and perhaps some other APIs. > > > > > So should I work on that patch or do you want to try? I think we should > > > do something. > > > > Let's wait for some other opinions, first ... > > Looking at the code, I get the impression that we wouldn't lose anything > without "pg_attribute.attndims", so +1 for removing it. > > This would call for some documentation. We should remove most of the > documentation about the non-existing difference between declaring a column > "integer[]", "integer[][]" or "integer[3][3]" and just describe the first > variant in detail, perhaps mentioning that the other notations are > accepted for backward compatibility. Agreed, I see: https://www.postgresql.org/docs/current/arrays.html However, the current implementation ignores any supplied array size limits, i.e., the behavior is the same as for arrays of unspecified length. The current implementation does not enforce the declared number of dimensions either. So both size limits and dimensions would be ignored. > I also think that it would be helpful to emphasize that while dimensionality > does not matter to a column definition, it matters for individual array values. > Perhaps it would make sense to recommend a check constraint if one wants > to make sure that an array column should contain only a certain kind of array. The CHECK constraint idea is very good. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.