Обсуждение: [pgsql-admin] "Soft-hitting" the 1600 column limit

Поиск
Список
Период
Сортировка

[pgsql-admin] "Soft-hitting" the 1600 column limit

От
nunks
Дата:
Hello!

I'm trying to support an application in production at work, and for some obscure reason the developer made it drop and re-create a column periodically. 

I know this is a bad practice (to say the least), and I'm telling them to fix it, but after the 1600th drop/add cycle, PostgreSQL starts giving out the column limit error:

    ERROR:  tables can have at most 1600 columns

I reproduced this behavior in PostgreSQL 10.3 with a simple bash loop and a two-column table, one of which is fixed and the other is repeatedly dropped and re-created until the 1600 limit is reached.

To me this is pretty cool, since I can use this limit as leverage to push the developers to the right path, but should Postgres be doing that? It's as if it doesn't decrement some counter when a column is dropped.

Many thanks!
Bruno

----------
“Life beats down and crushes the soul and art reminds you that you have one.

- Stella Adler

Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
"David G. Johnston"
Дата:
On Wed, Jun 6, 2018 at 9:39 AM, nunks <nunks.lol@gmail.com> wrote:
I reproduced this behavior in PostgreSQL 10.3 with a simple bash loop and a two-column table, one of which is fixed and the other is repeatedly dropped and re-created until the 1600 limit is reached.

To me this is pretty cool, since I can use this limit as leverage to push the developers to the right path, but should Postgres be doing that? It's as if it doesn't decrement some counter when a column is dropped.

​This is working as expected.  When dropping a column, or adding a new column that can contain nulls, PostgreSQL does not, and does not want to, rewrite the physically stored records/table.  Thus it must be capable of accepting records formed for prior table versions which means it must keep track of those now-deleted columns.​

I'm sure that there is more to it that requires reading, and understanding, the source code to comprehend; but that does seem to explain why its works the way it does.

David J.

Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
Tom Lane
Дата:
nunks <nunks.lol@gmail.com> writes:
> I'm trying to support an application in production at work, and for some
> obscure reason the developer made it drop and re-create a column
> periodically.

Not a good idea ...

> I know this is a bad practice (to say the least), and I'm telling them to
> fix it, but after the 1600th drop/add cycle, PostgreSQL starts giving out
> the column limit error:

>     ERROR:  tables can have at most 1600 columns

Yup.  "Dropping" a column just hides it, doesn't make it disappear.
There are a lot of reasons why that's unlikely to change anytime soon.

            regards, tom lane


Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
Wells Oliver
Дата:
let's just stop for a moment and talk about what you're doing that requires *1600 columns* because my jaw is hitting the floor.

On Wed, Jun 6, 2018 at 9:39 AM, nunks <nunks.lol@gmail.com> wrote:
Hello!

I'm trying to support an application in production at work, and for some obscure reason the developer made it drop and re-create a column periodically. 

I know this is a bad practice (to say the least), and I'm telling them to fix it, but after the 1600th drop/add cycle, PostgreSQL starts giving out the column limit error:

    ERROR:  tables can have at most 1600 columns

I reproduced this behavior in PostgreSQL 10.3 with a simple bash loop and a two-column table, one of which is fixed and the other is repeatedly dropped and re-created until the 1600 limit is reached.

To me this is pretty cool, since I can use this limit as leverage to push the developers to the right path, but should Postgres be doing that? It's as if it doesn't decrement some counter when a column is dropped.

Many thanks!
Bruno

----------
“Life beats down and crushes the soul and art reminds you that you have one.

- Stella Adler



--

Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
Scott Ribe
Дата:
David's answer is right. Basically every column added gets an index # which is not recycled.

I just want to add that a dump/restore will not bring in the history of deleted columns, thus resetting the column
counterfor the table. 

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/



> On Jun 6, 2018, at 10:51 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
>
> On Wed, Jun 6, 2018 at 9:39 AM, nunks <nunks.lol@gmail.com> wrote:
> I reproduced this behavior in PostgreSQL 10.3 with a simple bash loop and a two-column table, one of which is fixed
andthe other is repeatedly dropped and re-created until the 1600 limit is reached. 
>
> To me this is pretty cool, since I can use this limit as leverage to push the developers to the right path, but
shouldPostgres be doing that? It's as if it doesn't decrement some counter when a column is dropped. 
>
> ​This is working as expected.  When dropping a column, or adding a new column that can contain nulls, PostgreSQL does
not,and does not want to, rewrite the physically stored records/table.  Thus it must be capable of accepting records
formedfor prior table versions which means it must keep track of those now-deleted columns.​ 
>
> I'm sure that there is more to it that requires reading, and understanding, the source code to comprehend; but that
doesseem to explain why its works the way it does. 
>
> David J.
> ​



Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
Scott Ribe
Дата:
He's *NOT* doing anything that requires 1600 columns--that was pretty much the point of his post.

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/



> On Jun 6, 2018, at 10:53 AM, Wells Oliver <wells.oliver@gmail.com> wrote:
>
> let's just stop for a moment and talk about what you're doing that requires *1600 columns* because my jaw is hitting
thefloor. 



Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
nunks
Дата:
Thank you all for your answers.

David and Tom's answers are more than enough to feed my argument to the developers. =)

Scott, you're right of course. My first action to reestablish their development database was to do a dump/restore, but by logging DML statements I can see the application is dropping and adding the same column over and over again, so it's a matter of time before it hangs. I hope showing this thread to the developers will make them see some sense. 

@Wells, don't ask me, I have no idea, hahah, but it's a small table and the application keeps dropping/readding the column, actually. I figure the developer is using some kind of crazy ORM that drops the column when he deletes an object property and re-adds it when he sets the property back or something. 

@Ervin, a vacuum full doesn't "fix" this behavior, since it's expected after all =) 

collimittest=# \d add_drop_text 
          Table "public.add_drop_text"
 Column | Type | Collation | Nullable | Default 
--------+------+-----------+----------+---------
 fixed  | text |           |          | 

collimittest=# vacuum FULL ANALYZE VERBOSE add_drop_text ;
INFO:  vacuuming "public.add_drop_text"
INFO:  "add_drop_text": found 0 removable, 0 nonremovable row versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  analyzing "public.add_drop_text"
INFO:  "add_drop_text": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows
VACUUM
collimittest=# alter table add_drop_text add column x int;
ERROR:  tables can have at most 1600 columns


----------
“Life beats down and crushes the soul and art reminds you that you have one.

- Stella Adler

On Wed, Jun 6, 2018 at 1:57 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
He's *NOT* doing anything that requires 1600 columns--that was pretty much the point of his post.

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/



> On Jun 6, 2018, at 10:53 AM, Wells Oliver <wells.oliver@gmail.com> wrote:
>
> let's just stop for a moment and talk about what you're doing that requires *1600 columns* because my jaw is hitting the floor.


Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
Wells Oliver
Дата:
Is there a pointer to some deeper explanation of this? It's news to me and kind of fascinating that dropped columns don't disappear. I did this stupid test, which obviously failed:

mydb=# create table wells.foo (col1 text, col2 text);
CREATE TABLE
mydb=# insert into wells.foo values ('a','b'),('c','d');
INSERT 0 2
mydb=# alter table wells.foo drop column col2;
ALTER TABLE
mydb=# insert into wells.foo (col1,col2) values('a','b');
ERROR:  column "col2" of relation "foo" does not exist
LINE 1: insert into wells.foo (col1,col2) values('a','b');

Just curious then, in what meaningful way do dropped columns persist, what are the reasons?

On Wed, Jun 6, 2018 at 10:13 AM, nunks <nunks.lol@gmail.com> wrote:
Thank you all for your answers.

David and Tom's answers are more than enough to feed my argument to the developers. =)

Scott, you're right of course. My first action to reestablish their development database was to do a dump/restore, but by logging DML statements I can see the application is dropping and adding the same column over and over again, so it's a matter of time before it hangs. I hope showing this thread to the developers will make them see some sense. 

@Wells, don't ask me, I have no idea, hahah, but it's a small table and the application keeps dropping/readding the column, actually. I figure the developer is using some kind of crazy ORM that drops the column when he deletes an object property and re-adds it when he sets the property back or something. 

@Ervin, a vacuum full doesn't "fix" this behavior, since it's expected after all =) 

collimittest=# \d add_drop_text 
          Table "public.add_drop_text"
 Column | Type | Collation | Nullable | Default 
--------+------+-----------+----------+---------
 fixed  | text |           |          | 

collimittest=# vacuum FULL ANALYZE VERBOSE add_drop_text ;
INFO:  vacuuming "public.add_drop_text"
INFO:  "add_drop_text": found 0 removable, 0 nonremovable row versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  analyzing "public.add_drop_text"
INFO:  "add_drop_text": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows
VACUUM
collimittest=# alter table add_drop_text add column x int;
ERROR:  tables can have at most 1600 columns


----------
“Life beats down and crushes the soul and art reminds you that you have one.

- Stella Adler

On Wed, Jun 6, 2018 at 1:57 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
He's *NOT* doing anything that requires 1600 columns--that was pretty much the point of his post.

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/



> On Jun 6, 2018, at 10:53 AM, Wells Oliver <wells.oliver@gmail.com> wrote:
>
> let's just stop for a moment and talk about what you're doing that requires *1600 columns* because my jaw is hitting the floor.





--

Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
Scott Ribe
Дата:
References to them persist in system catalogs. Otherwise, dropping a column would require rewriting every row in the
table.

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/



> On Jun 6, 2018, at 12:43 PM, Wells Oliver <wells.oliver@gmail.com> wrote:
>
> Just curious then, in what meaningful way do dropped columns persist, what are the reasons?
>



Re: [**EXTERNAL**] Re: [pgsql-admin] "Soft-hitting" the 1600 columnlimit

От
"Moradhassel, Kavian"
Дата:

To my mind, it makes perfect sense for columns to persist in the table structure when dropped…the only question I have is whether the column would survive a VACUUM FULL?  i.e. if the table is rewritten after the column is dropped, would that change things?

 

Kav Moradhassel | Ciena

kmoradha@ciena.com | 385 Terry Fox Drive | Ottawa, ON, K2K 0L1  Canada

 

 

From: Wells Oliver <wells.oliver@gmail.com>
Date: Wednesday, June 6, 2018 at 2:46 PM
To: nunks <nunks.lol@gmail.com>
Cc: Scott Ribe <scott_ribe@elevated-dev.com>, pgsql-admin <pgsql-admin@postgresql.org>
Subject: [**EXTERNAL**] Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

 

Is there a pointer to some deeper explanation of this? It's news to me and kind of fascinating that dropped columns don't disappear. I did this stupid test, which obviously failed:

 

mydb=# create table wells.foo (col1 text, col2 text);

CREATE TABLE

mydb=# insert into wells.foo values ('a','b'),('c','d');

INSERT 0 2

mydb=# alter table wells.foo drop column col2;

ALTER TABLE

mydb=# insert into wells.foo (col1,col2) values('a','b');

ERROR:  column "col2" of relation "foo" does not exist

LINE 1: insert into wells.foo (col1,col2) values('a','b');

 

Just curious then, in what meaningful way do dropped columns persist, what are the reasons?

 

On Wed, Jun 6, 2018 at 10:13 AM, nunks <nunks.lol@gmail.com> wrote:

Thank you all for your answers.



David and Tom's answers are more than enough to feed my argument to the developers. =)



Scott, you're right of course. My first action to reestablish their development database was to do a dump/restore, but by logging DML statements I can see the application is dropping and adding the same column over and over again, so it's a matter of time before it hangs. I hope showing this thread to the developers will make them see some sense. 



@Wells, don't ask me, I have no idea, hahah, but it's a small table and the application keeps dropping/readding the column, actually. I figure the developer is using some kind of crazy ORM that drops the column when he deletes an object property and re-adds it when he sets the property back or something. 



@Ervin, a vacuum full doesn't "fix" this behavior, since it's expected after all =) 



collimittest=# \d add_drop_text 

          Table "public.add_drop_text"

 Column | Type | Collation | Nullable | Default 

--------+------+-----------+----------+---------

 fixed  | text |           |          | 

 

collimittest=# vacuum FULL ANALYZE VERBOSE add_drop_text ;

INFO:  vacuuming "public.add_drop_text"

INFO:  "add_drop_text": found 0 removable, 0 nonremovable row versions in 0 pages

DETAIL:  0 dead row versions cannot be removed yet.

CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

INFO:  analyzing "public.add_drop_text"

INFO:  "add_drop_text": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows

VACUUM

collimittest=# alter table add_drop_text add column x int;

ERROR:  tables can have at most 1600 columns

 


----------

“Life beats down and crushes the soul and art reminds you that you have one.”

- Stella Adler

 

On Wed, Jun 6, 2018 at 1:57 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:

He's *NOT* doing anything that requires 1600 columns--that was pretty much the point of his post.

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/


> On Jun 6, 2018, at 10:53 AM, Wells Oliver <wells.oliver@gmail.com> wrote:
>
> let's just stop for a moment and talk about what you're doing that requires *1600 columns* because my jaw is hitting the floor.

 



 

--

Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
Tom Lane
Дата:
Wells Oliver <wells.oliver@gmail.com> writes:
> Is there a pointer to some deeper explanation of this? It's news to me and
> kind of fascinating that dropped columns don't disappear. I did this stupid
> test, which obviously failed:

> mydb=# create table wells.foo (col1 text, col2 text);
> CREATE TABLE
> mydb=# insert into wells.foo values ('a','b'),('c','d');
> INSERT 0 2
> mydb=# alter table wells.foo drop column col2;
> ALTER TABLE
> mydb=# insert into wells.foo (col1,col2) values('a','b');
> ERROR:  column "col2" of relation "foo" does not exist
> LINE 1: insert into wells.foo (col1,col2) values('a','b');

> Just curious then, in what meaningful way do dropped columns persist, what
> are the reasons?

I don't recall if it's documented explicitly in any user-facing places,
but poking into the source code or the system catalogs will show you what
happens:

regression=# create table wells.foo (col1 text, col2 text);
CREATE TABLE
regression=# select attname, attnum, attisdropped from pg_attribute where attrelid = 'wells.foo'::regclass and attnum >
0;
 attname | attnum | attisdropped
---------+--------+--------------
 col1    |      1 | f
 col2    |      2 | f
(2 rows)

regression=# alter table wells.foo drop column col2;
ALTER TABLE
regression=# select attname, attnum, attisdropped from pg_attribute where attrelid = 'wells.foo'::regclass and attnum >
0;
           attname            | attnum | attisdropped
------------------------------+--------+--------------
 col1                         |      1 | f
 ........pg.dropped.2........ |      2 | t
(2 rows)

Most SQL operations ignore "attisdropped" entries in pg_attribute,
which is why those seem to be hidden.  But they're still valid as
far as the physical representation of the table is concerned.

As for why it's like this, the most obvious practical benefit is that it
makes ALTER TABLE DROP COLUMN cheap: we just have to change that one
entry in pg_attribute, not rewrite the entire table to physically remove
the column from each table row.

There are also some more-theoretical benefits involving having a stable
identifier (a/k/a primary key) for a column.  While that could be done
in different ways, the way we do it is that attrelid (the table's OID)
plus attnum is the unique identifier for a column.

            regards, tom lane


Re: [**EXTERNAL**] Re: [pgsql-admin] "Soft-hitting" the 1600 columnlimit

От
Ron
Дата:
On 06/06/2018 01:54 PM, Moradhassel, Kavian wrote:

To my mind, it makes perfect sense for columns to persist in the table structure when dropped…the only question I have is whether the column would survive a VACUUM FULL?  i.e. if the table is rewritten after the column is dropped, would that change things?

 

Kav Moradhassel | Ciena

kmoradha@ciena.com | 385 Terry Fox Drive | Ottawa, ON, K2K 0L1  Canada


+1

Another solution would be to dump/drop/create/load just that table.

--
Angular momentum makes the world go 'round.

Re: [**EXTERNAL**] Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
"David G. Johnston"
Дата:
On Wed, Jun 6, 2018 at 4:15 PM, Ron <ronljohnsonjr@gmail.com> wrote:
On 06/06/2018 01:54 PM, Moradhassel, Kavian wrote:

To my mind, it makes perfect sense for columns to persist in the table structure when dropped…the only question I have is whether the column would survive a VACUUM FULL?  i.e. if the table is rewritten after the column is dropped, would that change things?

​A cursory skim of cluster.c, plus general reasoning, leads me to think that the extent of the smarts of the table rewrite (for vacuum full at least, not cluster) is to evaluate headers for visibility and omit copying the physical tuples to the new heap.  The contents of each tuple are otherwise copied as-is (except toast pointers...).  So, yes, the variant column structures would indeed survive vacuum full and clustering operations.  There isn't any motivation to do more - a 1600​ column limit, even with dropped columns counted, is not unreasonable - and doing more would making an already expensive operation even more expensive (and the risk of serious data-losing bugs in the first release is scary).
 

Another solution would be to dump/drop/create/load just that table.

​Any non-trivial table is generally challenging to drop due to dependencies.​..

David J.

Re: [**EXTERNAL**] Re: [pgsql-admin] "Soft-hitting" the 1600 column limit

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Wed, Jun 6, 2018 at 4:15 PM, Ron <ronljohnsonjr@gmail.com> wrote:
>> To my mind, it makes perfect sense for columns to persist in the table
>> structure when dropped…the only question I have is whether the column would
>> survive a VACUUM FULL?  i.e. if the table is rewritten after the column is
>> dropped, would that change things?

> A cursory skim of cluster.c, plus general reasoning, leads me to think
> that the extent of the smarts of the table rewrite (for vacuum full at
> least, not cluster) is to evaluate headers for visibility and omit copying
> the physical tuples to the new heap.  The contents of each tuple are
> otherwise copied as-is (except toast pointers...).  So, yes, the variant

No; values in dropped columns get replaced by nulls during a table rewrite.
See reform_and_rewrite_tuple(), whose comments say

 * We cannot simply copy the tuple as-is, for several reasons:
 *
 * 1. We'd like to squeeze out the values of any dropped columns, both
 * to save space and to ensure we have no corner-case failures. (It's
 * possible for example that the new table hasn't got a TOAST table
 * and so is unable to store any large values of dropped cols.)
 *
 * 2. The tuple might not even be legal for the new table; this is
 * currently only known to happen as an after-effect of ALTER TABLE
 * SET WITHOUT OIDS.
 *
 * So, we must reconstruct the tuple from component Datums.

Likewise, dropped columns get replaced by nulls when an existing
tuple is updated.

So, while the dropped column never disappears from the table's structure,
over time the space it consumed goes away.

            regards, tom lane