Обсуждение: Altering a column type - Most efficient way
Is there any quick hacks to do this quickly? There's around 20-30million rows of data. I want to change a column type from varchar(4) to varchar() table size is ~10-15GB (and another 10-15G for indexes) What would be the preferrred way of doing it? SHould I be dropping the indexes 1st to make things faster? Would it matter? The punch line is that since the databases are connected via slony, this makes it even harder to pull it off. My last try got the DDL change completed in like 3 hours (smallest table of the bunch) and it hung everything.
Ow Mun Heng schrieb: > Is there any quick hacks to do this quickly? There's around > 20-30million > rows of data. > > I want to change a column type from varchar(4) to varchar() > > table size is ~10-15GB (and another 10-15G for indexes) > > What would be the preferrred way of doing it? SHould I be dropping the > indexes 1st to make things faster? Would it matter? > > The punch line is that since the databases are connected via slony, this > makes it even harder to pull it off. My last try got the DDL change > completed in like 3 hours (smallest table of the bunch) and it hung > everything > Before Postgresql supported "alter table ... type ... " conversions, I did it a few times when I detected later that my varchar() fields were too short, and it worked perfectly. Example: {OLDLEN} = 4 {NEWLEN} = 60 update pg_attribute set atttypmod={NEWLEN}+4 where attname='the-name-of-the-column' and attrelid=(select oid from pg_class where relname='the-name-of-the-table') and atttypmod={OLDLEN}+4; This worked very well when you want to increase the maximum length, don't try to reduce the maximum length this way! Disclaimer: I do not know if slony might be have a problem with this.
On Thu, 2008-07-10 at 10:36 +0200, Mario Weilguni wrote: > Ow Mun Heng schrieb: > > > > I want to change a column type from varchar(4) to varchar() > > > > > Example: > {OLDLEN} = 4 > {NEWLEN} = 60 > > update pg_attribute > set atttypmod={NEWLEN}+4 > where attname='the-name-of-the-column' > and attrelid=(select oid from pg_class where > relname='the-name-of-the-table') > and atttypmod={OLDLEN}+4; > This is what I see on the table NEW attypmod = -1 OLD attypmod = 8 I'm not very confident in doint this change since this is not a development server. If would help though if someone else has any other suggestion or something.. Thanks for the information though, it's enlightening knowledge.
Ow Mun Heng wrote: > This is what I see on the table > > NEW attypmod = -1 > OLD attypmod = 8 8 means varchar(4) which is what you said you had (4+4) -1 means unlimited size. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Thu, 2008-07-10 at 09:57 -0400, Alvaro Herrera wrote: > Ow Mun Heng wrote: > > > This is what I see on the table > > > > NEW attypmod = -1 > > OLD attypmod = 8 > > 8 means varchar(4) which is what you said you had (4+4) > -1 means unlimited size. > This is cool. If it were this simple a change, I'm not certain why (I believe) PG is checking each and every row to see if it will fit into the new column definition/type. Thus, I'm still a bit hesitant to do the change, although it is definitely a very enticing thing to do. ( I presume also that this change will be instantaneous and does not need to check on each and every row of the table?) Thanks./
Ow Mun Heng schrieb: > On Thu, 2008-07-10 at 09:57 -0400, Alvaro Herrera wrote: > >> Ow Mun Heng wrote: >> >> >>> This is what I see on the table >>> >>> NEW attypmod = -1 >>> OLD attypmod = 8 >>> >> 8 means varchar(4) which is what you said you had (4+4) >> -1 means unlimited size. >> >> > > This is cool. > > If it were this simple a change, I'm not certain why (I believe) PG is > checking each and every row to see if it will fit into the new column > definition/type. > > Thus, I'm still a bit hesitant to do the change, although it is > definitely a very enticing thing to do. ( I presume also that this > change will be instantaneous and does not need to check on each and > every row of the table?) > > Thanks./ > > It should be safe, because the length limit is checked at insert/update time, and internally, a varchar(20) is treated as something like this: foo varchar(1000000000) check (length(foo) <= 20) The change is done without re-checking all rows, and will not fail IF the new size is longer than the old size.
Ow Mun Heng wrote: > If it were this simple a change, I'm not certain why (I believe) PG is > checking each and every row to see if it will fit into the new column > definition/type. Because the code that does the ALTER TYPE is very generic, and it doesn't (yet) have an optimization that tells it to skip the check and the possible table rewrite in the cases where it's obviously not needed (like this one). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Ow Mun Heng wrote: >> If it were this simple a change, I'm not certain why (I believe) PG is >> checking each and every row to see if it will fit into the new column >> definition/type. > Because the code that does the ALTER TYPE is very generic, and it > doesn't (yet) have an optimization that tells it to skip the check and > the possible table rewrite in the cases where it's obviously not needed > (like this one). Awhile back I looked into teaching ALTER TYPE that it needn't rewrite if the type conversion expression parses out as just a Var with RelabelType, but it seemed that that wouldn't cover very much of the use-cases where a human thinks that it's "obvious" that no rewrite is needed. You'd really need to build in hard-wired knowledge about the behavior of specific coercion functions, which seems entirely unappealing. regards, tom lane
>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Ow Mun Heng wrote: >>> If it were this simple a change, I'm not certain why (I believe) PG is >>> checking each and every row to see if it will fit into the new column >>> definition/type. > >> Because the code that does the ALTER TYPE is very generic, and it >> doesn't (yet) have an optimization that tells it to skip the check and >> the possible table rewrite in the cases where it's obviously not needed >> (like this one). > > Awhile back I looked into teaching ALTER TYPE that it needn't rewrite > if the type conversion expression parses out as just a Var with > RelabelType, but it seemed that that wouldn't cover very much of the > use-cases where a human thinks that it's "obvious" that no rewrite > is needed. We wouldn't have to cover all possible cases for it to be useful. If there's some low-hanging fruit here, +1 for getting that. The cases which would most often save staff here some time are switching a varchar to a longer maximum length or switching between a domain which is varchar to plain varchar (or vice versa). -Kevin
On Fri, 2008-07-11 at 09:55 -0500, Kevin Grittner wrote: > > Alvaro Herrera <alvherre@commandprompt.com> writes: > >>> Ow Mun Heng wrote: > >>> If it were this simple a change, I'm not certain why (I believe) PG > >>>is checking each and every row to see if it will fit into the new column > >>> definition/type. > >Because the code that does the ALTER TYPE is very generic, and it > > doesn't (yet) have an optimization that tells it to skip the check > > and the possible table rewrite in the cases where it's obviously not > >needed(like this one). > If there's some low-hanging fruit here, +1 for getting that. I just tested this out and everything seems to be working fine. (cross fingers - for now and if I do report back, it means we've crashed and burned, but as of now... the low hanging fruit is tasty) This 2 sec change is much preferred over the 3+ hour per table. I agree with Tom that this is not useful in _all_ cases and may seem to look like a hack, but it really isn't. Given that the condition that we're expaning the min length rather than the opposite, it should be pretty safe. Guys(/gals) Thanks very much for brightening up a dreadry Monday morning.