Обсуждение: Altering a column type - Most efficient way

Поиск
Список
Период
Сортировка

Altering a column type - Most efficient way

От
Ow Mun Heng
Дата:
Is there any quick hacks to do this quickly? There's around
20-30million
rows of data.

I want to change a column type from varchar(4) to varchar()

table size is ~10-15GB (and another 10-15G for indexes)

What would be the preferrred way of doing it? SHould I be dropping the
indexes 1st to make things faster? Would it matter?

The punch line is that since the databases are connected via slony, this
makes it even harder to pull it off. My last try got the DDL change
completed in like 3 hours (smallest table of the bunch) and it hung
everything.






Re: Altering a column type - Most efficient way

От
Mario Weilguni
Дата:
Ow Mun Heng schrieb:
> Is there any quick hacks to do this quickly? There's around
> 20-30million
> rows of data.
>
> I want to change a column type from varchar(4) to varchar()
>
> table size is ~10-15GB (and another 10-15G for indexes)
>
> What would be the preferrred way of doing it? SHould I be dropping the
> indexes 1st to make things faster? Would it matter?
>
> The punch line is that since the databases are connected via slony, this
> makes it even harder to pull it off. My last try got the DDL change
> completed in like 3 hours (smallest table of the bunch) and it hung
> everything
>
Before Postgresql supported "alter table ... type ... " conversions, I
did it a few times when I detected later that my varchar() fields were
too short, and it worked perfectly.

Example:
{OLDLEN} = 4
{NEWLEN} = 60

update pg_attribute
   set atttypmod={NEWLEN}+4
 where attname='the-name-of-the-column'
   and attrelid=(select oid from pg_class where
relname='the-name-of-the-table')
   and atttypmod={OLDLEN}+4;


This worked very well when you want to increase the maximum length,
don't try to reduce the maximum length this way!

Disclaimer: I do not know if slony might be have a problem with this.




Re: Altering a column type - Most efficient way

От
Ow Mun Heng
Дата:
On Thu, 2008-07-10 at 10:36 +0200, Mario Weilguni wrote:
> Ow Mun Heng schrieb:
> >
> > I want to change a column type from varchar(4) to varchar()
> >
> >
> Example:
> {OLDLEN} = 4
> {NEWLEN} = 60
>
> update pg_attribute
>    set atttypmod={NEWLEN}+4
>  where attname='the-name-of-the-column'
>    and attrelid=(select oid from pg_class where
> relname='the-name-of-the-table')
>    and atttypmod={OLDLEN}+4;
>

This is what I see on the table

NEW attypmod = -1
OLD attypmod =  8

I'm not very confident in doint this change since this is not a
development server. If would help though if someone else has any other
suggestion or something.. Thanks for the information though, it's
enlightening knowledge.

Re: Altering a column type - Most efficient way

От
Alvaro Herrera
Дата:
Ow Mun Heng wrote:

> This is what I see on the table
>
> NEW attypmod = -1
> OLD attypmod =  8

8 means varchar(4) which is what you said you had (4+4)
-1 means unlimited size.


--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Altering a column type - Most efficient way

От
Ow Mun Heng
Дата:
On Thu, 2008-07-10 at 09:57 -0400, Alvaro Herrera wrote:
> Ow Mun Heng wrote:
>
> > This is what I see on the table
> >
> > NEW attypmod = -1
> > OLD attypmod =  8
>
> 8 means varchar(4) which is what you said you had (4+4)
> -1 means unlimited size.
>

This is cool.

If it were this simple a change, I'm not certain why (I believe) PG is
checking each and every row to see if it will fit into the new column
definition/type.

Thus, I'm still a bit hesitant to do the change, although it is
definitely a very enticing thing to do. ( I presume also that this
change will be instantaneous and does not need to check on each and
every row of the table?)

Thanks./

Re: Altering a column type - Most efficient way

От
Mario Weilguni
Дата:
Ow Mun Heng schrieb:
> On Thu, 2008-07-10 at 09:57 -0400, Alvaro Herrera wrote:
>
>> Ow Mun Heng wrote:
>>
>>
>>> This is what I see on the table
>>>
>>> NEW attypmod = -1
>>> OLD attypmod =  8
>>>
>> 8 means varchar(4) which is what you said you had (4+4)
>> -1 means unlimited size.
>>
>>
>
> This is cool.
>
> If it were this simple a change, I'm not certain why (I believe) PG is
> checking each and every row to see if it will fit into the new column
> definition/type.
>
> Thus, I'm still a bit hesitant to do the change, although it is
> definitely a very enticing thing to do. ( I presume also that this
> change will be instantaneous and does not need to check on each and
> every row of the table?)
>
> Thanks./
>
>

It should be safe, because the length limit is checked at insert/update
time, and internally, a varchar(20) is treated as something like this:
foo      varchar(1000000000) check (length(foo) <= 20)


The change is done without re-checking all rows, and will not fail IF
the new size is longer than the old size.


Re: Altering a column type - Most efficient way

От
Alvaro Herrera
Дата:
Ow Mun Heng wrote:

> If it were this simple a change, I'm not certain why (I believe) PG is
> checking each and every row to see if it will fit into the new column
> definition/type.

Because the code that does the ALTER TYPE is very generic, and it
doesn't (yet) have an optimization that tells it to skip the check and
the possible table rewrite in the cases where it's obviously not needed
(like this one).

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Altering a column type - Most efficient way

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Ow Mun Heng wrote:
>> If it were this simple a change, I'm not certain why (I believe) PG is
>> checking each and every row to see if it will fit into the new column
>> definition/type.

> Because the code that does the ALTER TYPE is very generic, and it
> doesn't (yet) have an optimization that tells it to skip the check and
> the possible table rewrite in the cases where it's obviously not needed
> (like this one).

Awhile back I looked into teaching ALTER TYPE that it needn't rewrite
if the type conversion expression parses out as just a Var with
RelabelType, but it seemed that that wouldn't cover very much of the
use-cases where a human thinks that it's "obvious" that no rewrite
is needed.  You'd really need to build in hard-wired knowledge about
the behavior of specific coercion functions, which seems entirely
unappealing.

            regards, tom lane

Re: Altering a column type - Most efficient way

От
"Kevin Grittner"
Дата:
>>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Ow Mun Heng wrote:
>>> If it were this simple a change, I'm not certain why (I believe) PG
is
>>> checking each and every row to see if it will fit into the new
column
>>> definition/type.
>
>> Because the code that does the ALTER TYPE is very generic, and it
>> doesn't (yet) have an optimization that tells it to skip the check
and
>> the possible table rewrite in the cases where it's obviously not
needed
>> (like this one).
>
> Awhile back I looked into teaching ALTER TYPE that it needn't
rewrite
> if the type conversion expression parses out as just a Var with
> RelabelType, but it seemed that that wouldn't cover very much of the
> use-cases where a human thinks that it's "obvious" that no rewrite
> is needed.

We wouldn't have to cover all possible cases for it to be useful.  If
there's some low-hanging fruit here, +1 for getting that.  The cases
which would most often save staff here some time are switching a
varchar to a longer maximum length or switching between a domain which
is varchar to plain varchar (or vice versa).

-Kevin

[SOLVED] Re: Altering a column type - Most efficient way

От
Ow Mun Heng
Дата:
On Fri, 2008-07-11 at 09:55 -0500, Kevin Grittner wrote:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
> >>> Ow Mun Heng wrote:
> >>> If it were this simple a change, I'm not certain why (I believe) PG
> >>>is checking each and every row to see if it will fit into the new column
> >>> definition/type.
> >Because the code that does the ALTER TYPE is very generic, and it
> > doesn't (yet) have an optimization that tells it to skip the check
> > and the possible table rewrite in the cases where it's obviously not
> >needed(like this one).

>  If there's some low-hanging fruit here, +1 for getting that.

I just tested this out and everything seems to be working fine. (cross
fingers - for now and if I do report back, it means we've crashed and
burned, but as of now... the low hanging fruit is tasty)

This 2 sec change is much preferred over the 3+ hour per table.

I agree with Tom that this is not useful in _all_ cases and may seem to
look like a hack, but it really isn't. Given that the condition that
we're expaning the min length rather than the opposite, it should be
pretty safe.


Guys(/gals) Thanks very much for brightening up a dreadry Monday
morning.