Re: Alter the column data type of the large data volume table.

Поиск
Список
Период
Сортировка
От Olivier Gautherot
Тема Re: Alter the column data type of the large data volume table.
Дата
Msg-id CAJ7S9TVKUA6FK0zXqsGjokB6E7pAdm0vPR3z9vPaJN_heM3yWQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Alter the column data type of the large data volume table.  (charles meng <xlyybz@gmail.com>)
Ответы Re: Alter the column data type of the large data volume table.
Список pgsql-general
Hi Charles,

On Fri, Dec 4, 2020 at 9:12 AM charles meng <xlyybz@gmail.com> wrote:
What I mean is that it will take a lot of time to complete all data processing.I have to say that it is a good solution to adjust the column type without affecting online users.

I found a tool on github, see the link below, unfortunately, this is for MySQL...


MySQL has its own strategy with regard to column handling so what works there does not necessarily fit here. 

There are some good ideas in this thread but I would start with a few checks:

1) What version of PG are you using ?

2) If you can try on your current setup (not necessarily in Production) to add a column typed bigint - if it is recent enough it will be a simple catalog update. Michael's suggestion is viable

3) Given the massive number of rows, I would update as suggested, progressively in batches of a few tens of thousands. Make sure you commit and vacuum after each to retrieve the space (or you may end up with a storage space issue in addition to all the rest). In the meantime, add a trigger to set the new column to the index value. Once the new column is complete, drop the old column and set the new one as primary key (it may take a while to recalculate the index).

4) If your table is still growing, I would definitely look into partitioning as it can hardly make things worse.

5) If you're brave enough, convert your current table as a partition (rename it to something like table_hist), duplicate the table model under the same name as now (adjusting the primary key type) and set the INHERITS on the primary key range. The inheritance should take care of the type conversion (haven't tried it but it's worth a try). If it works, you will reach your goal without downtime or significant overhead.

 
regards.

Michael Lewis <mlewis@entrata.com> 于2020年12月4日周五 下午1:04写道:
Please do reply all.

Not sure what you mean about it taking too much time. It's rewriting a bunch of data. It's going to take a while. The best you can do is break the work up into small pieces and commit each piece.

On Thu, Dec 3, 2020, 7:11 PM charles meng <xlyybz@gmail.com> wrote:
Thanks for your help, I think the first method I tried (adding temporary column) is similar to what you said, but it takes too much time for me.


Thanks again.

Michael Lewis <mlewis@entrata.com> 于2020年12月4日周五 上午1:11写道:
On Wed, Dec 2, 2020 at 11:53 PM charles meng <xlyybz@gmail.com> wrote:
Hi all,

I have a table with 1.6 billion records. The data type of the primary key column is incorrectly used as integer. I need to replace the type of the column with bigint. Is there any ideas for this?

Solutions that have been tried:
Adding temporary columns was too time-consuming, so I gave up.
Using a temporary table, there is no good way to migrate the original table data to the temporary table

Thanks in advance.

You can add a new column with NO default value and null as default and have it be very fast. Then you can gradually update rows in batches (if on PG11+, perhaps use do script with a loop to commit after X rows) to set the new column the same as the primary key. Lastly, in a transaction, update any new rows where the bigint column is null, and change which column is the primary key & drop the old one. This should keep each transaction reasonably sized to not hold up other processes.

Cheers
--
Olivier Gautherot
 

Libre de virus. www.avast.com

В списке pgsql-general по дате отправления:

Предыдущее
От: charles meng
Дата:
Сообщение: Re: Alter the column data type of the large data volume table.
Следующее
От: Wim Bertels
Дата:
Сообщение: Re: psql > split > queries & output