Re: ALTER TABLE without long waiting is possibly ?

Поиск
Список
Период
Сортировка
От Vitaly Burovoy
Тема Re: ALTER TABLE without long waiting is possibly ?
Дата
Msg-id CAKOSWNmc+xBOTf5Gb0pmgPak5YKMLo0qcme6GtF8znZnUNm8JQ@mail.gmail.com
обсуждение исходный текст
Ответ на ALTER TABLE without long waiting is possibly ?  (Condor <condor@stz-bg.com>)
Ответы Re: ALTER TABLE without long waiting is possibly ?  (Condor <condor@stz-bg.com>)
Список pgsql-general
On 10/11/16, Condor <condor@stz-bg.com> wrote:
>
> Hello,
>
> today I need to alter one of our biggest tables to add two new columns
> with default value 0.
> Server version: PostgreSQL 9.5.4 on x86_64-slackware-linux-gnu, compiled
> by x86_64-slackware-linux-gcc (GCC) 5.3.0, 64-bit
>
> when I do: ALTER TABLE stocks ADD COLUMN promo INTEGER DEFAULT 0;
>
> long waiting is coming,

The doc[1] explicitly explains why:
"Adding a column with a DEFAULT clause or changing the type of an
existing column will require the entire table and its indexes to be
rewritten."

> so I try to find a way how to avoid that
> waiting. I know isn't possibly to alter table without lock it,
> but Im wondering do will be more fast if I do:
>
> ALTER TABLE stocks ADD COLUMN promo INTEGER;
> UPDATE TABLE stocks SET promo = 0;
> ALTER TABLE stocks ALTER COLUMN promo SET DEFAULT 0;

You are close to the best solution but you should use "SET DEFAULT"
before update and split "UPDATE" into several commands to update
smaller parts of the table at a time, in the other case you ends up
with full rewrite of the table at once as the original "ALTER TABLE"
does.
All rows which has been updated are locked until the UPDATE commits,
so when your code tries to update or delete it, commands wait until
the UPDATE completes.

Usual solution looks like this (change _pk_column_ to a column(s) name
mentioned in the primary key of the table):

ALTER TABLE stocks ADD COLUMN promo INTEGER;
ALTER TABLE stocks ALTER COLUMN promo SET DEFAULT 0;  -- set for
inserted columns
CREATE INDEX CONCURRENTLY set_default_idx_tmp
    ON stocks(_pk_column_) WHERE promo IS NULL;
-- repeat the next command (five lines!) until it returns 0 affected
rows (you can choose different LIMIT value):
UPDATE stocks s SET promo = 0 FROM (
    SELECT _pk_column_ FROM stocks
    WHERE promo IS NULL ORDER BY _pk_column_
    FOR UPDATE LIMIT 10000
)t WHERE s._pk_column_=t._pk_column_;


The "ORDER BY" clause allows you to decrease chance to block current
transactions by the UPDATE which sets the default value.

If you have PG 9.5 and higher, add "SKIP LOCKED" just after the "FOR
UPDATE" clause.

If your table is big enough you may run:
VACUUM VERBOSE stocks;
when 1/2 or 1/3 (and 2/3) table is done to mark old tuples as free
space and reuse it for new tuples generated by the next UPDATEs (and
prevent bloating table).

P.S.: then DROP INDEX CONCURRENTLY set_default_idx_tmp;

P.P.S.: If you have to add two columns you can update both of them by
one UPDATE:

ALTER TABLE stocks ADD COLUMN promo INTEGER;
ALTER TABLE stocks ADD COLUMN column2 INTEGER;
ALTER TABLE stocks ALTER COLUMN promo SET DEFAULT 0;  -- set for
inserted columns
ALTER TABLE stocks ALTER COLUMN column2 SET DEFAULT 65536;  -- whatever you need

CREATE INDEX CONCURRENTLY set_default_idx_tmp
    ON stocks(_pk_column_) WHERE promo IS NULL AND column2 IS NULL;

-- repeat the next command (six lines!) until it returns 0 affected
rows (you can choose different LIMIT value):
UPDATE stocks s SET promo = DEFAULT, column2 = DEFAULT FROM (
    SELECT _pk_column_ FROM stocks
    WHERE promo IS NULL AND column2 IS NULL
    ORDER BY _pk_column_
    FOR UPDATE LIMIT 10000
)t WHERE s._pk_column_=t._pk_column_;


> Unfortunately I can't test on product servers, so Im looking for some
> advice or some one to point me the right direction how I can alter table
> today without clients to notice their query is locked and need to wait.

[1] https://www.postgresql.org/docs/current/static/sql-altertable.html#AEN75605

--
Best regards,
Vitaly Burovoy


В списке pgsql-general по дате отправления:

Предыдущее
От: arnaud gaboury
Дата:
Сообщение: confusion about user paring with pg_hba and pg_ident
Следующее
От: Andreas Kretschmer
Дата:
Сообщение: Re: HA Cluster Solution?