Re: Updating a large table

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Updating a large table
Дата
Msg-id 59776cbc-bd4a-7ce9-133c-be5665075429@2ndquadrant.com
обсуждение исходный текст
Ответ на Updating a large table  (Timokhin Maxim <ncx2@yandex.com>)
Список pgsql-performance
On 12/22/2017 05:46 PM, Timokhin Maxim wrote:
> 
> Hello! We have a large table 11GB ( about 37 million records ) and we
> need to alter a table - add a new column with default values is 
> false. Also 'NOT NULL' is required.
>
> So, first I've done:
>
> ALTER TABLE clusters ALTER COLUMN "is_paid";
> 

That seems somewhat incomplete ... what exactly did the ALTER do?

> after that:
> 
> UPDATE clusters SET is_paid = DEFAULT where ctime <= now() - interval '720h' AND is_paid != FALSE;
> 
> Everything went ok. Then I tried to run it again for an interval of 1
> years. And I got that no one can't see - the was no available space
> on a disk. The reason was WAL-files ate everything.
> Master-server couldn't send some WAL-file to their replicas. Bandwidth wasn't enough.
> 

Well, then perhaps the best solution is to add more disk space and/or
make sure the network bandwidth is sufficient?

In any case, don't forget this may also need to update all indexes on
the table, because the new row versions will end up on different pages.
So while the table has 11GB, this update may need much more WAL space
than that.

> Well, I'm searching for a better idea to update the table.
> Solutions I found.
> 1. Separate my UPDATE by chunks.

If this is a one-time change, this is probably the best option.

> 2. Alter a table using a new temporary table, but it's not convenient
> for me because there is a lot of foreign keys and indexes.
Right.

> 3. Hot-update. This is the most interesting case for me.
> Speaking of HOT-update
https://www.dbrnd.com/2016/03/postgresql-the-awesome-table-fillfactor-to-speedup-update-and-select-statement/
> The article says: it might be useful for tables that change often and moreover It would be the best way to increase
thespeed of UPDATE.
 

First of all, to make HOT possible there would have to be enough free
space on the pages. As you need to update the whole table, that means
each table would have to be only 50% full. That's unlikely to be true,
and you can't fix that at this point.

> So, my questions are will it work for all tuples? It says that - no 
> https://www.dbrnd.com/2016/03/postgresql-alter-table-to-change- 
> fillfactor-value/, but I could not find a confirmation in official 
> postresql's documentation.
Not sure I understand your question, but HOT can only happen when two
conditions are met:

1) the update does not change any indexed column

This is likely met, assuming you don't have an index on is_paid.

2) there's enough space on the same page for the new row version

This is unlikely to be true, because the default fillfactor for tables
is 90%. You may change fillfactor using ALTER TABLE, but that only
applies to new data.

Moreover, as the article says - this is useful for tables that change
often. Which is not quite what one-time table rewrite does.

So HOT is not the solution you're looking for.

> Why do I need to launch vacuum after updating?

You don't need to launch vacuum - autovacuum will take care of that
eventually. But you may do that, to do the cleanup when it's convenient
for you.

> How should I reduce the better fillfactor?

For example to change fillfactor to 75% (i.e. 25% free space):

ALTER TABLE t SET (fillfactor = 75);

But as I said, it's not a solution for you.

> What will be with WAL-files it this case?

Not sure what you mean.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-performance по дате отправления:

Предыдущее
От: salah jubeh
Дата:
Сообщение: Re: Updating a large table
Следующее
От: Jean Baro
Дата:
Сообщение: Batch insert heavily affecting query performance.