Re: Recommendation to run vacuum FULL in parallel

Поиск
Список
Период
Сортировка
От Perumal Raj
Тема Re: Recommendation to run vacuum FULL in parallel
Дата
Msg-id CALvqh4rATReNKQLmrU0xHe5tpr_EFO5h2-n-w-_QFwYmdGVbbg@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Recommendation to run vacuum FULL in parallel  (Kevin Brannen <KBrannen@efji.com>)
Ответы Re: Recommendation to run vacuum FULL in parallel  (Ron <ronljohnsonjr@gmail.com>)
Список pgsql-general
Thanks Kevin for the inputs,

In my Case there are 500+ Tables and biggest chunk 30GB ( Table only) + its indexes.
So i have created 6 batches and executed in parallel . All my scripts completed in 2 Hours and my DB size came down from 500GB to 300GB.

Yes i do see CPU spike, But i did whole activity with full apps down time.

Going forward i am going to run vacuum daily basis to maintain the DB size.

Also Table/DB Age came down drastically.

Thanks
Raj

On Thu, Apr 4, 2019 at 12:53 PM Kevin Brannen <KBrannen@efji.com> wrote:

From: Perumal Raj <perucinci@gmail.com>

So conclude the requirement here , The only way to parallelism is multiple script. And no need to do REINDEX exclusively.

Question : Do we need to consider  Table dependencies while preparing script in order to avoid table locks during vacuum full ?

 

We have a small bash script (see below) that get the list of tables and their sizes, sorted smallest to largest, and do “vacuum full” one at a time because (as someone else pointed out) this is very I/O intensive. That order also helps to ensure we finish because some of our installs are at the edge of running out of space (an issue we’re dealing with). I probably wouldn’t have a problem doing 2 at a time, but we do this in the middle of the night when activity is lowest and it only takes 1-2 hours, so we’re good with it. It sounds like you have a lot more data though.

 

You might also consider putting the data into different tablespaces which are spread over multiple disks to help I/O. If you can, use SSD drives, they help with speed quite a bit. 😊

 

Don’t worry about table dependencies. This is a physical operation, not a data operation.

 

HTH,

Kevin

 

    $PGPATH/psql -t -c "

        WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME, pg_total_relation_size(c.oid) AS total_bytes

                  FROM pg_class c

                  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace

                  WHERE relkind = 'r' AND nspname NOT IN ( 'pg_catalog', 'information_schema' )

                  ORDER BY 2 )

        SELECT table_name FROM s

        " |

    while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t $t ; done

###

This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.

В списке pgsql-general по дате отправления:

Предыдущее
От: Prakash Ramakrishnan
Дата:
Сообщение: Re: os upgrade 7.3 to 7.5 (postgres version 10.5)
Следующее
От: Ron
Дата:
Сообщение: Re: Recommendation to run vacuum FULL in parallel