Re: File Fragmentation

Поиск
Список
Период
Сортировка
От Jasen Betts
Тема Re: File Fragmentation
Дата
Msg-id kildea$d01$1@gonzo.reversiblemaps.ath.cx
обсуждение исходный текст
Ответ на File Fragmentation  ("jg" <jg@rilk.com>)
Список pgsql-general
On 2013-03-20, jg <jg@rilk.com> wrote:
> Hi,
>
> I have a PostgreSQL database with 50 tables.
> Every minute, sequentially, a batch load 10.000 rows of 250 bytes with a COPY.
>
> After a day, i got a database with 50 tables with 1.440 set of 10.000 rows.
> The tables are cleany and naturally clustered by the inserted timestamp.
> Each table has data in a file with 1.440 fragments (each day)
>
> Now, there is a partition rotation script, that suppress old tables when some size limit happens.
> Let suppose, that this script runs and suppress only one table qith few days of data, then recreates a new empty one.
>
> I got a disk freespace very fragmented, the space used by the rotated table.
>
> Then some COPY inserts new data, the tables got new data in theirs files and continue to be fragmented.
> The new tables begins to grows from the begining of the free space to and is more fragmented that ever.
>
> But all the data are always clustered in the tables.
>
> After few more rotated tables, all the tables are heavily fragmented and even if the data is clustered inside the
filefragments are spread all over the drive. 
>
> After few days, I see IO wait grows and grows, even when the size of the database stabilises due to the rotation
script.
>
> I suspect the heavy fragmented files to the cause of the IO wait grows (PostgreSQL on WIndows).
> How to cope with that ?
> It seems I can not pregrow file in PostgreSQL.

> I found a trick: if i created an empty table and i insert dummy
>data, then i insert good data, then i suppress dummy data, then i
>vacuum the table (but not a full vacuum) i got a large file with
>freespace at the begining of the file. If all the files were created
>with that tricks and larger than the maximum data COPYed, i will have
>no fragmented files.

I assume you're using generate_series to do the dummy data in a
single insert.

it might be faster to insert the dummy data through one connection,
then insert the first good data through a second connection then rollback
the insert of the dummy data. and vacuum the table.





--
⚂⚃ 100% natural

В списке pgsql-general по дате отправления:

Предыдущее
От: Kenroy Bennett
Дата:
Сообщение: Need advice on best system to choose
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: timeofday() and clock_timestamp() produce different results when casting to timestamptz