Same-page UPDATEs in bloated tables

Поиск
Список
Период
Сортировка
От Ian Dowse
Тема Same-page UPDATEs in bloated tables
Дата
Msg-id 200610141746.aa21135@nowhere.iedowse.com
обсуждение исходный текст
Ответы Re: Same-page UPDATEs in bloated tables  (Alvaro Herrera <alvherre@commandprompt.com>)
Список pgsql-general
Hi,

I've been seeing an issue with 8.1.4 that seems to be caused by the
way UPDATE operations prefer to place the new row version in the
same page as the original row. The issue is specific to UPDATEs;
it does not occur when each UPDATE is replaced by a DELETE/INSERT
pair. The problem can prevent a temporarily bloated table from ever
returning to its normal size even though all rows are frequently
changing and regular vacuuming is taking place.

A simple way to demonstrate the issue is to insert 10001 rows into
an empty table and delete the first 10000 rows. Now, repeatedly
performing (lazy) vacuums and UPDATEs will never result in the table
size shrinking:

    x=# SELECT * FROM foo;
      id   | val
    -------+------
     10001 | foo2
    (1 row)

    x=# VACUUM ANALYSE foo;
    VACUUM
    x=# SELECT relpages, reltuples FROM pg_class WHERE relname = 'foo';
     relpages | reltuples
    ----------+-----------
          527 |         1
    (1 row)

    x=# UPDATE foo SET val = 'foo3' WHERE id = 10001;
    UPDATE 1
    x=# VACUUM ANALYSE foo;
    VACUUM
    x=# SELECT relpages, reltuples FROM pg_class WHERE relname = 'foo';
     relpages | reltuples
    ----------+-----------
          527 |         1
    (1 row)

However, using DELETE/INSERT instead causes the vacuum to immediately
shrink the table:

    x=# DELETE from foo WHERE id = 10001;
    DELETE 1
    x=# INSERT INTO foo(val) VALUES('foo4');
    INSERT 0 1
    x=# VACUUM ANALYSE foo;
    VACUUM
    x=# SELECT relpages, reltuples FROM pg_class WHERE relname = 'foo';
     relpages | reltuples
    ----------+-----------
        1 |         1
    (1 row)

(Note that the above is just intended as a simple example to
demonstrate the effect; the actual case where this behaviour was
observed involved a high-churn table that never emptied but where
every row was updated at least a few times per day and (lazy)
vacuuming was performed approx every 10 minutes. The table would
occasionally become bloated, e.g. while performing a slow backup
of the database, and it would never recover its original size if
the rows were updated with UPDATE operations, but would quickly
return to its original size if DELETE/INSERT pairs were used instead).

This behaviour seemed fairly surprising given that UPDATEs are
supposed to be quite similar to INSERT/DELETE pairs in PostgreSQL.
Would it be possible to come up with some logic (in heap_update()?)
that could avoid using the same page if the page offset is way
beyond the nominal size of the table? Maybe some appropriate
statistics could be recorded by vacuum/analyse?

Ian

В списке pgsql-general по дате отправления:

Предыдущее
От: "Niederland"
Дата:
Сообщение: Aggregate in Correlated SubQuery
Следующее
От: "Niederland"
Дата:
Сообщение: Re: Aggregate in Correlated SubQuery