Re: [GENERAL] Multixact members limit exceeded

Поиск
Список
Период
Сортировка
От Peter Hunčár
Тема Re: [GENERAL] Multixact members limit exceeded
Дата
Msg-id CADOjAB=AaaKYcj6e7KRbZXdxe0BY9gZTirStUyBERi5Ex4rEdg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [GENERAL] Multixact members limit exceeded  (Andres Freund <andres@anarazel.de>)
Ответы Re: [GENERAL] Multixact members limit exceeded  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: [GENERAL] Multixact members limit exceeded  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-general
Hi,

Thank you, yes those are the 'urgent' tables, I'd talk to the developers regarding the locks.I too think, there's something 'fishy' going on.

Anyway, could it be that autovacuum blocks manual vacuum? Because I ran vacuum (full, verbose) and some tables finished quite fast, with huge amount of io recorded in the monitoring, but some of them are kind of stuck? 
Which brings me to the second question, how can I cancel autovacuum?

And the last thing I don't understand:

One particular table before vacuum full:

           relname            | relminmxid | table_size
------------------------------+------------+------------
 delayed_jobs                 | 1554151198 | 21 GB

And after vacuum full:
 
   relname    | relminmxid | table_size
--------------+------------+------------
 delayed_jobs | 1554155465 | 6899 MB

Shouldn't be the relminmxid changed after vacuum full, or am I not understanding something?

Thanks a lot


On Wed, Aug 9, 2017 at 7:57 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2017-08-09 10:06:48 +0000, Peter Hunčár wrote:
> We started feeding it several weeks ago and everything went smoothly until
> we hit this issue:
>
> 2017-08-09 05:21:50.946 WIB >DETAIL:  This command would create a multixact
> with 2 members, but the remaining space is only enough for 0 members.
> 2017-08-09 05:21:50.946 WIB >HINT:  Execute a database-wide VACUUM in
> database with OID 20101 with reduced vacuum_multixact_freeze_min_age and
> vacuum_multixact_freeze_table_age settings.

Ugh, that's not good.


> I did what the hint proposes, but on a such large database a vacuum takes
> several days.
> We are currently in data loading phase and we are sending only INSERT
> statements, there should be very little UPDATEs or DELETEs.
> Yesterday, the message disappeared shortly, but today it's back (vacuum is
> still running)

I suggest manually vacuuming the tables with the oldest xids first, that
ought to move the limits ahead. A query like
SELECT oid::regclass, relminmxid, mxid_age(relminmxid) FROM pg_class WHERE relminmxid <> '0' ORDER BY mxid_age(relminmxid) DESC;
might help you make that determination.


> *Is there a way how to prevent/fix this so we can finish the loading (97%
> done), because the performance went down from 100 ops/sec to 15ops/min.*
>
> Most tables have around 150 M rows with toast data.
> There are several huge tables with toast data, currently autovacuumed, I
> guess this is the reason for the performance drop:
>
>  | usename  | application_name | state  | backend_xmin |
>                                     query
> -+----------+------------------+--------+--------------+----------------------------------------------------------------------
>  | postgres |                  | active |   1683428686 | autovacuum: VACUUM
> pg_toast.pg_toast_14548803 (to prevent wraparound)
>  | postgres |                  | active |   1683428693 | autovacuum: VACUUM
> pg_toast.pg_toast_14548821 (to prevent wraparound)
>  | postgres |                  | active |   1683428705 | autovacuum: VACUUM
> pg_toast.pg_toast_14548828 (to prevent wraparound)
>  | postgres |                  | active |   1683428719 | autovacuum: VACUUM
> pg_toast.pg_toast_14548835 (to prevent wraparound)
>  | postgres |                  | active |   1683428732 | autovacuum: VACUUM
> pg_toast.pg_toast_14549150 (to prevent wraparound)

If these are the most "urgent" tables from the query above, I'd let
autovac finish, otherwise I'd cancel them and manually choose the most
urgent ones.


> After changing the vacuum_cost_limit to 10000 for one night, I saw 200MB/s
> of writes the whole night, but I had to change it back to 2000, because the
> insert perf went to 0.

You have a fair amount of autovacuum workers, perhaps that's also a bit
too much?  I'd consider making autovacuum's cleanup duties *less*
aggressive, but trying to keep the rate of cleanup high.  E.g. by
increasing vacuum_freeze_min_age and vacuum_multixact_freeze_min_age, so
only urgently old stuff gets cleaned up.


Do you know how you come to have a lot of large multixacts?  That's
often indicative of way too many sessions trying to lock the same rows
and such, might be worthwhile to work on that too, to reduce the
frequency at which new ones are needed.


Greetings,

Andres Freund

В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toastvalue 76753264 in pg_toast_10920100
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: [GENERAL] How to ALTER EXTENSION name OWNER TO new_owner ?