pg_restore out of memory

Поиск
Список
Период
Сортировка
От Miguel Ramos
Тема pg_restore out of memory
Дата
Msg-id e9adbc37-a7a5-66d6-26f0-d2820de86a7f@miguel.ramos.name
обсуждение исходный текст
Ответы Re: pg_restore out of memory  (Felipe Santos <felipepts@gmail.com>)
Re: pg_restore out of memory  (Sameer Kumar <sameer.kumar@ashnik.com>)
Re: pg_restore out of memory  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: pg_restore out of memory  (Karsten Hilbert <Karsten.Hilbert@gmx.net>)
Список pgsql-general
Hi,

We have backed up a database and now when trying to restore it to the
same server we get this:

 > # pg_restore -d recovery /mnt/paysdeloire2013_convertida2.1.dump
 > pg_restore: [custom archiver] out of memory
 > 12:09:56.58  9446.593u+1218.508s 24.3%  167+2589k  6+0io  0pf+0sw
6968822cs


Some information about the application:

- We have sensor data, including pictures, and number crunshing output,
then so the large tables on this database have 319, 279, 111 and 26GB.
Mostly on TOAST pages, but the 279GB one divides it evenly. This
database is 765GB. We try to keep them under 4TB.
- Transactions are large, some 100 MB at a time.
- We also use PostGIS.

About the server (dedicated):

- FreeBSD 9.1-RELEASE #0 on AMD64
- 16 GB of RAM
- 8x3GB hardware RAID 10
- 10TB slice for pgdata UFS-formatted and 32kB block
- PostgreSQL 9.1.8 custom compiled to get 32kB blocks
- Installed in 2013 with ~10 people working with it, 145 days uptime today.

I found two relevant threads on the mailing-lists.
The most recent one sugested that postgresql was being configured to use
more memory than what's available.
The older one sugested that the system limits on the size of the data or
stack segments were lower than required.

So here are some server parameters (relevant or otherwise):

 > max_connections = 100
 > shared_buffers = 4GB  -- 25% of RAM
 > temp_buffers = 32MB  -- irrelevant?
 > work_mem = 64MB
 > maintenance_work_mem = was 1G lowered to 256M then 64M
 > wal_buffers = -1  -- should mean 1/32 of shared_buffers = 128MB
 > checkpoint_segments = 64  -- WAL segments are 16MB
 > effective_cache_size = 8GB  -- irrelevant?


I suspect that the restore fails when constructing the indices. After
the process is aborted, the data appears to be all or most there, but no
indices.
So, all I did so far, was lowering maintenance_work_mem and it didn't work.

System limits, as you can see, are at defaults (32GB for data and 512MB
for stack):

 > # limit
 > cputime      unlimited
 > filesize     unlimited
 > datasize     33554432 kbytes
 > stacksize    524288 kbytes
 > coredumpsize unlimited
 > memoryuse    unlimited
 > vmemoryuse   unlimited
 > descriptors  11095
 > memorylocked unlimited
 > maxproc      5547
 > sbsize       unlimited
 > swapsize     unlimited

Shared memory is configured to allow for the single shared memory
segment postgresql appears to use, plus a bit of extra (8GB):

 > # ipcs -M
 > shminfo:
 >         shmmax:   8589934592    (max shared memory segment size)
 >         shmmin:            1    (min shared memory segment size)
 >         shmmni:          192    (max number of shared memory identifiers)
 >         shmseg:          128    (max shared memory segments per process)
 >         shmall:      2097152    (max amount of shared memory in pages)

And semaphores (irrelevant?)...

 > # ipcs -S
 > seminfo:
 >         semmni:          256    (# of semaphore identifiers)
 >         semmns:          512    (# of semaphores in system)
 >         semmnu:          256    (# of undo structures in system)
 >         semmsl:          340    (max # of semaphores per id)
 >         semopm:          100    (max # of operations per semop call)
 >         semume:           50    (max # of undo entries per process)
 >         semusz:          632    (size in bytes of undo structure)
 >         semvmx:        32767    (semaphore maximum value)
 >         semaem:        16384    (adjust on exit max value)


I don't know what else to try.
I lowered maintenance_work_mem without restarting the server.
In some of the attempts, but not all, the restore was done while people
were working.

Each attempt takes 12 hours...
We couldn't use the directory -Fd dump because it's postgresql 9.1.
The original database is still on the server, this is a test restore.

We have about one or two months of slack before we really need to remove
them from the server to recover space.


--
Miguel Ramos


В списке pgsql-general по дате отправления:

Предыдущее
От: Bjørn T Johansen
Дата:
Сообщение: Re: Slow SQL?
Следующее
От: Felipe Santos
Дата:
Сообщение: Re: pg_restore out of memory