Обсуждение: pg_restore out of memory

Поиск

Список

Период

Сортировка

pg_restore out of memory

От

Miguel Ramos

Дата:

12 июля 2016 г., 11:25:17

Hi,

We have backed up a database and now when trying to restore it to the
same server we get this:

 > # pg_restore -d recovery /mnt/paysdeloire2013_convertida2.1.dump
 > pg_restore: [custom archiver] out of memory
 > 12:09:56.58  9446.593u+1218.508s 24.3%  167+2589k  6+0io  0pf+0sw
6968822cs


Some information about the application:

- We have sensor data, including pictures, and number crunshing output,
then so the large tables on this database have 319, 279, 111 and 26GB.
Mostly on TOAST pages, but the 279GB one divides it evenly. This
database is 765GB. We try to keep them under 4TB.
- Transactions are large, some 100 MB at a time.
- We also use PostGIS.

About the server (dedicated):

- FreeBSD 9.1-RELEASE #0 on AMD64
- 16 GB of RAM
- 8x3GB hardware RAID 10
- 10TB slice for pgdata UFS-formatted and 32kB block
- PostgreSQL 9.1.8 custom compiled to get 32kB blocks
- Installed in 2013 with ~10 people working with it, 145 days uptime today.

I found two relevant threads on the mailing-lists.
The most recent one sugested that postgresql was being configured to use
more memory than what's available.
The older one sugested that the system limits on the size of the data or
stack segments were lower than required.

So here are some server parameters (relevant or otherwise):

 > max_connections = 100
 > shared_buffers = 4GB  -- 25% of RAM
 > temp_buffers = 32MB  -- irrelevant?
 > work_mem = 64MB
 > maintenance_work_mem = was 1G lowered to 256M then 64M
 > wal_buffers = -1  -- should mean 1/32 of shared_buffers = 128MB
 > checkpoint_segments = 64  -- WAL segments are 16MB
 > effective_cache_size = 8GB  -- irrelevant?


I suspect that the restore fails when constructing the indices. After
the process is aborted, the data appears to be all or most there, but no
indices.
So, all I did so far, was lowering maintenance_work_mem and it didn't work.

System limits, as you can see, are at defaults (32GB for data and 512MB
for stack):

 > # limit
 > cputime      unlimited
 > filesize     unlimited
 > datasize     33554432 kbytes
 > stacksize    524288 kbytes
 > coredumpsize unlimited
 > memoryuse    unlimited
 > vmemoryuse   unlimited
 > descriptors  11095
 > memorylocked unlimited
 > maxproc      5547
 > sbsize       unlimited
 > swapsize     unlimited

Shared memory is configured to allow for the single shared memory
segment postgresql appears to use, plus a bit of extra (8GB):

 > # ipcs -M
 > shminfo:
 >         shmmax:   8589934592    (max shared memory segment size)
 >         shmmin:            1    (min shared memory segment size)
 >         shmmni:          192    (max number of shared memory identifiers)
 >         shmseg:          128    (max shared memory segments per process)
 >         shmall:      2097152    (max amount of shared memory in pages)

And semaphores (irrelevant?)...

 > # ipcs -S
 > seminfo:
 >         semmni:          256    (# of semaphore identifiers)
 >         semmns:          512    (# of semaphores in system)
 >         semmnu:          256    (# of undo structures in system)
 >         semmsl:          340    (max # of semaphores per id)
 >         semopm:          100    (max # of operations per semop call)
 >         semume:           50    (max # of undo entries per process)
 >         semusz:          632    (size in bytes of undo structure)
 >         semvmx:        32767    (semaphore maximum value)
 >         semaem:        16384    (adjust on exit max value)


I don't know what else to try.
I lowered maintenance_work_mem without restarting the server.
In some of the attempts, but not all, the restore was done while people
were working.

Each attempt takes 12 hours...
We couldn't use the directory -Fd dump because it's postgresql 9.1.
The original database is still on the server, this is a test restore.

We have about one or two months of slack before we really need to remove
them from the server to recover space.


--
Miguel Ramos

Re: pg_restore out of memory

От

Felipe Santos

Дата:

12 июля 2016 г., 11:33:03

2016-07-12 8:25 GMT-03:00 Miguel Ramos <org.postgresql@miguel.ramos.name>:

Hi,

We have backed up a database and now when trying to restore it to the same server we get this:

> # pg_restore -d recovery /mnt/paysdeloire2013_convertida2.1.dump
> pg_restore: [custom archiver] out of memory
> 12:09:56.58 9446.593u+1218.508s 24.3% 167+2589k 6+0io 0pf+0sw 6968822cs

Some information about the application:

- We have sensor data, including pictures, and number crunshing output, then so the large tables on this database have 319, 279, 111 and 26GB. Mostly on TOAST pages, but the 279GB one divides it evenly. This database is 765GB. We try to keep them under 4TB.
- Transactions are large, some 100 MB at a time.
- We also use PostGIS.

About the server (dedicated):

- FreeBSD 9.1-RELEASE #0 on AMD64
- 16 GB of RAM
- 8x3GB hardware RAID 10
- 10TB slice for pgdata UFS-formatted and 32kB block
- PostgreSQL 9.1.8 custom compiled to get 32kB blocks
- Installed in 2013 with ~10 people working with it, 145 days uptime today.

I found two relevant threads on the mailing-lists.
The most recent one sugested that postgresql was being configured to use more memory than what's available.
The older one sugested that the system limits on the size of the data or stack segments were lower than required.

So here are some server parameters (relevant or otherwise):

> max_connections = 100
> shared_buffers = 4GB -- 25% of RAM
> temp_buffers = 32MB -- irrelevant?
> work_mem = 64MB
> maintenance_work_mem = was 1G lowered to 256M then 64M
> wal_buffers = -1 -- should mean 1/32 of shared_buffers = 128MB
> checkpoint_segments = 64 -- WAL segments are 16MB
> effective_cache_size = 8GB -- irrelevant?

I suspect that the restore fails when constructing the indices. After the process is aborted, the data appears to be all or most there, but no indices.
So, all I did so far, was lowering maintenance_work_mem and it didn't work.

System limits, as you can see, are at defaults (32GB for data and 512MB for stack):

> # limit
> cputime unlimited
> filesize unlimited
> datasize 33554432 kbytes
> stacksize 524288 kbytes
> coredumpsize unlimited
> memoryuse unlimited
> vmemoryuse unlimited
> descriptors 11095
> memorylocked unlimited
> maxproc 5547
> sbsize unlimited
> swapsize unlimited

Shared memory is configured to allow for the single shared memory segment postgresql appears to use, plus a bit of extra (8GB):

> # ipcs -M
> shminfo:
> shmmax: 8589934592 (max shared memory segment size)
> shmmin: 1 (min shared memory segment size)
> shmmni: 192 (max number of shared memory identifiers)
> shmseg: 128 (max shared memory segments per process)
> shmall: 2097152 (max amount of shared memory in pages)

And semaphores (irrelevant?)...

> # ipcs -S
> seminfo:
> semmni: 256 (# of semaphore identifiers)
> semmns: 512 (# of semaphores in system)
> semmnu: 256 (# of undo structures in system)
> semmsl: 340 (max # of semaphores per id)
> semopm: 100 (max # of operations per semop call)
> semume: 50 (max # of undo entries per process)
> semusz: 632 (size in bytes of undo structure)
> semvmx: 32767 (semaphore maximum value)
> semaem: 16384 (adjust on exit max value)

I don't know what else to try.
I lowered maintenance_work_mem without restarting the server.
In some of the attempts, but not all, the restore was done while people were working.

Each attempt takes 12 hours...
We couldn't use the directory -Fd dump because it's postgresql 9.1.
The original database is still on the server, this is a test restore.

We have about one or two months of slack before we really need to remove them from the server to recover space.

--
Miguel Ramos

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Hi Miguel,

I would try lowering max_connections to 50 and then set work_mem to 128MB.

After that restart your server and retry the restore.

Tell us if that helps.

Regards,

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

12 июля 2016 г., 11:54:51

Às 12:32 de 12-07-2016, Felipe Santos escreveu:
> I would try lowering max_connections to 50 and then set work_mem to 128MB.
>
> After that restart your server and retry the restore.

Ok, I will try restarting tonight.
work_mem is the parameter I was most afraid of.

I'll post some news in 24h...

--
Miguel Ramos

Re: pg_restore out of memory

От

Felipe Santos

Дата:

12 июля 2016 г., 12:03:04

2016-07-12 8:54 GMT-03:00 Miguel Ramos <org.postgresql@miguel.ramos.name>:

Às 12:32 de 12-07-2016, Felipe Santos escreveu:
I would try lowering max_connections to 50 and then set work_mem to 128MB.

After that restart your server and retry the restore.

Ok, I will try restarting tonight.
work_mem is the parameter I was most afraid of.

I'll post some news in 24h...

--
Miguel Ramos

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

After the restore (being it a success or failure) don't forget to set the parameters back to their original values (max_conn = 100 and work_mem=64MB).

Re: pg_restore out of memory

От

Sameer Kumar

Дата:

12 июля 2016 г., 13:08:23

On Tue, 12 Jul 2016, 7:25 p.m. Miguel Ramos, <org.postgresql@miguel.ramos.name> wrote:

Hi,

We have backed up a database and now when trying to restore it to the
same server we get this:

> # pg_restore -d recovery /mnt/paysdeloire2013_convertida2.1.dump
> pg_restore: [custom archiver] out of memory
> 12:09:56.58 9446.593u+1218.508s 24.3% 167+2589k 6+0io 0pf+0sw
6968822cs

Some information about the application:

- We have sensor data, including pictures, and number crunshing output,
then so the large tables on this database have 319, 279, 111 and 26GB.
Mostly on TOAST pages, but the 279GB one divides it evenly. This
database is 765GB. We try to keep them under 4TB.
- Transactions are large, some 100 MB at a time.
- We also use PostGIS.

About the server (dedicated):

- FreeBSD 9.1-RELEASE #0 on AMD64
- 16 GB of RAM
- 8x3GB hardware RAID 10
- 10TB slice for pgdata UFS-formatted and 32kB block
- PostgreSQL 9.1.8 custom compiled to get 32kB blocks
- Installed in 2013 with ~10 people working with it, 145 days uptime today.

I found two relevant threads on the mailing-lists.
The most recent one sugested that postgresql was being configured to use
more memory than what's available.
The older one sugested that the system limits on the size of the data or
stack segments were lower than required.

So here are some server parameters (relevant or otherwise):

> max_connections = 100
> shared_buffers = 4GB -- 25% of RAM
> temp_buffers = 32MB -- irrelevant?
> work_mem = 64MB
> maintenance_work_mem = was 1G lowered to 256M then 64M

Why did you lower it? I think increasing it should help better. But 1GB seems like fine.

> wal_buffers = -1 -- should mean 1/32 of shared_buffers = 128MB

Increase this during the restore, may be 512MB

> checkpoint_segments = 64 -- WAL segments are 16MB
> effective_cache_size = 8GB -- irrelevant?

I suspect that the restore fails when constructing the indices. After
the process is aborted, the data appears to be all or most there, but no
indices.

What is logged in database log files? Have you checked that?

So, all I did so far, was lowering maintenance_work_mem and it didn't work.

System limits, as you can see, are at defaults (32GB for data and 512MB
for stack):

> # limit
> cputime unlimited
> filesize unlimited
> datasize 33554432 kbytes
> stacksize 524288 kbytes
> coredumpsize unlimited
> memoryuse unlimited
> vmemoryuse unlimited
> descriptors 11095
> memorylocked unlimited
> maxproc 5547
> sbsize unlimited
> swapsize unlimited

Shared memory is configured to allow for the single shared memory
segment postgresql appears to use, plus a bit of extra (8GB):

> # ipcs -M
> shminfo:
> shmmax: 8589934592 (max shared memory segment size)
> shmmin: 1 (min shared memory segment size)
> shmmni: 192 (max number of shared memory identifiers)
> shmseg: 128 (max shared memory segments per process)
> shmall: 2097152 (max amount of shared memory in pages)

And semaphores (irrelevant?)...

> # ipcs -S
> seminfo:
> semmni: 256 (# of semaphore identifiers)
> semmns: 512 (# of semaphores in system)
> semmnu: 256 (# of undo structures in system)
> semmsl: 340 (max # of semaphores per id)
> semopm: 100 (max # of operations per semop call)
> semume: 50 (max # of undo entries per process)
> semusz: 632 (size in bytes of undo structure)
> semvmx: 32767 (semaphore maximum value)
> semaem: 16384 (adjust on exit max value)

What are your vm.dirty_ratio and vm.dirty_background_ratio? I think reducing them may help. But can not really say what exactly would help unless you are able to get the error source in db logs

I don't know what else to try.
I lowered maintenance_work_mem without restarting the server.
In some of the attempts, but not all, the restore was done while people
were working.

Each attempt takes 12 hours...
We couldn't use the directory -Fd dump because it's postgresql 9.1.
The original database is still on the server, this is a test restore.

We have about one or two months of slack before we really need to remove
them from the server to recover space.

--
Miguel Ramos

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Best Regards

Sameer Kumar | DB Solution Architect

ASHNIK PTE. LTD.

101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533

T: +65 6438 3504 | M: +65 8110 0350 | www.ashnik.com

Re: pg_restore out of memory

От

Tom Lane

Дата:

12 июля 2016 г., 14:40:45

Miguel Ramos <org.postgresql@miguel.ramos.name> writes:
> We have backed up a database and now when trying to restore it to the
> same server we get this:

>>> # pg_restore -d recovery /mnt/paysdeloire2013_convertida2.1.dump
>>> pg_restore: [custom archiver] out of memory

It looks to me like this error is pg_restore itself running out of memory,
not reporting a server-side OOM condition.  You could verify that by
looking in the server log to see whether any out-of-memory error appeared
there.  But assuming that I'm right, the other responses suggesting
tweaking server configurations are not on-point at all.

Unless you're running pg_restore under a really small ulimit, this would
seem to suggest some kind of memory leak in pg_restore itself.  I wonder
how many objects in your dump (how long is "pg_restore -l" output)?

> - PostgreSQL 9.1.8 custom compiled to get 32kB blocks

9.1.8 is pretty old ...

            regards, tom lane

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

12 июля 2016 г., 15:24:05

Às 15:40 de 12-07-2016, Tom Lane escreveu:
> Miguel Ramos <org.postgresql@miguel.ramos.name> writes:
>> We have backed up a database and now when trying to restore it to the
>> same server we get this:
>
>>>> # pg_restore -d recovery /mnt/paysdeloire2013_convertida2.1.dump
>>>> pg_restore: [custom archiver] out of memory
>
> It looks to me like this error is pg_restore itself running out of memory,
> not reporting a server-side OOM condition.  You could verify that by
> looking in the server log to see whether any out-of-memory error appeared
> there.  But assuming that I'm right, the other responses suggesting
> tweaking server configurations are not on-point at all.

The logs of the last attempt are already gone.
Obviously, I'll keep tonight's logs.

> Unless you're running pg_restore under a really small ulimit, this would
> seem to suggest some kind of memory leak in pg_restore itself.  I wonder
> how many objects in your dump (how long is "pg_restore -l" output)?

pg_restore -l | wc gives me:
     1055    7984   70675

It looks small to me.
We don't have a lot of tables, instead we have really huge tables.
We try to keep the schema normalized when possible, and we only don't do
that when the resulting tuples become too small for PostgreSQL, when the
row overhead becomes prohibitive.

>
>> - PostgreSQL 9.1.8 custom compiled to get 32kB blocks
>
> 9.1.8 is pretty old ...

Oh, I'll forward your email to those who were older than I, the many far
wiser than I, but whose love of database servers is certainly not
stronger than mine, by far.

Thanks,

--
Miguel Ramos

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

12 июля 2016 г., 15:33:04

Às 16:23 de 12-07-2016, Miguel Ramos escreveu:
>> It looks to me like this error is pg_restore itself running out of
>> memory,
>> not reporting a server-side OOM condition.  You could verify that by
>> looking in the server log to see whether any out-of-memory error appeared
>> there.  But assuming that I'm right, the other responses suggesting
>> tweaking server configurations are not on-point at all.
[...]
>>
>>> - PostgreSQL 9.1.8 custom compiled to get 32kB blocks
>>
>> 9.1.8 is pretty old ...

Of course, I could try a recent pg_restore.
That'll take me a bit of time too, but I can try.

--
Miguel Ramos

Re: pg_restore out of memory

От

Tom Lane

Дата:

12 июля 2016 г., 15:58:20

Miguel Ramos <org.postgresql@miguel.ramos.name> writes:
> �s 15:40 de 12-07-2016, Tom Lane escreveu:
>> Unless you're running pg_restore under a really small ulimit, this would
>> seem to suggest some kind of memory leak in pg_restore itself.  I wonder
>> how many objects in your dump (how long is "pg_restore -l" output)?

> pg_restore -l | wc gives me:
>      1055    7984   70675

> It looks small to me.

Yeah, the archive TOC is clearly not large enough to cause any problem in
itself.  I'm wondering at this point about libpq buffer bloat.  We've
fixed a number of problems in that area over the last few years, though
in a quick review of the commit logs I don't see anything that clearly
bears on your problem.  (Commit 86888054a92aeca4 is pretty interesting
but applies to mostly-server-to-client data transfer, which is the wrong
direction.)  I wonder whether your data is such that loading it would
trigger a bunch of NOTICEs from the server?

Anyway, it would be useful to try running the restore with a more modern
version of pg_restore, to see if that helps.

            regards, tom lane

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

12 июля 2016 г., 18:47:00

A Ter, 12-07-2016 às 11:58 -0400, Tom Lane escreveu:
> 
> Anyway, it would be useful to try running the restore with a more
> modern
> version of pg_restore, to see if that helps.
>
>             regards, tom lane
>
>

I have the scheduled restart tonight.
So, I will do the other test first, nevertheless.

This because I have the impression that it is during index creation,
where I think client role would be minimal.
Maybe I saw something in the logs when this problem was first reported
internally.

I can't spare a second 700G for doing both tests concurrently.

I will get back when I have more news.

Thanks,

--
Miguel

Re: pg_restore out of memory

От

Tom Lane

Дата:

12 июля 2016 г., 19:10:48

Miguel Ramos <org.postgresql@miguel.ramos.name> writes:
> This because I have the impression that it is during index creation,
> where I think client role would be minimal.

Hard to believe really, given the spelling of the message.  But anyway,
be sure you do the run with log_statement = all so that it's clear what
is being worked on when the error happens.

            regards, tom lane

Re: pg_restore out of memory

От

Karsten Hilbert

Дата:

13 июля 2016 г., 08:41:12

On Tue, Jul 12, 2016 at 12:25:08PM +0100, Miguel Ramos wrote:

> > # pg_restore -d recovery /mnt/paysdeloire2013_convertida2.1.dump
> > pg_restore: [custom archiver] out of memory
> > 12:09:56.58  9446.593u+1218.508s 24.3%  167+2589k  6+0io  0pf+0sw 6968822cs
...
> I suspect that the restore fails when constructing the indices. After the
> process is aborted, the data appears to be all or most there, but no
> indices.
...
> I don't know what else to try.

You could try restoring w/o indices and re-adding them later.

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

13 июля 2016 г., 20:52:11


Finally, here are the log messages at the moment of the error.
It is clearly not while building indices.

The table in question is a big one, 111GB.
Fields latitude, longitude and height are arrays of length around 500-
700 on each row (double and real).

So, what does this mean?
Was it the client that aborted? I think I saw that "unexpected message
type 0x58" on other types of interruptions.



Jul 13 20:10:10 ema postgres[97889]: [867-1] LOG:  statement: COPY positioned_scan (id_dataset, id_acquired_set,
sequence_number,id_scan_dataset, latitude, longitude, height, srid, srid_vertical) FROM stdin; 
Jul 13 20:10:10 ema postgres[97889]: [867-2]
Jul 13 21:08:06 ema postgres[97889]: [868-1] ERROR:  unexpected message type 0x58 during COPY from stdin
Jul 13 21:08:06 ema postgres[97889]: [868-2] CONTEXT:  COPY positioned_scan, line 2779323
Jul 13 21:08:06 ema postgres[97889]: [868-3] STATEMENT:  COPY positioned_scan (id_dataset, id_acquired_set,
sequence_number,id_scan_dataset, latitude, longitude, height, srid, srid_vertical) FROM stdin; 
Jul 13 21:08:06 ema postgres[97889]: [868-4]
Jul 13 21:08:06 ema postgres[97889]: [869-1] LOG:  could not send data to client: Broken pipe
Jul 13 21:08:06 ema postgres[97889]: [869-2] STATEMENT:  COPY positioned_scan (id_dataset, id_acquired_set,
sequence_number,id_scan_dataset, latitude, longitude, height, srid, srid_vertical) FROM stdin; 
Jul 13 21:08:06 ema postgres[97889]: [869-3]
Jul 13 21:08:06 ema postgres[97889]: [870-1] LOG:  unexpected EOF on client connection

Thanks,

-- Miguel


A Ter, 12-07-2016 às 15:10 -0400, Tom Lane escreveu:
> Miguel Ramos <org.postgresql@miguel.ramos.name> writes:
> > This because I have the impression that it is during index
> > creation,
> > where I think client role would be minimal.
>
> Hard to believe really, given the spelling of the message.  But
> anyway,
> be sure you do the run with log_statement = all so that it's clear
> what
> is being worked on when the error happens.
>
>             regards, tom lane

Re: pg_restore out of memory

От

John R Pierce

Дата:

13 июля 2016 г., 20:59:51

On 7/13/2016 1:51 PM, Miguel Ramos wrote:
> Finally, here are the log messages at the moment of the error.
> It is clearly not while building indices.
>
> The table in question is a big one, 111GB.
> Fields latitude, longitude and height are arrays of length around 500-
> 700 on each row (double and real).
>
> So, what does this mean?
> Was it the client that aborted? I think I saw that "unexpected message
> type 0x58" on other types of interruptions.

is pg_restore, and the postgres server all the same version?

$ pg_restore --version
pg_restore (PostgreSQL) 9.3.13

$ su - postgres
-bash-4.1$ psql -c "select version()"
version
----------------------------------------------------------------------------------------------------------------
  PostgreSQL 9.3.13 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
(1 row)


--
john r pierce, recycling bits in santa cruz

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

13 июля 2016 г., 21:08:03

A Ter, 12-07-2016 às 13:08 +0000, Sameer Kumar escreveu:
> On Tue, 12 Jul 2016, 7:25 p.m. Miguel Ramos,
> <org.postgresql@miguel.ramos.name> wrote:
> > I found two relevant threads on the mailing-lists.
> > The most recent one sugested that postgresql was being configured
> > to use
> > more memory than what's available.
> > The older one sugested that the system limits on the size of the
> > data or
> > stack segments were lower than required.
> >
> > So here are some server parameters (relevant or otherwise):
> >
> >  > max_connections = 100
> >  > shared_buffers = 4GB  -- 25% of RAM
> >  > temp_buffers = 32MB  -- irrelevant?
> >  > work_mem = 64MB
> >  > maintenance_work_mem = was 1G lowered to 256M then 64M
>
>
> Why did you lower it? I think increasing it should help better. But
> 1GB seems like fine.

The advise was on that thread and maybe the problem was very different.
The idea I get is that PostgreSQL can always find a way to do its work,
maybe using an out of core algorithm.

If you tell it to use a lot of memory, then it will try to use RAM and
then it really may run out of memory.

So, basically, increasing the memory available is a performance
improvement, if you feel safe that the memory really is available.

But maybe that logic applies only to work_mem...
And it's also work_mem that is difficult to bound, according to the
manual.
I don't really know...

> >  > wal_buffers = -1  -- should mean 1/32 of shared_buffers = 128MB
>
> Increase this during the restore, may be 512MB

I retain the advise, but now I have posted the log messages to the
list.

> >  > checkpoint_segments = 64  -- WAL segments are 16MB
> >  > effective_cache_size = 8GB  -- irrelevant?
> >
> >
> > I suspect that the restore fails when constructing the indices.
> > After
> > the process is aborted, the data appears to be all or most there,
> > but no
> > indices.
>
> What is logged in database log files? Have you checked that?

This time I collected the logs.
I posted the 10 relevant lines as a reply to another message.
I'll repeat only the ERROR line here:

Jul 13 21:08:06 ema postgres[97889]: [868-1] ERROR:  unexpected message type 0x58 during COPY from stdin

> What are your vm.dirty_ratio and vm.dirty_background_ratio? I think
> reducing them may help. But can not really say what exactly would
> help unless you are able to get the error source in db logs

This is a FreeBSD server.
I'm not really sure what the equivalent would be.
Also, I don't think tunning the VM would help.

This is quite a deterministic abort, 12-13 hours after the beginning of
the restore, and does not change much whether it is done during the
night or during the day with 10 people working intensively.

Thanks,

-- Miguel

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

13 июля 2016 г., 21:11:36

Yes.
Both 9.1.8, I checked right now.

-- Miguel

A Qua, 13-07-2016 às 13:59 -0700, John R Pierce escreveu:
> On 7/13/2016 1:51 PM, Miguel Ramos wrote:
> > Finally, here are the log messages at the moment of the error.
> > It is clearly not while building indices.
> >
> > The table in question is a big one, 111GB.
> > Fields latitude, longitude and height are arrays of length around
> > 500-
> > 700 on each row (double and real).
> >
> > So, what does this mean?
> > Was it the client that aborted? I think I saw that "unexpected
> > message
> > type 0x58" on other types of interruptions.
>
> is pg_restore, and the postgres server all the same version?
>
> $ pg_restore --version
> pg_restore (PostgreSQL) 9.3.13
>
> $ su - postgres
> -bash-4.1$ psql -c "select version()"
> version
> -------------------------------------------------------------------
> ---------------------------------------------
>   PostgreSQL 9.3.13 on x86_64-unknown-linux-gnu, compiled by gcc
> (GCC) 
> 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
> (1 row)
>
>
> -- 
> john r pierce, recycling bits in santa cruz
>
>

Re: pg_restore out of memory

От

Adrian Klaver

Дата:

13 июля 2016 г., 21:15:16

On 07/13/2016 01:51 PM, Miguel Ramos wrote:
>
>
> Finally, here are the log messages at the moment of the error.
> It is clearly not while building indices.
>
> The table in question is a big one, 111GB.
> Fields latitude, longitude and height are arrays of length around 500-
> 700 on each row (double and real).
>
> So, what does this mean?
> Was it the client that aborted? I think I saw that "unexpected message
> type 0x58" on other types of interruptions.

So where are you running the pg_restore from, manually from command line
or from within a script?

>
>
>
> Jul 13 20:10:10 ema postgres[97889]: [867-1] LOG:  statement: COPY positioned_scan (id_dataset, id_acquired_set,
sequence_number,id_scan_dataset, latitude, longitude, height, srid, srid_vertical) FROM stdin; 
> Jul 13 20:10:10 ema postgres[97889]: [867-2]
> Jul 13 21:08:06 ema postgres[97889]: [868-1] ERROR:  unexpected message type 0x58 during COPY from stdin
> Jul 13 21:08:06 ema postgres[97889]: [868-2] CONTEXT:  COPY positioned_scan, line 2779323
> Jul 13 21:08:06 ema postgres[97889]: [868-3] STATEMENT:  COPY positioned_scan (id_dataset, id_acquired_set,
sequence_number,id_scan_dataset, latitude, longitude, height, srid, srid_vertical) FROM stdin; 
> Jul 13 21:08:06 ema postgres[97889]: [868-4]
> Jul 13 21:08:06 ema postgres[97889]: [869-1] LOG:  could not send data to client: Broken pipe
> Jul 13 21:08:06 ema postgres[97889]: [869-2] STATEMENT:  COPY positioned_scan (id_dataset, id_acquired_set,
sequence_number,id_scan_dataset, latitude, longitude, height, srid, srid_vertical) FROM stdin; 
> Jul 13 21:08:06 ema postgres[97889]: [869-3]
> Jul 13 21:08:06 ema postgres[97889]: [870-1] LOG:  unexpected EOF on client connection
>
> Thanks,
>
> -- Miguel
>
>
> A Ter, 12-07-2016 às 15:10 -0400, Tom Lane escreveu:
>> Miguel Ramos <org.postgresql@miguel.ramos.name> writes:
>>> This because I have the impression that it is during index
>>> creation,
>>> where I think client role would be minimal.
>>
>> Hard to believe really, given the spelling of the message.  But
>> anyway,
>> be sure you do the run with log_statement = all so that it's clear
>> what
>> is being worked on when the error happens.
>>
>>             regards, tom lane
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_restore out of memory

От

Tom Lane

Дата:

13 июля 2016 г., 21:15:57

Miguel Ramos <org.postgresql@miguel.ramos.name> writes:
> So, what does this mean?
> Was it the client that aborted? I think I saw that "unexpected message
> type 0x58" on other types of interruptions.

Yeah, 0x58 is ASCII 'X' which is a Terminate message.  Between that and
the unexpected-EOF report, it's quite clear that the client side went
belly-up, not the server.  We still don't know exactly why, but given
that pg_restore reports "out of memory" before quitting, there must be
some kind of memory leak going on inside pg_restore.

> Jul 13 20:10:10 ema postgres[97889]: [867-1] LOG:  statement: COPY positioned_scan (id_dataset, id_acquired_set,
sequence_number,id_scan_dataset, latitude, longitude, height, srid, srid_vertical) FROM stdin; 

I'm guessing from the column names that you've got some PostGIS data
types in this table.  I wonder if that's a contributing factor.

I'm still suspicious that this might be some sort of NOTICE-processing-
related buffer bloat.  Could you try loading the data with the server's
log_min_messages level cranked down to NOTICE, so you can see from the
postmaster log whether any NOTICEs are being issued to the pg_restore
session?

            regards, tom lane

Re: pg_restore out of memory

От

John R Pierce

Дата:

13 июля 2016 г., 21:19:51

On 7/13/2016 2:11 PM, Miguel Ramos wrote:
> Yes.
> Both 9.1.8, I checked right now.

9.1 is up to 9.1.22, thats a lot of bug fixes you're missing. 9.1.8 was
released 2013-02-07, 9.1.22 in 2016-05-12


--
john r pierce, recycling bits in santa cruz

Re: pg_restore out of memory

От

Tom Lane

Дата:

13 июля 2016 г., 22:42:59

I wrote:
> I'm still suspicious that this might be some sort of NOTICE-processing-
> related buffer bloat.  Could you try loading the data with the server's
> log_min_messages level cranked down to NOTICE, so you can see from the
> postmaster log whether any NOTICEs are being issued to the pg_restore
> session?

BTW, I experimented with that theory by creating a table with a BEFORE
INSERT trigger function that emits a NOTICE, and then making pg_restore
restore a lot of data into it.  I could not see any memory growth in
the pg_restore process.  However, I was testing 9.1.22, not 9.1.8.
Also, some of the misbehaviors we've discovered along these lines have
been timing-sensitive, meaning that the problem might or might not
reproduce for another person even with the same software version.
Are you running pg_restore locally on the same machine as the server,
or across a network --- and if the latter, how fast is the network?

            regards, tom lane

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

14 июля 2016 г., 08:42:51

That's 3 years and 3 months with absolutely zero maintenance.
Apart from the scripts I left back then.

During that time, it was used by an average of 10 people, some 9T of
sensor data entered at the rate of 60G/week, and another 3T of analysis
data was produced.

The expression "cutting down on maintenance" could be appropriate.
But that was not my decision.

Right now, updating is really difficult.
They're on a rush, and they are using the server 15 hours a day and
maybe they'll need the weekend as well.

Plus, the OS is now old as well, the ports tree is out of sync with the
OS... it will all be a lot harder.

-- Miguel

A Qua, 13-07-2016 às 14:19 -0700, John R Pierce escreveu:
> On 7/13/2016 2:11 PM, Miguel Ramos wrote:
> > Yes.
> > Both 9.1.8, I checked right now.
>
> 9.1 is up to 9.1.22, thats a lot of bug fixes you're missing. 9.1.8
> was 
> released 2013-02-07, 9.1.22 in 2016-05-12
>
>
> -- 
> john r pierce, recycling bits in santa cruz
>
>
>

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

14 июля 2016 г., 09:39:42

A Qua, 13-07-2016 às 17:15 -0400, Tom Lane escreveu:
> Miguel Ramos <org.postgresql@miguel.ramos.name> writes:
> > So, what does this mean?
> > Was it the client that aborted? I think I saw that "unexpected
> > message
> > type 0x58" on other types of interruptions.
>
> Yeah, 0x58 is ASCII 'X' which is a Terminate message.  Between that
> and
> the unexpected-EOF report, it's quite clear that the client side went
> belly-up, not the server.  We still don't know exactly why, but given
> that pg_restore reports "out of memory" before quitting, there must
> be
> some kind of memory leak going on inside pg_restore.
>
> > Jul 13 20:10:10 ema postgres[97889]: [867-1] LOG:  statement: COPY
> > positioned_scan (id_dataset, id_acquired_set, sequence_number,
> > id_scan_dataset, latitude, longitude, height, srid, srid_vertical)
> > FROM stdin;
>
> I'm guessing from the column names that you've got some PostGIS data
> types in this table.  I wonder if that's a contributing factor.
>
> I'm still suspicious that this might be some sort of NOTICE-
> processing-
> related buffer bloat.  Could you try loading the data with the
> server's
> log_min_messages level cranked down to NOTICE, so you can see from
> the
> postmaster log whether any NOTICEs are being issued to the pg_restore
> session?
>
>             regards, tom lane


No, no PostGIS here. The columns latitude, longitude and height are
just arrays. The first two are arrays of double and height is an array
of single. So, if anything, this could be related to array processing.

Thanks,

-- Miguel Ramos

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

14 июля 2016 г., 09:52:18

A Qua, 13-07-2016 às 18:42 -0400, Tom Lane escreveu:
> I wrote:
> > I'm still suspicious that this might be some sort of NOTICE-
> > processing-
> > related buffer bloat.  Could you try loading the data with the
> > server's
> > log_min_messages level cranked down to NOTICE, so you can see from
> > the
> > postmaster log whether any NOTICEs are being issued to the
> > pg_restore
> > session?
>
> BTW, I experimented with that theory by creating a table with a
> BEFORE
> INSERT trigger function that emits a NOTICE, and then making
> pg_restore
> restore a lot of data into it.  I could not see any memory growth in
> the pg_restore process.  However, I was testing 9.1.22, not 9.1.8.
> Also, some of the misbehaviors we've discovered along these lines
> have
> been timing-sensitive, meaning that the problem might or might not
> reproduce for another person even with the same software version.
> Are you running pg_restore locally on the same machine as the server,
> or across a network --- and if the latter, how fast is the network?
>
>             regards, tom lane
>

I was running pg_restore locally.
The disk containing the backup, however, is on NAS.
The NAS is mounted on the server using SMB and the FreeBSD kernel
implementation of smbfs (mount_smbfs -I ... /mnt).
The kernel smbfs is notoriously outdated and sometimes we get timeouts.

However, those timeouts happen randomly and this "out of memory"
happens consistently.
This time, the server was no longer under heavy load, the log lines are
consecutive, there was no activity during the start of the COPY
statement and the error.

The network is 1Gbps with a single unmanaged 24-port switch.
The server
has two aggregated links to the switch.

I ran pg_restore locally because the server is in another office, connected to mine through a VPN.

Now I have arranjed for a PC to be there for me and my next test will be to do the restore using the latest pgadmin.

Thanks,

-- Miguel Ramos

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

14 июля 2016 г., 16:31:02

Yes.
Both 9.1.8, I checked right now.

-- Miguel

A Qua, 13-07-2016 às 13:59 -0700, John R Pierce escreveu:
> On 7/13/2016 1:51 PM, Miguel Ramos wrote:
> > Finally, here are the log messages at the moment of the error.
> > It is clearly not while building indices.
> >
> > The table in question is a big one, 111GB.
> > Fields latitude, longitude and height are arrays of length around
> > 500-
> > 700 on each row (double and real).
> >
> > So, what does this mean?
> > Was it the client that aborted? I think I saw that "unexpected
> > message
> > type 0x58" on other types of interruptions.
>
> is pg_restore, and the postgres server all the same version?
>
> $ pg_restore --version
> pg_restore (PostgreSQL) 9.3.13
>
> $ su - postgres
> -bash-4.1$ psql -c "select version()"
> version
> -------------------------------------------------------------------
> ---------------------------------------------
>   PostgreSQL 9.3.13 on x86_64-unknown-linux-gnu, compiled by gcc
> (GCC) 
> 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
> (1 row)
>
>
> -- 
> john r pierce, recycling bits in santa cruz
>
>
>

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

15 июля 2016 г., 07:37:59

A Qui, 14-07-2016 às 10:52 +0100, Miguel Ramos escreveu:
>
> A Qua, 13-07-2016 às 18:42 -0400, Tom Lane escreveu:
> >
> > I wrote:
> > >
> > > I'm still suspicious that this might be some sort of NOTICE-
> > > processing-
> > > related buffer bloat.  Could you try loading the data with the
> > > server's
> > > log_min_messages level cranked down to NOTICE, so you can see
> > > from
> > > the
> > > postmaster log whether any NOTICEs are being issued to the
> > > pg_restore
> > > session?
> >
> > BTW, I experimented with that theory by creating a table with a
> > BEFORE
> > INSERT trigger function that emits a NOTICE, and then making
> > pg_restore
> > restore a lot of data into it.  I could not see any memory growth
> > in
> > the pg_restore process.  However, I was testing 9.1.22, not 9.1.8.
> > Also, some of the misbehaviors we've discovered along these lines
> > have
> > been timing-sensitive, meaning that the problem might or might not
> > reproduce for another person even with the same software version.
> > Are you running pg_restore locally on the same machine as the
> > server,
> > or across a network --- and if the latter, how fast is the network?
> >
> >             regards, tom lane
> >
>
> I was running pg_restore locally.
> The disk containing the backup, however, is on NAS.
> The NAS is mounted on the server using SMB and the FreeBSD kernel
> implementation of smbfs (mount_smbfs -I ... /mnt).
> The kernel smbfs is notoriously outdated and sometimes we get
> timeouts.
>
> However, those timeouts happen randomly and this "out of memory"
> happens consistently.
> This time, the server was no longer under heavy load, the log lines
> are
> consecutive, there was no activity during the start of the COPY
> statement and the error.
>
> The network is 1Gbps with a single unmanaged 24-port switch.
> The server
> has two aggregated links to the switch.
>
>
> I ran pg_restore locally because the server is in another office,
> connected to mine through a VPN.
>
> Now I have arranjed for a PC to be there for me and my next test will
> be to do the restore using the latest pgadmin.
>
>
> Thanks,
>
> -- Miguel Ramos
>

I tried the restore using pgAdmin III 1.22.1.
This time from a Windows PC connected to the server through a 1Gbps
switch.

Unfortunately the result was the same, and this was my best bet.


I see (transcribed by hand from screenshot):
...
pg_restore: processing data for table "inspection.positioned_scan"
out of memory

Process returned exit code 1.


I hadn't yet set log_min_messages to 'notice'. But as
client_min_messages is at 'notice', aren't this displayed on a verbose
pg_restore?
Maybe during the weekend I can have more verbose logging.

Now I'm repeating the backup (maybe the file is bad) and then I will
repeat the restore with log_min_messages to 'notice'.

I suppose log_statement to 'all' is no longer necessary? 

What else?


-- 
Miguel Ramos

Re: pg_restore out of memory

От

Tom Lane

Дата:

15 июля 2016 г., 13:13:42

Miguel Ramos <mail@miguel.ramos.name> writes:
> I see (transcribed by hand from screenshot):
> ...
> pg_restore: processing data for table "inspection.positioned_scan"
> out of memory

> Process returned exit code 1.

Right, so that confirms that the OOM happens while sending data for that
table; but we're still no closer as to why.

> I hadn't yet set log_min_messages to 'notice'. But as client_min_messages is at 'notice', aren't this displayed on a
verbosepg_restore? 

The theory I'd been considering was that NOTICE messages were being sent
by the server during the COPY (and not logged in the postmaster log
because log_min_messages wasn't high enough), but for some reason they
were not immediately processed and printed by pg_restore.  In such a case
they'd accumulate in libpq's input buffer.  After enough such messages
you'd eventually get an OOM failure.  Now the big hole in this theory is
that it's unclear why the server would be sending any notices.  But I
can't think of other good ideas.

> Now I'm repeating the backup (maybe the file is bad) and then I will repeat the restore with log_min_messages to
'notice'.

OK.

> I suppose log_statement to 'all' is no longer necessary?

I agree; we already know which statement is failing.

            regards, tom lane

Re: pg_restore out of memory

От

Adrian Klaver

Дата:

15 июля 2016 г., 14:16:39

On 07/15/2016 12:37 AM, Miguel Ramos wrote:
>
> A Qui, 14-07-2016 às 10:52 +0100, Miguel Ramos escreveu:
>>
>> A Qua, 13-07-2016 às 18:42 -0400, Tom Lane escreveu:
>>>
>>> I wrote:
>>>>
>>>> I'm still suspicious that this might be some sort of NOTICE-
>>>> processing-
>>>> related buffer bloat.  Could you try loading the data with the
>>>> server's
>>>> log_min_messages level cranked down to NOTICE, so you can see
>>>> from
>>>> the
>>>> postmaster log whether any NOTICEs are being issued to the
>>>> pg_restore
>>>> session?
>>>
>>> BTW, I experimented with that theory by creating a table with a
>>> BEFORE
>>> INSERT trigger function that emits a NOTICE, and then making
>>> pg_restore
>>> restore a lot of data into it.  I could not see any memory growth
>>> in
>>> the pg_restore process.  However, I was testing 9.1.22, not 9.1.8.
>>> Also, some of the misbehaviors we've discovered along these lines
>>> have
>>> been timing-sensitive, meaning that the problem might or might not
>>> reproduce for another person even with the same software version.
>>> Are you running pg_restore locally on the same machine as the
>>> server,
>>> or across a network --- and if the latter, how fast is the network?
>>>
>>>             regards, tom lane
>>>
>>
>> I was running pg_restore locally.
>> The disk containing the backup, however, is on NAS.
>> The NAS is mounted on the server using SMB and the FreeBSD kernel
>> implementation of smbfs (mount_smbfs -I ... /mnt).
>> The kernel smbfs is notoriously outdated and sometimes we get
>> timeouts.
>>
>> However, those timeouts happen randomly and this "out of memory"
>> happens consistently.
>> This time, the server was no longer under heavy load, the log lines
>> are
>> consecutive, there was no activity during the start of the COPY
>> statement and the error.
>>
>> The network is 1Gbps with a single unmanaged 24-port switch.
>> The server
>> has two aggregated links to the switch.
>>
>>
>> I ran pg_restore locally because the server is in another office,
>> connected to mine through a VPN.
>>
>> Now I have arranjed for a PC to be there for me and my next test will
>> be to do the restore using the latest pgadmin.
>>
>>
>> Thanks,
>>
>> -- Miguel Ramos
>>
>
> I tried the restore using pgAdmin III 1.22.1.
> This time from a Windows PC connected to the server through a 1Gbps
> switch.
>
> Unfortunately the result was the same, and this was my best bet.
>
>
> I see (transcribed by hand from screenshot):
> ...
> pg_restore: processing data for table "inspection.positioned_scan"
> out of memory
>
> Process returned exit code 1.
>
>
> I hadn't yet set log_min_messages to 'notice'. But as
> client_min_messages is at 'notice', aren't this displayed on a verbose
> pg_restore?
> Maybe during the weekend I can have more verbose logging.
>
> Now I'm repeating the backup (maybe the file is bad) and then I will
> repeat the restore with log_min_messages to 'notice'.
>
> I suppose log_statement to 'all' is no longer necessary?
>
> What else?

The pg_dump file you are restoring from is a custom format.

Do you have room to do something like?:

1) pg_restore -d some_db -U some_user -t inspection.positioned_scan
/mnt/paysdeloire2013_convertida2.1.dump


>
>
> --
> Miguel Ramos
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

15 июля 2016 г., 17:34:21

A Sex, 15-07-2016 às 07:16 -0700, Adrian Klaver escreveu:
> On 07/15/2016 12:37 AM, Miguel Ramos wrote:
> > What else?
>
> The pg_dump file you are restoring from is a custom format.
>
> Do you have room to do something like?:
>
> 1) pg_restore -d some_db -U some_user -t inspection.positioned_scan 
> /mnt/paysdeloire2013_convertida2.1.dump
>
> 

Of course, I came up with that too.
I'll get back soon.

If that fails, then it really starts looking like a bug.

Thanks,

--
Miguel Ramos

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

19 июля 2016 г., 21:15:39

A Qui, 14-07-2016 às 10:52 +0100, Miguel Ramos escreveu:
>
> A Qua, 13-07-2016 às 18:42 -0400, Tom Lane escreveu:
> >
> > I wrote:
> > >
> > > I'm still suspicious that this might be some sort of NOTICE-
> > > processing-
> > > related buffer bloat.  Could you try loading the data with the
> > > server's
> > > log_min_messages level cranked down to NOTICE, so you can see
> > > from
> > > the
> > > postmaster log whether any NOTICEs are being issued to the
> > > pg_restore
> > > session?
> >
> > BTW, I experimented with that theory by creating a table with a
> > BEFORE
> > INSERT trigger function that emits a NOTICE, and then making
> > pg_restore
> > restore a lot of data into it.  I could not see any memory growth
> > in
> > the pg_restore process.  However, I was testing 9.1.22, not 9.1.8.
> > Also, some of the misbehaviors we've discovered along these lines
> > have
> > been timing-sensitive, meaning that the problem might or might not
> > reproduce for another person even with the same software version.
> > Are you running pg_restore locally on the same machine as the
> > server,
> > or across a network --- and if the latter, how fast is the network?
> >
> >             regards, tom lane
> >
>
> I was running pg_restore locally.
> The disk containing the backup, however, is on NAS.
> The NAS is mounted on the server using SMB and the FreeBSD kernel
> implementation of smbfs (mount_smbfs -I ... /mnt).
> The kernel smbfs is notoriously outdated and sometimes we get
> timeouts.
>
> However, those timeouts happen randomly and this "out of memory"
> happens consistently.
> This time, the server was no longer under heavy load, the log lines
> are
> consecutive, there was no activity during the start of the COPY
> statement and the error.
>
> The network is 1Gbps with a single unmanaged 24-port switch.
> The server
> has two aggregated links to the switch.
>
>
> I ran pg_restore locally because the server is in another office,
> connected to mine through a VPN.
>
> Now I have arranjed for a PC to be there for me and my next test will
> be to do the restore using the latest pgadmin.
>
>
> Thanks,
>
> -- Miguel Ramos


I tried the restore using pgAdmin III 1.22.1.
This time from a Windows PC connected to the server through a 1Gbps switch.

Unfortunately the result was the same, and this was my best bet.


I see (transcribed by hand from screenshot):
...
pg_restore: processing data for table "inspection.positioned_scan"
out of memory

Process returned exit code 1.


I hadn't yet set log_min_messages to 'notice'. But as client_min_messages is at 'notice', aren't this displayed on a
verbosepg_restore? 
Maybe during the weekend I can have more verbose logging.

Now I'm repeating the backup (maybe the file is bad) and then I will repeat the restore with log_min_messages to
'notice'.

I suppose log_statement to 'all' is no longer necessary?

What else?


--
Miguel Ramos

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

02 августа 2016 г., 08:43:02

Greetings,

To all those who helped out with this problem, I'm sorry it took me so
long to respond. For the record, the matter is solved, at least for us,
but we had to repeat a lot of things to make sure.


First, the "out of memory" problem repeated itself when restoring that
single table, after doing a schema-only restore and removing all
foreign key constraints.

Second, the other weirdness on this table was a couple of constraints
to make sure that the three arrays on each row were of the same length.
But that was not the problem.


It turned out that the backup file was the problem.

Plus, I'm convinced that this was our mistake, someone simply didn't
see an error message during backup or some copy of the file.
Also, we should have already repeated the backup and maybe we didn't.


We did a new custom dump from within latest pgadmin III, and that one
restored just fine (using the same old 9.1.8 pg_restore).

It could have been a bug in pg_dump 9.1.8 since the good backup was
from a newer version. But this didn't seem so likely and because
repeating a full backup would force us to keep the original database on
the server for a few more days (dumps take so long) we didn't do it.

We did however a dump of that single table with pg_dump 9.1.8 and that
one also restored just fine.



Our immediate practical problem is over.
The only complaint would be that the "out of memory" message is
unfriendly.
If you would find useful that we make some additional tests, or some
observation of the file, we would be glad to return the help.


Best Regards,

--
Miguel Ramos

Re: pg_restore out of memory

От

Miguel Ramos

Дата:

02 августа 2016 г., 22:02:10


Greetings,

To all those who helped out with this problem, I'm sorry it took me so
long to respond. For the record, the matter is solved, at least for us,
but we had to repeat a lot of things to make sure.


First, the "out of memory" problem repeated itself when restoring that
single table, after doing a schema-only restore and removing all
foreign key constraints.

Second, the other weirdness on this table was a couple of constraints
to make sure that the three arrays on each row were of the same length.
But that was not the problem.


It turned out that the backup file was the problem.

Plus, I'm convinced that this was our mistake, someone simply didn't
see an error message during backup or some copy of the file.
Also, we should have already repeated the backup and maybe we didn't.


We did a new custom dump from within latest pgadmin III, and that one
restored just fine (using the same old 9.1.8 pg_restore).

It could have been a bug in pg_dump 9.1.8 since the good backup was
from a newer version. But this didn't seem so likely and because
repeating a full backup would force us to keep the original database on
the server for a few more days (dumps take so long) we didn't do it.

We did however a dump of that single table with pg_dump 9.1.8 and that
one also restored just fine.



Our immediate practical problem is over.
The only complaint would be that the "out of memory" message is
unfriendly.
If you would find useful that we make some additional tests, or some
observation of the file, we would be glad to return the help.


Best Regards,


--
Miguel Ramos

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: pg_restore out of memory