Обсуждение: Re: ERROR: out of memory DETAIL: Failed on request of size ???

Поиск
Список
Период
Сортировка

Re: ERROR: out of memory DETAIL: Failed on request of size ???

От
Brian Wong
Дата:
I've tried any work_mem value from 1gb all the way up to 40gb, with no effect on the error.  I'd like to think of this problem as a server process memory (not the server's buffers) or client process memory issue, primarily because when we tested the error there was no other load whatsoever.  Unfortunately,  the error doesn't say what kinda memory ran out.

--- Original Message ---

From: "bricklen" <bricklen@gmail.com>
Sent: November 18, 2013 7:25 PM
To: "Brian Wong" <bwong64@hotmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of size ???

On Mon, Nov 18, 2013 at 12:40 PM, Brian Wong <bwong64@hotmail.com> wrote:
We'd like to seek out your expertise on postgresql regarding this error that we're getting in an analytical database.

Some specs:
proc: Intel Xeon X5650 @ 2.67Ghz dual procs 6-core, hyperthreading on.
memory: 48GB
OS: Oracle Enterprise Linux 6.3
postgresql version: 9.1.9
shared_buffers: 18GB

After doing a lot of googling, I've tried setting FETCH_COUNT on psql AND/OR setting work_mem.  I'm just not able to work around this issue, unless if I take most of the MAX() functions out but just one.

What is your work_mem set to?
Did testing show that shared_buffers set to 18GB was effective? That seems about 2 to 3 times beyond what you probably want.

Re: ERROR: out of memory DETAIL: Failed on request of size ???

От
Igor Neyman
Дата:

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Wong
Sent: Monday, November 18, 2013 11:30 PM
To: bricklen
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of size ???

I've tried any work_mem value from 1gb all the way up to 40gb, with no effect on the error.  I'd like to think of this
problemas a server process memory (not the server's buffers) or client process memory issue, primarily because when we
testedthe error there was no other load whatsoever.  Unfortunately,  the error doesn't say what kinda memory ran out.
 

-----------------------------------------------------------------

You are testing with work_mem set to between 1GB and 40GB.
You were asked to lower this setting. 
Even 1GB is too much, try something like work_mem=64MB.

Regards,
Igor Neyman



Re: ERROR: out of memory DETAIL: Failed on request of size ???

От
Edson Richter
Дата:
Em 19/11/2013 02:30, Brian Wong escreveu:
I've tried any work_mem value from 1gb all the way up to 40gb, with no effect on the error.  I'd like to think of this problem as a server process memory (not the server's buffers) or client process memory issue, primarily because when we tested the error there was no other load whatsoever.  Unfortunately,  the error doesn't say what kinda memory ran out.

--- Original Message ---

From: "bricklen" <bricklen@gmail.com>
Sent: November 18, 2013 7:25 PM
To: "Brian Wong" <bwong64@hotmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of size ???

On Mon, Nov 18, 2013 at 12:40 PM, Brian Wong <bwong64@hotmail.com> wrote:
We'd like to seek out your expertise on postgresql regarding this error that we're getting in an analytical database.

Some specs:
proc: Intel Xeon X5650 @ 2.67Ghz dual procs 6-core, hyperthreading on.
memory: 48GB
OS: Oracle Enterprise Linux 6.3
postgresql version: 9.1.9
shared_buffers: 18GB

After doing a lot of googling, I've tried setting FETCH_COUNT on psql AND/OR setting work_mem.  I'm just not able to work around this issue, unless if I take most of the MAX() functions out but just one.

Excuse me (or just ignore me) if it is a stupid question, but have you configured sysctl.conf accordingly?
For instance, to use larget memory settings, I had to configure my EL as follows:

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296


Regards,

Edson



What is your work_mem set to?
Did testing show that shared_buffers set to 18GB was effective? That seems about 2 to 3 times beyond what you probably want.

Re: ERROR: out of memory DETAIL: Failed on request of size ???

От
"Tomas Vondra"
Дата:
On 19 Listopad 2013, 5:30, Brian Wong wrote:
> I've tried any work_mem value from 1gb all the way up to 40gb, with no
> effect on the error.  I'd like to think of this problem as a server
> process memory (not the server's buffers) or client process memory issue,
> primarily because when we tested the error there was no other load
> whatsoever.  Unfortunately,  the error doesn't say what kinda memory ran
> out.

Hi Brian,

first of all, please don't top-post, especially if the previous response
used bottom-post. Thank you ;-)

Regarding the issue you're seeing:

* Increasing work_mem in hope that it will make the issue go away is
pointless. In case work_mem is too low, PostgreSQL will automatically
spill the data to disk (e.g. it won't do a sort in memory, but will do a
on-disk merge sort). It will never fail, and messages "failed on request
of size" is actually coming from malloc, when requesting another chunk of
memory from the OS. So you're hitting a OS-level memory limit.

Note: AFAIK the only operation that does not spill to disk, and may fail
with OOM-like errors is hash aggregate. But by increasing the work_mem
you're actually encouraging PostgreSQL to do this planning error.

I see the query you're running is doing MAX() so it might be hitting this
issue. How much data are you dealing with? How many groups are in the
result?

* Setting shared buffers to 18GB is almost certainly a bad choice. I'm yet
to see a real-world database where shared_buffers over 8GB actually make a
measurable difference. More is not always better, and you're actually
reserving memory that can't be used for work_mem (so you're making the
issue worse). Please, decrease shared_buffers to e.g. 4GB, then try to
increase it and measure the performance difference.

* So how much memory does the query allocate? Can you watch it over
top/free to get an idea if it e.g. allocates all available memory, or if
it allocates only 1GB and then fail, or something?

* I believe you're hitting some sort of limit, imposed by the operating
system. Please check ulimit and overcommit settings.

* BTW the SO post you mentioned as a perfect match was talking about query
executed over dblink - are you doing the same? If so, try to move the
aggregation into the query (so that the aggregation happens on the other
end).

regards
Tomas

>
> --- Original Message ---
>
> From: "bricklen" <bricklen@gmail.com>
> Sent: November 18, 2013 7:25 PM
> To: "Brian Wong" <bwong64@hotmail.com>
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of
> size ???
>
> On Mon, Nov 18, 2013 at 12:40 PM, Brian Wong <bwong64@hotmail.com> wrote:
>
>> We'd like to seek out your expertise on postgresql regarding this error
>> that we're getting in an analytical database.
>>
>> Some specs:
>> proc: Intel Xeon X5650 @ 2.67Ghz dual procs 6-core, hyperthreading on.
>> memory: 48GB
>> OS: Oracle Enterprise Linux 6.3
>> postgresql version: 9.1.9
>> shared_buffers: 18GB
>>
>> After doing a lot of googling, I've tried setting FETCH_COUNT on psql
>> AND/OR setting work_mem.  I'm just not able to work around this issue,
>> unless if I take most of the MAX() functions out but just one.
>>
>
> What is your work_mem set to?
> Did testing show that shared_buffers set to 18GB was effective? That seems
> about 2 to 3 times beyond what you probably want.
>




Re: ERROR: out of memory DETAIL: Failed on request of size ???

От
"Tomas Vondra"
Дата:
Hi,

On 22 Listopad 2013, 20:09, Edson Richter wrote:
>
> Excuse me (or just ignore me) if it is a stupid question, but have you
> configured sysctl.conf accordingly?
> For instance, to use larget memory settings, I had to configure my EL as
> follows:
>
> # Controls the maximum shared segment size, in bytes
> kernel.shmmax = 68719476736
>
> # Controls the maximum number of shared memory segments, in pages
> kernel.shmall = 4294967296

This only influences shared memory, which is used for shared buffers (not
for memory allocated during query execution, as that's translated to plain
malloc). With sysctl.conf misconfigured, the database would not start at
all and you'd get a reasonable error message in the log, suggesting pretty
clearly where/how to fix it.

Tomas



Re: ERROR: out of memory DETAIL: Failed on request of size ???

От
"Christofer C. Bell"
Дата:
On Fri, Nov 22, 2013 at 1:09 PM, Edson Richter <edsonrichter@hotmail.com> wrote:
Em 19/11/2013 02:30, Brian Wong escreveu:
I've tried any work_mem value from 1gb all the way up to 40gb, with no effect on the error.  I'd like to think of this problem as a server process memory (not the server's buffers) or client process memory issue, primarily because when we tested the error there was no other load whatsoever.  Unfortunately,  the error doesn't say what kinda memory ran out.

--- Original Message ---

From: "bricklen" <bricklen@gmail.com>
Sent: November 18, 2013 7:25 PM
To: "Brian Wong" <bwong64@hotmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of size ???

On Mon, Nov 18, 2013 at 12:40 PM, Brian Wong <bwong64@hotmail.com> wrote:
We'd like to seek out your expertise on postgresql regarding this error that we're getting in an analytical database.

Some specs:
proc: Intel Xeon X5650 @ 2.67Ghz dual procs 6-core, hyperthreading on.
memory: 48GB
OS: Oracle Enterprise Linux 6.3
postgresql version: 9.1.9
shared_buffers: 18GB

After doing a lot of googling, I've tried setting FETCH_COUNT on psql AND/OR setting work_mem.  I'm just not able to work around this issue, unless if I take most of the MAX() functions out but just one.

Excuse me (or just ignore me) if it is a stupid question, but have you configured sysctl.conf accordingly?
For instance, to use larget memory settings, I had to configure my EL as follows:

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

On most systems:
kernel.shmmax = 68719476736 (64 GB) implies a kernel.shmall = 16777216 (4k pages)

While:
kernel.shmall = 4294967296 (4k pages) implies a kernel.shmmax = 17592186044416 (16 TB)

kernel.shmmax is the amount of memory you want to reserve in bytes.  The kernel.shmmax is the amount of memory you want to reserve in pages (normally 4096 bytes each unless you are using big pages or huge pages, the rest of this assumes 4096, adjust accordingly for your setup).  So generally, kernel.shmall will be kernel.shmmax divided by 4096 (and likewise, kernel.shmmax will be kernel.shmall multiplied by 4096).

To find out your page size, use this command:
$ getconf PAGE_SIZE

To find out the maximum physical pages available in the system, use this command:
$ getconf_PHYS_PAGES

To use the 8G recommended by Tomáš Vondra earlier, and assuming 4kb pages, your settings should be:

kernel.shmmax = 8589934592
kernel.shmall = 2097152

If the database does not start (assuming you're set shared_buffers to exactly 8GB in postgresql.conf), it will give you an error message with a corrected (slightly higher) value.  Use that value for kernel.shmmax and use that number divided by your page size (again, generally 4096) for kernel.shmall.

For example, for my small database, I am using 256 MB of shared buffers.  So I initially try to use these settings:

kernel.shmmax = 268435456
kernel.shmall = 65536

However, my database does not start.  The suggested kernel.shmmax giving by PostgresSQL is 298156032 (~284 MB).  So I use replace the above with these values:

kernel.shmmax = 288940032
kernel.shmall = 70542

I'm open to correction where I've misspoken and I hope this is helpful to you.

Good luck!
 
--
Chris

"If you wish to make an apple pie from scratch, you must first invent the Universe." -- Carl Sagan


Re: ERROR: out of memory DETAIL: Failed on request of size ???

От
"Tomas Vondra"
Дата:
On 27 Listopad 2013, 22:39, Brian Wong wrote:
>> Date: Fri, 22 Nov 2013 20:11:47 +0100
>> Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of
>> size ???
>> From: tv@fuzzy.cz
>> To: bwong64@hotmail.com
>> CC: bricklen@gmail.com; pgsql-general@postgresql.org
>>
>> On 19 Listopad 2013, 5:30, Brian Wong wrote:
>> > I've tried any work_mem value from 1gb all the way up to 40gb, with no
>> > effect on the error.  I'd like to think of this problem as a server
>> > process memory (not the server's buffers) or client process memory
>> issue,
>> > primarily because when we tested the error there was no other load
>> > whatsoever.  Unfortunately,  the error doesn't say what kinda memory
>> ran
>> > out.
>>
>> Hi Brian,
>>
>> first of all, please don't top-post, especially if the previous response
>> used bottom-post. Thank you ;-)
>>
>> Regarding the issue you're seeing:
>>
>> * Increasing work_mem in hope that it will make the issue go away is
>> pointless. In case work_mem is too low, PostgreSQL will automatically
>> spill the data to disk (e.g. it won't do a sort in memory, but will do a
>> on-disk merge sort). It will never fail, and messages "failed on request
>> of size" is actually coming from malloc, when requesting another chunk
>> of
>> memory from the OS. So you're hitting a OS-level memory limit.
>
> After changing the shared_buffers setting to 200MB, the developer has
> confirmed that the Out Of Memory error no longer happens.  So thanks
> folks.
>
> Playing with work_mem was out of desperation.  Postgresql simply giving
> the "Out of memory" error wasn't informative enough about the problem.
> For example, is it the server buffer, the server process, or the client
> process that's having a problem?

PostgreSQL is unable to give you a more detailed information about the
cause because it simply does not have it. You're hitting some limit set at
the kernel level, so PostgreSQL calls malloc() and kernel responds with
NULL. What details would you expect from PostgresQL in that case than
simple statement 'we tried to allocated X bytes and it failed'?

Also, this kind of errors tends to be 'random' i.e. it the allocation
error may happen at different places every time you run the query,
depending on how many backends are running etc.

The fact that decreasing shared buffers to 200 MB made the problem go away
for now only suggests this is a problem with some kernel limit (ulimit,
overcommit, ...). It also means you haven't solved it and it will likely
happen again in the future, e.g. if the amount of data grows and you'll
hit the limit again. Find and fix the actual issue.

Have you checked the ulimit / overcommit as I asked you to?

>> Note: AFAIK the only operation that does not spill to disk, and may fail
>> with OOM-like errors is hash aggregate. But by increasing the work_mem
>> you're actually encouraging PostgreSQL to do this planning error.
>>
>> I see the query you're running is doing MAX() so it might be hitting
>> this
>> issue. How much data are you dealing with? How many groups are in the
>> result?
>>
>> * Setting shared buffers to 18GB is almost certainly a bad choice. I'm
>> yet
>> to see a real-world database where shared_buffers over 8GB actually make
>> a
>> measurable difference. More is not always better, and you're actually
>> reserving memory that can't be used for work_mem (so you're making the
>> issue worse). Please, decrease shared_buffers to e.g. 4GB, then try to
>> increase it and measure the performance difference.
>>
> If you look at the documentation about how to configure shared_buffers, it
> is very unclear to me how I can maximize performance by allocating as much
> memory to the buffer as possible.  On one hand, the documentation says I
> shouldn't go high on the shared_buffers setting.  On the other hand, the
> more memory you allocate to the buffers, the better the performance is
> supposedly.  So at least as of 9.1, this is annoying.  I heard that
> starting from 9.2, this behavior changed dramatically?

I'm not sure I understand what you're trying to say. It's true the
documentation does not give perfectly clear instructions on how to set
optimal shared_buffers. The reason is very simple - there's no such value,
optimal for all workloads, operating systems and PostgreSQL versions. The
assumption that the more is better is incorrect for several reasons.

There's a more detailed wiki page about this:

    https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

The 200MB you've set seems way too low in your case, and aside from making
it slower than necessary, you most probably haven't fixed the actual
issue.

regards
Tomas



Re: ERROR: out of memory DETAIL: Failed on request of size ???

От
Brian Wong
Дата:
> Date: Fri, 22 Nov 2013 20:11:47 +0100
> Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of size ???
> From: tv@fuzzy.cz
> To: bwong64@hotmail.com
> CC: bricklen@gmail.com; pgsql-general@postgresql.org
>
> On 19 Listopad 2013, 5:30, Brian Wong wrote:
> > I've tried any work_mem value from 1gb all the way up to 40gb, with no
> > effect on the error. I'd like to think of this problem as a server
> > process memory (not the server's buffers) or client process memory issue,
> > primarily because when we tested the error there was no other load
> > whatsoever. Unfortunately, the error doesn't say what kinda memory ran
> > out.
>
> Hi Brian,
>
> first of all, please don't top-post, especially if the previous response
> used bottom-post. Thank you ;-)
>
> Regarding the issue you're seeing:
>
> * Increasing work_mem in hope that it will make the issue go away is
> pointless. In case work_mem is too low, PostgreSQL will automatically
> spill the data to disk (e.g. it won't do a sort in memory, but will do a
> on-disk merge sort). It will never fail, and messages "failed on request
> of size" is actually coming from malloc, when requesting another chunk of
> memory from the OS. So you're hitting a OS-level memory limit.

After changing the shared_buffers setting to 200MB, the developer has confirmed that the Out Of Memory error no longer happens.  So thanks folks.

Playing with work_mem was out of desperation.  Postgresql simply giving the "Out of memory" error wasn't informative enough about the problem.  For example, is it the server buffer, the server process, or the client process that's having a problem?

> Note: AFAIK the only operation that does not spill to disk, and may fail
> with OOM-like errors is hash aggregate. But by increasing the work_mem
> you're actually encouraging PostgreSQL to do this planning error.
>
> I see the query you're running is doing MAX() so it might be hitting this
> issue. How much data are you dealing with? How many groups are in the
> result?
>
> * Setting shared buffers to 18GB is almost certainly a bad choice. I'm yet
> to see a real-world database where shared_buffers over 8GB actually make a
> measurable difference. More is not always better, and you're actually
> reserving memory that can't be used for work_mem (so you're making the
> issue worse). Please, decrease shared_buffers to e.g. 4GB, then try to
> increase it and measure the performance difference.

If you look at the documentation about how to configure shared_buffers, it is very unclear to me how I can maximize performance by allocating as much memory to the buffer as possible.  On one hand, the documentation says I shouldn't go high on the shared_buffers setting.  On the other hand, the more memory you allocate to the buffers, the better the performance is supposedly.  So at least as of 9.1, this is annoying.  I heard that starting from 9.2, this behavior changed dramatically?

> * So how much memory does the query allocate? Can you watch it over
> top/free to get an idea if it e.g. allocates all available memory, or if
> it allocates only 1GB and then fail, or something?
>
> * I believe you're hitting some sort of limit, imposed by the operating
> system. Please check ulimit and overcommit settings.
>
> * BTW the SO post you mentioned as a perfect match was talking about query
> executed over dblink - are you doing the same? If so, try to move the
> aggregation into the query (so that the aggregation happens on the other
> end).

Nope we're not using dblink in this case.

>
> regards
> Tomas
>
> >
> > --- Original Message ---
> >
> > From: "bricklen" <bricklen@gmail.com>
> > Sent: November 18, 2013 7:25 PM
> > To: "Brian Wong" <bwong64@hotmail.com>
> > Cc: pgsql-general@postgresql.org
> > Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of
> > size ???
> >
> > On Mon, Nov 18, 2013 at 12:40 PM, Brian Wong <bwong64@hotmail.com> wrote:
> >
> >> We'd like to seek out your expertise on postgresql regarding this error
> >> that we're getting in an analytical database.
> >>
> >> Some specs:
> >> proc: Intel Xeon X5650 @ 2.67Ghz dual procs 6-core, hyperthreading on.
> >> memory: 48GB
> >> OS: Oracle Enterprise Linux 6.3
> >> postgresql version: 9.1.9
> >> shared_buffers: 18GB
> >>
> >> After doing a lot of googling, I've tried setting FETCH_COUNT on psql
> >> AND/OR setting work_mem. I'm just not able to work around this issue,
> >> unless if I take most of the MAX() functions out but just one.
> >>
> >
> > What is your work_mem set to?
> > Did testing show that shared_buffers set to 18GB was effective? That seems
> > about 2 to 3 times beyond what you probably want.
> >
>
>