Обсуждение: OOM-killer issue with a specific query

Поиск
Список
Период
Сортировка

OOM-killer issue with a specific query

От
nabble.30.miller_2555@spamgourmet.com
Дата:
Hi - I'm running into an OOM-killer issue when running a specific query (no virtual machine running) and, based on researching the issue, I can probably fix by making the following sysctl adjustments:
    vm.overcommit_memory = 2
    vm.overcommit_ratio = 0
However, I am perplexed as to why I am running into the issue in the first place. The machine (running Linux 2.6.34.7-61.fc13.x86_64) is dedicated to Postgres (v9.0.0 [RPM package: postgresql90-9.0.0-1PGDG.fc13.1.x86_64]) and the following memory usage is pretty typical for the system (via "top"):
    Mem:   8121992k total,  2901960k used,  5220032k free,   237408k buffers
    Swap:  1048572k total,   235940k used,   812632k free,  2053768k cached
Under steady-state conditions, the following shows the virtual memory size for postgres backend processes:
     PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    8506 postgres  20   0 2327m 3084 1792 S  0.0  0.0   0:00.33 postgres
    8504 postgres  20   0 2326m  14m  13m S  0.0  0.2   0:01.32 postgres
    8505 postgres  20   0 2326m  728  452 S  0.0  0.0   0:00.91 postgres
    3582 postgres  20   0 2325m  54m  53m S  0.0  0.7   0:02.03 postgres
My current relevant postgresql.conf settings are the following:
    shared_buffers = 2100MB
    temp_buffers = 8MB
    work_mem = 32MB
    maintenance_work_mem = 16MB
    max_stack_depth = 2MB
    constraint_exclusion = partition
When executing the query, I've been watching the "top" activity, sorted by resident memory. Upon execution, no other processes appear to take additional resident memory, except a postgres backend process servicing the query, which goes to +6Gb (triggering the OOM-killer). Given the settings in postgresql.conf, and my anecdotal understanding of Postgres memory management functions, I am uncertain why Postgres exhausts physical memory instead of swapping to temporary files. Do I need to lower my work_mem setting since the subquery involves a partitioned table, causing a multiplier effect to the memory used (I have tried per-connection settings of 6MB)? Would tweaking  query planning settings help?

Thanks in advance!

If it helps, I have included the query (with column names aliased to their data type), a brief description of the applicable table's contents, and an abridged copy of the EXPLAIN ANALYZE output

SELECT "bigint", "date", "text"
FROM tableA AS A
WHERE A."boolean" = 'true' AND
(A."text" = 'abc' OR A."text" = 'xyz') AND
A."bigint" NOT IN (SELECT "bigint" FROM tableB)
ORDER BY A."date" DESC;

tableA:
    - total table contains ~11 million records (total width: 109 bytes)
    - partitioned by month (180 partitions)
    - each table partition contains ~100k records
tableB:
    - total table contains ~400k records (total width: 279 bytes)
    - partitioned by month (96 partitions)
    - each table partition contains ~30k records


EXPLAIN ANALYZE output:
    Note: could not produce output for exact query due to OOM-killer, but ran query by limiting the subquery to the first 50 results. The planner iterates over all partitions, but only the first two partitions are noted for brevity.

                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=451279.67..451279.70 rows=10 width=55) (actual time=18343.085..18343.090 rows=10 loops=1)
   ->  Sort  (cost=451279.67..456398.37 rows=2047480 width=55) (actual time=18343.083..18343.087 rows=10 loops=1)
         Sort Key: A."Date"
         Sort Method:  top-N heapsort  Memory: 26kB
         ->  Result  (cost=1.21..407034.37 rows=2047480 width=55) (actual time=0.793..17014.726 rows=4160606 loops=1)
               ->  Append  (cost=1.21..407034.37 rows=2047480 width=55) (actual time=0.792..16119.298 rows=4160606 loops=1)
                     ->  Seq Scan on tableA A  (cost=1.21..19.08 rows=1 width=44) (actual time=0.002..0.002 rows=0 loops=1)
                           Filter: ("boolean" AND (NOT (hashed SubPlan 1)) AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
                           SubPlan 1
                             ->  Limit  (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210)
                                   ->  Result  (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210)
                                         ->  Append  (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210)
                                               ->  Seq Scan on tableB  (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210)
                                               ->  Seq Scan on tableB_201201 tableB  (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210)
                                               ->  Seq Scan on tableB_201112 tableB  (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)
                                               ->  ...
                     ->  Seq Scan on tableA_201201 A  (cost=1.21..19.08 rows=1 width=44) (actual time=0.001..0.001 rows=0 loops=1)
                           Filter: ("boolean" AND (NOT (hashed SubPlan 1)) AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
                           SubPlan 1
                             ->  Limit  (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210)
                                   ->  Result  (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210)
                                         ->  Append  (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210)
                                               ->  Seq Scan on tableB  (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210)
                                               ->  Seq Scan on tableB_201201 tableB  (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210)
                                               ->  Seq Scan on tableB_201112 tableB  (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)
                                               ->  ...
                     ->  Seq Scan on tableA_201112 A  (cost=1.21..794.69 rows=5980 width=55) (actual time=0.789..12.686 rows=12075 loops=1)
                           Filter: ("boolean" AND (NOT (hashed SubPlan 1)) AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
                           SubPlan 1
                             ->  Limit  (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210)
                                   ->  Result  (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210)
                                         ->  Append  (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210)
                                               ->  Seq Scan on tableB  (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210)
                                               ->  Seq Scan on tableB_201201 tableB  (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210)
                                               ->  Seq Scan on tableB_201112 tableB  (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)
                                               ->  Seq Scan on tableB_201111 tableB  (cost=0.00..604.89 rows=29189 width=8) (never executed)
                                               ->  ...
                     ->  Seq Scan on tableA_201111 A  (cost=1.21..2666.12 rows=14670 width=55) (actual time=0.441..36.680 rows=29189 loops=1)
                           Filter: ("boolean" AND (NOT (hashed SubPlan 1)) AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
                           SubPlan 1
                             ->  Limit  (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210)
                                   ->  Result  (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210)
                                         ->  Append  (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210)
                                               ->  Seq Scan on tableB  (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210)
                                               ->  Seq Scan on tableB_201201 tableB  (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210)
                                               ->  Seq Scan on tableB_201112 tableB  (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)
                                               ->  Seq Scan on tableB_201111 tableB  (cost=0.00..604.89 rows=29189 width=8) (never executed)
                                               ->  ...
                     ->  ...
 Total runtime: 18359.851 ms
(23327 rows)


Re: OOM-killer issue with a specific query

От
Scott Marlowe
Дата:
On Mon, Dec 19, 2011 at 8:52 AM,  <nabble.30.miller_2555@spamgourmet.com> wrote:
> Under steady-state conditions, the following shows the virtual memory size
> for postgres backend processes:
>      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>     8506 postgres  20   0 2327m 3084 1792 S  0.0  0.0   0:00.33 postgres
>     8504 postgres  20   0 2326m  14m  13m S  0.0  0.2   0:01.32 postgres
>     8505 postgres  20   0 2326m  728  452 S  0.0  0.0   0:00.91 postgres
>     3582 postgres  20   0 2325m  54m  53m S  0.0  0.7   0:02.03 postgres

FYI, this is not swap usage.  VIRT is all the memory a process has
handles open to everywhere, including libs that it's linked against
that might not even be loaded.  Generally speaking, VIRT is close to
worthless for troubleshooting.

> My current relevant postgresql.conf settings are the following:
>     shared_buffers = 2100MB
>     temp_buffers = 8MB
>     work_mem = 32MB
>     maintenance_work_mem = 16MB
>     max_stack_depth = 2MB
>     constraint_exclusion = partition

What's max_connections?

> When executing the query, I've been watching the "top" activity, sorted by
> resident memory. Upon execution, no other processes appear to take
> additional resident memory, except a postgres backend process servicing the
> query, which goes to +6Gb (triggering the OOM-killer). Given the settings in
> postgresql.conf, and my anecdotal understanding of Postgres memory
> management functions, I am uncertain why Postgres exhausts physical memory
> instead of swapping to temporary files.

> EXPLAIN ANALYZE output:
>     Note: could not produce output for exact query due to OOM-killer, but
> ran query by limiting the subquery to the first 50 results. The planner
> iterates over all partitions, but only the first two partitions are noted
> for brevity.

This may be one instance where the regular explain will be more
useful.  it's quite likely that the query changes when there is no
limit.  If you compare what explain for the full query says, and what
explain (analyze) for the abridged one says, the part that's causing
you to run out of memory may be more obvious.

Re: OOM-killer issue with a specific query 9 of 20)

От
nabble.30.miller_2555@spamgourmet.com
Дата:
On Tue, Dec 20, 2011 at 8:24 AM, Scott Marlowe -
scott.marlowe@gmail.com
<+nabble+miller_2555+3b65e832a3.scott.marlowe#gmail.com@spamgourmet.com>
wrote:
>
> On Mon, Dec 19, 2011 at 8:52 AM,  <nabble.30.miller_2555@spamgourmet.com> wrote:
> > I can probably fix by making the following sysctl adjustments:
> >     vm.overcommit_memory = 2
> >     vm.overcommit_ratio = 0

FYI - for the sake of others visiting this post, disabling the OS
memory overcommit does not appear an easy solution in my case as the
box fails to bootstrap due to insufficient memory.

> > Under steady-state conditions, the following shows the virtual memory size
> > for postgres backend processes:
> >      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> >     8506 postgres  20   0 2327m 3084 1792 S  0.0  0.0   0:00.33 postgres
> >     8504 postgres  20   0 2326m  14m  13m S  0.0  0.2   0:01.32 postgres
> >     8505 postgres  20   0 2326m  728  452 S  0.0  0.0   0:00.91 postgres
> >     3582 postgres  20   0 2325m  54m  53m S  0.0  0.7   0:02.03 postgres
>
> FYI, this is not swap usage.  VIRT is all the memory a process has
> handles open to everywhere, including libs that it's linked against
> that might not even be loaded.  Generally speaking, VIRT is close to
> worthless for troubleshooting.
>
Thanks - I misunderstood the meaning of VIRT (its been ahwile since
I've consulted the man page).

> > My current relevant postgresql.conf settings are the following:
> >     shared_buffers = 2100MB
> >     temp_buffers = 8MB
> >     work_mem = 32MB
> >     maintenance_work_mem = 16MB
> >     max_stack_depth = 2MB
> >     constraint_exclusion = partition
>
> What's max_connections?
>
max_connections=20. As a sidenote, this is a development box and there
are no other active connections to the database while this test case
was run.

> > When executing the query, I've been watching the "top" activity, sorted by
> > resident memory. Upon execution, no other processes appear to take
> > additional resident memory, except a postgres backend process servicing the
> > query, which goes to +6Gb (triggering the OOM-killer). Given the settings in
> > postgresql.conf, and my anecdotal understanding of Postgres memory
> > management functions, I am uncertain why Postgres exhausts physical memory
> > instead of swapping to temporary files.
>
> > EXPLAIN ANALYZE output:
> >     Note: could not produce output for exact query due to OOM-killer, but
> > ran query by limiting the subquery to the first 50 results. The planner
> > iterates over all partitions, but only the first two partitions are noted
> > for brevity.
>
> This may be one instance where the regular explain will be more
> useful.  it's quite likely that the query changes when there is no
> limit.  If you compare what explain for the full query says, and what
> explain (analyze) for the abridged one says, the part that's causing
> you to run out of memory may be more obvious.
>
I've run EXPLAIN on the query, but AFAICS the query plan does not
appear significantly different than the abridged version for this
particular query (output attached below). In an effort to analyze the
base case, I re-ran the query (without LIMIT) for a selected partition
of tableA and tableB (both tables are partitioned by "Date" and the
"Date" column on each partition of tableB references the "Date" column
of the corresponding partition of tableA as a foreign key constraint).
The tableA partition holds 82,939 records (record width is 108 bytes,
per EXPLAIN) and the tableB partition holds 13,718 records (record
width is 312 bytes, per EXPLAIN) For a single table partition, `top`
shows the following resource usage of running postmaster processes:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1818 postgres  20   0 2325m  56m  55m S  0.0  0.7   0:32.10 postmaster
 2810 postgres  20   0  156m 1044  420 S  0.0  0.0   0:01.50 postmaster
 2813 postgres  20   0 2326m 256m 255m S  0.0  3.2   0:09.61 postmaster
 2814 postgres  20   0 2326m 2220 1592 S  0.0  0.0   0:04.30 postmaster
 2815 postgres  20   0 2327m 3996 2148 S  0.0  0.0   0:00.66 postmaster
 2816 postgres  20   0  156m 1272  504 S  0.0  0.0   0:09.14 postmaster
29661 postgres  20   0 2335m  49m  40m S  0.0  0.6   0:00.24 postmaster

While I could run the query partition-by-partition, I'd still like to
be able to run a full query across all partitions.

EXPLAIN output excerpt:
                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=3125664.13..3130782.83 rows=2047480 width=55)
   Sort Key: A."Date"
   ->  Result  (cost=11553.15..2856046.51 rows=2047480 width=55)
         ->  Append  (cost=11553.15..2856046.51 rows=2047480 width=55)
               ->  Seq Scan on tableA A  (cost=11553.15..11571.02
rows=1 width=44)
                     Filter: ("Boolean" AND (NOT (hashed SubPlan 1))
AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
                     SubPlan 1
                       ->  Result  (cost=0.00..10357.52 rows=478252 width=8)
                             ->  Append  (cost=0.00..10357.52
rows=478252 width=8)
                                   ->  Seq Scan on tableB
(cost=0.00..15.30 rows=530 width=8)
                                   ->  Seq Scan on tableB_201201
tableB  (cost=0.00..15.30 rows=530 width=8)
                                   ->  Seq Scan on tableB_201112
tableB  (cost=0.00..251.25 rows=12125 width=8)
                                   ->  Seq Scan on tableB_201111
tableB  (cost=0.00..604.89 rows=29189 width=8)
                                   ->  Seq Scan on tableB_201110
tableB  (cost=0.00..490.30 rows=23630 width=8)
                                   ->  ...
               ->  Seq Scan on tableA_201201 A
(cost=11553.15..11571.02 rows=1 width=44)
                     Filter: ("Boolean" AND (NOT (hashed SubPlan 1))
AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
                     SubPlan 1
                       ->  Result  (cost=0.00..10357.52 rows=478252 width=8)
                             ->  Append  (cost=0.00..10357.52
rows=478252 width=8)
                                   ->  Seq Scan on tableB
(cost=0.00..15.30 rows=530 width=8)
                                   ->  Seq Scan on tableB_201201
tableB  (cost=0.00..15.30 rows=530 width=8)
                                   ->  Seq Scan on tableB_201112
tableB  (cost=0.00..251.25 rows=12125 width=8)
                                   ->  Seq Scan on tableB_201111
tableB  (cost=0.00..604.89 rows=29189 width=8)
                                   ->  Seq Scan on tableB_201110
tableB  (cost=0.00..490.30 rows=23630 width=8)
                                   ->  ...
               ->  Seq Scan on tableA_201112 A
(cost=11553.15..12346.63 rows=5980 width=55)
                     Filter: ("Boolean" AND (NOT (hashed SubPlan 1))
AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
                     SubPlan 1
                       ->  Result  (cost=0.00..10357.52 rows=478252 width=8)
                             ->  Append  (cost=0.00..10357.52
rows=478252 width=8)
                                   ->  Seq Scan on tableB
(cost=0.00..15.30 rows=530 width=8)
                                   ->  Seq Scan on tableB_201201
tableB  (cost=0.00..15.30 rows=530 width=8)
                                   ->  Seq Scan on tableB_201112
tableB  (cost=0.00..251.25 rows=12125 width=8)
                                   ->  Seq Scan on tableB_201111
tableB  (cost=0.00..604.89 rows=29189 width=8)
                                   ->  Seq Scan on tableB_201110
tableB  (cost=0.00..490.30 rows=23630 width=8)
                                   ->  ...
               ->  ...
(23112 rows)


Re: OOM-killer issue with a specific query 9 of 20)

От
Tom Lane
Дата:
nabble.30.miller_2555@spamgourmet.com writes:
> I've run EXPLAIN on the query, but AFAICS the query plan does not
> appear significantly different than the abridged version for this
> particular query (output attached below).

I think what's happening is that you've got the hashed NOT IN being
pushed down separately to each of the 180 child tables, so each of those
hashtables thinks it can use work_mem (32MB), which means you're pushing
6GB of memory usage before accounting for anything else.

NOT IN is really hard to optimize because of its weird behavior for
nulls, so the planner doesn't have much of any intelligence about it.
I'd suggest seeing if you can transform it to a NOT EXISTS, if you
don't have any nulls in the bigint columns or don't really want the
spec-mandated behavior for them anyway.  A quick check suggests that 9.0
should give you a vastly better plan from a NOT EXISTS.

Another suggestion is that you ought to be running something newer than
9.0.0; you're missing over a year's worth of bug fixes (some of which
were memory leaks...).  If you are going to pick a PG version to sit on
and not bother to update, a dot-zero release is about your worst
possible choice; it will always have more bugs than a more mature
release series.  With my red fedora on, I'd also mutter that F13 is well
past its use-by date.

            regards, tom lane

Re: OOM-killer issue with a specific query 11 of 20)

От
nabble.30.miller_2555@spamgourmet.com
Дата:
On Tue, Dec 20, 2011 at 11:15 AM, F. BROUARD / SQLpro -
sqlpro@club-internet.fr
<+nabble+miller_2555+ca434688eb.sqlpro#club-internet.fr@spamgourmet.com>
wrote:
> I should think your query is not correct.
>
> Le 19/12/2011 16:52, nabble.30.miller_2555@spamgourmet.com a écrit :
>>
>> SELECT "bigint", "date", "text"
>> FROM tableA AS A
>> WHERE A."boolean" = 'true' AND
>> (A."text" = 'abc' OR A."text" = 'xyz') AND
>> A."bigint" NOT IN (SELECT "bigint" FROM tableB)
>> ORDER BY A."date" DESC;
>
>
> Why do you cast the true as a string ?
> Can't you be more simple like :
> WHERE A."boolean" = true
> and that's all ?
> or much more simple :
> WHERE A."boolean"
>
That is true - I was actually quoting the value as a literal to make
the query more explicit in the post ... probably not the best judgment
in hindsight given posting to a performance-based mailing list :-). I
do use the WHERE A."boolean" clause in the actual SQL query to avoid
the unneccesary parsing & type casting. Apologies for any confusion.