DELETE taking too much memory

Поиск
Список
Период
Сортировка
От vincent dephily
Тема DELETE taking too much memory
Дата
Msg-id CAHdkwt5b=g_Nj0mogB-8u62U9wEB1KdCbPiBkJKGJyjM4q=FvQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: [GENERAL] DELETE taking too much memory
Список pgsql-performance
Hi,

I have a delete query taking 7.2G of ram (and counting) but I do not
understant why so much memory is necessary. The server has 12G, and
I'm afraid it'll go into swap. Using postgres 8.3.14.

I'm purging some old data from table t1, which should cascade-delete
referencing rows in t2. Here's an anonymized rundown :


# \d t1
                             Table "public.t1"
  Column   |            Type             |             Modifiers
-----------+-----------------------------+---------------------------------
 t1id      | integer                     | not null default
nextval('t1_t1id_seq'::regclass)
(...snip...)
Indexes:
    "message_pkey" PRIMARY KEY, btree (id)
(...snip...)

# \d t2
                               Table "public.t2"
     Column      |            Type             |        Modifiers
-----------------+-----------------------------+-----------------------------
 t2id            | integer                     | not null default
nextval('t2_t2id_seq'::regclass)
 t1id            | integer                     | not null
 foo             | integer                     | not null
 bar             | timestamp without time zone | not null default now()
Indexes:
    "t2_pkey" PRIMARY KEY, btree (t2id)
    "t2_bar_key" btree (bar)
    "t2_t1id_key" btree (t1id)
Foreign-key constraints:
    "t2_t1id_fkey" FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE
RESTRICT ON DELETE CASCADE

# explain delete from t1 where t1id in (select t1id from t2 where
foo=0 and bar < '20101101');
                               QUERY PLAN
-----------------------------------------------------------------------------
 Nested Loop  (cost=5088742.39..6705282.32 rows=30849 width=6)
   ->  HashAggregate  (cost=5088742.39..5089050.88 rows=30849 width=4)
         ->  Index Scan using t2_bar_key on t2  (cost=0.00..5035501.50
rows=21296354 width=4)
               Index Cond: (bar < '2010-11-01 00:00:00'::timestamp
without time zone)
               Filter: (foo = 0)
   ->  Index Scan using t1_pkey on t1  (cost=0.00..52.38 rows=1 width=10)
         Index Cond: (t1.t1id = t2.t1id)
(7 rows)


Note that the estimate of 30849 rows is way off : there should be
around 55M rows deleted from t1, and 2-3 times as much from t2.

When looking at the plan, I can easily imagine that data gets
accumulated below the nestedloop (thus using all that memory), but why
isn't each entry freed once one row has been deleted from t1 ? That
entry isn't going to be found again in t1 or in t2, so why keep it
around ?

Is there a better way to write this query ? Would postgres 8.4/9.0
handle things better ?



Thanks in advance.


--
Vincent de Phily

В списке pgsql-performance по дате отправления:

Предыдущее
От: bakkiya
Дата:
Сообщение: Re: 100% CPU Utilization when we run queries.
Следующее
От: jtkells@verizon.net
Дата:
Сообщение: very large record sizes and ressource usage