Re: Deleting Rows From Large Tables

Поиск
Список
Период
Сортировка
От Vitalii Tymchyshyn
Тема Re: Deleting Rows From Large Tables
Дата
Msg-id CABWW-d3H5yq6FxdYk951OekwEUOSZNwSxH-iWSWtqSdBe8WRuA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Deleting Rows From Large Tables  (Sékine Coulibaly <scoulibaly@gmail.com>)
Список pgsql-performance

Analyze your temp tables after filling and before using!

17 трав. 2013 17:27, "Sékine Coulibaly" <scoulibaly@gmail.com> напис.
Oh, sorry, overlooked that part.
Maybe refreshing stats with VACUUM FULL ?


2013/5/17 Robert Emery <robertemery@codeweavers.net>
Hi Sékine,

Unfortunately I'm not trying to empty the table completely, just
delete about 10-15% of the data in it.

Thanks,

On 17 May 2013 14:11, Sékine Coulibaly <scoulibaly@gmail.com> wrote:
> Rob,
>
> Did you tried TRUNCATE ?
> http://www.postgresql.org/docs/8.4/static/sql-truncate.html
>
> This is is supposed to be quicker since it does scan the table.
>
> Regards
>
>
> 2013/5/17 Rob Emery <re-pgsql@codeweavers.net>
>>
>> Hi All,
>>
>> We've got 3 quite large tables that due to an unexpected surge in
>> usage (!) have grown to about 10GB each, with 72, 32 and 31 million
>> rows in. I've been tasked with cleaning out about half of them, the
>> problem I've got is that even deleting the first 1,000,000 rows seems
>> to take an unreasonable amount of time. Unfortunately this is on quite
>> an old server (Dell 2950 with a RAID-10 over 6 disks) running Postgres
>> 8.4; which serves other things like our logging systems.
>>
>> If I run a sustained (more than about 5 minutes) delete it'll have a
>> detrimental effect on the other services. I'm trying to batch up the
>> deletes into small chunks of approximately 1 month of data ; even this
>> seems to take too long, I originally reduced this down to a single
>> day's data and had the same problem. I can keep decreasing the size of
>> the window I'm deleting but I feel I must be doing something either
>> fundamentally wrong or over-complicating this enormously. I've
>> switched over to retrieving a list of IDs to delete, storing them in
>> temporary tables and deleting based on the primary keys on each of the
>> tables with something similar to this:
>>
>> BEGIN TRANSACTION;
>>
>> CREATE TEMPORARY TABLE table_a_ids_to_delete (id INT);
>> CREATE TEMPORARY TABLE table_b_ids_to_delete (id INT);
>>
>> INSERT INTO table_a_ids_to_delete
>>     SELECT id FROM table_a WHERE purchased ='-infinity' AND created_at
>> < '2007-01-01T00:00:00';
>>
>> INSERT INTO table_b_ids_to_delete
>>     SELECT table_b_id FROM table_a_table_b_xref
>>     INNER JOIN table_a_ids_to_delete ON (table_a_ids_to_delete.id =
>> table_a_table_b.quote_id);
>>
>> DELETE FROM table_a_table_b_xref USING table_a_ids_to_delete
>>     WHERE table_a_table_b_xref.table_a_id = table_a_ids_to_delete.id;
>>
>> DELETE FROM table_b USING table_b_ids_to_delete
>>     WHERE table_b.id = table_b_ids_to_delete.id;
>>
>> DELETE FROM table_a USING table_a_ids_to_delete
>>     WHERE table_a.id =  table_a_ids_to_delete.id;
>>
>> COMMIT;
>>
>> There're indices on table_a on the queried columns, table_b's primary
>> key is it's id, and table_a_table_b_xref has an index on (table_a_id,
>> table_b_id). There're FK defined on the xref table, hence why I'm
>> deleting from it first.
>>
>> Does anyone have any ideas as to what I can do to make the deletes any
>> faster? I'm running out of ideas!
>>
>> Thanks in advance,
>>
>> --
>> Rob Emery
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>
>



--
Robert Emery
Database Administrator

| T: 0800 021 0888 | www.codeweavers.net |
| Codeweavers Limited | Barn 4 | Dunston Business Village | Dunston | ST18 9AB |
| Registered in England and Wales No. 04092394 | VAT registration no.
974 9705 63 |

 CUSTOMERS' BLOG     TWITTER       FACEBOOK      LINKED IN
DEVELOPERS' BLOG      YOUTUBE

В списке pgsql-performance по дате отправления:

Предыдущее
От: Sékine Coulibaly
Дата:
Сообщение: Re: Deleting Rows From Large Tables
Следующее
От: ach
Дата:
Сообщение: Re: statistics target for columns in unique constraint?