trying to delete most of the table by range of date col

Поиск
Список
Период
Сортировка
От Mariel Cherkassky
Тема trying to delete most of the table by range of date col
Дата
Msg-id CA+t6e1krNiXeqbXPi7pSYA8vvu7KsRCcE8yexzFBGrnBhS7sZg@mail.gmail.com
обсуждение исходный текст
Ответы Re: trying to delete most of the table by range of date col
Список pgsql-performance
Hi,
I have a big table (with 1.6 milion records). One of the columns is called end_date and it`s type is timestamp. I'm trying to find the best way to delete most of the table but not all of it according to a range of dates. The table structure : 
afa=# \d my_table;
                                              Table "public.my_table"
             Column              |           Type           |                        Modifiers
---------------------------------+--------------------------+----------------------------------------------------------
 id                              | bigint                   | not null default nextval('my_table_id_seq'::regclass)
 devid| integer                  | not null
 column_name| integer                  | not null
 column_name| integer                  | not null
 column_name| integer                  | not null
 column_name| timestamp with time zone |
 column_name| integer                  | not null
 column_name| integer                  | not null
 column_name| integer                  | not null
 column_name| integer                  | not null
 column_name| integer                  | not null
 column_name| integer                  | not null
 column_name| integer                  | not null
 column_name| text                     | not null
 column_name| integer                  | not null default 0
 column_name| integer                  | not null default 0
 column_name| integer                  | not null default 0
 column_name| integer                  | not null default 0
 column_name| integer                  | not null default 0
 column_name| integer                  | not null default 0
 column_name| integer                  | not null default 0
 column_name| integer                  | default 0
 column_name| integer                  | default 0
 column_name| integer                  | default 0
 column_name| integer                  | default 0
 column_name| integer                  | default 0
 column_name| integer                  | default 0
 end_date                        | timestamp with time zone |

Indexes:
    "my_table_pkey" PRIMARY KEY, btree (id)
    "my_table_date_idx" btree (date)
    "my_table_device_idx" btree (devid)
    "end_date_idx" btree (end_date)
Foreign-key constraints:
    "fk_aaaaa" FOREIGN KEY (devid) REFERENCES device_data(id)
Referenced by:
    TABLE "table1" CONSTRAINT "application_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
    TABLE "table2" CONSTRAINT "configuration_changes_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
    TABLE "table3" CONSTRAINT "fk_57hmvnx423bw9h203260r8gic" FOREIGN KEY (my_table) REFERENCES my_table(id)
    TABLE "table3" CONSTRAINT "interface_change_my_table_fk" FOREIGN KEY (my_table) REFERENCES my_table(id)
    TABLE "table4" CONSTRAINT "my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id) ON DELETE CASCADE
    TABLE "table5" CONSTRAINT "my_table_report_my_table_fk" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
    TABLE "table6" CONSTRAINT "my_table_to_policy_change_my_table_foreign_key" FOREIGN KEY (my_table) REFERENCES my_table(id)
    TABLE "table7" CONSTRAINT "network_object_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
    TABLE "table8" CONSTRAINT "orig_nat_rule_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
    TABLE "table9" CONSTRAINT "risk_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
    TABLE "table10" CONSTRAINT "rule_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
    TABLE "table11" CONSTRAINT "service_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)

As you can see alot of other tables uses the id col as a foreign key which make the delete much slower.

Solution I tried for the query : 

delete  from my_table where end_date <= to_date('12/12/2018','DD/MM/YYYY') and end_date > to_date('11/12/2018','DD/MM/YYYY');

                                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Delete on my_table  (cost=0.00..65183.30 rows=1573862 width=6) (actual time=5121.344..5121.344 rows=0 loops=1)
   ->  Seq Scan on my_table  (cost=0.00..65183.30 rows=1573862 width=6) (actual time=0.012..2244.393 rows=1572864 loops=1)
         Filter: ((end_date <= to_date('12/12/2018'::text, 'DD/MM/YYYY'::text)) AND (end_date > to_date('11/12/2018'::text, 'DD/MM/YYYY'::text)))
         Rows Removed by Filter: 40253
 Planning time: 0.210 ms
 Trigger for constraint table1: time=14730.816 calls=1572864
 Trigger for constraint table2: time=30718.084 calls=1572864
 Trigger for constraint table3: time=28170.363 calls=1572864
 Trigger for constraint table4: time=29573.681 calls=1572864
 Trigger for constraint table5: time=29629.263 calls=1572864
 Trigger for constraint table6: time=29628.489 calls=1572864
 Trigger for constraint table7: time=29798.121 calls=1572864
 Trigger for constraint table8: time=29645.705 calls=1572864
 Trigger for constraint table9: time=29657.177 calls=1572864
 Trigger for constraint table10: time=29487.054 calls=1572864
 Trigger for constraint table11: time=30010.978 calls=1572864
 Trigger for constraint table12: time=26383.924 calls=1572864
 Execution time: 350603.047 ms
(18 rows)

-----------------------

DELETE FROM my_table WHERE id IN (select id from my_table where end_date <= to_date('12/12/2018','DD/MM/YYYY') and end_date > to_date('11/12/2018','DD/MM/YYYY'));



                                                                          QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Delete on my_table  (cost=92522.54..186785.27 rows=1572738 width=12) (actual time=9367.477..9367.477 rows=0 loops=1)
   ->  Hash Join  (cost=92522.54..186785.27 rows=1572738 width=12) (actual time=2871.906..5503.732 rows=1572864 loops=1)
         Hash Cond: (my_table.id = my_table_1.id)
         ->  Seq Scan on my_table  (cost=0.00..49052.16 rows=1613116 width=14) (actual time=0.004..669.184 rows=1613117 loops=1)
         ->  Hash  (cost=65183.32..65183.32 rows=1572738 width=14) (actual time=2871.301..2871.301 rows=1572864 loops=1)
               Buckets: 131072  Batches: 32  Memory Usage: 3332kB
               ->  Seq Scan on my_table my_table_1  (cost=0.00..65183.32 rows=1572738 width=14) (actual time=0.009..2115.826 rows=1572864 loops=1)
                     Filter: ((end_date <= to_date('12/12/2018'::text, 'DD/MM/YYYY'::text)) AND (end_date > to_date('11/12/2018'::text, 'DD/MM/YYYY'::text)))
                     Rows Removed by Filter: 40253
 Planning time: 0.419 ms
 Trigger for constraint my_table_id_fkey: time=14291.206 calls=1572864
 Trigger for constraint table2_fk: time=29171.591 calls=1572864
 Trigger for constraint table3_fk: time=26356.711 calls=1572864
 Trigger for constraint table4_fk: time=27579.694 calls=1572864
 Trigger for constraint table5_fk: time=27537.491 calls=1572864
 Trigger for constraint table6_fk: time=27574.169 calls=1572864
 Trigger for constraint table7_fk: time=27716.636 calls=1572864
 Trigger for constraint table8_fk: time=27780.192 calls=1572864
....
....

 Execution time: 333166.233 ms ~ 5.5 minutes
(23 rows)


Loading into a temp table the data isnt option because I cant truncate the table because of all the dependencies...

Any idea what else can I check ?

В списке pgsql-performance по дате отправления:

Предыдущее
От: Andreas Kretschmer
Дата:
Сообщение: Re: Extremely slow when query uses GIST exclusion index
Следующее
От: Justin Pryzby
Дата:
Сообщение: Re: trying to delete most of the table by range of date col