Re: Slow queries on big table
От | Tyrrill, Ed |
---|---|
Тема | Re: Slow queries on big table |
Дата | |
Msg-id | A23190A408F7094FAF446C1538222F7603EE4416@avaexch01.avamar.com обсуждение исходный текст |
Ответ на | Re: Slow queries on big table (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-performance |
Tom Lane <tgl@sss.pgh.pa.us> writes: >> Thanks for the help guys! That was my problem. I actually need the >> backup_location_rid index for a different query so I am going to keep >> it. > > Well, you don't really *need* it; the two-column index on (record_id, > backup_id) will serve perfectly well for queries on its leading column > alone. It'll be physically >>bigger and hence slightly slower to scan > than a single-column index; but unless the table is almost completely > read-only, the update overhead of maintaining all three indexes is > probably going to cost more than you can save with it. Try that other > query with and without backup_location_rid and see how much you're > really saving. Well, the query that got me to add backup_location_rid took 105 minutes using only the primary key index. After I added backup_location_rid the query was down to about 45 minutes. Still not very good, and I am still fiddling around with it. The query is: mdsdb=# explain analyze select backupobjects.record_id from backupobjects left outer join backup_location using(record_id) where backup_id is null; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------- Merge Left Join (cost=0.00..21408455.06 rows=11790970 width=8) (actual time=2784967.410..2784967.410 rows=0 loops=1) Merge Cond: ("outer".record_id = "inner".record_id) Filter: ("inner".backup_id IS NULL) -> Index Scan using backupobjects_pkey on backupobjects (cost=0.00..443484.31 rows=11790970 width=8) (actual time=0.073..47865.957 rows=11805996 loops=1) -> Index Scan using backup_location_rid on backup_location (cost=0.00..20411495.21 rows=162435366 width=12) (actual time=0.110..2608485.437 rows=162426837 loops=1) Total runtime: 2784991.612 ms (6 rows) It is of course the same backup_location, but backupobjects is: mdsdb=# \d backupobjects Table "public.backupobjects" Column | Type | Modifiers ----------------+-----------------------------+----------- record_id | bigint | not null dir_record_id | integer | name | text | extension | character varying(64) | hash | character(40) | mtime | timestamp without time zone | size | bigint | user_id | integer | group_id | integer | meta_data_hash | character(40) | Indexes: "backupobjects_pkey" PRIMARY KEY, btree (record_id) "backupobjects_meta_data_hash_key" UNIQUE, btree (meta_data_hash) "backupobjects_extension" btree (extension) "backupobjects_hash" btree (hash) "backupobjects_mtime" btree (mtime) "backupobjects_size" btree (size) record_id has in backupobjects has a many to many relationship to record_id in backup_location. Ed
В списке pgsql-performance по дате отправления: