Re: Help with slow table update

Поиск
Список
Период
Сортировка
От Pawel Veselov
Тема Re: Help with slow table update
Дата
Msg-id CAMnJ+Bc0kTZ+4PGvbC7u6e2xR2ZS0q0V3zy3Qv4DZxYvAtw6gQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Help with slow table update  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-general
On Mon, Apr 13, 2015 at 6:03 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Apr 13, 2015 at 5:01 PM, Pawel Veselov <pawel.veselov@gmail.com> wrote:

r_agrio_hourly - "good", r_agrio_total - "bad".

 Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329) (actual time=2.248..2.248 rows=0 loops=1)
   ->  Index Scan using u_r_agrio_hourly on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207 rows=1 loops=1)
         Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND (device_type = 3::numeric) AND (placement = 2::numeric))
 Total runtime: 2.281 ms
 Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321) (actual time=106.766..106.766 rows=0 loops=1)
   ->  Index Scan using u_r_agrio_total on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626 rows=1 loops=1)
         Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric) AND (device_type = 3::numeric) AND (placement = 2::numeric))
 Total runtime: 106.793 ms

What it is you expect to see here?

​What are the results (count and times) for:

SELECT count(*) FROM r_agrio_total WHERE tagid = 1002 and unitid = 1002;

Result: 8 (the whole table is 24 rows). It returns somewhat with a stumble, but relatively quickly.
db=> explain analyze SELECT count(*) FROM r_agrio_total WHERE tagid = 1002 and unitid = 1002;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=4.45..4.46 rows=1 width=0) (actual time=327.194..327.195 rows=1 loops=1)
   ->  Index Scan using tag_r_agrio_total on r_agrio_total  (cost=0.42..4.45 rows=1 width=0) (actual time=0.039..327.189 rows=8 loops=1)
         Index Cond: (tagid = 1002::numeric)
         Filter: (unitid = 1002::numeric)
 Total runtime: 327.228 ms
 
SELECT count(*) FROM r_agrio_hourly WHERE tagid = 1002 and unitid = 1002;

Result is 2869. Returns somewhat quckly. Explain analyze is crazy though:
db=> explain analyze SELECT count(*) FROM r_agrio_hourly WHERE tagid = 1002 and unitid = 1002;
                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=68134.68..68134.69 rows=1 width=0) (actual time=15177.211..15177.211 rows=1 loops=1)
   ->  Index Scan using adunit_r_agrio_hourly on r_agrio_hourly  (cost=0.42..67027.10 rows=443035 width=0) (actual time=0.096..15175.730 rows=2869 loops=1)
         Index Cond: (unitid = 1002::numeric)
         Filter: (tagid = 1002::numeric)
 Total runtime: 15177.240 ms

​More queries along this line might be needed.  The underlying question is how many index rows need to be skipped over on "total" to get the final result - or rather are the columns in the index in descending order of cardinality? 

Idea is - both tables have unique multi-field indices, and each update hits exactly one row from that index, no more, and all fields from the index are locked with equality condition on the update. All of the updates (within a transaction) would always work on a small subset of rows (max a few hundred, ever; in this case, it's may be around 10). I expect it to be possible for the server to keep the active working set in the cache at all times. Since the index is unique, there shouldn't be a reason to re-scan the table, if a cached row is found, no?
 
Any chance you can perform a "REINDEX" - maybe there is some bloat present?  There are queries to help discern if that may be the case, I do not know then off the top of my head, but just doing it might be acceptable and is definitely quicker if so. 

That's the thing - I've done both vacuum full, and re-index. The very first time I did vacuum full things improved (60 seconds to 7 seconds). Re-index didn't improve anything (but it was done after vacuum full).
 
​I'm still not really following your presentation but maybe my thoughts will spark something.​

Thank you! I hope I clarified this some :)


В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Hot standby problems: consistent state not reached, no connection to master server.
Следующее
От: Pawel Veselov
Дата:
Сообщение: Re: Help with slow table update