Re: [PERFORM] Querying with multicolumn index
От | Eric Jiang |
---|---|
Тема | Re: [PERFORM] Querying with multicolumn index |
Дата | |
Msg-id | CAOfJSTxCfzoKxZ72nFV-Ow8mSch8b6C8qS6efH_1_Oi7SA7acQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PERFORM] Querying with multicolumn index (Daniel Blanch Bataller <daniel.blanch.bataller@gmail.com>) |
Ответы |
Re: [PERFORM] Querying with multicolumn index
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-performance |
We aren't using any special planner settings - all enable_* options are "on". On Sat, Dec 10, 2016 at 12:06 AM, Daniel Blanch Bataller <daniel.blanch.bataller@gmail.com> wrote: > As a quick fix: Have you considered dropping ix_updates_time index? We do occasionally want to use ix_updates_time, although not very often. > I’ve been able to reproduce the issue, but with bigger numbers than you. When I dropped ix_updates_time it went much muchfaster. It uses updates_driver_id_time_idx index instead. > > For some reason the planner does not make a good estimation in this case. Can you show us EXPLAIN (ANALYZE, BUFFERS) beforeand after dropping ix_updates_time index? Can you show us too the output of SHOW shared_buffers; ? Here's EXPLAIN (ANALYZE, BUFFERS) with the above bad query on a cold cache: Limit (cost=0.09..0.70 rows=1 width=56) (actual time=244846.915..244846.915 rows=1 loops=1) Buffers: shared hit=3999254 read=57831 I/O Timings: read=242139.661 -> Index Scan Backward using ix_updates_time on updates (cost=0.09..271696.74 rows=442550 width=56) (actual time=244846.913..244846.913 rows=1 loops=1) Filter: (driver_id = 30132) Rows Removed by Filter: 5316811 Buffers: shared hit=3999254 read=57831 I/O Timings: read=242139.661 Total runtime: 244846.946 ms and after dropping ix_updates_time: Limit (cost=0.11..0.98 rows=1 width=56) (actual time=2.270..2.271 rows=1 loops=1) Buffers: shared hit=1 read=4 I/O Timings: read=2.230 -> Index Scan Backward using updates_driver_id_time_idx on updates (cost=0.11..382307.69 rows=442550 width=56) (actual time=2.270..2.270 rows=1 loops=1) Index Cond: (driver_id = 30132) Buffers: shared hit=1 read=4 I/O Timings: read=2.230 Total runtime: 2.305 ms and `SHOW shared_buffers;` shared_buffers ---------------- 244MB > I suspect the issue has to do with low shared_buffers configuration and cache misses, and maybe some costs suboptimal configurationI’ll try to find it out, if anyone can enlighten us it will be very welcomed. > > > P.S. Meanwhile If you still need 'time' index, you can create an index using ‘time' and ‘customer_id' in this order. Did you mean an index on (time, driver_id)? I did: CREATE INDEX CONCURRENTLY ix_updates_time_driver_id ON updates ("time", driver_id) but seems like the planner will use it for driver_id having more than ~300k rows: Limit (cost=0.11..0.79 rows=1 width=56) (actual time=115.051..115.052 rows=1 loops=1) Buffers: shared hit=20376 -> Index Scan Backward using ix_updates_time_driver_id on updates (cost=0.11..302189.90 rows=443924 width=56) (actual time=115.048..115.048 rows=1 loops=1) Index Cond: (driver_id = 30132) Buffers: shared hit=20376 Total runtime: 115.091 ms It does seem faster than when having an index on just "time", but still not optimal. Really appreciate everyone's help! -- Eric Jiang, DoubleMap eric@doublemap.com | www.doublemap.com
В списке pgsql-performance по дате отправления:
Предыдущее
От: Daniel Blanch BatallerДата:
Сообщение: Re: [PERFORM] Querying with multicolumn index