Обсуждение: does the planner "learn"?

Поиск
Список
Период
Сортировка

does the planner "learn"?

От
TJ O'Donnell
Дата:
I understand the value of indexes and of ANALYZE for the efficient use of them.
In the following statement, you can see that the index scan is being used.
Even though it takes 80 seconds (for a 1.25 million row table), it is
much faster than without the index.
But, if I repeat this search, it speeds up by more than a factor of 2!
I love it, but I'd sure like to understand why.  When I do it a third time,
it speeds up again.  A fourth try does not speed it up more.
Is this speedup due to some memory/disk buffering from which I'm
benefiting?  I'm using linux (actually under VMware on WinXP, so it's even less
efficient that it could be on it's own).  Or is the planner learning
something from previous runs of this search?  It appears not, since the
rows it thinks it needs to search are the same in the EXPLAIN ANALYZE
outputs below.  Can someone help me understand why my searches are speeding
up so I can make it happen the first time, if possible?

Thanks,
TJ


Try #1:
Select distinct smiles from structure where (_c, _o, _arom_c, _c_double, _o_double, _n, _ring_c, _ring_hetero,
_halogen, 
_n_double, _arom_n, _s, _s_double, _other_atoms, _c_triple, _n_triple, _p, _arom_s, _p_double, _arom_o) >=
(4,2,6,2,2,1,4,1,0,0,0,0,0,0,0,0,0,0,0,0) and oe_matches(smiles,'c1ccc(cc1)C2CC(=O)NC2=O') limit 500

Limit  (cost=92649.53..92652.35 rows=500 width=49) (actual time=81544.566..81545.522 rows=117 loops=1)
   ->  Unique  (cost=92649.53..92688.60 rows=6924 width=49) (actual time=81544.561..81545.174 rows=117 loops=1)
         ->  Sort  (cost=92649.53..92669.06 rows=7813 width=49) (actual time=81544.553..81544.726 rows=117 loops=1)
               Sort Key: smiles
               ->  Index Scan using fingerprint on structure  (cost=0.00..92144.36 rows=7813 width=49) (actual
time=36.179..81533.872 rows=117 loops=1)
                     Index Cond: ((_c >= 4) AND (_o >= 2) AND (_arom_c >= 6) AND (_c_double >= 2) AND (_o_double >= 2)
AND (_n >= 1) AND (_ring_c >= 4) AND (_ring_hetero >= 1) AND (_halogen >= 0) AND (_n_double >= 0) AND (_arom_n >= 0)
AND 
(_s >= 0) AND (_s_double >= 0) AND (_other_atoms >= 0) AND (_c_triple >= 0) AND (_n_triple >= 0) AND (_p >= 0) AND
(_arom_s >= 0) AND (_p_double >= 0) AND (_arom_o >= 0))
                     Filter: oe_matches(smiles, 'c1ccc(cc1)C2CC(=O)NC2=O'::character varying)
Total runtime: 81545.903 ms

Try #2:
Limit  (cost=92649.53..92652.35 rows=500 width=49) (actual time=36924.436..36925.450 rows=117 loops=1)
   ->  Unique  (cost=92649.53..92688.60 rows=6924 width=49) (actual time=36924.431..36925.051 rows=117 loops=1)
         ->  Sort  (cost=92649.53..92669.06 rows=7813 width=49) (actual time=36924.423..36924.596 rows=117 loops=1)
               Sort Key: smiles
               ->  Index Scan using fingerprint on structure  (cost=0.00..92144.36 rows=7813 width=49) (actual
time=14.591..36891.589 rows=117 loops=1)
                     Index Cond: ((_c >= 4) AND (_o >= 2) AND (_arom_c >= 6) AND (_c_double >= 2) AND (_o_double >= 2)
AND (_n >= 1) AND (_ring_c >= 4) AND (_ring_hetero >= 1) AND (_halogen >= 0) AND (_n_double >= 0) AND (_arom_n >= 0)
AND 
(_s >= 0) AND (_s_double >= 0) AND (_other_atoms >= 0) AND (_c_triple >= 0) AND (_n_triple >= 0) AND (_p >= 0) AND
(_arom_s >= 0) AND (_p_double >= 0) AND (_arom_o >= 0))
                     Filter: oe_matches(smiles, 'c1ccc(cc1)C2CC(=O)NC2=O'::character varying)
Total runtime: 36925.820 ms

Try #3:
Limit  (cost=92649.53..92652.35 rows=500 width=49) (actual time=23712.435..23713.394 rows=117 loops=1)
   ->  Unique  (cost=92649.53..92688.60 rows=6924 width=49) (actual time=23712.430..23713.046 rows=117 loops=1)
         ->  Sort  (cost=92649.53..92669.06 rows=7813 width=49) (actual time=23712.422..23712.599 rows=117 loops=1)
               Sort Key: smiles
               ->  Index Scan using fingerprint on structure  (cost=0.00..92144.36 rows=7813 width=49) (actual
time=17.548..23631.915 rows=117 loops=1)
                     Index Cond: ((_c >= 4) AND (_o >= 2) AND (_arom_c >= 6) AND (_c_double >= 2) AND (_o_double >= 2)
AND (_n >= 1) AND (_ring_c >= 4) AND (_ring_hetero >= 1) AND (_halogen >= 0) AND (_n_double >= 0) AND (_arom_n >= 0)
AND 
(_s >= 0) AND (_s_double >= 0) AND (_other_atoms >= 0) AND (_c_triple >= 0) AND (_n_triple >= 0) AND (_p >= 0) AND
(_arom_s >= 0) AND (_p_double >= 0) AND (_arom_o >= 0))
                     Filter: oe_matches(smiles, 'c1ccc(cc1)C2CC(=O)NC2=O'::character varying)
Total runtime: 23713.765 ms


Re: does the planner "learn"?

От
Martijn van Oosterhout
Дата:
On Mon, Feb 07, 2005 at 11:07:16AM -0800, TJ O'Donnell wrote:
> I understand the value of indexes and of ANALYZE for the efficient use of
> them.
> In the following statement, you can see that the index scan is being used.
> Even though it takes 80 seconds (for a 1.25 million row table), it is
> much faster than without the index.
> But, if I repeat this search, it speeds up by more than a factor of 2!
> I love it, but I'd sure like to understand why.  When I do it a third time,
> it speeds up again.  A fourth try does not speed it up more.
> Is this speedup due to some memory/disk buffering from which I'm
> benefiting?  I'm using linux (actually under VMware on WinXP, so it's even

Yep. Buffering improves performance considerably.

> less
> efficient that it could be on it's own).  Or is the planner learning
> something from previous runs of this search?  It appears not, since the
> rows it thinks it needs to search are the same in the EXPLAIN ANALYZE
> outputs below.  Can someone help me understand why my searches are speeding
> up so I can make it happen the first time, if possible?

Nope, the plan would be the same everytime. There's no way the planner
is doing anything different.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения