Re: Fix for seg picksplit function

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: Fix for seg picksplit function
Дата
Msg-id AANLkTi=cM_RQ47nPyVoj8x0pkX25zWFKzC6DdMPFBc9u@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Fix for seg picksplit function  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Sat, Nov 20, 2010 at 6:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
Well, the problem with just comparing on < is that it takes very
little account of the upper bounds.  I think the cases where a simple
split would hurt you the most are those where examining the upper
bound is necessary to to get a good split.
Yes, also such asymmetric solution seems not beautiful enough for me :). It's easy to sort segs by their center, in this case lower and upper bound will be used equally. New patch is attached. I checked it on various data distributions.

1) Uniform distribution
test=# insert into seg_test (select (a || ' .. ' || a + 0.00005*b)::seg from (select random() as a, random() as b from generate_series(1,1000000)) x);
INSERT 0 1000000
Time: 79121,830 ms
test=# create index seg_test_idx on seg_test using gist (a);
CREATE INDEX
Time: 176409,434 ms
test=# explain (buffers, analyze) select * from seg_test where a @> '0.5 .. 0.5'::seg;
                                                        QUERY PLAN                                                      
  
--------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on seg_test  (cost=28.19..2500.32 rows=1000 width=12) (actual time=0.251..0.886 rows=27 loops=1)
   Recheck Cond: (a @> '0.5'::seg)
   Buffers: shared hit=3 read=27
   ->  Bitmap Index Scan on seg_test_idx  (cost=0.00..27.94 rows=1000 width=0) (actual time=0.193..0.193 rows=27 loops=1)
         Index Cond: (a @> '0.5'::seg)
         Buffers: shared hit=3
 Total runtime: 1.091 ms
(7 rows)

Time: 41,884 ms

2) Natural distribution (Box–Muller transform was used for data generation)
test=# insert into seg_test (select ( a - 0.00005*abs(b) || ' .. ' || a + 0.00005*abs(b))::seg from (select cos(2.0*pi()*random())*sqrt(-2.0*ln(random())) as a, cos(2.0*pi()*random())*sqrt(-2.0*ln(random())) as b from generate_series(1,1000000)) x);
INSERT 0 1000000
Time: 98614,305 ms
test=# create index seg_test_idx on seg_test using gist(a);
CREATE INDEX
Time: 212513,540 ms
test=# explain (buffers, analyze) select * from seg_test where a @> '0.3 .. 0.3'::seg;
                                                        QUERY PLAN                                                      
  
--------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on seg_test  (cost=28.18..2500.31 rows=1000 width=12) (actual time=0.132..0.428 rows=27 loops=1)
   Recheck Cond: (a @> '0.3'::seg)
   Buffers: shared hit=3 read=27
   ->  Bitmap Index Scan on seg_test_idx  (cost=0.00..27.93 rows=1000 width=0) (actual time=0.103..0.103 rows=27 loops=1)
         Index Cond: (a @> '0.3'::seg)
         Buffers: shared hit=3
 Total runtime: 0.504 ms
(7 rows)

Time: 0,967 ms

3) Many distinct values
test=# insert into seg_test (select (a||'..'||(a+1))::seg from (select (random()*13000)::integer as a from generate_series(1,1000000)) x);
INSERT 0 1000000
Time: 90775,952 ms
test=# create index seg_test_idx on seg_test using gist(a);
CREATE INDEX
Time: 200960,758 ms
test=# explain (buffers, analyze) select * from seg_test where a @> '700.0 .. 700.0'::seg;
                                                        QUERY PLAN                                                      
   
---------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on seg_test  (cost=28.19..2500.33 rows=1000 width=12) (actual time=0.358..3.531 rows=138 loops=1)
   Recheck Cond: (a @> '700.0'::seg)
   Buffers: shared hit=3 read=135
   ->  Bitmap Index Scan on seg_test_idx  (cost=0.00..27.94 rows=1000 width=0) (actual time=0.270..0.270 rows=138 loops=1)
         Index Cond: (a @> '700.0'::seg)
         Buffers: shared hit=3
 Total runtime: 3.882 ms
(7 rows)

Time: 5,271 ms

----
With best regards,
Alexander Korotkov.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Yeb Havinga
Дата:
Сообщение: Re: Fix for seg picksplit function
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons