Re: Double sorting split patch

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: Double sorting split patch
Дата
Msg-id CAPpHfdvt-8vkD3p5wUCH0A3fCWtmgKxzG6bp23K4aBvOrS5Vxw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Double sorting split patch  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: Double sorting split patch
Список pgsql-hackers
Path without allocating extra bytes is attached.
I run some more detailed tests on geonames and two smaller datasets from rtreeportal.org.
Test tables are following:
1) test1 - geonames
2) test2 - California Roads, containing the MBRs of 2,249,727 streets
3) test3 - Tiger Streams, containing the MBRs of 194,971 streams
Scripts for test queries generation and execution are attached (scripts.tar.gz).

Build times are given in the following table:
      New linear Double sorting
test1 277.889630 273.766355
test2  73.262561  75.114079
test3   4.251563   4.425139
As we can see, build times differ insignificantly.

Index sizes are given in the following table:
      New linear Double sorting
test1 572383232  578977792
test2 179929088  178569216
test3  15409152   15532032
As we can see, index sizes differ insignificantly.

Data about index quality testing are in the table test_results of the following structure:
tablename - name of the queried table
count - actual count of rows matching query
nominal_count - count of rows matching query which test generator tried to provide (test generator not always can create test query which return exactly same amount of rows as it was expected)
strategy - split strategy: 1 - new liner split(current), 2 - double sorting(this patch)
hits - number of pages hits for query execution.

Raw data are in the attachment (test_result.sql.gz). Report is below.
test=# select tablename, nominal_count, round(avg(count),2) as avg_count, strategy, round(avg(hits),2) as avg_hits from test_results group by tablename, nominal_count, strategy order by tablename, nominal_count, strategy;
 tablename | nominal_count | avg_count | strategy | avg_hits 
-----------+---------------+-----------+----------+----------
 test1     |             1 |      4.87 |        1 |    19.94
 test1     |             1 |      4.87 |        2 |     6.46
 test1     |            10 |     11.07 |        1 |    23.95
 test1     |            10 |     11.07 |        2 |     7.36
 test1     |           100 |    101.36 |        1 |    43.30
 test1     |           100 |    101.36 |        2 |    10.19
 test1     |          1000 |    998.70 |        1 |    86.48
 test1     |          1000 |    998.70 |        2 |    24.21
 test2     |             1 |      1.32 |        1 |     8.67
 test2     |             1 |      1.32 |        2 |     5.99
 test2     |            10 |     11.32 |        1 |     9.40
 test2     |            10 |     11.32 |        2 |     6.71
 test2     |           100 |    102.93 |        1 |    13.10
 test2     |           100 |    102.93 |        2 |     9.02
 test2     |          1000 |    999.67 |        1 |    32.16
 test2     |          1000 |    999.67 |        2 |    23.51
 test3     |             1 |      0.99 |        1 |     6.03
 test3     |             1 |      0.99 |        2 |     4.32
 test3     |            10 |      9.95 |        1 |     7.52
 test3     |            10 |      9.95 |        2 |     5.09
 test3     |           100 |     99.92 |        1 |    10.73
 test3     |           100 |     99.92 |        2 |     7.73
 test3     |          1000 |    999.75 |        1 |    27.47
 test3     |          1000 |    999.75 |        2 |    22.44
(24 rows)

As we can see, double sorting split outperfoms new linear split in all the presented cases. Difference in test2 and test3 is not so big as it is in test1. It can be explained by fact that test2 and test3 are smaller than test1.

------
With best regards,
Alexander Korotkov.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: [REVIEW] prepare plans of embedded sql on function start
Следующее
От: Magnus Hagander
Дата:
Сообщение: Bug in walsender when calling out to do_pg_stop_backup (and others?)