Re: speeding up planning with partitions

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: speeding up planning with partitions
Дата
Msg-id 3f280722-46f2-c2a4-4c19-2cfa28c6c1cd@lab.ntt.co.jp
обсуждение исходный текст
Ответ на speeding up planning with partitions  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Ответы Re: speeding up planning with partitions
Список pgsql-hackers
On 2018/08/29 21:06, Amit Langote wrote:
> I measured the gain in performance due to each patch on a modest virtual
> machine.  Details of the measurement and results follow.
>
> UPDATE:
>
> nparts  master    0001   0002   0003
> ======  ======    ====   ====   ====
> 0         2856    2893   2862   2816
> 8          507    1115   1447   1872
> 16         260     765   1173   1892
> 32         119     483    922   1884
> 64          59     282    615   1881
> 128         29     153    378   1835
> 256         14      79    210   1803
> 512          5      40    113   1728
> 1024         2      17     57   1616
> 2048         0*      9     30   1471
> 4096         0+      4     15   1236
> 8192         0=      2      7    975
>
> For SELECT:
>
> nparts  master    0001   0002   0003
> ======  ======    ====   ====   ====
> 0         2290    2329   2319   2268
> 8         1058    1077   1414   1788
> 16         711     729   1124   1789
> 32         450     475    879   1773
> 64         265     272    603   1765
> 128        146     149    371   1685
> 256          76      77    214   1678
> 512          39      39    112   1636
> 1024          16      17     59   1525
> 2048           8       9     29   1416
> 4096           4       4     15   1195
> 8192           2       2      7    932

Prompted by Tsunakawa-san's comment, I tried to look at the profiles when
running the benchmark with partitioning and noticed a few things that made
clear why, even with 0003 applied, tps numbers decreased as the number of
partitions increased.  Some functions that appeared high up in the
profiles were related to partitioning:

* set_relation_partition_info calling partition_bounds_copy(), which calls
  datumCopy() on N Datums, where N is the number of partitions.  The more
  the number of partitions, higher up it is in profiles.  I suspect that
  this copying might be redundant; planner can keep using the same pointer
  as relcache

There are a few existing and newly introduced sites in the planner where
the code iterates over *all* partitions of a table where processing just
the partition selected for scanning would suffice.  I observed the
following functions in profiles:

* make_partitionedrel_pruneinfo, which goes over all partitions to
  generate subplan_map and subpart_map arrays to put into the
  PartitionedRelPruneInfo data structure that it's in the charge of
  generating

* apply_scanjoin_target_to_paths, which goes over all partitions to adjust
  their Paths for applying required scanjoin target, although most of
  those are dummy ones that won't need the adjustment

* For UPDATE, a couple of functions I introduced in patch 0001 were doing
  the same thing as apply_scanjoin_target_to_paths, which is unnecessary

To fix the above three instances of redundant processing, I added a
Bitmapset 'live_parts' to the RelOptInfo which stores the set of indexes
of only the unpruned partitions (into the RelOptInfo.part_rels array) and
replaced the for (i = 0; i < rel->nparts; i++) loops in those sites with
the loop that iterates over the members of 'live_parts'.

Results looked were promising indeed, especially after applying 0003 which
gets rid of locking all partitions.

UPDATE:

nparts  master    0001    0002   0003
======  ======    ====    ====   ====
0         2856    2893    2862   2816
8          507    1115    1466   1845
16         260     765    1161   1876
32         119     483     910   1862
64          59     282     609   1895
128         29     153     376   1884
256         14      79     212   1874
512          5      40     115   1859
1024         2      17      58   1847
2048         0       9      29   1883
4096         0       4      15   1867
8192         0       2       7   1826

SELECT:

nparts  master    0001    0002   0003
======  ======    ====    ====   ====
0       2290      2329    2319   2268
8       1058      1077    1431   1800
16       711       729    1158   1781
32       450       475     908   1777
64       265       272     612   1791
128      146       149     379   1777
256       76        77     213   1785
512       39        39     114   1776
1024      16        17      59   1756
2048       8         9      30   1746
4096       4         4      15   1722
8192       2         2       7   1706

Note that with 0003, tps doesn't degrade as the number of partitions increase.

Attached updated patches, with 0002 containing the changes mentioned above.

Thanks,
Amit

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] [PATCH] kNN for SP-GiST
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] [PATCH] kNN for SP-GiST