pgsql: Fix O(N^2) performance problems in regular-expression compiler.

Поиск
Список
Период
Сортировка
От Tom Lane
Тема pgsql: Fix O(N^2) performance problems in regular-expression compiler.
Дата
Msg-id E1ZnB7A-0005hT-6k@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Fix O(N^2) performance problems in regular-expression compiler.

Change the singly-linked in-arc and out-arc lists to be doubly-linked,
so that arc deletion is constant time rather than having worst-case time
proportional to the number of other arcs on the connected states.

Modify the bulk arc transfer operations copyins(), copyouts(), moveins(),
moveouts() so that they use a sort-and-merge algorithm whenever there's
more than a small number of arcs to be copied or moved.  The previous
method is O(N^2) in the number of arcs involved, because it performs
duplicate checking independently for each copied arc.  The new method may
change the ordering of existing arcs for the destination state, but nothing
really cares about that.

Provide another bulk arc copying method mergeins(), which is unused as
of this commit but is needed for the next one.  It basically is like
copyins(), but the source arcs might not all come from the same state.

Replace the O(N^2) bubble-sort algorithm used in carcsort() with a qsort()
call.

These changes greatly improve the performance of regex compilation for
large or complex regexes, at the cost of extra space for arc storage during
compilation.  The original tradeoff was probably fine when it was made, but
now we care more about speed and less about memory consumption.

Back-patch to all supported branches.

Branch
------
REL9_5_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/cff9e0659e8b79d4e075d30f04dac5a5587b8ac2

Modified Files
--------------
src/backend/regex/regc_nfa.c |  808 +++++++++++++++++++++++++++++++++++++-----
src/backend/regex/regcomp.c  |   10 +-
src/include/regex/regguts.h  |    4 +-
3 files changed, 728 insertions(+), 94 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: pgsql: Fix regular-expression compiler to handle loops of constraint ar
Следующее
От: Tom Lane
Дата:
Сообщение: pgsql: Improve performance of fixempties() pass in regular-expression c