pgsql: Improve memory-usage accounting in regular-expression compiler.

Поиск
Список
Период
Сортировка
От Tom Lane
Тема pgsql: Improve memory-usage accounting in regular-expression compiler.
Дата
Msg-id E1ZnB7A-0005hw-8f@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Improve memory-usage accounting in regular-expression compiler.

This code previously counted the number of NFA states it created, and
complained if a limit was exceeded, so as to prevent bizarre regex patterns
from consuming unreasonable time or memory.  That's fine as far as it went,
but the code paid no attention to how many arcs linked those states.  Since
regexes can be contrived that have O(N) states but will need O(N^2) arcs
after fixempties() processing, it was still possible to blow out memory,
and take a long time doing it too.  To fix, modify the bookkeeping to count
space used by both states and arcs.

I did not bother with including the "color map" in the accounting; it
can only grow to a few megabytes, which is not a lot in comparison to
what we're allowing for states+arcs (about 150MB on 64-bit machines
or half that on 32-bit machines).

Looking at some of the larger real-world regexes captured in the Tcl
regression test suite suggests that the most that is likely to be needed
for regexes found in the wild is under 10MB, so I believe that the current
limit has enough headroom to make it okay to keep it as a hard-wired limit.

In connection with this, redefine REG_ETOOBIG as meaning "regular
expression is too complex"; the previous wording of "nfa has too many
states" was already somewhat inapropos because of the error code's use
for stack depth overrun, and it was not very user-friendly either.

Back-patch to all supported branches.

Branch
------
REL9_4_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/0ecf4a9e55d7a9322f3aaee31bbd68ba01b2820e

Modified Files
--------------
src/backend/regex/regc_nfa.c |   75 ++++++++----------------------------------
src/backend/regex/regcomp.c  |    2 ++
src/include/regex/regerrs.h  |    2 +-
src/include/regex/regex.h    |    2 +-
src/include/regex/regguts.h  |   15 +++++----
5 files changed, 27 insertions(+), 69 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: pgsql: Improve performance of pullback/pushfwd in regular-expression co
Следующее
От: Tom Lane
Дата:
Сообщение: pgsql: Improve memory-usage accounting in regular-expression compiler.