Commit 57a2d4a1 authored by Tom Lane's avatar Tom Lane

Fix performance bug in regexp's citerdissect/creviterdissect.

After detecting a sub-match "dissect" failure (i.e., a backref match
failure) in the i'th sub-match of an iteration node, we should proceed
by adjusting the attempted length of the i'th submatch.  As coded,
though, these functions changed the attempted length of the *last*
sub-match, and only after exhausting all possibilities for that would
they back up to adjust the next-to-last sub-match, and then the
second-from-last, etc; all of which is wasted effort, since only
changing the start or length of the i'th sub-match can possibly make
it succeed.  This oversight creates the possibility for exponentially
bad performance.  Fortunately the problem is masked in most cases by
optimizations or constraints applied elsewhere; which explains why
we'd not noticed it before.  But it is possible to reach the problem
with fairly simple, if contrived, regexps.

Oversight in my commit 173e29aa.  That's pretty ancient now,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/1808998.1629412269@sss.pgh.pa.us
parent 92ce7f52
...@@ -1133,8 +1133,8 @@ citerdissect(struct vars *v, ...@@ -1133,8 +1133,8 @@ citerdissect(struct vars *v,
* Our strategy is to first find a set of sub-match endpoints that are * Our strategy is to first find a set of sub-match endpoints that are
* valid according to the child node's DFA, and then recursively dissect * valid according to the child node's DFA, and then recursively dissect
* each sub-match to confirm validity. If any validity check fails, * each sub-match to confirm validity. If any validity check fails,
* backtrack the last sub-match and try again. And, when we next try for * backtrack that sub-match and try again. And, when we next try for a
* a validity check, we need not recheck any successfully verified * validity check, we need not recheck any successfully verified
* sub-matches that we didn't move the endpoints of. nverified remembers * sub-matches that we didn't move the endpoints of. nverified remembers
* how many sub-matches are currently known okay. * how many sub-matches are currently known okay.
*/ */
...@@ -1222,12 +1222,13 @@ citerdissect(struct vars *v, ...@@ -1222,12 +1222,13 @@ citerdissect(struct vars *v,
return REG_OKAY; return REG_OKAY;
} }
/* match failed to verify, so backtrack */ /* i'th match failed to verify, so backtrack it */
k = i;
backtrack: backtrack:
/* /*
* Must consider shorter versions of the current sub-match. However, * Must consider shorter versions of the k'th sub-match. However,
* we'll only ask for a zero-length match if necessary. * we'll only ask for a zero-length match if necessary.
*/ */
while (k > 0) while (k > 0)
...@@ -1338,8 +1339,8 @@ creviterdissect(struct vars *v, ...@@ -1338,8 +1339,8 @@ creviterdissect(struct vars *v,
* Our strategy is to first find a set of sub-match endpoints that are * Our strategy is to first find a set of sub-match endpoints that are
* valid according to the child node's DFA, and then recursively dissect * valid according to the child node's DFA, and then recursively dissect
* each sub-match to confirm validity. If any validity check fails, * each sub-match to confirm validity. If any validity check fails,
* backtrack the last sub-match and try again. And, when we next try for * backtrack that sub-match and try again. And, when we next try for a
* a validity check, we need not recheck any successfully verified * validity check, we need not recheck any successfully verified
* sub-matches that we didn't move the endpoints of. nverified remembers * sub-matches that we didn't move the endpoints of. nverified remembers
* how many sub-matches are currently known okay. * how many sub-matches are currently known okay.
*/ */
...@@ -1433,12 +1434,13 @@ creviterdissect(struct vars *v, ...@@ -1433,12 +1434,13 @@ creviterdissect(struct vars *v,
return REG_OKAY; return REG_OKAY;
} }
/* match failed to verify, so backtrack */ /* i'th match failed to verify, so backtrack it */
k = i;
backtrack: backtrack:
/* /*
* Must consider longer versions of the current sub-match. * Must consider longer versions of the k'th sub-match.
*/ */
while (k > 0) while (k > 0)
{ {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment