-
Tom Lane authored
I had committer's remorse almost immediately after pushing cb76fbd7e, upon finding that removing capturing subexpressions' subREs from the data structure broke my proposed patch for REG_NOSUB optimization. Revert that data structure change. Instead, address the concern about not changing capturing subREs' endpoints by not changing the endpoints. We don't need to, because the point of that bit was just to ensure that the atom has endpoints distinct from the outer state pair that we're stringing the branch between. We already made suitable states in the parenthesized-subexpression case, so the additional ones were just useless overhead. This seems more understandable than Spencer's original coding, and it ought to be a shade faster too by saving a few state creations and arc changes. (I actually see a couple percent improvement on Jacobson's web corpus, though that's barely above the noise floor so I wouldn't put much stock in that result.) Also, fix the logic added by ea1268f6 to ensure that the subRE recorded in v->subs[subno] is exactly the one with capno == subno. Spencer's original coding recorded the child subRE of the capture node, which is okay so far as having the right endpoint states is concerned, but as of cb76fbd7e the capturing subRE itself always has those endpoints too. I think the inconsistency is confusing for the REG_NOSUB optimization. As before, backpatch to v14. Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com
5227d998