diff: fix "multiple regexp" semantics to find hunk header comment

When multiple regular expressions are concatenated with "\n", they were
traditionally AND'ed together, and only a line that matches _all_ of them
is taken as a match.  This however is unwieldy when multiple regexp
feature is used to specify alternatives.

This fixes the semantics to take the first match.  A nagative pattern, if
matches, makes the line to fail as before.  A match with a positive
pattern will be the final match, and what it captures in $1 is used as the
hunk header comment.

We could write alternatives using "|" in ERE, but the machinery can only
use captured $1 as the hunk header comment (or $0 if there is no match in
$1), so you cannot write:

    "junk ( A | B ) | garbage ( C | D )"

and expect both "junk" and "garbage" to get stripped with the existing
code.  With this fix, you can write it as:

    "junk ( A | B ) \n garbage ( C | D )"

and the way capture works would match the user expectation more
naturally.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Junio C Hamano
2008-09-20 00:52:11 -07:00
parent 1883a0d3b7
commit 3d8dccd74a
2 changed files with 11 additions and 8 deletions

2
diff.c
View File

@ -1414,7 +1414,7 @@ static const struct funcname_pattern_entry builtin_funcname_pattern[] = {
{ "pascal",
"^((procedure|function|constructor|destructor|interface|"
"implementation|initialization|finalization)[ \t]*.*)$"
"|"
"\n"
"^(.*=[ \t]*(class|record).*)$",
REG_EXTENDED },
{ "php", "^[\t ]*((function|class).*)", REG_EXTENDED },