grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data
If we attempt to grep non-ascii log message text with an ascii pattern, we
run into the following issue:
$ git log --color --author='.var.*Bjar' -1 origin/master | grep ^Author
grep: (standard input): binary file matches
So, to fix this teach the grep code to use PCRE2_UTF, as long as the log
output is encoded in UTF-8.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
committed by
Junio C Hamano
parent
6a5c337922
commit
ae39ba431a
6
grep.c
6
grep.c
@ -382,8 +382,10 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
|
||||
}
|
||||
options |= PCRE2_CASELESS;
|
||||
}
|
||||
if (!opt->ignore_locale && is_utf8_locale() && has_non_ascii(p->pattern) &&
|
||||
!(!opt->ignore_case && (p->fixed || p->is_fixed)))
|
||||
if ((!opt->ignore_locale && !has_non_ascii(p->pattern)) ||
|
||||
(!opt->ignore_locale && is_utf8_locale() &&
|
||||
has_non_ascii(p->pattern) && !(!opt->ignore_case &&
|
||||
(p->fixed || p->is_fixed))))
|
||||
options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF);
|
||||
|
||||
#ifdef GIT_PCRE2_VERSION_10_36_OR_HIGHER
|
||||
|
||||
Reference in New Issue
Block a user