send-email: improve RFC2047 quote parsing
The RFC2047 unquoting, used to parse email addresses in From and Cc headers, is broken in several ways: * It erroneously substitutes ' ' for '_' in *the whole* header, even outside the quoted field. [Noticed by Christoph.] * It is too liberal in its matching, and happily matches the start of one quoted chunk against the end of another, or even just something that looks like such an end. [Noticed by Junio.] * It fundamentally cannot cope with encodings that are not a superset of ASCII, nor several (incompatible) encodings in the same header. This patch fixes the first two by doing a more careful decoding of the outer quoting (e.g. "=AB" to represent an octet whose value is 0xAB). Fixing the fundamental issues is left for a future, more intrusive, patch. Noticed-by: Christoph Miebach <christoph.miebach@web.de> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:

committed by
Junio C Hamano

parent
d0f1ea6003
commit
b622d4d11d
@ -862,11 +862,13 @@ $time = time - scalar $#files;
|
||||
sub unquote_rfc2047 {
|
||||
local ($_) = @_;
|
||||
my $encoding;
|
||||
if (s/=\?([^?]+)\?q\?(.*)\?=/$2/g) {
|
||||
s{=\?([^?]+)\?q\?(.*?)\?=}{
|
||||
$encoding = $1;
|
||||
s/_/ /g;
|
||||
s/=([0-9A-F]{2})/chr(hex($1))/eg;
|
||||
}
|
||||
my $e = $2;
|
||||
$e =~ s/_/ /g;
|
||||
$e =~ s/=([0-9A-F]{2})/chr(hex($1))/eg;
|
||||
$e;
|
||||
}eg;
|
||||
return wantarray ? ($_, $encoding) : $_;
|
||||
}
|
||||
|
||||
|
Reference in New Issue
Block a user