
Under ci/ hierarchy, we run scripts under either "sh" (any Bourne compatible POSIX shell would work) or specifically "bash" (as they require features from bash, e.g., ${parameter/pattern/string} expansion). As we have the CI environment under our control, we can expect that /bin/sh will always be fine to run the scripts that only require a Bourne shell, but we may not know where "bash" is installed depending on the distro used. So let's make sure we start these scripts with either one of these: #!/bin/sh #!/usr/bin/env bash Yes, the latter has to assume that everybody installs "env" at that path and not as /bin/env or /usr/local/bin/env, but this currently is the best we could do. Signed-off-by: Junio C Hamano <gitster@pobox.com>
28 lines
1.3 KiB
Bash
Executable File
28 lines
1.3 KiB
Bash
Executable File
#!/usr/bin/env bash
|
|
|
|
# This script verifies that the non-binary files tracked in the Git index do
|
|
# not contain any Unicode directional formatting: such formatting could be used
|
|
# to deceive reviewers into interpreting code differently from the compiler.
|
|
# This is intended to run on an Ubuntu agent in a GitHub workflow.
|
|
#
|
|
# To allow translated messages to introduce such directional formatting in the
|
|
# future, we exclude the `.po` files from this validation.
|
|
#
|
|
# Neither GNU grep nor `git grep` (not even with `-P`) handle `\u` as a way to
|
|
# specify UTF-8.
|
|
#
|
|
# To work around that, we use `printf` to produce the pattern as a byte
|
|
# sequence, and then feed that to `git grep` as a byte sequence (setting
|
|
# `LC_CTYPE` to make sure that the arguments are interpreted as intended).
|
|
#
|
|
# Note: we need to use Bash here because its `printf` interprets `\uNNNN` as
|
|
# UTF-8 code points, as desired. Running this script through Ubuntu's `dash`,
|
|
# for example, would use a `printf` that does not understand that syntax.
|
|
|
|
# U+202a..U+2a2e: LRE, RLE, PDF, LRO and RLO
|
|
# U+2066..U+2069: LRI, RLI, FSI and PDI
|
|
regex='(\u202a|\u202b|\u202c|\u202d|\u202e|\u2066|\u2067|\u2068|\u2069)'
|
|
|
|
! LC_CTYPE=C git grep -El "$(LC_CTYPE=C.UTF-8 printf "$regex")" \
|
|
-- ':(exclude,attr:binary)' ':(exclude)*.po'
|