Since [1] running "make coccicheck" has resulted in [2] being emitted
to the *.log files for the "spatch" run, and in the case of "make
coccicheck-test" we'd emit these to the user's terminal.
Nothing was broken as a result, but let's refactor the relevant rules
to eliminate the ambiguity between a possible variable and an
identifier.
1. 0e6550a2c6 (cocci: add a index-compatibility.pending.cocci,
2022-11-19)
2. warning: line 257: should active_cache be a metavariable?
warning: line 260: should active_cache_changed be a metavariable?
warning: line 263: should active_cache_tree be a metavariable?
warning: line 271: should active_nr be a metavariable?
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since GNU make 4.4 the semantics of the $(MAKEFLAGS) variable has
changed in a backward-incompatible way, as its "NEWS" file notes:
Previously only simple (one-letter) options were added to the MAKEFLAGS
variable that was visible while parsing makefiles. Now, all options are
available in MAKEFLAGS. If you want to check MAKEFLAGS for a one-letter
option, expanding "$(firstword -$(MAKEFLAGS))" is a reliable way to return
the set of one-letter options which can be examined via findstring, etc.
This upstream change meant that e.g.:
make man
Would become very noisy, because in shared.mak we rely on extracting
"s" from the $(MAKEFLAGS), which now contains long options like
"--jobserver-auth=fifo:<path>", which we'll conflate with the "-s"
option.
So, let's change this idiom we've been carrying since [1], [2] and [3]
as the "NEWS" suggests.
Note that the "-" in "-$(MAKEFLAGS)" is critical here, as the variable
will always contain leading whitespace if there are no short options,
but long options are present. Without it e.g. "make --debug=all" would
yield "--debug=all" as the first word, but with it we'll get "-" as
intended. Then "-s" for "-s", "-Bs" for "-s -B" etc.
1. 0c3b4aac8e (git-gui: Support of "make -s" in: do not output
anything of the build itself, 2007-03-07)
2. b777434383 (Support of "make -s": do not output anything of the
build itself, 2007-03-07)
3. bb2300976b (Documentation/Makefile: make most operations "quiet",
2009-03-27)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the GitHub CI to newer ubuntu release.
* jx/ci-ubuntu-fix:
ci: install python on ubuntu
ci: use the same version of p4 on both Linux and macOS
ci: remove the pipe after "p4 -V" to catch errors
github-actions: run gcc-8 on ubuntu-20.04 image
The format of a line in /proc/cpuinfo that describes a CPU on s390x
looked different from everybody else, and the code in chainlint.pl
failed to parse it.
* ah/chainlint-cpuinfo-parse-fix:
chainlint.pl: fix /proc/cpuinfo regexp
Resolve symbolic links when processing the locations of alternate
object stores, since failing to do so can lead to confusing and buggy
behavior.
* gc/resolve-alternate-symlinks:
object-file: use real paths when adding alternates
A handful of leaks in the line-log machinery have been plugged.
* sg/plug-line-log-leaks:
diff.c: use diff_free_queue()
line-log: free the diff queues' arrays when processing merge commits
line-log: free diff queue when processing non-merge commits
Add one more candidate directory that may house httpd modules while
running tests.
* es/locate-httpd-module-location-in-test:
lib-httpd: extend module location auto-detection
"git prune" may try to iterate over .git/objects/pack for trash
files to remove in it, and loudly fail when the directory is
missing, which is not necessary. The command has been taught to
ignore such a failure.
* ew/prune-with-missing-objects-pack:
prune: quiet ENOENT on missing directories
Assorted fixes of parsing end-user input as integers.
* pw/config-int-parse-fixes:
git_parse_signed(): avoid integer overflow
config: require at least one digit when parsing numbers
git_parse_unsigned: reject negative values
`parse_object()` hardening when checking for the existence of a
suspected blob object.
* jk/parse-object-type-mismatch:
parse_object(): simplify blob conditional
parse_object(): check on-disk type of suspected blob
parse_object(): drop extra "has" check before checking object type
A GNU make pattern rule with multiple targets has always meant that
a single invocation of the recipe will build all the targets.
However in older versions of GNU make a recipe that did not really
build all the targets would be tolerated.
Starting with GNU make 4.4 this behavior is deprecated and pattern
rules are expected to generate files to match all the patterns.
If not all targets are created then GNU make will not consider any
target up to date and will re-run the recipe when it is run again.
Modify Documentation/Makefile to split the man page-creating pattern
rule into a separate pattern rule for each pattern.
Reported-by: Alexander Kanavin <alex.kanavin@gmail.com>
Signed-off-by: Paul Smith <psmith@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python is missing from the default ubuntu-22.04 runner image, which
prevents git-p4 from working. To install python on ubuntu, we need
to provide the correct package names:
* On Ubuntu 18.04 (bionic), "/usr/bin/python2" is provided by the
"python" package, and "/usr/bin/python3" is provided by the "python3"
package.
* On Ubuntu 20.04 (focal) and above, "/usr/bin/python2" is provided by
the "python2" package which has a different name from bionic, and
"/usr/bin/python3" is provided by "python3".
Since the "ubuntu-latest" runner image has a higher version, its
safe to use "python2" or "python3" package name.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There would be a segmentation fault when running p4 v16.2 on ubuntu
22.04 which is the latest version of ubuntu runner image for github
actions.
By checking each version from [1], p4d version 21.1 and above can work
properly on ubuntu 22.04. But version 22.x will break some p4 test
cases. So p4 version 21.x is exactly the version we can use.
With this update, the versions of p4 for Linux and macOS happen to be
the same. So we can add the version number directly into the "P4WHENCE"
variable, and reuse it in p4 installation for macOS.
By removing the "LINUX_P4_VERSION" variable from "ci/lib.sh", the
comment left above has nothing to do with p4, but still applies to
git-lfs. Since we have a fixed version of git-lfs installed on Linux,
we may have a different version on macOS.
[1]: https://cdist2.perforce.com/perforce/
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When installing p4 as a dependency, we used to pipe output of "p4 -V"
and "p4d -V" to validate the installation and output a condensed version
information. But this would hide potential errors of p4 and would stop
with an empty output. E.g.: p4d version 16.2 running on ubuntu 22.04
causes sigfaults, even before it produces any output.
By removing the pipe after "p4 -V" and "p4d -V", we may get a
verbose output, and stop immediately on errors because we have "set
-e" in "ci/lib.sh". Since we won't look at these trace logs unless
something fails, just including the raw output seems most sensible.
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GitHub starts to upgrade its runner image "ubuntu-latest" from version
"ubuntu-20.04" to version "ubuntu-22.04". It will fail to find and
install "gcc-8" package on the new runner image.
Change some of the runner images from "ubuntu-latest" to "ubuntu-20.04"
in order to install "gcc-8" as a dependency.
The first revision of this patch tried to replace "$runs_on_pool" in
"ci/*.sh" with a new "$runs_on_os" environment variable based on the
"os" field in the matrix strategy. But these "os" fields in matrix
strategies are obsolete legacies from commit [1] and commit [2], and
are no longer useful. So remove these unused "os" fields.
[1]: c08bb26010 (CI: rename the "Linux32" job to lower-case "linux32",
2021-11-23)
[2]: 25715419bf (CI: don't run "make test" twice in one job, 2021-11-23)
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding an alternate ODB, we check if the alternate has the same
path as the object dir, and if so, we do nothing. However, that
comparison does not resolve symlinks. This makes it possible to add the
object dir as an alternate, which may result in bad behavior. For
example, it can trick "git repack -a -l -d" (possibly run by "git gc")
into thinking that all packs come from an alternate and delete all
objects.
rm -rf test &&
git clone https://github.com/git/git test &&
(
cd test &&
ln -s objects .git/alt-objects &&
# -c repack.updateserverinfo=false silences a warning about not
# being able to update "info/refs", it isn't needed to show the
# bad behavior
GIT_ALTERNATE_OBJECT_DIRECTORIES=".git/alt-objects" git \
-c repack.updateserverinfo=false repack -a -l -d &&
# It's broken!
git status
# Because there are no more objects!
ls .git/objects/pack
)
Fix this by resolving symlinks and relative paths before comparing the
alternate and object dir. This lets us clean up a number of issues noted
in 37a95862c6 (alternates: re-allow relative paths from environment,
2016-11-07):
- Now that we compare the real paths, duplicate detection is no longer
foiled by relative paths.
- Using strbuf_realpath() allows us to "normalize" paths that
strbuf_normalize_path() can't, so we can stop silently ignoring errors
when "normalizing" paths from the environment.
- We now store an absolute path based on getcwd() (the "future
direction" named in 37a95862c6), so chdir()-ing in the process no
longer changes the directory pointed to by the alternate. This is a
change in behavior, but a desirable one.
Signed-off-by: Glen Choo <chooglen@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 81071626ba (trace2: add global counter mechanism, 2022-10-24)
these tests have been failing when git is compiled with NO_PTHREADS=Y,
which is always the case e.g. if 'uname -s' is "NONSTOP_KERNEL".
Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git receive-pack" used to use all the local refs as the boundary for
checking connectivity of the data "git push" sent, but now it uses
only the refs that it advertised to the pusher. In a repository with
the .hideRefs configuration, this reduces the resources needed to
perform the check.
cf. <221028.86bkpw805n.gmgdl@evledraar.gmail.com>
cf. <xmqqr0yrizqm.fsf@gitster.g>
* ps/receive-use-only-advertised:
receive-pack: only use visible refs for connectivity check
rev-parse: add `--exclude-hidden=` option
revision: add new parameter to exclude hidden refs
revision: introduce struct to handle exclusions
revision: move together exclusion-related functions
refs: get rid of global list of hidden refs
refs: fix memory leak when parsing hideRefs config
Fix an issue where core.fsmonitor on macOS would not notice created
or modified symbolic links.
* sz/macos-fsmonitor-symlinks:
fsmonitor--daemon: on macOS support symlink
A pair of bugfixes to the Documentation/howto/maintain-git.txt guide.
* tb/howto-maintain-git-fixes:
Documentation: build redo-seen.sh from jch..seen
Documentation: build redo-jch.sh from master..jch
Teach chainlint.pl to show corresponding line numbers when printing
the source of a test.
* es/chainlint-lineno:
chainlint: prefix annotated test definition with line numbers
chainlint: latch line numbers at which each token starts and ends
chainlint: sidestep impoverished macOS "terminfo"
Fix a source of flakiness in CI when compiling with SANITIZE=leak.
* ab/t7610-timeout:
t7610: use "file:///dev/null", not "/dev/null", fixes MinGW
t7610: fix flaky timeout issue, don't clone from example.com
'git maintenance register' is taught to write configuration to an
arbitrary path, and 'git for-each-repo' is taught to expand tilde
characters in paths.
* rp/maintenance-qol:
builtin/gc.c: fix use-after-free in maintenance_unregister()
maintenance --unregister: fix uninit'd data use & -Wdeclaration-after-statement
maintenance: add option to register in a specific config
for-each-repo: interpolate repo path arguments
Correct an error where `git rebase` would mistakenly use a branch or
tag named "refs/rewritten/xyz" when missing a rebase label.
* pw/strict-label-lookups:
sequencer: tighten label lookups
sequencer: unify label lookup
Redact headers from cURL's h2h3 module in GIT_CURL_VERBOSE and
others.
* gc/redact-h2h3-headers:
http: redact curl h2h3 headers in info
t: run t5551 tests with both HTTP and HTTP/2
"make coccicheck" is time consuming. It has been made to run more
incrementally.
* ab/coccicheck-incremental:
Makefile: don't create a ".build/.build/" for cocci, fix output
spatchcache: add a ccache-alike for "spatch"
cocci: run against a generated ALL.cocci
cocci rules: remove <id>'s from rules that don't need them
Makefile: copy contrib/coccinelle/*.cocci to build/
cocci: optimistically use COMPUTE_HEADER_DEPENDENCIES
cocci: make "coccicheck" rule incremental
cocci: split off "--all-includes" from SPATCH_FLAGS
cocci: split off include-less "tests" from SPATCH_FLAGS
Makefile: split off SPATCH_BATCH_SIZE comment from "cocci" heading
Makefile: have "coccicheck" re-run if flags change
Makefile: add ability to TAB-complete cocci *.patch rules
cocci rules: remove unused "F" metavariable from pending rule
Makefile + shared.mak: rename and indent $(QUIET_SPATCH_T)
Teach chainlint.pl to annotate the original test definition instead
of the token stream.
* es/chainlint-output:
chainlint: annotate original test definition rather than token stream
chainlint: latch start/end position of each token
chainlint: tighten accuracy when consuming input stream
chainlint: add explanatory comments
'scalar reconfigure -a' is taught to automatically remove
scalar.repo entires which no longer exist.
* js/remove-stale-scalar-repos:
tests(scalar): tighten the stale `scalar.repo` test some
scalar reconfigure -a: remove stale `scalar.repo` entries
Fix a regression in the bisect-helper which mistakenly treats
arguments to the command given to 'git bisect run' as arguments to
the helper.
* dd/bisect-helper-subcommand:
bisect--helper: parse subcommand with OPT_SUBCOMMAND
bisect--helper: move all subcommands into their own functions
bisect--helper: remove unused options
Preparation to remove git-submodule.sh and replace it with a builtin.
* ab/submodule-helper-prep-only:
submodule--helper: use OPT_SUBCOMMAND() API
submodule--helper: drop "update --prefix <pfx>" for "-C <pfx> update"
submodule--helper: remove --prefix from "absorbgitdirs"
submodule API & "absorbgitdirs": remove "----recursive" option
submodule.c: refactor recursive block out of absorb function
submodule tests: test for a "foreach" blind-spot
submodule--helper: fix a memory leak in "status"
submodule tests: add tests for top-level flag output
submodule--helper: move "config" to a test-tool
29fb2ec3 (chainlint.pl: validate test scripts in parallel,
2022-09-01) introduced a function that gets the number of cores from
/proc/cpuinfo on some systems, notably linux.
The regexp it uses (^processor\s*:) fails to match the desired lines in
the s390x architecture, where they look like this:
processor 0: version = FF, identification = 148F67, machine = 2964
As a result, on s390x that function returns 0 as the number of cores,
and the chainlint.pl script exits without doing anything.
Signed-off-by: Andreas Hasenack <andreas.hasenack@canonical.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 8db2dad7a0 (parse_object(): check on-disk type of suspected blob,
2022-11-17) simplified the conditional for checking if we might have a
blob. But we can simplify it further. In:
!obj || (obj && obj->type == OBJ_BLOB)
the short-circuit "OR" means "obj" will always be true on the right-hand
side. The compiler almost certainly optimized that out anyway, but
dropping it makes the conditional easier to understand for humans.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although it is possible to manually set LIB_HTTPD_PATH and
LIB_HTTPD_MODULE_PATH to point at the location of `httpd` and its
modules, doing so is cumbersome and easily forgotten. To address this,
0d344738dc (t/lib-http.sh: Restructure finding of default httpd
location, 2010-01-02) enhanced lib-httpd.sh to automatically detect the
location of `httpd` and its modules in order to facilitate out-of-the-
box testing on a wider range of platforms. Follow that lead by further
enhancing it to automatically detect the `httpd` modules on Void Linux,
as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test case "push with config push.useBitmap" of t5516 was introduced
in commit 82f67ee13f (send-pack.c: add config push.useBitmaps,
2022-06-17). It won't work in verbose mode, e.g.:
$ sh t5516-fetch-push.sh --run='1,115' -v
This is because "git-push" will run in a tty in this case, and the
subcommand "git pack-objects" will contain an argument "--progress"
instead of "-q". Adding a specific option "--quiet" to "git push" will
get a stable result for t5516.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
filter_combine__init() allocates a struct combine_filter_data object and
assigns it to the filter_data member of struct filter_options. Release
it in the complementing filter_combine__free().
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
$GIT_DIR/objects/pack may be removed to save inodes in shared
repositories. Quiet down prune in cases where either
$GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent,
but emit the system error in other cases to help users diagnose
permissions problems or resource constraints.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply "index-compatibility.pending.cocci" rule to "builtin/*", but
exclude those where we conflict with in-flight changes.
As a result some of them end up using only "the_index", so let's have
them use the more narrow "USE_THE_INDEX_VARIABLE" rather than
"USE_THE_INDEX_COMPATIBILITY_MACROS".
Manual changes not made by coccinelle, that were squashed in:
* Whitespace-wrap argument lists for repo_hold_locked_index(),
repo_read_index_preload() and repo_refresh_and_write_index(), in cases
where the line became too long after the transformation.
* Change "refresh_cache()" to "refresh_index()" in a comment in
"builtin/update-index.c".
* For those whose call was followed by perror("<macro-name>"), change
it to perror("<function-name>"), referring to the new function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a preceding commit we fully applied the
"index-compatibility.pending.cocci" rule to "t/helper/*".
Let's now stop defining "USE_THE_INDEX_COMPATIBILITY_MACROS" in
test-tool.h itself, and instead instead define
"USE_THE_INDEX_VARIABLE" in the individual test helpers that need
it. This mirrors how we do the same thing in the "builtin/" directory.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split up the "USE_THE_INDEX_COMPATIBILITY_MACROS" into that setting
and a more narrow "USE_THE_INDEX_VARIABLE". In the case of these
built-ins we only need "the_index" variable, but not the compatibility
wrapper for functions we're not using.
Let's then have some users of "USE_THE_INDEX_COMPATIBILITY_MACROS" use
this more narrow and descriptive define.
For context: The USE_THE_INDEX_COMPATIBILITY_MACROS macro was added to
test-tool.h in f8adbec9fe (cache.h: flip
NO_THE_INDEX_COMPATIBILITY_MACROS switch, 2019-01-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the "index-compatibility.pending.cocci" rule to the "t/helper/*"
directory, a subsequent commit will extend cache.h to further narrow
down the use of "USE_THE_INDEX_COMPATIBILITY_MACROS" in this area.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mostly apply the part of "index-compatibility.pending.cocci" that
renames the global variables like "active_nr", which are a shorthand
to referencing (in that case) a struct member as "the_index.cache_nr".
In doing so move more of "index-compatibility.pending.cocci" to
"index-compatibility.cocci".
In the case of "active_nr" we'd have a textual conflict with
"ab/various-leak-fixes" in "next"[1]. Let's exclude that specific case
while moving the rule over from "pending".
1. 407b94280f8 (commit: discard partial cache before (re-)reading it,
2022-11-08)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply a selection of rules in "index-compatibility.pending.cocci"
tree-wide, and in doing so migrate them to
"index-compatibility.cocci".
As in preceding commits the only manual changes here are the macro
removals in "cache.h", and the update to the '*.cocci" rules. The rest
of the C code changes are the result of applying those updated rules.
Move rules for some rarely used cache compatibility macros from
"index-compatibility.pending.cocci" to "index-compatibility.cocci" and
apply them.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a coccinelle rule which covers the rest of the macros guarded by
"USE_THE_INDEX_COMPATIBILITY_MACROS" cache.h. If the result of this
were applied it can be reduced down to just:
#ifdef USE_THE_INDEX_COMPATIBILITY_MACROS
extern struct index_state the_index;
#endif
But that patch is just under 2000 lines, so let's first add this as a
"pending", and then incrementally pick changes from it in subsequent
commits. In doing that we'll migrate rules from this
"index-compatibility.pending.cocci" to the "index-compatibility.cocci"
created in a preceding commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The discard_index() function has not returned non-zero since
7a51ed66f6 (Make on-disk index representation separate from in-core
one, 2008-01-14), but we've had various code in-tree still acting as
though that might be the case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 4aab5b46f4 (Make read-cache.c "the_index" free., 2007-04-01)
we've been undergoing a slow migration away from these macros, but
haven't made much progress since f8adbec9fe (cache.h: flip
NO_THE_INDEX_COMPATIBILITY_MACROS switch, 2019-01-24).
Let's move forward a bit by changing the users of those macros that
are rare enough that we can convert them in one go, and then remove
the compatibility shim.
The only manual change to the C code here is to "cache.h", the rest is
all the result of applying the new "index-compatibility.cocci".
Even though it's a one-off, let's keep the coccinelle rules for
now. We'll extend them in subsequent commits, and this will help
anything that's in-flight or out-of-tree to migrate.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adding "USE_THE_INDEX_COMPATIBILITY_MACROS" to these two appears to
have been unnecessary from the start, as going back and compiling
f8adbec9fe (cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch,
2019-01-24) without that addition works.
Let's not have these ask for the compatibility macros from cache.h
that they don't need.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "active_alloc" macro added in 228e94f935 (Move index-related
variables into a structure., 2007-04-01) has not been used since
4aab5b46f4 (Make read-cache.c "the_index" free., 2007-04-01). Let's
remove it.
The rest of these are likewise unused, so let's not keep them
around. E.g. 12cd0bf9b0 (dir: stop using the index compatibility
macros, 2017-05-05) is the last use of "cache_dir_exists".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid calling 'cache_tree_update()' when doing so would be redundant.
* vd/skip-cache-tree-update:
rebase: use 'skip_cache_tree_update' option
read-tree: use 'skip_cache_tree_update' option
reset: use 'skip_cache_tree_update' option
unpack-trees: add 'skip_cache_tree_update' option
cache-tree: add perf test comparing update and prime
Update the credential-cache documentation to provide a more realistic
example.
* mh/increase-credential-cache-timeout:
Documentation: increase example cache timeout to 1 hour
`git rebase --update-refs` would delete references when all `update-ref`
commands in the sequencer were removed, which has been corrected.
* vd/update-refs-delete:
rebase --update-refs: avoid unintended ref deletion
"git repack" learns to send cruft objects out of the way into
packfiles outside the repository.
* tb/repack-expire-to:
builtin/repack.c: implement `--expire-to` for storing pruned objects
builtin/repack.c: write cruft packs to arbitrary locations
builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack`
builtin/repack.c: pass "out" to `prepare_pack_objects`
Makefile comments updates and reordering to clarify knobs used to
choose SHA implementations.
* ab/sha-makefile-doc:
Makefile: discuss SHAttered in *_SHA{1,256} discussion
Makefile: document default SHA-1 backend on OSX
Makefile & test-tool: replace "DC_SHA1" variable with a "define"
Makefile: document SHA-1 and SHA-256 default and selection order
Makefile: document default SHA-256 backend
Makefile: rephrase the discussion of *_SHA1 knobs
Makefile: create and use sections for "define" flag listing
Makefile: correct DC_SHA1 documentation
INSTALL: remove discussion of SHA-1 backends
Makefile: always (re)set DC_SHA1 on fallback
Various test updates.
* ab/misc-hook-submodule-run-command:
run-command tests: test stdout of run_command_parallel()
submodule tests: reset "trace.out" between "grep" invocations
hook tests: fix redirection logic error in 96e7225b31
On my use case involving 771 islands of Linux on kernel.org,
this reduces memory usage by around 25MB. The bulk of that
comes from free_remote_islands, since free_config_regexes only
saves around 40k.
This memory is saved early in the memory-intensive pack process,
making it available for the remainder of the long process.
Signed-off-by: Eric Wong <e@80x24.org>
Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In parse_object(), we try to handle blobs by streaming rather than
loading them entirely into memory. The most common case here will be
that we haven't seen the object yet and check oid_object_info(), which
tells us we have a blob.
But we trigger this code on one other case: when we have an in-memory
object struct with type OBJ_BLOB (and without its "parsed" flag set,
since otherwise we'd return early from the function). This indicates
that some other part of the code suspected we have a blob (e.g., it was
mentioned by a tree or tag) but we haven't yet looked at the on-disk
copy.
In this case before hitting the streaming path, we check if we have the
object on-disk at all. This is mostly pointless extra work, as the
streaming path would complain if it couldn't open the object (albeit
with the message "hash mismatch", which is a little misleading).
But it's also insufficient to catch all problems. The streaming code
will only tell us "yes, the on-disk object matches the oid". But it
doesn't actually confirm that what we found was indeed a blob, and
neither does repo_has_object_file().
One way to improve this would be to teach stream_object_signature() to
check the type (either by returning it to us to check, or taking an
"expected" type). But there's an even simpler fix here: if we suspect
the object is a blob, just call oid_object_info() to confirm that we
have it on-disk, and that it really is a blob.
This is slightly less efficient than teaching stream_object_signature()
to do it (since it has to open the object already). But this case very
rarely comes up. In practice, we usually don't have any clue what the
type is, in which case we already call oid_object_info(). This
"suspected" case happens only when some other code created an object
struct but didn't actually parse the blob, which is actually tricky to
trigger at all (see the discussion of the test below).
I reworked the conditional a bit so that instead of:
if ((suspected_blob && oid_object_info() == OBJ_BLOB)
(no_clue && oid_object_info() == OBJ_BLOB)
we have the simpler:
if ((suspected_blob || no_clue) && oid_object_info() == OBJ_BLOB)
This is shorter, but also reflects what we really want say, which is
"have we ruled out this being a blob; if not, check it on-disk".
In either case, if oid_object_info() fails to tell us it's a blob, we'll
skip the streaming code path and call repo_read_object_file(), just as
before. And if we really do have a mismatch with the existing object
struct, we'll eventually call lookup_commit(), etc, via
parse_object_buffer(), which will complain that it doesn't match our
existing obj->type.
So this fixes one of the lingering expect_failure cases from 0616617c7e
(t: introduce tests for unexpected object types, 2019-04-09). That test
works by peeling a tag that claims to point to a blob (triggering us to
create the struct), but really points to something else, which we later
discover when we call parse_object() as part of the actual traversal).
Prior to this commit, we'd quietly check the sha1 and mark the blob as
"parsed". Now we correctly complain about the mismatch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When parsing an object of unknown type, we check to see if it's a blob,
so we can use our streaming code path. This uses oid_object_info() to
check the type, but before doing so we call repo_has_object_file(). This
latter is pointless, as oid_object_info() will already fail if the
object is missing. Checking it ahead of time just complicates the code
and is a waste of resources (albeit small).
Let's drop the redundant check.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.
Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.
We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.
This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:
Benchmark 1: main
Time (mean ± σ): 30.977 s ± 0.157 s [User: 30.226 s, System: 1.083 s]
Range (min … max): 30.796 s … 31.071 s 3 runs
Benchmark 2: pks-connectivity-check-hide-refs
Time (mean ± σ): 6.799 s ± 0.063 s [User: 6.803 s, System: 0.354 s]
Range (min … max): 6.729 s … 6.850 s 3 runs
Summary
'pks-connectivity-check-hide-refs' ran
4.56 ± 0.05 times faster than 'main'
As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:
Benchmark 1: main
Time (mean ± σ): 48.188 s ± 0.432 s [User: 49.326 s, System: 5.009 s]
Range (min … max): 47.706 s … 48.539 s 3 runs
Benchmark 2: pks-connectivity-check-hide-refs
Time (mean ± σ): 48.027 s ± 0.500 s [User: 48.934 s, System: 5.025 s]
Range (min … max): 47.504 s … 48.500 s 3 runs
Summary
'pks-connectivity-check-hide-refs' ran
1.00 ± 0.01 times faster than 'main'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a section name `uploadpack` or `receive`
as argument, it causes us to exclude all references that would be hidden
by the respective `$section.hideRefs` configuration.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.
Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The functions that handle exclusion of refs work on a single string
list. We're about to add a second mechanism for excluding refs though,
and it makes sense to reuse much of the same architecture for both kinds
of exclusion.
Introduce a new `struct ref_exclusions` that encapsulates all the logic
related to excluding refs and move the `struct string_list` that holds
all wildmatch patterns of excluded refs into it. Rename functions that
operate on this struct to match its name.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Move together the definitions of functions that handle exclusions of
refs so that related functionality sits in a single place, only.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.
Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When parsing the hideRefs configuration, we first duplicate the config
value so that we can modify it. We then subsequently append it to the
`hide_refs` string list, which is initialized with `strdup_strings`
enabled. As a consequence we again reallocate the string, but never
free the first duplicate and thus have a memory leak.
While we never clean up the static `hide_refs` variable anyway, this is
no excuse to make the leak worse by leaking every value twice. We are
also about to change the way this variable will be handled so that we do
indeed start to clean it up. So let's fix the memory leak by using the
`string_list_append_nodup()` so that we pass ownership of the allocated
string to `hide_refs`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When `git notes` prepares the template it adds an empty newline between
the comment header and the content:
>
> #
> # Write/edit the notes for the following object:
>
> # commit 0f3c55d4c2b7864bffb2d92278eff08d0b2e083f
> # etc
This is wrong structurally because that newline is part of the comment,
too, and thus should be commented. Also, it throws off some positioning
strategies of editors and plugins, and it differs from how we do commit
templates.
Change this to follow the standard set by `git commit`:
>
> #
> # Write/edit the notes for the following object:
> #
> # commit 0f3c55d4c2b7864bffb2d92278eff08d0b2e083f
>
Tests pass unchanged after this code change.
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
On MinGW the "/dev/null" is translated to "nul" on command-lines, even
though as in this case it'll never end up referring to an actual file.
So on Windows the fix for the previous "example.com" timeout issue in
8354cf752e (t7610: fix flaky timeout issue, don't clone from
example.com, 2022-11-05) would yield:
fatal: repo URL: 'nul' must be absolute or begin with ./|../
Let's evade this yet again by prefixing this with "file://", which
makes this pass in the Windows CI.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
While trying to fix a move based on an uninitialized value (along with a
declaration after the first statement), be0fd57228
(maintenance --unregister: fix uninit'd data use &
-Wdeclaration-after-statement, 2022-11-15) unintentionally introduced a
use-after-free.
The problem arises when `maintenance_unregister()` sees a non-NULL
`config_file` string and thus tries to call
git_configset_get_value_multi() to lookup the corresponding values.
We store the result off, and then call git_configset_clear(), which
frees the pointer that we just stored. We then try to read that
now-freed pointer a few lines below, and there we have our
use-after-free:
$ ./t7900-maintenance.sh -vxi --run=23 --valgrind
[...]
+ git maintenance unregister --config-file ./other
==3048727== Invalid read of size 8
==3048727== at 0x1869CA: maintenance_unregister (gc.c:1590)
==3048727== by 0x188F42: cmd_maintenance (gc.c:2651)
==3048727== by 0x128C62: run_builtin (git.c:466)
==3048727== by 0x12907E: handle_builtin (git.c:721)
==3048727== by 0x1292EC: run_argv (git.c:788)
==3048727== by 0x12988E: cmd_main (git.c:926)
==3048727== by 0x21ED39: main (common-main.c:57)
==3048727== Address 0x4b38bc8 is 24 bytes inside a block of size 64 free'd
==3048727== at 0x484617B: free (vg_replace_malloc.c:872)
==3048727== by 0x2D207E: free_individual_entries (hashmap.c:188)
==3048727== by 0x2D2153: hashmap_clear_ (hashmap.c:207)
==3048727== by 0x270B5C: git_configset_clear (config.c:2375)
==3048727== by 0x1869AC: maintenance_unregister (gc.c:1585)
==3048727== by 0x188F42: cmd_maintenance (gc.c:2651)
==3048727== by 0x128C62: run_builtin (git.c:466)
==3048727== by 0x12907E: handle_builtin (git.c:721)
==3048727== by 0x1292EC: run_argv (git.c:788)
==3048727== by 0x12988E: cmd_main (git.c:926)
==3048727== by 0x21ED39: main (common-main.c:57)
[...]
Resolve this via a partial-revert of be0fd57228. The config_set struct
now gets a zero initialization, which makes free()-ing it a noop even
without calling git_configset_init(). When we do initialize it to a
non-zero value, it is only free()'d after our last read of `list`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since (maintenance: add option to register in a specific config,
2022-11-09) we've been unable to build with "DEVELOPER=1" without
"DEVOPTS=no-error", as the added code triggers a
"-Wdeclaration-after-statement" warning.
And worse than that, the data handed to git_configset_clear() is
uninitialized, as can be spotted with e.g.:
./t7900-maintenance.sh -vixd --run=23 --valgrind
[...]
+ git maintenance unregister --force
Conditional jump or move depends on uninitialised value(s)
at 0x6B5F1E: git_configset_clear (config.c:2367)
by 0x4BA64E: maintenance_unregister (gc.c:1619)
by 0x4BD278: cmd_maintenance (gc.c:2650)
by 0x409905: run_builtin (git.c:466)
by 0x40A21C: handle_builtin (git.c:721)
by 0x40A58E: run_argv (git.c:788)
by 0x40AF68: cmd_main (git.c:926)
by 0x5D39FE: main (common-main.c:57)
Uninitialised value was created by a stack allocation
at 0x4BA22C: maintenance_unregister (gc.c:1557)
Let's fix both of these issues, and also move the scope of the
variable to the "if" statement it's used in, to make it obvious where
it's used.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
maintenance register currently records the maintenance repo exclusively
within the user's global configuration, but other configuration files
may be relevant when running maintenance if they are included from the
global config. This option allows the user to choose where maintenance
repos are recorded.
Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
This is a quality of life change for git-maintenance, so repos can be
recorded with the tilde syntax. The register subcommand will not record
repos in this format by default.
Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add trace2 counters to the region to clear skip worktree bits in a
sparse checkout.
* al/trace2-clearing-skip-worktree:
index: raise a bug if the index is materialised more than once
index: add trace2 region for clear skip worktree
With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
"Authorization" and "Cookie" get redacted. However, since [1], curl's
h2h3 module (invoked when using HTTP/2) also prints headers in its
"info", which don't get redacted. For example,
echo 'github.com TRUE / FALSE 1698960413304 o foo=bar' >cookiefile &&
GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
-c 'http.cookiefile=cookiefile' \
-c 'http.version=' \
ls-remote https://github.com/git/git refs/heads/main 2>output &&
grep 'cookie' output
produces output like:
23:04:16.920495 http.c:678 == Info: h2h3 [cookie: o=foo=bar]
23:04:16.920562 http.c:637 => Send header: cookie: o=<redacted>
Teach http.c to check for h2h3 headers in info and redact them using the
existing header redaction logic. This fixes the broken redaction logic
that we noted in the previous commit, so mark the redaction tests as
passing under HTTP2.
[1] f8c3724aa9
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
We have occasionally seen bugs that affect Git running only against an
HTTP/2 web server, not an HTTP one. For instance, b66c77a64e (http:
match headers case-insensitively when redacting, 2021-09-22). But since
we have no test coverage using HTTP/2, we only uncover these bugs in the
wild.
That commit gives a recipe for converting our Apache setup to support
HTTP/2, but:
- it's not necessarily portable
- we don't want to just test HTTP/2; we really want to do a variety of
basic tests for _both_ protocols
This patch handles both problems by running a duplicate of t5551
(labeled as t5559 here) with an alternate-universe setup that enables
HTTP/2. So we'll continue to run t5551 as before, but run the same
battery of tests again with HTTP/2. If HTTP/2 isn't supported on a given
platform, then t5559 should bail during the webserver setup, and
gracefully skip all tests (unless GIT_TEST_HTTPD has been changed from
"auto" to "yes", where the point is to complain when webserver setup
fails).
In theory other http-related test scripts could benefit from the same
duplication, but doing t5551 should give us a reasonable check of basic
functionality, and would have caught both bugs we've seen in the wild
with HTTP/2.
A few notes on the implementation:
- a script enables the server side config by calling enable_http2
before starting the webserver. This avoids even trying to load any
HTTP/2 config for t5551 (which is what lets it keep working with
regular HTTP even on systems that don't support it). This also sets
a prereq which can be used by individual tests.
- As discussed in b66c77a64e, the http2 module isn't compatible with
the "prefork" mpm, so we need to pick something else. I chose
"event" here, which works on my Debian system, but it's possible
there are platforms which would prefer something else. We can adjust
that later if somebody finds such a platform.
- The test "large fetch-pack requests can be sent using chunked
encoding" makes sure we use a chunked transfer-encoding by looking
for that header in the trace. But since HTTP/2 has its own streaming
mechanisms, we won't find such a header. We could skip the test
entirely by marking it with !HTTP2. But there's some value in making
sure that the fetch itself succeeded. So instead, we'll confirm that
either we're using HTTP2 _or_ we saw the expected chunked header.
- the redaction tests fail under HTTP/2 with recent versions of curl.
This is a bug! I've marked them with !HTTP2 here to skip them under
t5559 for the moment. Using test_expect_failure would be more
appropriate, but would require a bunch of boilerplate. Since we'll
be fixing them momentarily, let's just skip them for now to keep the
test suite bisectable, and we can re-enable them in the commit that
fixes the bug.
- one alternative layout would be to push most of t5551 into a
lib-t5551.sh script, then source it from both t5551 and t5559.
Keeping t5551 intact seemed a little simpler, as its one less level
of indirection for people fixing bugs/regressions in the non-HTTP/2
tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Git learned pushing submodules without pushing the superproject by
the user specifying --recurse-submodules=only through 6c656c3fe4
("submodules: add RECURSE_SUBMODULES_ONLY value", 2016-12-20) and
225e8bf778 ("push: add option to push only submodules", 2016-12-20).
For users who use this feature regularly, it is desirable to have an
equivalent configuration.
It turns out that such a configuration (push.recurseSubmodules=only) is
already supported, even though it is neither documented nor mentioned
in the commit messages, due to the way the --recurse-submodules=only
feature was implemented (a function used to parse --recurse-submodules
was updated to support "only", but that same function is used to parse
push.recurseSubmodules too). What is left is to document it and test it,
which is what this commit does.
There is a possible point of confusion when recursing into a submodule
that itself has the push.recurseSubmodules=only configuration, because
if a repository has only its submodules pushed and not itself, its
superproject can never be pushed. Therefore, treat such configurations
as being "on-demand", and print a warning message.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
It was previously unclear how unrecognised attributes are handled.
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
As pointed out by Stolee, the previous incarnation of this test case was
not stringent enough: we want to verify that _only_ the stale entries
are removed (previously, the test case would have succeeded even if all
entries had been removed).
Let's rectify this and verify that the other entries are left intact.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Remotes are considered "promisor" if extensions.partialClone and some
other configuration variables are set. The casing for this in
Documentation/technical/repository-version.txt is not proper and may
cause confusion. This change corrects this casing.
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix a couple of issues in the recently merged 0f3c55d4c2b (Merge
branch 'ab/coccicheck-incremental' into next, 2022-11-08):
In copying over the "contrib/coccinelle/" rules to
".build/contrib/coccinelle/" we inadvertently ended up with a
".build/.build/contrib/coccinelle/" as well. We'd generate the
per-file patches in the former, and keep the rule and overall result
in the latter. E.g. running:
make contrib/coccinelle/free.cocci.patch COCCI_SOURCES="attr.c grep.c"
Would, per "tree -a .build" yield the following result:
.build
├── .build
│ └── contrib
│ └── coccinelle
│ └── free.cocci.patch
│ ├── attr.c
│ ├── attr.c.log
│ ├── grep.c
│ └── grep.c.log
└── contrib
└── coccinelle
├── FOUND_H_SOURCES
├── free.cocci
└── free.cocci.patch
Now we'll instead generate all of our files in
".build/contrib/coccinelle/". Fixing this required renaming the
directory where we keep our per-file patches, as we'd otherwise
conflict with the result.
Now the per-file patch directory is named e.g. "free.cocci.d". And the
end result will now be:
.build
└── contrib
└── coccinelle
├── FOUND_H_SOURCES
├── free.cocci
├── free.cocci.d
│ ├── attr.c.patch
│ ├── attr.c.patch.log
│ ├── grep.c.patch
│ └── grep.c.patch.log
└── free.cocci.patch
The per-file patches now have a ".patch" file suffix, which fixes
another issue reported against 0f3c55d4c2b: The summary output was
confusing. Before for the "make" command above we'd emit:
[...]
MKDIR -p .build/contrib/coccinelle
CP contrib/coccinelle/free.cocci .build/contrib/coccinelle/free.cocci
GEN .build/contrib/coccinelle/FOUND_H_SOURCES
MKDIR -p .build/.build/contrib/coccinelle/free.cocci.patch
SPATCH .build/.build/contrib/coccinelle/free.cocci.patch/grep.c
SPATCH .build/.build/contrib/coccinelle/free.cocci.patch/attr.c
SPATCH CAT $^ >.build/contrib/coccinelle/free.cocci.patch
CP .build/contrib/coccinelle/free.cocci.patch contrib/coccinelle/free.cocci.patch
But now we'll instead emit (identical output at the start omitted):
[...]
MKDIR -p .build/contrib/coccinelle/free.cocci.d
SPATCH grep.c >.build/contrib/coccinelle/free.cocci.d/grep.c.patch
SPATCH attr.c >.build/contrib/coccinelle/free.cocci.d/attr.c.patch
SPATCH CAT .build/contrib/coccinelle/free.cocci.d/**.patch >.build/contrib/coccinelle/free.cocci.patch
CP .build/contrib/coccinelle/free.cocci.patch contrib/coccinelle/free.cocci.patch
I.e. we have an "SPATCH" line that makes it clear that we're running
against the "{attr,grep}.c" file. The "SPATCH CAT" is then altered to
correspond to it, showing that we're concatenating the
"free.cocci.d/**.patch" files into one generated "free.cocci.patch" at
the end.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
As of it is, we're parsing subcommand with OPT_CMDMODE, which will
continue to parse more options even if the command has been found.
When we're running "git bisect run" with a command that expecting
a "--log" or "--no-log" arguments, or one of those "--bisect-..."
arguments, bisect--helper may mistakenly think those options are
bisect--helper's option.
We may fix those problems by passing "--" when calling from
git-bisect.sh, and skip that "--" in bisect--helper. However, it may
interfere with user's "--".
Let's parse subcommand with OPT_SUBCOMMAND since that API was born for
this specific use-case.
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In a later change, we will use OPT_SUBCOMMAND to parse sub-commands to
avoid consuming non-option opts.
Since OPT_SUBCOMMAND needs a function pointer to operate,
let's move it now.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
'git-bisect.sh' used to have a 'bisect_next_check' to check if we have
both good/bad, old/new terms set or not. In commit 129a6cf344
(bisect--helper: `bisect_next_check` shell function in C, 2019-01-02),
a subcommand for bisect--helper was introduced to port the check to C.
Since d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13), all users of 'bisect_next_check' was
re-implemented in C, this subcommand was no longer used but we forgot
to remove '--bisect-next-check'.
'git-bisect.sh' also used to have a 'bisect_write' function, whose
third positional parameter was a "nolog" flag. This flag was only used
when 'bisect_start' invoked 'bisect_write' to write the starting good
and bad revisions. Then 0f30233a11 (bisect--helper: `bisect_write`
shell function in C, 2019-01-02) ported it to C as a command mode of
'bisect--helper', which (incorrectly) added the '--no-log' option,
and convert the only place ('bisect_start') that call 'bisect_write'
with 'nolog' to 'git bisect--helper --bisect-write' with 'nolog'
instead of '--no-log', since 'bisect--helper' has command modes not
subcommands, all other command modes see and handle that option as well.
This bogus state didn't last long, however, because in the same patch
series 06f5608c14 (bisect--helper: `bisect_start` shell function
partially in C, 2019-01-02) the C reimplementation of bisect_start()
started calling the bisect_write() C function, this time with the
right 'nolog' function parameter. From then on there was no need for
the '--no-log' option in 'bisect--helper'. Eventually all bisect
subcommands were ported to C as 'bisect--helper' command modes, each
calling the bisect_write() C function instead, but when the
'--bisect-write' command mode was removed in 68efed8c8a
(bisect--helper: retire `--bisect-write` subcommand, 2021-02-03) it
forgot to remove that '--no-log' option.
'--no-log' option had never been used and it's unused now.
Let's remove --bisect-next-check and --no-log from option parsing.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When chainlint detects problems in a test, it prints out the name of the
test script, the name of the problematic test, and a copy of the test
definition with "?!FOO?!" annotations inserted at the locations where
problems were detected. Taken together this information is sufficient
for the test author to identify the problematic code in the original
test definition. However, in a lengthy script or a lengthy test
definition, the author may still end up using the editor's search
feature to home in on the exact problem location.
To further assist the test author, display line numbers along with the
annotated test definition, thus allowing the author to jump directly to
each problematic line.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When chainlint detects problems in a test, it prints out the name of the
test script, the name of the problematic test, and a copy of the test
definition with "?!FOO?!" annotations inserted at the locations where
problems were detected. Taken together this information is sufficient
for the test author to identify the problematic code in the original
test definition. However, in a lengthy script or a lengthy test
definition, the author may still end up using the editor's search
feature to home in on the exact problem location.
To further assist the test author, an upcoming change will display line
numbers along with the annotated test definition, thus allowing the
author to jump directly to each problematic line. As preparation,
upgrade Lexer to latch the line numbers at which each token starts and
ends, and return that information with the token itself.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Although the macOS Terminal.app is "xterm"-compatible, its corresponding
"terminfo" entries -- such as "xterm", "xterm-256color", and
"xterm-new"[1] -- neglect to mention capabilities which Terminal.app
actually supports (such as "dim text"). This oversight on Apple's part
ends up penalizing users of "good citizen" console programs which
consult "terminfo" to tailor their output based upon reported terminal
capabilities (as opposed to programs which assume that the terminal
supports ANSI codes). The same problem is present in other Apple
"terminfo" entries, such as "nsterm"[2], with which macOS Terminal.app
may be configured.
Sidestep this Apple problem by imbuing get_colors() with specific
knowledge of capabilities common to "xterm" and "nsterm", rather than
trusting "terminfo" to report them correctly. Although hard-coding such
knowledge is ugly, "xterm" support is nearly ubiquitous these days, and
Git itself sets precedence by assuming support for ANSI color codes. For
other terminal types, fall back to querying "terminfo" via `tput` as
usual.
FOOTNOTES
[1] iTerm2 FAQ suggests "xterm-new": https://iterm2.com/faq.html
[2] Neovim documentation recommends terminal type "nsterm" with
Terminal.app: https://neovim.io/doc/user/term.html#terminfo
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The `label` command creates a ref refs/rewritten/<label> that the
`reset` and `merge` commands resolve by calling lookup_label(). That
uses lookup_commit_reference_by_name() to look up the label ref. As
lookup_commit_reference_by_name() uses the dwim rules when looking up
the label it will look for a branch named
refs/heads/refs/rewritten/<label> and return that instead of an error if
the branch exists and the label does not. Fix this by using read_ref()
followed by lookup_commit_object() when looking up labels.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The arguments to the `reset` and `merge` commands may be a label created
with a `label` command or an arbitrary commit name. The `merge` command
uses the lookup_label() function to lookup its arguments but `reset` has
a slightly different version of that function in do_reset(). Reduce this
code duplication by calling lookup_label() from do_reset() as well.
This change improves the behavior of `reset` when the argument is a
tree. Previously `reset` would accept a tree only for the rebase to
fail with
update_ref failed for ref 'HEAD': cannot update ref 'HEAD': trying to write non-commit object da5497437fd67ca928333aab79c4b4b55036ea66 to branch 'HEAD'
Using lookup_label() means do_reset() will now error out straight away
if its argument is not a commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Enable the 'skip_cache_tree_update' option in both 'do_reset()'
('sequencer.c') and 'reset_head()' ('reset.c'). Both of these callers invoke
'prime_cache_tree()' after 'unpack_trees()', so we can remove an unnecessary
cache tree rebuild by skipping 'cache_tree_update()'.
When testing with 'p3400-rebase.sh' and 'p3404-rebase-interactive.sh', the
performance change of this update was negligible, likely due to the
operation being dominated by more expensive operations (like checking out
trees). However, since the change doesn't harm performance, it's worth
keeping this 'unpack_trees()' usage consistent with others that subsequently
invoke 'prime_cache_tree()'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When running 'read-tree' with a single tree and no prefix,
'prime_cache_tree()' is called after the tree is unpacked. In that
situation, skip a redundant call to 'cache_tree_update()' in
'unpack_trees()' by enabling the 'skip_cache_tree_update' unpack option.
Removing the redundant cache tree update provides a substantial performance
improvement to 'git read-tree <tree-ish>', as shown by a test added to
'p0006-read-tree-checkout.sh':
Test before after
----------------------------------------------------------------------
read-tree br_ballast_plus_1 3.94(1.80+1.57) 3.00(1.14+1.28) -23.9%
Note that the 'read-tree' in 't1022-read-tree-partial-clone.sh' is updated
to read two trees, rather than one. The test was first introduced in
d3da223f22 (cache-tree: prefetch in partial clone read-tree, 2021-07-23) to
exercise the 'cache_tree_update()' code path, as used in 'git merge'. Since
this patch drops the call to 'cache_tree_update()' in single-tree 'git
read-tree', change the test to use the two-tree variant so that
'cache_tree_update()' is called as intended.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Enable the 'skip_cache_tree_update' option in the variants that call
'prime_cache_tree()' after 'unpack_trees()' (specifically, 'git reset
--mixed' and 'git reset --hard'). This avoids redundantly rebuilding the
cache tree in both 'cache_tree_update()' at the end of 'unpack_trees()' and
in 'prime_cache_tree()', resulting in a small (but consistent) performance
improvement. From the newly-added 'p7102-reset.sh' test:
Test before after
--------------------------------------------------------------------
7102.1: reset --hard (...) 2.11(0.40+1.54) 1.97(0.38+1.47) -6.6%
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add (disabled by default) option to skip the 'cache_tree_update()' at the
end of 'unpack_trees()'. In many cases, this cache tree update is redundant
because the caller of 'unpack_trees()' immediately follows it with
'prime_cache_tree()', rebuilding the entire cache tree from scratch. While
these operations aren't the most expensive part of operations like 'git
reset', the duplicate calls still create a minor unnecessary slowdown.
Introduce an option for callers to skip the 'cache_tree_update()' in
'unpack_trees()' if it is redundant (that is, if 'prime_cache_tree()' is
called afterwards). At the moment, no 'unpack_trees()' callers use the new
option; they will be updated in subsequent patches.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add a performance test comparing the execution times of 'prime_cache_tree()'
and 'cache_tree_update(_, WRITE_TREE_SILENT | WRITE_TREE_REPAIR)'. The goal
of comparing these two is to identify which is the faster method for
rebuilding an invalid cache tree, ultimately to remove one when both are
(reundantly) called in immediate succession.
Both methods are fast, so the new tests in 'p0090-cache-tree.sh' must call
each tested function multiple times to ensure the reported times (to 0.01s
resolution) convey the differences between them.
The tests compare the timing of a 'test-tool cache-tree' run as a no-op (to
capture a baseline for the overhead associated with running the tool),
'cache_tree_update()', and 'prime_cache_tree()' on four scenarios:
- A completely valid cache tree
- A cache tree with 2 invalid paths
- A cache tree with 50 invalid paths
- A completely empty cache tree
Example results:
Test this tree
-----------------------------------------------------------
0090.2: no-op, clean 1.27(0.48+0.52)
0090.3: prime_cache_tree, clean 2.02(0.83+0.85)
0090.4: cache_tree_update, clean 1.30(0.49+0.54)
0090.5: no-op, invalidate 2 1.29(0.48+0.54)
0090.6: prime_cache_tree, invalidate 2 1.98(0.81+0.83)
0090.7: cache_tree_update, invalidate 2 2.12(0.94+0.86)
0090.8: no-op, invalidate 50 1.32(0.50+0.55)
0090.9: prime_cache_tree, invalidate 50 2.10(0.86+0.89)
0090.10: cache_tree_update, invalidate 50 2.35(1.14+0.90)
0090.11: no-op, empty 1.33(0.50+0.54)
0090.12: prime_cache_tree, empty 2.04(0.84+0.87)
0090.13: cache_tree_update, empty 2.51(1.27+0.92)
These timings show that, while 'cache_tree_update()' is faster when the
cache tree is completely valid, it is equal to or slower than
'prime_cache_tree()' when there are any invalid paths. Since the redundant
calls are mostly in scenarios where the cache tree will be at least
partially invalid (e.g., 'git reset --hard'), 'prime_cache_tree()' will
likely perform better than 'cache_tree_update()' in typical cases.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When deleting a branch, "git branch -d" has a safety check that ensures
the branch is merged to its upstream (if any), or to HEAD. To do that,
naturally we try to resolve HEAD to a commit object. If we're on an
orphan branch (i.e., HEAD points to a branch that does not yet exist),
that will fail, and we'll bail with an error:
$ git branch -d to-delete
fatal: Couldn't look up commit object for HEAD
This usually isn't that big of a deal. The deletion would fail anyway,
since the branch isn't merged to HEAD, and you'd need to use "-D" (or
"-f"). And doing so skips the HEAD resolution, courtesy of 67affd5173
(git-branch -D: make it work even when on a yet-to-be-born branch,
2006-11-24).
But there are still two problems:
1. The error message isn't very helpful. We should give the usual "not
fully merged" message, which points the user at "branch -D". That
was a problem even back in 67affd5173.
2. Even without a HEAD, these days it's still possible for the
deletion to succeed. After 67affd5173, commit 99c419c915 (branch
-d: base the "already-merged" safety on the branch it merges with,
2009-12-29) made it OK to delete a branch if it is merged to its
upstream.
We can fix both by removing the die() in delete_branches() completely,
leaving head_rev NULL in this case. It's tempting to stop there, as it
appears at first glance that the rest of the code does the right thing
with a NULL. But sadly, it's not quite true.
We end up feeding the NULL to repo_is_descendant_of(). In the
traditional code path there, we call repo_in_merge_bases_many(). It
feeds the NULL to repo_parse_commit(), which is smart enough to return
an error, and we immediately return "no, it's not a descendant".
But there's an alternate code path: if we have a commit graph with
generation numbers, we end up in can_all_from_reach(), which does
eventually try to set a flag on the NULL commit and segfaults.
So instead, we'll teach the local branch_merged() helper to treat a NULL
as "not merged". This would be a little more elegant in in_merge_bases()
itself, but that function is called in a lot of places, and it's not
clear that quietly returning "not merged" is the right thing everywhere
(I'd expect in many cases, feeding a NULL is a sign of a bug).
There are four tests here:
a. The first one confirms that deletion succeeds with an orphaned HEAD
when the branch is merged to its upstream. This is case (2) above.
b. Same, but with commit graphs enabled. Even if it is merged to
upstream, we still check head_rev so that we can say "deleting
because it's merged to upstream, even though it's not merged to
HEAD". Without the second hunk in branch_merged(), this test would
segfault in can_all_from_reach().
c. The third one confirms that we correctly say "not merged to HEAD"
when we can't resolve HEAD, and reject the deletion.
d. Same, but with commit graphs enabled. Without the first hunk in
branch_merged(), this one would segfault.
Reported-by: Martin von Zweigbergk <martinvonz@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
git_parse_signed() checks that the absolute value of the parsed string
is less than or equal to a caller supplied maximum value. When
calculating the absolute value there is a integer overflow if `val ==
INTMAX_MIN`. To fix this avoid negating `val` when it is negative by
having separate overflow checks for positive and negative values.
An alternative would be to special case INTMAX_MIN before negating `val`
as it is always out of range. That would enable us to keep the existing
code but I'm not sure that the current two-stage check is any clearer
than the new version.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
If the input to strtoimax() or strtoumax() does not contain any digits
then they return zero and set `end` to point to the start of the input
string. git_parse_[un]signed() do not check `end` and so fail to return
an error and instead return a value of zero if the input string is a
valid units factor without any digits (e.g "k").
Tests are added to check that 'git config --int' and OPT_MAGNITUDE()
reject a units specifier without a leading digit.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
git_parse_unsigned() relies on strtoumax() which unfortunately parses
negative values as large positive integers. Fix this by rejecting any
string that contains '-' as we do in strtoul_ui(). I've chosen to treat
negative numbers as invalid input and set errno to EINVAL rather than
ERANGE one the basis that they are never acceptable if we're looking for
a unsigned integer. This is also consistent with the existing behavior
of rejecting "1–2" with EINVAL.
As we do not have unit tests for this function it is tested indirectly
by checking that negative values of reject for core.bigFileThreshold are
rejected. As this function is also used by OPT_MAGNITUDE() a test is
added to check that rejects negative values too.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Now that struct replay_opts has a reflog_action member we no longer
need to export GIT_REFLOG_ACTION when starting a rebase. If the user
has set GIT_REFLOG_ACTION then we use it when initializing
reflog_action.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Each time it picks a commit the sequencer copies the GIT_REFLOG_ACITON
environment variable so it can temporarily change it and then restore
the previous value. This results in code that is hard to follow and also
leaks memory because (i) we fail to free the copy when we've finished
with it and (ii) each call to setenv() leaks the previous value. Instead
pass the reflog action around in a variable and use it to set
GIT_REFLOG_ACTION in the child environment when running "git commit".
Within the sequencer GIT_REFLOG_ACTION is no longer set and is only read
by sequencer_reflog_action(). It is still set by rebase before calling
the sequencer, that will be addressed in the next commit. cherry-pick
and revert are unaffected as they do not set GIT_REFLOG_ACTION before
calling the sequencer.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When t7610-mergetool.sh runs without failures the git://example.com
submodule URLs will never be used. That's because we "git submodule
add" it, but then manually populate them so that subsequent "git
submodule update -N" won't attempt to clone it, only update it without
fetching.
But if we fail in an earlier test it'll have the knock-on effect of
having later tests hang on that "git submodule update -N" as we
attempt to clone this repository from example.com.
This can be reproduced on "master" by running the test with
SANITIZE=leak without "--immediate". With
"GIT_TEST_PASSING_SANITIZE_LEAK=true" (which the linux-leaks job uses)
we'll skip the test entirely. So we'll only run into this when running
it manually, or with the "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode.
That's not because the failure has anything to do with leak detection
per-se. It just so happens that we have a leak that'll fail before
we've managed to fully set these up, and therefore "git submodule
update -N" ends up spawning "git clone".
Let's instead continue lying about the origin of this submodule by
providing a URL for it that doesn't work, but now one that *really*
doesn't work: /dev/null. If the test is passing we won't ever use
this, and if we have knock-on failures we'll fail early, instead of
waiting for a timeout.
The behavior of "-N" here might be surprising to some, since it's
explained as "[if you use -N we] don’t fetch new objects from the
remote site". But (perhaps counter-intuitively) it's only talking
about if it needs to do so via "git fetch". In this case we'll end up
spawning a "git clone", as we have no submodule set up.
See ff7f089ed1 (mergetool: Teach about submodules, 2011-04-13) for
the commit that implemented these "example.com" tests.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
* es/chainlint-output:
chainlint: annotate original test definition rather than token stream
chainlint: latch start/end position of each token
chainlint: tighten accuracy when consuming input stream
chainlint: add explanatory comments
Simplify the run-command API.
* rs/no-more-run-command-v:
replace and remove run_command_v_opt()
replace and remove run_command_v_opt_cd_env_tr2()
replace and remove run_command_v_opt_tr2()
replace and remove run_command_v_opt_cd_env()
use child_process members "args" and "env" directly
use child_process member "args" instead of string array variable
sequencer: simplify building argument list in do_exec()
bisect--helper: factor out do_bisect_run()
bisect: simplify building "checkout" argument list
am: simplify building "show" argument list
run-command: fix return value comment
merge: remove always-the-same "verbose" arguments
"git archive" mistakenly complained twice about a missing executable,
which has been corrected.
* rs/archive-filter-error-once:
archive-tar: report filter start error only once
A redundant diagnostic message is dropped from test_path_is_missing().
* ma/drop-redundant-diagnostic:
test-lib-functions: drop redundant diagnostic print
Various tests exercising the transfer.credentialsInUrl configuration
are taught to avoid making requests which require resolving localhost
to reduce CI-flakiness.
* jk/ref-filter-parsing-bugs:
ref-filter: fix parsing of signatures with CRLF and no body
ref-filter: fix parsing of signatures without blank lines
The glossary entries for "commit-graph file" and "reachability
bitmap" have been added.
* po/glossary-around-traversal:
glossary: add reachability bitmap description
glossary: add "commit graph" description
doc: use 'object database' not ODB or abbreviation
doc: use "commit-graph" hyphenation consistently
The adjust_shared_perm() helper function learned to refrain from
setting the "g+s" bit on directories when it is not necessary.
* jc/set-gid-bit-less-aggressively:
adjust_shared_perm(): leave g+s alone when the group does not matter
Enable gc.cruftpacks by default for those who opt into
feature.experimental setting.
* es/mark-gc-cruft-as-experimental:
config: let feature.experimental imply gc.cruftPacks=true
gc: add tests for --cruft and friends
Resolves a problem where symbolic links were not showing up in diff when
created or modified.
kFSEventStreamEventFlagItemIsSymlink is also treated as a file update.
This is because kFSEventStreamEventFlagItemIsFile is not included in
FSEvents when creating or deleting symbolic links. For example:
$ ln -snf t test
fsevent: '/path/to/dir/test', flags=0x40100 ItemCreated|ItemIsSymlink|
$ ln -snf ci test
fsevent: '/path/to/dir/test', flags=0x40200 ItemIsSymlink|ItemRemoved|
fsevent: '/path/to/dir/test', flags=0x40100 ItemCreated|ItemIsSymlink|
Signed-off-by: srz_zumix <zumix.cpp@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Have the REV_INFO_INIT macro added in [1] declare more members of
"struct rev_info" that we can initialize statically, and have
repo_init_revisions() do so with the memcpy(..., &blank) idiom
introduced in [2].
As the comment for the "REV_INFO_INIT" macro notes this still isn't
sufficient to initialize a "struct rev_info" for use yet. But we are
getting closer to that eventual goal.
Even though we can't fully initialize a "struct rev_info" with
REV_INFO_INIT it's useful for readability to clearly separate those
things that we can statically initialize, and those that we can't.
This change could replace the:
list_objects_filter_init(&revs->filter);
In the repo_init_revisions() with this line, at the end of the
REV_INFO_INIT deceleration in revisions.h:
.filter = LIST_OBJECTS_FILTER_INIT, \
But doing so would produce a minor conflict with an outstanding
topic[3]. Let's skip that for now. I have follow-ups to initialize
more of this statically, e.g. changes to get rid of grep_init(). We
can initialize more members with the macro in a future series.
1. f196c1e908 (revisions API users: use release_revisions() needing
REV_INFO_INIT, 2022-04-13)
2. 5726a6b401 (*.c *_init(): define in terms of corresponding *_INIT
macro, 2021-07-01)
3. https://lore.kernel.org/git/265b292ed5c2de19b7118dfe046d3d9d932e2e89.1667901510.git.ps@pks.im/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The old version we currently use runs in node.js v12.x, which is being
deprecated in GitHub Actions. The new version uses node.js v16.x.
Incidentally, this also avoids the warning about the deprecated
`::set-output::` workflow command because the newer version of the
`github-script` Action uses the recommended new way to specify outputs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.
However, now that each parsed token carries positional information, the
location of a detected problem can be pinpointed precisely in the
original test definition. Therefore, take advantage of this information
to annotate the test definition itself rather than annotating the parsed
token stream, thus making it easier for a test author to relate a
problem back to the source.
Maintaining the positional meta-information associated with each
detected problem requires a slight change in how the problems are
managed internally. In particular, shell syntax such as:
msg="total: $(cd data; wc -w *.txt) words"
requires the lexical analyzer to recursively invoke the parser in order
to detect problems within the $(...) expression inside the double-quoted
string. In this case, the recursive parse context will detect the broken
&&-chain between the `cd` and `wc` commands, returning the token stream:
cd data ; ?!AMP?! wc -w *.txt
However, the parent parse context will see everything inside the
double-quotes as a single string token:
"total: $(cd data ; ?!AMP?! wc -w *.txt) words"
losing whatever positional information was attached to the ";" token
where the problem was detected.
One way to preserve the positional information of a detected problem in
a recursive parse context within a string would be to attach the
positional information to the annotation textually; for instance:
"total: $(cd data ; ?!AMP:21:22?! wc -w *.txt) words"
and then extract the positional information when annotating the original
test definition.
However, a cleaner and much simpler approach is to maintain the list of
detected problems separately rather than embedding the problems as
annotations directly in the parsed token stream. Not only does this
ensure that positional information within recursive parse contexts is
not lost, but it keeps the token stream free from non-token pollution,
which may simplify implementation of validations added in the future
since they won't have to handle non-token "?!FOO!?" items specially.
Finally, the chainlint self-test "expect" files need a few mechanical
adjustments now that the original test definitions are emitted rather
than the parsed token stream. In particular, the following items missing
from the historic parsed-token output are now preserved verbatim:
* indentation (and whitespace, in general)
* comments
* here-doc bodies
* here-doc tag quoting (i.e. "\EOF")
* line-splices (i.e. "\" at the end of a line)
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.
To address this shortcoming, an upcoming change will make it print out
an annotated copy of the original test definition rather than the
tokenized representation. In order to do so, it will need to know the
start and end positions of each token in the original test definition.
As preparation, upgrade TestParser::scan_token() to latch the start and
end position of the token being scanned, and return that information
along with the token itself. A subsequent change will take advantage of
this positional information.
In terms of implementation, TestParser::scan_token() is retrofitted to
return a tuple consisting of the token's lexeme and its start and end
positions, rather than returning just the lexeme. However, an
alternative would be to define a class which represents a token:
package Token;
sub new {
my ($class, $lexeme, $start, $end) = @_;
bless [$lexeme, $start, $end] => $class;
}
sub as_string {
my $self = shift @_;
return $self->[0];
}
sub compare {
my ($x, $y) = @_;
if (UNIVERSAL::isa($y, 'Token')) {
return $x->[0] cmp $y->[0];
}
return $x->[0] cmp $y;
}
use overload (
'""' => 'as_string',
'cmp' => 'compare'
);
The major benefit of the class-based approach is that it is entirely
non-invasive; it requires no additional changes to the rest of the
script since a Token converts automatically to a string, which is what
scan_token() historically returned.
The big downside to the Token approach, however, is that it is _slow_;
on this developer's (old) machine, it increases user-time by an
unacceptable seven seconds when scanning all test scripts in the
project. Hence, the simple tuple approach is employed instead since it
adds only a fraction of a second user-time.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To extract the next token in the input stream, Lexer::scan_token() finds
the start of the token by skipping whitespace, then consumes characters
belonging to the token until it encounters a non-token character, such
as an operator, punctuation, or whitespace. In the case of an operator
or punctuation which ends a token, before returning the just-scanned
token, it pushes that operator or punctuation character back onto the
input stream to ensure that it will be the first character consumed by
the next call to scan_token().
However, scan_token() is intentionally lax when whitespace ends a token;
it doesn't bother pushing the whitespace character back onto the token
stream since it knows that the next call to scan_token() will, as its
first step, skip over whitespace anyhow when looking for the start of
the token.
Although such laxity is harmless for the proper functioning of the
lexical analyzer, it does make it difficult to precisely identify the
token's end position in the input stream. Accurate token position
information may be desirable, for instance, to annotate problems or
highlight other interesting facets of the input found during the parsing
phase. To accommodate such possibilities, tighten scan_token() by making
it push the token-ending whitespace character back onto the input
stream, just as it does for other token-ending characters.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The logic in TestParser::accumulate() for detecting broken &&-chains is
mostly well-commented, but a couple branches which were deemed obvious
and straightforward lack comments. In retrospect, though, these cases
may give future readers pause, so comment them, as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Have the cmd_submodule__helper() use the OPT_SUBCOMMAND() API
introduced in fa83cc834d (parse-options: add support for parsing
subcommands, 2022-08-19).
This is only a marginal reduction in line count, but once we start
unifying this with a yet-to-be-added "builtin/submodule.c" it'll be
much easier to reason about those changes, as they'll both use
OPT_SUBCOMMAND().
We don't need to worry about "argv[0]" being NULL in the die() because
we'd have errored out in parse_options() as we're not using
"PARSE_OPT_SUBCOMMAND_OPTIONAL".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since 29a5e9e1ff (submodule--helper update-clone: learn --init,
2022-03-04) we've been passing "-C <prefix>" from "git-submodule.sh"
whenever we pass "--prefix <prefix>", so the latter is redundant to
the former. Let's drop the "--prefix" option.
Suggested-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Let's pass the "-C <prefix>" option instead to "absorbgitdirs" from
its only caller.
When it was added in f6f8586140 (submodule: add absorb-git-dir
function, 2016-12-12) there were other "submodule--helper" subcommands
that were invoked with "-C <prefix>", so we could have done this all
along.
Suggested-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Remove the "----recursive" option to "git submodule--helper
absorbgitdirs" (yes, with 4 dashes, not 2).
This option and all the "else" when "flags &
ABSORB_GITDIR_RECURSE_SUBMODULES" is false has never been used since
it was added in f6f8586140 (submodule: add absorb-git-dir function,
2016-12-12), which we'd have had to do as "----recursive", a
"--recursive" would have errored out.
It would be nice to follow-up with an optbug() assertion to
parse-options.c for such funnily named options, I manually validated
that this was the only long option whose name started with "-", but
let's skip adding such an assertion for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
A move and indentation-only change to move the
ABSORB_GITDIR_RECURSE_SUBMODULES case into its own function, which as
we'll see makes the subsequent commit changing this code much smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
We tested for "--" followed by command names, but not for "--"
followed by an argument that looks like an option, let's do that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The "status" sub-command was leaking the "struct strvec" it was
setting up for the reasons explained in f92dbdbc6a (revisions API:
don't leak memory on argv elements that need free()-ing, 2022-08-02),
so let's use the "free_removed_argv_elements" option to
setup_revisions() to fix the leak.
Even if we did that, clobbering the "diff_files_args.nr" with the
return value of setup_revisions() would leave leaks in place, but we
can just stop clobbering it.
Ever since that code was added in a9f8a37584 (submodule: port
submodule subcommand 'status' from shell to C, 2017-10-06) we've had
no reason to modify the "nr" member ("argc" at the time): The next use
of "diff_files_args" after this is the "strvec_clear()" at the end of
the function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Exhaustively test for how combining various "mixed-level" "git
submodule" option works. "Mixed-level" here means options that are
accepted by a mixture of the top-level "submodule" command, and
e.g. the "status" sub-command.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
As with other moves to "test-tool" in f322e9f51b (Merge branch
'ab/submodule-helper-prep', 2022-09-13) the "config" sub-command was
only used by our own tests.
It was last used by "git submodule" itself in code that went away with
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10).
Let's move it over, and while doing so make it easier to reason about
by splitting up the various uses for it into separate sub-commands, so
that we don't need to count arguments to see what it does.
This also has the advantage that we stop wasting future translator
time on this command, currently the usage information for this
internal-only tool has been translated into several languages. The use
of the "_" function has also been removed from the "please make
sure..." message.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Let's mention the SHAttered attack and more generally why we use the
sha1collisiondetection backend by default, and note that for SHA-256
the user should feel free to pick any of the supported backends as far
as hashing security is concerned.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since [1] the default SHA-1 backend on OSX has been
APPLE_COMMON_CRYPTO. Per [2] we'll skip using it on anything older
than Mac OS X 10.4 "Tiger"[3].
When "DC_SHA1" was made the default in [4] this interaction between it
and APPLE_COMMON_CRYPTO seems to have been missed in. Ever since
DC_SHA1 was "made the default" we've still used Apple's CommonCrypto
instead of sha1collisiondetection on modern versions of Darwin and
OSX.
1. 61067954ce (cache.h: eliminate SHA-1 deprecation warnings on Mac
OS X, 2013-05-19)
2. 9c7a0beee0 (config.mak.uname: set NO_APPLE_COMMON_CRYPTO on older
systems, 2014-08-15)
3. We could probably drop "NO_APPLE_COMMON_CRYPTO", as nobody's likely
to care about such on old version of OSX anymore. But let's leave that
for now.
4. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Address the root cause of technical debt we've been carrying since
sha1collisiondetection was made the default in [1]. In a preceding
commit we narrowly fixed a bug where the "DC_SHA1" variable would be
unset (in combination with "NO_APPLE_COMMON_CRYPTO=" on OSX), even
though we had the sha1collisiondetection library enabled.
But the only reason we needed to have such a user-exposed knob went
away with [1], and it's been doing nothing useful since then. We don't
care if you define DC_SHA1=*, we only care that you don't ask for any
other SHA-1 implementation. If it turns out that you didn't, we'll use
sha1collisiondetection, whether you had "DC_SHA1" set or not.
As a result of this being confusing we had e.g. [2] for cmake and the
recent [3] for ci/lib.sh setting "DC_SHA1" explicitly, even though
this was always a NOOP.
A much simpler way to do this is to stop having the Makefile and
CMakeLists.txt set "DC_SHA1" to be picked up by the test-lib.sh, let's
instead add a trivial "test-tool sha1-is-sha1dc". It returns zero if
we're using sha1collisiondetection, non-zero otherwise.
1. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
2. c4b2f41b5f (cmake: support for testing git with ctest, 2020-06-26)
3. 1ad5c3df35 (ci: use DC_SHA1=YesPlease on osx-clang job for CI,
2022-10-20)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
For the *_SHA1 and *_SHA256 flags we've discussed the various flags,
but not the fact that when you define multiple flags we'll pick one.
Which one we pick depends on the order they're listed in the Makefile,
which differed from the order we discussed them in this documentation.
Let's be explicit about how we select these, and re-arrange the
listings so that they're listed in the priority order we've picked.
I'd personally prefer that the selection was more explicit, and that
we'd error out if conflicting flags were provided, but per the
discussion downhtread of[1] the consensus was to keep theses semantics.
This behavior makes it easier to e.g. integrate with autoconf-like
systems, where the configuration can provide everything it can
support, and Git is tasked with picking the first one it prefers.
1. https://lore.kernel.org/git/220710.86mtdh81ty.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since 27dc04c545 (sha256: add an SHA-256 implementation using
libgcrypt, 2018-11-14) we've claimed to support a BLK_SHA256 flag, but
there's no such SHA-256 backend.
Instead we fall back on adding "sha256/block/sha256.o" to "LIB_OBJS"
and adding "-DSHA256_BLK" to BASIC_CFLAGS.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In the preceding commit the discussion of the *_SHA1 knobs was left
as-is to benefit from a smaller diff, but since we're changing these
let's use the same phrasing we use for most other knobs. E.g. "define
X", not "define X environment variable", and get rid of the "when
running make to link with" entirely.
Furthermore the discussion of DC_SHA1* options is now under a "Options
for the sha1collisiondetection implementation" heading, so we don't
need to clarify that these options go along with DC_SHA1=Y, so let's
rephrase them accordingly.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since the "Define ..." template of comments at the top of the Makefile
was started in 5bdac8b326 ([PATCH] Improve the compilation-time
settings interface, 2005-07-29) we've had a lot more flags added,
including flags that come in "groups". Not having any obvious
structure to the >500 line comment at the top of the Makefile has made
it hard to follow.
This change is almost entirely a move-only change, the two paragraphs
at the start of the first two sections are new, and so are the added
sections themselves, but other than that no lines are changed, only
moved.
We now list Makefile-only flags at the start, followed by stand-alone
flags, and then cover "optional library" flags in their respective
groups, followed by SHA-1 and SHA-256 flags, and finally
DEVELOPER-specific flags.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The claim that DC_SHA1 takes priority over other *_SHA1 knobs was true
when it was added in [1], But that hasn't been the case since it was
made the fallback default in [2].
We should be making it not only the default, but something that takes
priority over other *_SHA1 knobs, but that's outside the scope of this
change. For now let's correct the documentation to match reality.
Let's also remove the "unconditionally enable" wording, per the above
the enabling of "DC_SHA1" is conditional on these other flags.
The "Define DC_SHA1" here is also a lie, actually it's "we don't care
if you define DC_SHA1, just don't define anything else", but that's a
more general issue that'll be addressed in a subsequent commit. Let's
first stop pretending that this setting (which we actually don't even
use) takes priority over anything else.
1. 8325e43b82 (Makefile: add DC_SHA1 knob, 2017-03-16)
2. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The claim that OpenSSL is the default SHA-1 backend hasn't been true
since e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17),
but more importantly tweaking the SHA-1 backend isn't something that's
common enough to warrant discussing in the INSTALL document, so let's
remove this paragraph.
This discussion was originally added in c538d2d34a (Add some
installation notes in INSTALL, 2005-06-17) when tweaking the default
backend was more common. The current wording was added in
5beb577db8 (INSTALL: Describe dependency knobs from Makefile,
2009-09-10).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix an edge case introduced in in e6b07da278 (Makefile: make DC_SHA1
the default, 2017-03-17), when DC_SHA1 was made the default fallback
we started unconditionally adding to BASIC_CFLAGS and LIB_OBJS, so
we'd use the sha1collisiondetection by default.
But the "DC_SHA1" variable remained unset, so e.g.:
make test DC_SHA1= T=t0013*.sh
Would skip the sha1collisiondetection tests, as we'd write
"DC_SHA1=''" to "GIT-BUILD-OPTIONS", but if we manually removed that
test prerequisite we'd pass the test (which we couldn't if we weren't
using sha1collisiondetection).
So let's have the fallback assignment use the 'override' directive
instead of the ":=" simply expanded variable introduced in
e6b07da278. In this case we explicitly want to override the user's
choice.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Once upon a time, Matheus wrote some patches to make
git grep [--cached | <REVISION>] ...
restrict its output to the sparsity specification when working in a
sparse checkout[1]. That effort got derailed by two things:
(1) The --sparse-index work just beginning which we wanted to avoid
creating conflicts for
(2) Never deciding on flag and config names and planned high level
behavior for all commands.
More recently, Shaoxuan implemented a more limited form of Matheus'
patches that only affected --cached, using a different flag name,
but also changing the default behavior in line with what Matheus did.
This again highlighted the fact that we never decided on command line
flag names, config option names, and the big picture path forward.
The --sparse-index work has been mostly complete (or at least released
into production even if some small edges remain) for quite some time
now. We have also had several discussions on flag and config names,
though we never came to solid conclusions. Stolee once upon a time
suggested putting all these into some document in
Documentation/technical[3], which Victoria recently also requested[4].
I'm behind the times, but here's a patch attempting to finally do that.
[1] https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/
(See his second link in that email in particular)
[2] https://lore.kernel.org/git/20220908001854.206789-2-shaoxuan.yuan02@gmail.com/
[3] https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/
(Scroll to the very end for the final few paragraphs)
[4] https://lore.kernel.org/git/cafcedba-96a2-cb85-d593-ef47c8c8397c@github.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In b3b1a21d1a (sequencer: rewrite update-refs as user edits todo list,
2022-07-19), the 'todo_list_filter_update_refs()' step was added to handle
the removal of 'update-ref' lines from a 'rebase-todo'. Specifically, it
removes potential ref updates from the "update refs state" if a ref does not
have a corresponding 'update-ref' line.
However, because 'write_update_refs_state()' will not update the state if
the 'refs_to_oids' list was empty, removing *all* 'update-ref' lines will
result in the state remaining unchanged from how it was initialized (with
all refs' "after" OID being null). Then, when the ref update is applied, all
refs will be updated to null and consequently deleted.
To fix this, delete the 'update-refs' state file when 'refs_to_oids' is
empty. Additionally, add a tests covering "all update-ref lines removed"
cases.
Reported-by: herr.kaste <herr.kaste@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Every once in a while, a Git for Windows installation fails because the
attempt to reconfigure a Scalar enlistment failed because it was deleted
manually without removing the corresponding entries in the global Git
config.
In f5f0842d0b (scalar: let 'unregister' handle a deleted enlistment
directory gracefully, 2021-12-03), we already taught `scalar delete` to
handle the case of a manually deleted enlistment gracefully. This patch
adds the same graceful handling to `scalar reconfigure --all`.
This patch is best viewed with `--color-moved`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
If clear_skip_worktree_from_present_files() encounter a sparse directory,
it fully materialise the index which should expand any sparse directories
and start going through each entries again. If this happens more than once,
raise it with a BUG.
Signed-off-by: Anh Le <anh@canva.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When using sparse checkout, clear_skip_worktree_from_present_files() must
enumerate index entries to find ones with the SKIP_WORKTREE bit to
determine whether those index entries exist on disk (in which case their
SKIP_WORKTREE bit should be removed).
In a large repository, this may take considerable time depending on the
size of the index.
Add a trace2 region to surface this information, keeping a count of how
many paths have been checked. Separately, keep counts after a full index is
materialized.
Signed-off-by: Anh Le <anh@canva.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Test script to verify the presence/absence of files, paths, directories,
symlinks and other features in 'git mv' command are using the command
format:
'test (-e|f|d|h|...)'
Replace them with helper functions of format:
'test_path_is_*'
Replacing idiomatic helper functions:
'! test_path_is_*'
with
'test_path_is_missing'
This uses values of 'test_path_bar' in place of '! test_path_foo' to
bring in the helpful factor of indicating the failure of tests after the
mv command has been used, that is, it echoes if the feature/test_path
exists.
Signed-off-by: Debra Obondo <debraobondo@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Various tests exercising the transfer.credentialsInUrl configuration
are taught to avoid making requests which require resolving localhost
to reduce CI-flakiness.
* jk/avoid-localhost:
t5516/t5601: be less strict about the number of credential warnings
t5516: move plaintext-password tests from t5601 and t5516
This commit fixes a bug when parsing tags that have CRLF line endings, a
signature, and no body, like this (the "^M" are marking the CRs):
this is the subject^M
-----BEGIN PGP SIGNATURE-----^M
^M
...some stuff...^M
-----END PGP SIGNATURE-----^M
When trying to find the start of the body, we look for a blank line
separating the subject and body. In this case, there isn't one. But we
search for it using strstr(), which will find the blank line in the
signature.
In the non-CRLF code path, we check whether the line we found is past
the start of the signature, and if so, put the body pointer at the start
of the signature (effectively making the body empty). But the CRLF code
path doesn't catch the same case, and we end up with the body pointer in
the middle of the signature field. This has two visible problems:
- printing %(contents:subject) will show part of the signature, too,
since the subject length is computed as (body - subject)
- the length of the body is (sig - body), which makes it negative.
Asking for %(contents:body) causes us to cast this to a very large
size_t when we feed it to xmemdupz(), which then complains about
trying to allocate too much memory.
These are essentially the same bugs fixed in the previous commit, except
that they happen when there is a CRLF blank line in the signature,
rather than no blank line at all. Both are caused by the refactoring in
9f75ce3d8f (ref-filter: handle CRLF at end-of-line more gracefully,
2020-10-29).
We can fix this by doing the same "sigstart" check that we do in the
non-CRLF case. And rather than repeat ourselves, we can just use
short-circuiting OR to collapse both cases into a single conditional.
I.e., rather than:
if (strstr("\n\n"))
...found blank, check if it's in signature...
else if (strstr("\r\n\r\n"))
...found blank, check if it's in signature...
else
...no blank line found...
we can collapse this to:
if (strstr("\n\n")) ||
strstr("\r\n\r\n")))
...found blank, check if it's in signature...
else
...no blank line found...
The tests show the problem and the fix. Though it wasn't broken, I
included contents:signature here to make sure it still behaves as
expected, but note the shell hackery needed to make it work. A
less-clever option would be to skip using test_atom and just "append_cr
>expected" ourselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When ref-filter is asked to show %(content:subject), etc, we end up in
find_subpos() to parse out the three major parts: the subject, the body,
and the signature (if any).
When searching for the blank line between the subject and body, if we
don't find anything, we try to treat the whole message as the subject,
with no body. But our idea of "the whole message" needs to take into
account the signature, too. Since 9f75ce3d8f (ref-filter: handle CRLF at
end-of-line more gracefully, 2020-10-29), the code instead goes all the
way to the end of the buffer, which produces confusing output.
Here's an example. If we have a tag message like this:
this is the subject
-----BEGIN SSH SIGNATURE-----
...some stuff...
-----END SSH SIGNATURE-----
then the current parser will put the start of the body at the end of the
whole buffer. This produces two buggy outcomes:
- since the subject length is computed as (body - subject), showing
%(contents:subject) will print both the subject and the signature,
rather than just the single line
- since the body length is computed as (sig - body), and the body now
starts _after_ the signature, we end up with a negative length!
Fortunately we never access out-of-bounds memory, because the
negative length is fed to xmemdupz(), which casts it to a size_t,
and xmalloc() bails trying to allocate an absurdly large value.
In theory it would be possible for somebody making a malicious tag
to wrap it around to a more reasonable value, but it would require a
tag on the order of 2^63 bytes. And even if they did, all they get
is an out of bounds string read. So the security implications are
probably not interesting.
We can fix both by correctly putting the start of the body at the same
index as the start of the signature (effectively making the body empty).
Note that this is a real issue with signatures generated with gpg.format
set to "ssh", which would look like the example above. In the new tests
here I use a hard-coded tag message, for a few reasons:
- regardless of what the ssh-signing code produces now or in the
future, we should be testing this particular case
- skipping the actual signature makes the tests simpler to write (and
allows them to run on more systems)
- t6300 has helpers for working with gpg signatures; for the purposes
of this bug, "BEGIN PGP" is just as good a demonstration, and this
simplifies the tests
Curiously, the same issue doesn't happen with real gpg signatures (and
there are even existing tests in t6300 with cover this). Those have a
blank line between the header and the content, like:
this is the subject
-----BEGIN PGP SIGNATURE-----
...some stuff...
-----END PGP SIGNATURE-----
Because we search for the subject/body separator line with a strstr(),
we find the blank line in the signature, even though it's outside of
what we'd consider the body. But that puts us unto a separate code path,
which realizes that we're now in the signature and adjusts the line back
to "sigstart". So this patch is basically just making the "no line found
at all" case match that. And note that "sigstart" is always defined (if
there is no signature, it points to the end of the buffer as you'd
expect).
Reported-by: Martin Englund <martin@englund.nu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add a rather trivial "spatchcache", with this running e.g.:
make cocciclean
make contrib/coccinelle/free.cocci.patch \
SPATCH=contrib/coccicheck/spatchcache \
SPATCH_FLAGS=--very-quiet
Is cut down from ~20s to ~5s on my system. Much of that is either
fixable shell overhead, or the around 40 files we "CANTCACHE" (see the
implementation).
This uses "redis" as a cache by default, but it's configurable. See
the embedded documentation.
This is *not* like ccache in that we won't cache failed spatch
invocations, or those where spatch suggests changes for us. Those
cases are so rare that I didn't think it was worth the bother, by far
the most common case is that it has no suggested changes. We'll also
refuse to cache any "spatch" invocation that has output on stderr,
which means that "--very-quiet" must be added to "SPATCH_FLAGS".
Because we narrow the cache to that we don't need to save away stdout,
stderr & the exit code. We simply cache the cases where we had no
suggested changes.
Another benchmark is to compare this with the previous
SPATCH_BATCH_SIZE=N, as noted in [1]. Before this (on my 8 core system) running:
make clean; time make contrib/coccinelle/array.cocci.patch SPATCH_BATCH_SIZE=0
Would take 33s, but with the preceding changes running without this
"spatchcache" is slightly slower, or around 35s:
make clean; time make contrib/coccinelle/array.cocci.patch
Now doing the same with SPATCH=contrib/coccinelle/spatchcache will
take around 6s, but we'll need to compile the *.o files first to take
full advantage of it (which can be fast with "ccache"):
make clean; make; time make contrib/coccinelle/array.cocci.patch SPATCH=contrib/coccinelle/spatchcache
1. https://lore.kernel.org/git/YwdRqP1CyUAzCEn2@coredump.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The preceding commits to make the "coccicheck" target incremental made
it slower in some cases. As an optimization let's not have the
many=many mapping of <*.cocci>=<*.[ch]>, but instead concat the
<*.cocci> into an ALL.cocci, and then run one-to-many
ALL.cocci=<*.[ch]>.
A "make coccicheck" is now around 2x as fast as it was on "master",
and around 1.5x as fast as the preceding change to make the run
incremental:
$ git hyperfine -L rev origin/master,HEAD~,HEAD -p 'make clean' 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' -r 3
Benchmark 1: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'origin/master
Time (mean ± σ): 4.258 s ± 0.015 s [User: 27.432 s, System: 1.532 s]
Range (min … max): 4.241 s … 4.268 s 3 runs
Benchmark 2: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD~
Time (mean ± σ): 5.365 s ± 0.079 s [User: 36.899 s, System: 1.810 s]
Range (min … max): 5.281 s … 5.436 s 3 runs
Benchmark 3: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD
Time (mean ± σ): 2.725 s ± 0.063 s [User: 14.796 s, System: 0.233 s]
Range (min … max): 2.667 s … 2.792 s 3 runs
Summary
'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD' ran
1.56 ± 0.04 times faster than 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'origin/master'
1.97 ± 0.05 times faster than 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD~'
This can be turned off with SPATCH_CONCAT_COCCI, but as the
beneficiaries of "SPATCH_CONCAT_COCCI=" would mainly be those
developing the *.cocci rules themselves, let's leave this optimization
on by default.
For more information see my "Optimizing *.cocci rules by concat'ing
them" (<220901.8635dbjfko.gmgdl@evledraar.gmail.com>) on the
cocci@inria.fr mailing list.
This potentially changes the results of our *.cocci rules, but as
noted in that discussion it should be safe for our use. We don't name
rules, or if we do their names don't conflict across our *.cocci
files.
To the extent that we'd have any inter-dependencies between rules this
doesn't make that worse, as we'd have them now if we ran "make
coccicheck", applied the results, and would then have (due to
hypothetical interdependencies) suggested changes on the subsequent
"make coccicheck".
Our "coccicheck-test" target makes use of the ALL.cocci when running
tests, e.g. when testing unused.{c,out} we test it against ALL.cocci,
not unused.cocci. We thus assert (to the extent that we have test
coverage) that this concatenation doesn't change the expected results
of running these rules.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The <id> in the <rulename> part of the coccinelle syntax[1] is for our
purposes there to declares if we have inter-dependencies between
different rules.
But such <id>'s must be unique within a given semantic patch file. As
we'll be processing a concatenated version of our rules in the
subsequent commit let's remove these names. They weren't being used
for the semantic patches themselves, and equated to a short comment
about the rule.
Both the filename and context of the rules makes it clear what they're
doing, so we're not gaining anything from keeping these. Retaining
them goes against recommendations that "contrib/coccinelle/README"
will be making in the subsequent commit.
This leaves only one named rule in our sources, where it's needed for
a "<id> <-> <extends> <id>" relationship:
$ git -P grep '^@ ' -- contrib/coccinelle/
contrib/coccinelle/swap.cocci:@ swap @
contrib/coccinelle/swap.cocci:@ extends swap @
1. https://coccinelle.gitlabpages.inria.fr/website/docs/main_grammar.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Change the "coccinelle" rule so that we first copy the *.cocci source
in e.g. "contrib/coccinelle/strbuf.cocci" to
".build/contrib/coccinelle/strbuf.cocci" before operating on it.
For now this serves as a rather pointless indirection, but prepares us
for the subsequent commit where we'll be able to inject generated
*.cocci files. Having the entire dependency tree live inside .build/*
simplifies both the globbing we'd need to do, and any "clean" rules.
It will also help for future targets which will want to act on the
generated patches or the logs, e.g. targets to alert if we can't parse
certain files (or, less so than usual) with "spatch", and e.g. a
replacement for "ci/run-static-analysis.sh". Such a replacement won't
care about placing the patches in the in-tree, only whether they're
"OK" (and about the diff).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Improve the incremental rebuilding support of "coccicheck" by
piggy-backing on the computed dependency information of the
corresponding *.o file, rather than rebuilding all <RULE>/<FILE> pairs
if either their corresponding file changes, or if any header changes.
This in effect uses the same method that the "sparse" target was made
to use in c234e8a0ec (Makefile: make the "sparse" target non-.PHONY,
2021-09-23), except that the dependency on the *.o file isn't a hard
one, we check with $(wildcard) if the *.o file exists, and if so we'll
depend on it.
This means that the common case of:
make
make coccicheck
Will benefit from incremental rebuilding, now changing e.g. a header
will only re-run "spatch" on those those *.c files that make use of
it:
By depending on the *.o we piggy-back on
COMPUTE_HEADER_DEPENDENCIES. See c234e8a0ec (Makefile: make the
"sparse" target non-.PHONY, 2021-09-23) for prior art of doing that
for the *.sp files. E.g.:
make contrib/coccinelle/free.cocci.patch
make -W column.h contrib/coccinelle/free.cocci.patch
Will take around 15 seconds for the second command on my 8 core box if
I didn't run "make" beforehand to create the *.o files. But around 2
seconds if I did and we have those "*.o" files.
Notes about the approach of piggy-backing on *.o for dependencies:
* It *is* a trade-off since we'll pay the extra cost of running the C
compiler, but we're probably doing that anyway. The compiler is much
faster than "spatch", so even though we need to re-compile the *.o to
create the dependency info for the *.c for "spatch" it's
faster (especially if using "ccache").
* There *are* use-cases where some would like to have *.o files
around, but to have the "make coccicheck" ignore them. See:
https://lore.kernel.org/git/20220826104312.GJ1735@szeder.dev/
For those users a:
make
make coccicheck SPATCH_USE_O_DEPENDENCIES=
Will avoid considering the *.o files.
* If that *.o file doesn't exist we'll depend on an intermediate file
of ours which in turn depends on $(FOUND_H_SOURCES).
This covers both an initial build, or where "coccicheck" is run
without running "all" beforehand, and because we run "coccicheck"
on e.g. files in compat/* that we don't know how to build unless
the requisite flag was provided to the Makefile.
Most of the runtime of "incremental" runs is now spent on various
compat/* files, i.e. we conditionally add files to COMPAT_OBJS, and
therefore conflate whether we *can* compile an object and generate
dependency information for it with whether we'd like to link it
into our binary.
Before this change the distinction didn't matter, but now one way
to make this even faster on incremental builds would be to peel
those concerns apart so that we can see that e.g. compat/mmap.c
doesn't depend on column.h.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Optimize the very slow "coccicheck" target to take advantage of
incremental rebuilding, and fix outstanding dependency problems with
the existing rule.
The rule is now faster both on the initial run as we can make better
use of GNU make's parallelism than the old ad-hoc combination of
make's parallelism combined with $(SPATCH_BATCH_SIZE) and/or the
"--jobs" argument to "spatch(1)".
It also makes us *much* faster when incrementally building, it's now
viable to "make coccicheck" as topic branches are merged down.
The rule didn't use FORCE (or its equivalents) before, so a:
make coccicheck
make coccicheck
Would report nothing to do on the second iteration. But all of our
patch output depended on all $(COCCI_SOURCES) files, therefore e.g.:
make -W grep.c coccicheck
Would do a full re-run, i.e. a a change in a single file would force
us to do a full re-run.
The reason for this (not the initial rationale, but my analysis) is:
* Since we create a single "*.cocci.patch+" we don't know where to
pick up where we left off, or how to incrementally merge e.g. a
"grep.c" change with an existing *.cocci.patch.
* We've been carrying forward the dependency on the *.c files since
63f0a758a0 (add coccicheck make target, 2016-09-15) the rule was
initially added as a sort of poor man's dependency discovery.
As we don't include other *.c files depending on other *.c files
has always been broken, as could be trivially demonstrated
e.g. with:
make coccicheck
make -W strbuf.h coccicheck
However, depending on the corresponding *.c files has been doing
something, namely that *if* an API change modified both *.c and *.h
files we'd catch the change to the *.h we care about via the *.c
being changed.
For API changes that happened only via *.h files we'd do the wrong
thing before this change, but e.g. for function additions (not
"static inline" ones) catch the *.h change by proxy.
Now we'll instead:
* Create a <RULE>/<FILE> pair in the .build directory, E.g. for
swap.cocci and grep.c we'll create
.build/contrib/coccinelle/swap.cocci.patch/grep.c.
That file is the diff we'll apply for that <RULE>-<FILE>
combination, if there's no changes to me made (the common case)
it'll be an empty file.
* Our generated *.patch
file (e.g. contrib/coccinelle/swap.cocci.patch) is now a simple "cat
$^" of all of all of the <RULE>/<FILE> files for a given <RULE>.
In the case discussed above of "grep.c" being changed we'll do the
full "cat" every time, so they resulting *.cocci.patch will always
be correct and up-to-date, even if it's "incrementally updated".
See 1cc0425a27 (Makefile: have "make pot" not "reset --hard",
2022-05-26) for another recent rule that used that technique.
As before we'll:
* End up generating a contrib/coccinelle/swap.cocci.patch, if we
"fail" by creating a non-empty patch we'll still exit with a zero
exit code.
Arguably we should move to a more Makefile-native way of doing
this, i.e. fail early, and if we want all of the "failed" changes
we can use "make -k", but as the current
"ci/run-static-analysis.sh" expects us to behave this way let's
keep the existing behavior of exhaustively discovering all cocci
changes, and only failing if spatch itself errors out.
Further implementation details & notes:
* Before this change running "make coccicheck" would by default end
up pegging just one CPU at the very end for a while, usually as
we'd finish whichever *.cocci rule was the most expensive.
This could be mitigated by combining "make -jN" with
SPATCH_BATCH_SIZE, see 960154b9c1 (coccicheck: optionally batch
spatch invocations, 2019-05-06).
There will be cases where getting rid of "SPATCH_BATCH_SIZE" makes
things worse, but a from-scratch "make coccicheck" with the default
of SPATCH_BATCH_SIZE=1 (and tweaking it doesn't make a difference)
is faster (~3m36s v.s. ~3m56s) with this approach, as we can feed
the CPU more work in a less staggered way.
* Getting rid of "SPATCH_BATCH_SIZE" particularly helps in cases
where the default of 1 yields parallelism under "make coccicheck",
but then running e.g.:
make -W contrib/coccinelle/swap.cocci coccicheck
I.e. before that would use only one CPU core, until the user
remembered to adjust "SPATCH_BATCH_SIZE" differently than the
setting that makes sense when doing a non-incremental run of "make
coccicheck".
* Before the "make coccicheck" rule would have to clean
"contrib/coccinelle/*.cocci.patch*", since we'd create "*+" and
"*.log" files there. Now those are created in
.build/contrib/coccinelle/, which is covered by the "cocciclean" rule
already.
Outstanding issues & future work:
* We could get rid of "--all-includes" in favor of manually
specifying a list of includes to give to "spatch(1)".
As noted upthread of [1] a naïve removal of "--all-includes" will
result in broken *.cocci patches, but if we know the exhaustive
list of includes via COMPUTE_HEADER_DEPENDENCIES we don't need to
re-scan for them, we could grab the headers to include from the
.depend.d/<file>.o.d and supply them with the "--include" option to
spatch(1).q
1. https://lore.kernel.org/git/87ft18tcog.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Per the rationale in 7b63ea5750 (Makefile: remove mandatory "spatch"
arguments from SPATCH_FLAGS, 2022-07-05) we have certain flags that
are truly mandatory, such as "--sp-file" and "--patch .". The
"--all-includes" flag is also critical, but per [1] we might want to
ad-hoc tweak it occasionally for testing or one-offs.
But being unable to set e.g. SPATCH_FLAGS="--verbose-parsing" without
breaking how our "spatch" works isn't ideal, i.e. before this we'd
need to know about the default include flags, and specify:
SPATCH_FLAGS="--all-includes --verbose-parsing".
If we were then to change the default include flag (e.g. to
"--recursive-includes") in the future any such one-off commands would
need to be correspondingly updated.
Let's instead leave the SPATCH_FLAGS for the user, while creating a
new SPATCH_INCLUDE_FLAGS to allow for ad-hoc testing of the include
strategy itself.
1. https://lore.kernel.org/git/20220823095733.58685-1-szeder.dev@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Amend the "coccicheck-test" rule added in f7ff6597a7 (cocci: add a
"coccicheck-test" target and test *.cocci rules, 2022-07-05) to stop
using "--all-includes". The flags we'll need for the tests are
different than the ones we'll need for our main source code.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Split off the "; setting[...]" part of the comment added in In
960154b9c1 (coccicheck: optionally batch spatch invocations,
2019-05-06), and restore what we had before that, which was a comment
indicating that variables for the "coccicheck" target were being set
here.
When 960154b9c1 amended the heading to discuss SPATCH_BATCH_SIZE it
left no natural place to add a new comment about other flags that
preceded it. As subsequent commits will add such comments we need to
split the existing comment up.
The wrapping for the "SPATCH_BATCH_SIZE" is now a bit odd, but
minimizes the diff size. As a subsequent commit will remove that
feature altogether this is worth it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix an issue with the "coccicheck" family of rules that's been here
since 63f0a758a0 (add coccicheck make target, 2016-09-15), unlike
e.g. "make grep.o" we wouldn't re-run it when $(SPATCH) or
$(SPATCH_FLAGS) changed. To test new flags we needed to first do a
"make cocciclean".
This now uses the same (copy/pasted) pattern as other "DEFINES"
rules. As a result we'll re-run properly. This can be demonstrated
e.g. on the issue noted in [1]:
$ make contrib/coccinelle/xcalloc.cocci.patch COCCI_SOURCES=promisor-remote.c V=1
[...]
SPATCH contrib/coccinelle/xcalloc.cocci
$ make contrib/coccinelle/xcalloc.cocci.patch COCCI_SOURCES=promisor-remote.c SPATCH_FLAGS="--all-includes --recursive-includes"
* new spatch flags
SPATCH contrib/coccinelle/xcalloc.cocci
SPATCH result: contrib/coccinelle/xcalloc.cocci.patch
$
1. https://lore.kernel.org/git/20220823095602.GC1735@szeder.dev/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Declare the contrib/coccinelle/<rule>.cocci.patch rules in such a way
as to allow TAB-completion, and slightly optimize the Makefile by
cutting down on the number of $(wildcard) in favor of defining
"coccicheck" and "coccicheck-pending" in terms of the same
incrementally filtered list.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix an issue with a rule added in 9b45f49981 (object-store: prepare
has_{sha1, object}_file to handle any repo, 2018-11-13). We've been
spewing out this warning into our $@.log since that rule was added:
warning: rule starting on line 21: metavariable F not used in the - or context code
We should do a better job of scouring our coccinelle log files for
such issues, but for now let's fix this as a one-off.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In f7ff6597a7 (cocci: add a "coccicheck-test" target and test *.cocci
rules, 2022-07-05) we abbreviated "_TEST" to "_T" to have it align
with the rest of the "="'s above it.
Subsequent commits will add more QUIET_SPATCH_* variables, so let's
stop abbreviating this, and indent it in preparation for adding more
of these variables.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Use diff_free_queue() instead of open-coding it. This shortens the
code and make it less repetitive.
Note that the second hunk in diff_flush() is interesting, because the
'free_queue' label separates the loop freeing the queue's filepairs
from free()-ing the queue's internal array. This is somewhat
suspicious, but it was not an issue before: there is only one place
from where we jump to this label with a goto, and that is protected by
an 'if (!q->nr && ...)' condition, i.e. we only skipped the loop
freeing the filepairs when there were no filepairs in the queue to
begin with.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When processing merge commits, the line-level log first creates an
array of diff queues, each comparing the merge commit with one of its
parents, to check whether any of the files in the given line ranges
were modified. Alas, when freeing these queues it only frees the
filepairs in the queues, but not the queues' internal arrays holding
pointers to those filepairs.
Use the diff_free_queue() helper function introduced in the previous
commit to free the diff queues' internal arrays as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When processing a non-merge commit, the line-level log first asks the
tree-diff machinery whether any of the files in the given line ranges
were modified between the current commit and its parent, and if some
of them were, then it loads the contents of those files from both
commits to see whether their line ranges were modified and/or need to
be adjusted. Alas, it doesn't free() the diff queue holding the
results of that query and the contents of those files once its done.
This can add up to a substantial amount of leaked memory, especially
when the file in question is big and is frequently modified: a user
reported "Out of memory, malloc failed" errors with a 2MB text file
that was modified ~2800 times [1] (I estimate the leak would use up
almost 11GB memory in that case).
Free that diff queue to plug this memory leak. However, instead of
simply open-coding the necessary three lines, add them as a helper
function to the diff API, because it will be useful elsewhere as well.
[1] https://public-inbox.org/git/CAFOPqVXz2XwzX8vGU7wLuqb2ZuwTuOFAzBLRM_QPk+NJa=eC-g@mail.gmail.com/
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
It is unclear as to _why_, but under certain circumstances the warning
about credentials being passed as part of the URL seems to be swallowed
by the `git remote-https` helper in the Windows jobs of Git's CI builds.
Since it is not actually important how many times Git prints the
warning/error message, as long as it prints it at least once, let's just
make the test a bit more lenient and test for the latter instead of the
former, which works around these CI issues.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Commit 6dcbdc0d66 (remote: create fetch.credentialsInUrl config,
2022-06-06) added tests for our handling of passwords in URLs. Since the
obvious URL to be affected is git-over-http, the tests use http. However
they don't set up a test server; they just try to access
https://localhost, assuming it will fail (because the nothing is
listening there).
This causes some possible problems:
- There might be a web server running on localhost, and we do not
actually want to connect to that.
- The DNS resolver, or the local firewall, might take a substantial
amount of time (or forever, whichever comes first) to fail to
connect, slowing down the tests cases unnecessarily.
- Since there's no server, our tests for "allow" and "warn" still
expect the clone/fetch/push operations to fail, even though in the
real world we'd expect these to succeed. We scrape stderr to see
what happened, but it's not as robust as a more realistic test.
Let's instead move these to t5551, which is all about testing http and
where we have a real server. That eliminates any issues with contacting
a strange URL, and lets the "allow" and "warn" tests confirm that the
operation actually succeeds.
It's not quite a verbatim move for a few reasons:
- we can drop the LIBCURL dependency; it's already part of
lib-httpd.sh
- we'll use HTTPD_URL_USER_PASS, etc, instead of our fake URL. To
avoid repetition, we'll add a few extra variables.
- the "https://username:@localhost" test uses a funny URL that
lib-httpd.sh doesn't provide. We'll similarly construct it in a
variable. Note that we're hard-coding the lib-httpd username here,
but t5551 already does that everywhere.
- for the "domain:port" test, the URL provided by lib-httpd is fine,
since our test server will always be on an exotic port. But we'll
confirm in the test that this is so.
- since our message-matching is done via grep, I simplified it to use
a regex, rather than trying to massage lib-httpd's variables.
Arguably this makes it more readable, too, while retaining the bits
we care about: the fatal/warning distinction, the "uses plaintext"
message, and the fact that the password was redacted.
- we'll use the /auth/ path for the repo, which shows that we are
indeed making use of the auth information when needed.
- we'll also use /smart/; most of these tests could be done via /dumb/
in t5550, but setting up pushes there requires extra effort and
dependencies. The smart protocol is what most everyone is using
these days anyway.
This patch is my own, but I stole the analysis and a few bits of the
commit message from a patch by Johannes Schindelin.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
`test_path_is_missing` was introduced back in 2caf20c52b ("test-lib:
user-friendly alternatives to test [-d|-f|-e]", 2010-08-10). It took the
path that was supposed to be missing, as well as an optional "diagnosis"
that would be echoed if the path was found to be alive.
Commit 45a2686441 ("test-lib-functions: remove bug-inducing
"diagnostics" helper param", 2021-02-12) dropped this diagnostic
functionality from several `test_path_is_foo` helpers, but note how it
tweaked the README entry on `test_path_is_missing` without actually
adjusting its implementation.
Commit e7884b353b ("test-lib-functions: assert correct parameter count",
2021-02-12) then followed up by asserting that we get just a single
argument.
This history leaves us in a state where we assert that we have exactly
one argument, then go on to anyway check for arguments, echoing them
all. It's clear that we can simplify this code. We should also note that
we run `ls -ld "$1"`, so printing the filename a second time doesn't
really buy us anything. Thus, we can drop the whole `if` block as
redundant.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In a similar spirit as the previous commit, the 'seen' branch gets
rebuilt by reintegrating topics between 'jch' and the (old) tip of
'seen'.
Update the instructions on how to generate Meta/redo-seen.sh for the
first time to reflect this.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Rebuilding the 'jch' branch begins by reintegrating any topics between
'master' and 'jch', not 'master' and 'seen'.
In the maintainer guide, the documentation isn't quite right, since the
initial input to Meta/Reintegrate is "master..seen", not "master..jch".
This can lead to confusing results when generating the Meta/redo-jch.sh
script for the first time.
Additionally, rebuilding 'jch' takes place in two steps. First, running
the script up to the first "### match next" cut-line, and then comparing
the result with what's on 'next' (i.e. with "git diff jch next"). Then,
the remaining set of topics get merged down to 'jch' (which aren't on
'next') by running the entire "redo-jch.sh" script.
Clarify the documentation to reflect this.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Extend the tests added in c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) to test stdout in
addition to stderr.
When the "ungroup" feature was added in fd3aaf53f7 (run-command: add
an "ungroup" option to run_process_parallel(), 2022-06-07) its tests
were made to test both the stdout and stderr, but these existing tests
were left alone. Let's also exhaustively test our expected output
here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix test patterns added in 62104ba14a (submodules: allow parallel
fetching, add tests and documentation, 2015-12-15) and
a028a1930c (fetching submodules: respect `submodule.fetchJobs` config
option, 2016-02-29).
In the former case we were leaving a trace.out file at the top-level
for any subsequent tests (there are none, currently). Let's clean the
file up instead.
In the latter case we were testing that a given configuration would
result in "N tasks" in the log, but we were grepping through the log
for all previous such tests, when we really meant to clear the logs
between the "grep" invocations.
In practice this resulted in no logic error, as e.g. "--fetch 7" would
not print out a "9 tasks" line, but let's be paranoid and stop
implicitly assuming that that's the case.
This change was originally left out of 51243f9f0f (run-command API:
don't fall back on online_cpus(), 2022-10-12), which added the
">trace.out" seen at the end of the context.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The tests added in 96e7225b31 (hook: add 'run' subcommand,
2021-12-22) were redirecting to "actual" both in the body of the hook
itself and in the testing code below.
The net result was that the "2>>actual" redirection later in the test
wasn't doing anything. Let's have those redirection do what it looks
like they're doing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Rewrite a deep recursion in the skipping negotiator to use a loop
with on-heap prio queue to avoid stack wastage.
* jt/skipping-negotiator-wo-recursion:
negotiator/skipping: avoid stack overflow
"git merge-tree --stdin" is a new way to request a series of merges
and report the merge results.
* en/merge-tree-sequence:
merge-tree: support multiple batched merges with --stdin
merge-tree: update documentation for differences in -z output
Define the logical elements of a "bundle list", data structure to
store them in-core, format to transfer them, and code to parse
them.
* ds/bundle-uri-3:
bundle-uri: suppress stderr from remote-https
bundle-uri: quiet failed unbundlings
bundle: add flags to verify_bundle()
bundle-uri: fetch a list of bundles
bundle: properly clear all revision flags
bundle-uri: limit recursion depth for bundle lists
bundle-uri: parse bundle list in config format
bundle-uri: unit test "key=value" parsing
bundle-uri: create "key=value" line parsing
bundle-uri: create base key-value pair parsing
bundle-uri: create bundle_list struct and helpers
bundle-uri: use plain string in find_temp_filename()
"git branch --edit-description" can exit with status -1 which is
not a good practice; it learned to use 1 as everybody else instead.
* rj/branch-do-not-exit-with-minus-one-status:
branch: error code with --edit-description
The role the security mailing list plays in an embargoed release
has been documented.
* jr/embargoed-releases-doc:
embargoed releases: also describe the git-security list and the process
Merging a branch with directory renames into a branch that changes
the directory to a symlink was mishandled by the ort merge
strategy, which has been corrected.
* en/ort-dir-rename-and-symlink-fix:
merge-ort: fix bug with dir rename vs change dir to symlink
A bugfix to "git subtree" in its split and merge features.
* pb/subtree-split-and-merge-after-squashing-tag-fix:
subtree: fix split after annotated tag was squashed merged
subtree: fix squash merging after annotated tag was squashed merged
subtree: process 'git-subtree-split' trailer in separate function
subtree: use named variables instead of "$@" in cmd_pull
subtree: define a variable before its first use in 'find_latest_squash'
subtree: prefix die messages with 'fatal'
subtree: add 'die_incompatible_opt' function to reduce duplication
subtree: use 'git rev-parse --verify [--quiet]' for better error messages
test-lib-functions: mark 'test_commit' variables as 'local'
Fix some bugs in the reflog messages when rebasing and changes the
reflog messages of "rebase --apply" to match "rebase --merge" with
the aim of making the reflog easier to parse.
* pw/rebase-reflog-fixes:
rebase: cleanup action handling
rebase --abort: improve reflog message
rebase --apply: make reflog messages match rebase --merge
rebase --apply: respect GIT_REFLOG_ACTION
rebase --merge: fix reflog message after skipping
rebase --merge: fix reflog when continuing
t3406: rework rebase reflog tests
rebase --apply: remove duplicated code
"git rebase --keep-base" used to discard the commits that are
already cherry-picked to the upstream, even when "keep-base" meant
that the base, on top of which the history is being rebuilt, does
not yet include these cherry-picked commits. The --keep-base
option now implies --reapply-cherry-picks and --no-fork-point
options.
* pw/rebase-keep-base-fixes:
rebase --keep-base: imply --no-fork-point
rebase --keep-base: imply --reapply-cherry-picks
rebase: factor out branch_base calculation
rebase: rename merge_base to branch_base
rebase: store orig_head as a commit
rebase: be stricter when reading state files containing oids
t3416: set $EDITOR in subshell
t3416: tighten two tests
Two new facilities, "timer" and "counter", are introduced to the
trace2 API.
* jh/trace2-timers-and-counters:
trace2: add global counter mechanism
trace2: add stopwatch timers
trace2: convert ctx.thread_name from strbuf to pointer
trace2: improve thread-name documentation in the thread-context
trace2: rename the thread_name argument to trace2_thread_start
api-trace2.txt: elminate section describing the public trace2 API
tr2tls: clarify TLS terminology
trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx
"git shortlog" learned to group by the "format" string.
* tb/shortlog-group:
shortlog: implement `--group=committer` in terms of `--group=<format>`
shortlog: implement `--group=author` in terms of `--group=<format>`
shortlog: extract `shortlog_finish_setup()`
shortlog: support arbitrary commit format `--group`s
shortlog: extract `--group` fragment for translation
shortlog: make trailer insertion a noop when appropriate
shortlog: accept `--date`-related options
Code simplification by using strvec_pushf() instead of building an
argument in a separate strbuf.
* rs/absorb-git-dir-simplify:
submodule: use strvec_pushf() for --super-prefix
The way "git repack" creared temporary files when it received a
signal was prone to deadlocking, which has been corrected.
* jk/repack-tempfile-cleanup:
t7700: annotate cruft-pack failure with ok=sigpipe
repack: drop remove_temporary_files()
repack: use tempfiles for signal cleanup
repack: expand error message for missing pack files
repack: populate extension bits incrementally
repack: convert "names" util bitfield to array
Make sure generated dependency file is stably sorted to help
developers debugging their build issues.
* sg/stable-docdep:
Documentation/build-docdep.perl: generate sorted output
A new "--include-whitespace" option is added to "git patch-id", and
existing bugs in the internal patch-id logic that did not match
what "git patch-id" produces have been corrected.
* jz/patch-id:
builtin: patch-id: remove unused diff-tree prefix
builtin: patch-id: add --verbatim as a command mode
patch-id: fix patch-id for mode changes
builtin: patch-id: fix patch-id with binary diffs
patch-id: use stable patch-id for rebases
patch-id: fix stable patch id for binary / header-only
Git has an additional "commit graph" capability that supplements the
normal commit object's directed acyclic graph (DAG). The supplemental
commit graph file is designed for speed of access.
Describe the commit graph both from the normative DAG view point and
from the commit graph file perspective.
Also, clarify the link between the branch ref and branch tip
by linking to the `ref` glossary entry, matching this commit graph
entry.
The commit-graph file is also distinguished by its hyphenation.
Subsequent commit catches the few cases where the hyphenation of
commit-graph was missing.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The abbreviation 'ODB' is used in the technical documentation
sections for commit-graph and parallel-checkout, along with an
'odb' option in `git-pack-redundant`, without expansion.
Use 'object database' in full, in those entries. The text has not
been reflowed to keep the changes minimal.
While in the glossary for `object` terms, add the common`oid`
abbreviation to its entry.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Note, historical release notes have not been updated.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
A missing tar filter is reported by start_command() using error(), but
also by its caller, write_tar_filter_archive(), using die():
$ git -c tar.invalid.command=foo archive --format=invalid HEAD
error: cannot run foo: No such file or directory
fatal: unable to start 'foo' filter: No such file or directory
The second message contains all relevant information and even says that
the failed command was intended to be used as a filter. Silence the
first one because it's redundant.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Replace the remaining calls of run_command_v_opt() with run_command()
calls and explict struct child_process variables. This is more verbose,
but not by much overall. The code becomes more flexible, e.g. it's easy
to extend to conditionally add a new argument.
Then remove the now unused function and its own flag names, simplifying
the run-command API.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The convenience function run_command_v_opt_cd_env_tr2() has no external
callers left. Inline it and remove it from the API.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The convenience function run_command_v_opt_tr2() is only used by a
single caller. Use struct child_process and run_command() directly
instead and remove the underused function.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
run_command_v_opt_cd_env() is only used in an example in a comment. Use
the struct child_process member "env" and run_command() directly instead
and then remove the unused convenience function.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Build argument list and environment of child processes by using
struct child_process and populating its members "args" and "env"
directly instead of maintaining separate strvecs and letting
run_command_v_opt() and friends populate these members. This is
simpler, shorter and slightly more efficient.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Use run_command() with a struct child_process variable and populate its
"args" member directly instead of building a string array and passing it
to run_command_v_opt(). This avoids the use of magic index numbers and
makes simplifies the possible addition of more arguments in the future.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Build child_argv during initialization, taking advantage of the C99
support for initialization expressions that are not compile time
constants. This avoids the use of a magic index constant and is shorter
and simpler.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Deduplicate the code for reporting and starting the bisect run command
by moving it to a short helper function. Use a string array instead of
a strvec to prepare the arguments, for simplicity.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reduce the scope of argv_checkout, which allows to fully build it during
initialization. Use oid_to_hex() instead of oid_to_hex_r(), because
that's simpler and using the static buffer of the former is just as safe
as the old static argv_checkout.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Build the string array av during initialization, without any magic
numbers or heap allocations. Not duplicating the result of oid_to_hex()
is safe because run_command_v_opt() duplicates all arguments already.
(It would even be safe if it didn't, but that's a different story.)
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
483bbd4e4c (run-command: introduce child_process_init(), 2014-08-19) and
2d71608ec0 (run-command: factor out child_process_clear(), 2015-10-24)
added help texts about child_process_init() and child_process_clear()
without updating the immediately following documentation of return codes
that only applied to the preexisting functions.
4c4066d95d (run-command: move doc to run-command.h, 2019-11-17) started
to list the functions explicitly that this paragraph applies to, but
still wrongly included child_process_init() and child_process_clear().
Remove their names from that list.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Simplify the code that builds the arguments for the "read-tree"
invocation in reset_hard() and read_empty() to remove the "verbose"
parameter.
Before 172b6428d0 (do not overwrite untracked during merge from
unborn branch, 2010-11-14) there was a "reset_hard()" function that
would be called in two places, one of those passed a "verbose=1", the
other a "verbose=0".
After 172b6428d0 when read_empty() was split off from reset_hard()
both of these functions only had one caller. The "verbose" in
read_empty() would always be false, and the one in reset_hard() would
always be true.
There was never a good reason for the code to act this way, it
happened because the read_empty() function was a copy/pasted and
adjusted version of reset_hard().
Since we're no longer conditionally adding the "-v" parameter
here (and we'd only add it for "reset_hard()" we'll be able to move to
a simpler and safer run-command API in the subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Julien Moutinho reports that in an environment where directory does
not have BSD group semantics and requires the g+s to be set (aka
FORCE_DIR_SET_GID), but the system forbids chmod() to touch the g+s
bit, adjust_shared_perm() fails even when the repository is for
private use with perm = 0600, because we unconditionally try to set
the g+s bit.
When we grant extra access based on group membership (i.e. the
directory has either g+r or g+w bit set), which group the directory
and its contents are owned by matters. But otherwise (e.g. perm is
set to 0600, in Julien's case), flipping g+s bit is not necessary.
Reported-by: Julien Moutinho <julm+git@sourcephile.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --stat" etc. were invented back when everything was ASCII
and strlen() was a way to measure the display width of a string;
adjust them to compute the display width assuming UTF-8 pathnames.
* tb/diffstat-with-utf8-strwidth:
diff: leave NEEDWORK notes in show_stats() function
diff.c: use utf8_strwidth() to count display width
Fix a longstanding syntax error in Git.pm error codepath.
* mm/git-pm-try-catch-syntax-fix:
Git.pm: trust rev-parse to find bare repositories
Git.pm: add semicolon after catch statement
When creating a multi-pack bitmap, remove per-pack bitmap files
unconditionally as they will never be consulted.
* tb/remove-unused-pack-bitmap:
builtin/repack.c: remove redundant pack-based bitmaps
The short-help text shown by "git cmd -h" and the synopsis text
shown at the beginning of "git help cmd" have been made more
consistent.
* ab/doc-synopsis-and-cmd-usage: (34 commits)
tests: assert consistent whitespace in -h output
tests: start asserting that *.txt SYNOPSIS matches -h output
doc txt & -h consistency: make "worktree" consistent
worktree: define subcommand -h in terms of command -h
reflog doc: list real subcommands up-front
doc txt & -h consistency: make "commit" consistent
doc txt & -h consistency: make "diff-tree" consistent
doc txt & -h consistency: use "[<label>...]" for "zero or more"
doc txt & -h consistency: make "annotate" consistent
doc txt & -h consistency: make "stash" consistent
doc txt & -h consistency: add missing options
doc txt & -h consistency: use "git foo" form, not "git-foo"
doc txt & -h consistency: make "bundle" consistent
doc txt & -h consistency: make "read-tree" consistent
doc txt & -h consistency: make "rerere" consistent
doc txt & -h consistency: add missing options and labels
doc txt & -h consistency: make output order consistent
doc txt & -h consistency: add or fix optional "--" syntax
doc txt & -h consistency: fix mismatching labels
doc SYNOPSIS & -h: use "-" to separate words in labels, not "_"
...
Work around older clang that warns against C99 zero initialization
syntax for struct.
* jh/struct-zero-init-with-older-clang:
config.mak.dev: disable suggest braces error on old clang versions
"git branch --edit-description" on an unborh branch misleadingly
said that no such branch exists, which has been corrected.
* rj/branch-edit-desc-unborn:
branch: description for non-existent branch errors
Clarify that "the sentence after <area>: prefix does not begin with
a capital letter" rule applies only to the commit title.
* jc/use-of-uc-in-log-messages:
SubmittingPatches: use usual capitalization in the log message body
The code to clean temporary object directories (used for
quarantine) tried to remove them inside its signal handler, which
was a no-no.
* jc/tmp-objdir:
tmp-objdir: skip clean up when handling a signal
Update comment in the Makefile about the RUNTIME_PREFIX config knob.
* dd/document-runtime-prefix-better:
Makefile: clarify runtime relative gitexecdir
"git multi-pack-index repack/expire" used to repack unreachable
cruft into a new pack, which have been corrected.
cf. <63a1c3d4-eff3-af10-4263-058c88e74594@github.com>
* tb/midx-repack-ignore-cruft-packs:
midx.c: avoid cruft packs with non-zero `repack --batch-size`
midx.c: remove unnecessary loop condition
midx.c: replace `xcalloc()` with `CALLOC_ARRAY()`
midx.c: avoid cruft packs with `repack --batch-size=0`
midx.c: prevent `expire` from removing the cruft pack
Documentation/git-multi-pack-index.txt: clarify expire behavior
Documentation/git-multi-pack-index.txt: fix typo
Documentation on various Boolean GIT_* environment variables have
been clarified.
* jc/environ-docs:
environ: GIT_INDEX_VERSION affects not just a new repository
environ: simplify description of GIT_INDEX_FILE
environ: GIT_FLUSH should be made a usual Boolean
environ: explain Boolean environment variables
environ: document GIT_SSL_NO_VERIFY
Cherry pick commit d3775de0 (Makefile: force -O0 when compiling with
SANITIZE=leak, 2022-10-18), as otherwise the leak checker at GitHub
Actions CI seems to fail with a false positive.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update to build procedure with VS using CMake/CTest.
* js/cmake-updates:
cmake: increase time-out for a long-running test
cmake: avoid editing t/test-lib.sh
add -p: avoid ambiguous signed/unsigned comparison
cmake: copy the merge tools for testing
cmake: make it easier to diagnose regressions in CTest runs
Move a global variable added as a hack during regression fixes to
its proper place in the API.
* ab/run-hook-api-cleanup:
run-command.c: remove "max_processes", add "const" to signal() handler
run-command.c: pass "opts" further down, and use "opts->processes"
run-command.c: use "opts->processes", not "pp->max_processes"
run-command.c: don't copy "data" to "struct parallel_processes"
run-command.c: don't copy "ungroup" to "struct parallel_processes"
run-command.c: don't copy *_fn to "struct parallel_processes"
run-command.c: make "struct parallel_processes" const if possible
run-command API: move *_tr2() users to "run_processes_parallel()"
run-command API: have run_process_parallel() take an "opts" struct
run-command.c: use designated init for pp_init(), add "const"
run-command API: don't fall back on online_cpus()
run-command API: make "n" parameter a "size_t"
run-command tests: use "return", not "exit"
run-command API: have "run_processes_parallel{,_tr2}()" return void
run-command test helper: use "else if" pattern
When geometric repacking feature is in use together with the
--pack-kept-objects option, we lost packs marked with .keep files.
* tb/save-keep-pack-during-geometric-repack:
repack: don't remove .keep packs with `--pack-kept-objects`
More UNUSED annotation to help using -Wunused option with the
compiler.
* jk/unused-anno-more:
ll-merge: mark unused parameters in callbacks
diffcore-pickaxe: mark unused parameters in pickaxe functions
convert: mark unused parameter in null stream filter
apply: mark unused parameters in noop error/warning routine
apply: mark unused parameters in handlers
date: mark unused parameters in handler functions
string-list: mark unused callback parameters
object-file: mark unused parameters in hash_unknown functions
mark unused parameters in trivial compat functions
update-index: drop unused argc from do_reupdate()
submodule--helper: drop unused argc from module_list_compute()
diffstat_consume(): assert non-zero length
A bugfix with tracing support in midx codepath
* tb/midx-bitmap-selection-fix:
pack-bitmap-write.c: instrument number of reused bitmaps
midx.c: instrument MIDX and bitmap generation with trace2 regions
midx.c: consider annotated tags during bitmap selection
midx.c: fix whitespace typo
We are interested in exploring whether gc.cruftPacks=true should become
the default value.
To determine whether it is safe to do so, let's encourage more users to
try it out.
Users who have set feature.experimental=true have already volunteered to
try new and possibly-breaking config changes, so let's try this new
default with that set of users.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects via
loose, 2022-05-20) gc learned to respect '--cruft' and 'gc.cruftPacks'.
'--cruft' is exercised in t5329-pack-objects-cruft.sh, but in a way that
doesn't check whether a lone gc run generates these cruft packs.
'gc.cruftPacks' is never exercised.
Add some tests to exercise these options to gc in the gc test suite.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Meta/redo-jch.sh script is generated a few lines earlier by running:
$ Meta/Reintegrate master..seen >Meta/redo-jch.sh
But the resulting script is not necessarily executable. Later mentions
of this script invoke it with sh (instead of directly), but this one is
an odd one out.
Update the documentation to invoke the Meta/redo-jch.sh script with sh
in case the maintainer has not made the script executable.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since c2d17ba3db (branch --edit-description: protect against mistyped
branch name, 2012-02-05) we return -1 on error editing the branch
description.
Let's change to 1, which follows the established convention and it is
better for portability reasons.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c847f53712 (Detached HEAD (experimental), 2007-01-01) an error
condition was introduced in rename_branch() to prevent renaming, later
also copying, a detached HEAD.
The condition used was checking for NULL in oldname, the source branch
to rename/copy. That condition cannot be satisfied because if no source
branch is specified, HEAD is going to be used in the call.
The error issued instead is:
fatal: Invalid branch name: 'HEAD'
Let's remove the condition in copy_or_rename_branch() (the current
function name) and check for HEAD before calling it, dying with the
original intended error if we're in a detached HEAD.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mark_common() in negotiator/skipping.c may overflow the stack due to
recursive function calls. Avoid this by instead recursing using a
heap-allocated data structure.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Give a bit more diversity to macOS CI by using sha1dc in one of the
jobs (the other one tests Apple Common Crypto).
* jc/ci-osx-with-sha1dc:
ci: use DC_SHA1=YesPlease on osx-clang job for CI
Allow configuration files in "protected" scopes to include other
configuration files.
* gc/bare-repo-discovery:
config: respect includes in protected config
"git diff rev^!" did not show combined diff to go to the rev from
its parents.
* rs/diff-caret-bang-with-parents:
diff: support ^! for merges
revisions.txt: unspecify order of resolved parts of ^!
revision: use strtol_i() for exclude_parent
Code clean-up.
* jk/cleanup-callback-parameters:
attr: drop DEBUG_ATTR code
commit: avoid writing to global in option callback
multi-pack-index: avoid writing to global in option callback
test-submodule: inline resolve_relative_url() function
"GIT_EDITOR=: git branch --edit-description" resulted in failure,
which has been corrected.
* jc/branch-description-unset:
branch: do not fail a no-op --edit-desc
The codepath to sign learned to report errors when it fails to read
from "ssh-keygen".
* pw/ssh-sign-report-errors:
ssh signing: return an error when signature cannot be read
Fix a logic in "mailinfo -b" that miscomputed the length of a
substring, which lead to an out-of-bounds access.
* pw/mailinfo-b-fix:
mailinfo -b: fix an out of bounds access
Force C locale while running tests around httpd to make sure we can
find expected error messages in the log.
* rs/test-httpd-in-C-locale:
t/lib-httpd: pass LANG and LC_ALL to Apache
In read-only repositories, "git merge-tree" tried to come up with a
merge result tree object, which it failed (which is not wrong) and
led to a segfault (which is bad), which has been corrected.
* js/merge-ort-in-read-only-repo:
merge-ort: return early when failing to write a blob
merge-ort: fix segmentation fault in read-only repositories
"git rebase -i" can mistakenly attempt to apply a fixup to a commit
itself, which has been corrected.
* ja/rebase-i-avoid-amending-self:
sequencer: avoid dropping fixup commit that targets self via commit-ish
"git fsck" failed to release contents of tree objects already used
from the memory, which has been fixed.
* jk/fsck-on-diet:
parse_object_buffer(): respect save_commit_buffer
fsck: turn off save_commit_buffer
fsck: free tree buffers after walking unreachable objects
"git clone" did not like to see the "--bare" and the "--origin"
options used together without a good reason.
* jk/clone-allow-bare-and-o-together:
clone: allow "--bare" with "-o"
"git remote rename" failed to rename a remote without fetch
refspec, which has been corrected.
* jk/remote-rename-without-fetch-refspec:
remote: handle rename of remote without fetch refspec
The codepath that reads from the index v4 had unaligned memory
accesses, which has been corrected.
* vd/fix-unaligned-read-index-v4:
read-cache: avoid misaligned reads in index v4
Update CodingGuidelines to clarify what features to use and avoid
in C99.
* ab/coding-guidelines-c99:
CodingGuidelines: recommend against unportable C99 struct syntax
CodingGuidelines: mention C99 features we can't use
CodingGuidelines: allow declaring variables in for loops
CodingGuidelines: mention dynamic C99 initializer elements
CodingGuidelines: update for C99
During the initial development of the fsck-msgids.txt feature, it
has become apparent that it is very much error prone to make sure
the description in the documentation file are sorted and correctly
match what is in the fsck.h header file.
Add a quick-and-dirty Perl script and doc-lint target to sanity
check that the fsck-msgids.txt is consistent with the error type
list in the fsck.h header file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation lacks mention of specific <msg-id> that are supported.
While git-help --config will display a list of these options, often
developers' first instinct is to consult the git docs to find valid
config values.
Add a list of fsck error messages, and link to it from the git-fsck
documentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This error type has never been used since it was introduced at
159e7b08 (fsck: detect gitmodules files, 2018-05-02).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2175a0c6 (fsck: stop checking tag->tagged, 2019-10-18) stopped
checking the tagged object referred to by a tag object, which is what the
error message BAD_TAG_OBJECT was for. Since then the BAD_TAG_OBJECT
message is no longer used anywhere.
Remove the BAD_TAG_OBJECT msg-id.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The apply code is not prepared to handle extremely large files. It uses
"int" in some places, and "unsigned long" in others.
This combination leads to unfortunate problems when switching between
the two types. Using "int" prevents us from handling large files, since
large offsets will wrap around and spill into small negative values,
which can result in wrong behavior (like accessing the patch buffer with
a negative offset).
Converting from "unsigned long" to "int" also has truncation problems
even on LLP64 platforms where "long" is the same size as "int", since
the former is unsigned but the latter is not.
To avoid potential overflow and truncation issues in `git apply`, apply
similar treatment as in dcd1742e56 (xdiff: reject files larger than
~1GB, 2015-09-24), where the xdiff code was taught to reject large
files for similar reasons.
The maximum size was chosen somewhat arbitrarily, but picking a value
just shy of a gigabyte allows us to double it without overflowing 2^31-1
(after which point our value would wrap around to a negative number).
To give ourselves a bit of extra margin, the maximum patch size is a MiB
smaller than a full GiB, which gives us some slop in case we allocate
"(records + 1) * sizeof(int)" or similar.
Luckily, the security implications of these conversion issues are
relatively uninteresting, because a victim needs to be convinced to
apply a malicious patch.
Reported-by: 정재우 <thebound7@gmail.com>
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the recent turnover on the git-security list, questions came up how
things are usually run. Rather than answering questions individually,
extend Git's existing documentation about security vulnerabilities to
describe the git-security mailing list, how things are run on that list,
and what to expect throughout the process from the time a security bug
is reported all the way to the time when a fix is released.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Julia Ramer <gitprplr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The last git version that had "diff-tree" in the header text
of "git diff-tree" output was v1.3.0 from 2006. The header text
was changed from "diff-tree" to "commit" in 91539833
("Log message printout cleanups").
Given how long ago this change was made, it is highly unlikely that
anyone is still feeding in outputs from that git version.
Remove the handling of the "diff-tree" prefix and document the
source of the other prefixes so that the overall functionality
is more clear.
Signed-off-by: Jerry Zhang <Jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are situations where the user might not want the default
setting where patch-id strips all whitespace. They might be working
in a language where white space is syntactically important, or they
might have CI testing that enforces strict whitespace linting. In
these cases, a whitespace change would result in the patch
fundamentally changing, and thus deserving of a different id.
Add a new mode that is exclusive of --stable and --unstable called
--verbatim. It also corresponds to the config
patchid.verbatim = true. In this mode, the stable algorithm is
used and whitespace is not stripped from the patch text.
Users of --unstable mainly care about compatibility with old git
versions, which unstripping the whitespace would break. Thus there
isn't a usecase for the combination of --verbatim and --unstable,
and we don't expose this so as to not add maintainence burden.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
fixes https://github.com/Skydio/revup/issues/2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently patch-id as used in rebase and cherry-pick does not account
for file modes if the file is modified. One consequence of this is
that if you have a local patch that changes modes, but upstream
has applied an outdated version of the patch that doesn't include
that mode change, "git rebase" will drop your local version of the
patch along with your mode changes. It also means that internal
patch-id doesn't produce the same output as the builtin, which does
account for mode changes due to them being part of diff output.
Fix by adding mode to the patch-id if it has changed, in the same
format that would be produced by diff, so that it is compatible
with builtin patch-id.
Signed-off-by: Jerry Zhang <Jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git patch-id" currently doesn't produce correct output if the
incoming diff has any binary files. Add logic to get_one_patchid
to handle the different possible styles of binary diff. This
attempts to keep resulting patch-ids identical to what would be
produced by the counterpart logic in diff.c, that is it produces
the id by hashing the a and b oids in succession.
In general we handle binary diffs by first caching the object ids from
the "index" line and using those if we then find an indication
that the diff is binary.
The input could contain patches generated with "git diff --binary". This
currently breaks the parse logic and results in multiple patch-ids
output for a single commit. Here we have to skip the contents of the
patch itself since those do not go into the patch id. --binary
implies --full-index so the object ids are always available.
When the diff is generated with --full-index there is no patch content
to skip over.
When a diff is generated without --full-index or --binary, it will
contain abbreviated object ids. This will still result in a sufficiently
unique patch-id when hashed, but does not match internal patch id
output. We'll call this ok for now as we already need specialized
arguments to diff in order to match internal patch id (namely -U3).
Signed-off-by: Jerry Zhang <Jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git doesn't persist patch-ids during the rebase process, so there is
no need to specifically invoke the unstable variant. Use the stable
logic for all internal patch-id calculations to minimize the number of
code paths and improve test coverage.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Patch-ids for binary patches are found by hashing the object
ids of the before and after objects in succession. However in
the --stable case, there is a bug where hunks are not flushed
for binary and header-only patch ids, which would always result
in a patch-id of 0000. The --unstable case is currently correct.
Reorder the logic to branch into 3 cases for populating the
patch body: header-only which populates nothing, binary which
populates the object ids, and normal which populates the text
diff. All branches will end up flushing the hunk.
Don't populate the ---a/ and +++b/ lines for binary diffs, to correspond
to those lines not being present in the "git diff" text output.
This is necessary because we advertise that the patch-id calculated
internally and used in format-patch is the same that what the
builtin "git patch-id" would produce when piped from a diff.
Update the test to run on both binary and normal files.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the same spirit as the previous commit, reimplement
`--group=committer` as a special case of `--group=<format>`, too.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of handling SHORTLOG_GROUP_AUTHOR separately, reimplement it as
a special case of the new `--group=<format>` mode, where the author mode
is a shorthand for `--group='%aN <%aE>'.
Note that we still need to keep the SHORTLOG_GROUP_AUTHOR enum since it
has a different meaning in `read_from_stdin()`, where it is still used
for a different purpose.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract a function which finishes setting up the shortlog struct for
use. The caller in `make_cover_letter()` does not care about trailer
sorting, so it isn't strictly necessary to add a call there in this
patch.
But the next patch will add additional functionality to the new
`shortlog_finish_setup()` function, which the caller in
`make_cover_letter()` will care about.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to generating a shortlog based on committer, author, or the
identity in one or more specified trailers, it can be useful to generate
a shortlog based on an arbitrary commit format.
This can be used, for example, to generate a distribution of commit
activity over time, like so:
$ git shortlog --group='%cd' --date='format:%Y-%m' -s v2.37.0..
117 2022-06
274 2022-07
324 2022-08
263 2022-09
7 2022-10
Arbitrary commit formats can be used. In fact, `git shortlog`'s default
behavior (to count by commit authors) can be emulated as follows:
$ git shortlog --group='%aN <%aE>' ...
and future patches will make the default behavior (as well as
`--committer`, and `--group=trailer:<trailer>`) special cases of the
more flexible `--group` option.
Note also that the SHORTLOG_GROUP_FORMAT enum value is used only to
designate that `--group:<format>` is in use when in stdin mode to
declare that the combination is invalid.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The subsequent commit will add another unhandled case in
`read_from_stdin()` which will want to use the same message as with
`--group=trailer`.
Extract the "--group=trailer" part from this message so the same
translation key can be used for both cases.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When there are no trailers to insert, it is natural that
insert_records_from_trailers() should return without having done any
work.
But instead we guard this call unnecessarily by first checking whether
`log->groups` has the `SHORTLOG_GROUP_TRAILER` bit set.
Prepare to match a similar pattern in the future where a function which
inserts records of a certain type does no work when no specifiers
matching that type are given.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare for a future patch which will introduce arbitrary pretty formats
via the `--group` argument.
To allow additional customizability (for example, to support something
like `git shortlog -s --group='%aD' --date='format:%Y-%m' ...` (which
groups commits by the datestring 'YYYY-mm' according to author date), we
must store off the `--date` parsed from calling `parse_revision_opt()`.
Note that this also affects custom output `--format` strings in `git
shortlog`. Though this is a behavior change, this is arguably fixing a
long-standing bug (ie., that `--format` strings are not affected by
`--date` specifiers as they should be).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pruning objects with `--cruft`, `git repack` offers some
flexibility when selecting the set of which objects are pruned via the
`--cruft-expiration` option.
This is useful for expiring objects which are older than the grace
period, making races where to-be-pruned objects become reachable and
then ancestors of freshly pushed objects, leaving the repository in a
corrupt state after pruning substantially less likely [1].
But in practice, such races are impossible to avoid entirely, no matter
how long the grace period is. To prevent this race, it is often
advisable to temporarily put a repository into a read-only state. But in
practice, this is not always practical, and so some middle ground would
be nice.
This patch introduces a new option, `--expire-to`, which teaches `git
repack` to write an additional cruft pack containing just the objects
which were pruned from the repository. The caller can specify a
directory outside of the current repository as the destination for this
second cruft pack.
This makes it possible to prune objects from a repository, while still
holding onto a supplemental copy of them outside of the original
repository. Having this copy on-disk makes it substantially easier to
recover objects when the aforementioned race is encountered.
`--expire-to` is implemented in a somewhat convoluted manner, which is
to take advantage of the fact that the first time `write_cruft_pack()`
is called, it adds the name of the cruft pack to the `names` string
list. That means the second time we call `write_cruft_pack()`, objects
in the previously-written cruft pack will be excluded.
As long as the caller ensures that no objects are expired during the
second pass, this is sufficient to generate a cruft pack containing all
objects which don't appear in any of the new packs written by `git
repack`, including the cruft pack. In other words, all of the objects
which are about to be pruned from the repository.
It is important to note that the destination in `--expire-to` does not
necessarily need to be a Git repository (though it can be) Notably, the
expired packs do not contain all ancestors of expired objects. So if the
source repository contains something like:
<unreachable>
/
C1 --- C2
\
refs/heads/master
where C2 is unreachable, but has a parent (C1) which is reachable, and
C2 would be pruned, then the expiry pack will contain only C2, not C1.
[1]: https://lore.kernel.org/git/20190319001829.GL29661@sigill.intra.peff.net/
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the following commit, a new write_cruft_pack() caller will be added
which wants to write a cruft pack to an arbitrary location. Prepare for
this by adding a parameter which controls the destination of the cruft
pack.
For now, provide "packtmp" so that this commit does not change any
behavior.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`builtin/repack.c`'s `write_cruft_pack()` is used to generate the cruft
pack when `--cruft` is supplied. It uses a static variable
"cruft_expiration" which is filled in by option parsing.
A future patch will add an `--expire-to` option which allows `git
repack` to write a cruft pack containing the pruned objects out to a
separate repository. In order to implement this functionality, some
callers will have to pass a value for `cruft_expiration` different than
the one filled out by option parsing.
Prepare for this by teaching `write_cruft_pack` to take a
"cruft_expiration" parameter, instead of reading a single static
variable.
The (sole) existing caller of `write_cruft_pack()` will pass the value
for "cruft_expiration" filled in by option parsing, retaining existing
behavior. This means that we can make the variable local to
`cmd_repack()`, and eliminate the static declaration.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`builtin/repack.c`'s `prepare_pack_objects()` is used to prepare a set
of arguments to a `pack-objects` process which will generate a desired
pack.
A future patch will add an `--expire-to` option which allows `git
repack` to write a cruft pack containing the pruned objects out to a
separate repository. Prepare for this by teaching that function to write
packs to an arbitrary location specified by the caller.
All existing callers of `prepare_pack_objects()` will pass `packtmp` for
`out`, retaining the existing behavior.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add global counters mechanism to Trace2.
The Trace2 counters mechanism adds the ability to create a set of
global counter variables and an API to increment them efficiently.
Counters can optionally report per-thread usage in addition to the sum
across all threads.
Counter events are emitted to the Trace2 logs when a thread exits and
at process exit.
Counters are an alternative to `data` and `data_json` events.
Counters are useful when you want to measure something across the life
of the process, when you don't want per-measurement events for
performance reasons, when the data does not fit conveniently within a
region, or when your control flow does not easily let you write the
final total. For example, you might use this to report the number of
calls to unzip() or the number of de-delta steps during a checkout.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add stopwatch timer mechanism to Trace2.
Timers are an alternative to Trace2 Regions. Regions are useful for
measuring the time spent in various computation phases, such as the
time to read the index, time to scan for unstaged files, time to scan
for untracked files, and etc.
However, regions are not appropriate in all places. For example,
during a checkout, it would be very inefficient to use regions to
measure the total time spent inflating objects from the ODB from
across the entire lifetime of the process; a per-unzip() region would
flood the output and significantly slow the command; and some form of
post-processing would be requried to compute the time spent in unzip().
Timers can be used to measure a series of timer intervals and emit
a single summary event (at thread and/or process exit).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the `tr2tls_thread_ctx.thread_name` field from a `strbuf`
to a "const char*" pointer.
The `thread_name` field is a constant string that is constructed when
the context is created. Using a (non-const) `strbuf` structure for it
caused some confusion in the past because it implied that someone
could rename a thread after it was created. That usage was not
intended. Change it to a const pointer to make the intent more clear.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the documentation of the tr2tls_thread_ctx.thread_name field
and its relation to the tr2tls_thread_ctx.thread_id field.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the `thread_name` argument in `tr2tls_create_self()` and
`trace2_thread_start()` to be `thread_base_name` to make it clearer
that the passed argument is a component used in the construction of
the actual `struct tr2tls_thread_ctx.thread_name` variable.
The base name will be used along with the thread id to create a
unique thread name.
This commit does not change how the `thread_name` field is
allocated or stored within the `tr2tls_thread_ctx` structure.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eliminate the mostly obsolete `Public API` sub-section from the
`Trace2 API` section in the documentation. Strengthen the referral
to `trace2.h`.
Most of the technical information in this sub-section was moved to
`trace2.h` in 6c51cb525d (trace2: move doc to trace2.h, 2019-11-17) to
be adjacent to the function prototypes. The remaining text wasn't
that useful by itself.
Furthermore, the text would need a bit of overhaul to add routines
that do not immediately generate a message, such as stopwatch timers.
So it seemed simpler to just get rid of it.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce or eliminate use of the term "TLS" in the Trace2 code.
The term "TLS" has two popular meanings: "thread-local storage" and
"transport layer security". In the Trace2 source, the term is associated
with the former. There was concern on the mailing list about it refering
to the latter.
Update the source and documentation to eliminate the use of the "TLS" term
or replace it with the phrase "thread-local storage" to reduce ambiguity.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use "size_t" rather than "int" for the "alloc" and "nr_open_regions"
fields in the "tr2tls_thread_ctx". These are used by ALLOC_GROW().
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
absorb_git_dir_into_superproject() uses a strbuf and strvec_pushl() to
build and add the --super-prefix option and its argument. Use a single
strvec_pushf() call to add the stuck form instead, which reduces the
code size and avoids a strbuf allocation and release. The same is
already done in submodule_reset_index() and submodule_move_head().
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of our tests intentionally causes the cruft-pack generation phase of
repack to fail, in order to stimulate an exit from repack at the desired
moment. It does so by feeding a bogus option argument to pack-objects.
This is a simple and reliable way to get pack-objects to fail, but it
has one downside: pack-objects will die before reading its stdin, which
means the caller repack may racily get SIGPIPE writing to it.
For the purposes of this test, that's OK. We are checking whether repack
cleans up already-created .tmp files, and it will do so whether it exits
or dies by signal (because the tempfile API hooks both).
But we have to tell test_must_fail that either outcome is OK, or it
complains about the signal. Arguably this is a workaround (compared to
fixing repack), as repack dying to SIGPIPE means that it loses the
opportunity to give a more detailed message. But we don't actually write
such a message anyway; we rely on pack-objects to have written something
useful to stderr, and it does. In either case (signal or exit), that is
the main thing the user will see.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an option, --stdin, to merge-tree which will accept lines of input
with two branches to merge per line, and which will perform all the
merges and give output for each in turn. This option implies -z, and
modifies the output to also include a merge status since the exit code
of the program can no longer convey that information now that multiple
merges are involved.
This could be useful, for example, by Git hosting providers. When one
branch is updated, one may want to check whether all code reviews
targetting that branch can still cleanly merge. Avoiding the overhead
of starting up a separate process for each of those code reviews might
provide significant savings in a repository with many code reviews.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Informational Messages was updated in de90581141 ("merge-ort:
optionally produce machine-readable output", 2022-06-18) to provide more
detailed and machine parseable output when `-z` is passed, but the
Documentation was not updated to reflect these changes. Update it now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When initializing a repository object, we run "git rev-parse --git-dir"
to let the C version of Git find the correct directory. But curiously,
if this fails we don't automatically say "not a git repository".
Instead, we do our own pure-perl check to see if we're in a bare
repository.
This makes little sense, as rev-parse will report both bare and non-bare
directories. This logic comes from d5c7721d58 (Git.pm: Add support for
subdirectories inside of working copies, 2006-06-24), but I don't see
any reason given why we can't just rely on rev-parse. Worse, because we
treat any non-error response from rev-parse as a non-bare repository,
we'll erroneously set the object's WorkingCopy, even in a bare
repository.
But it gets worse. Since 8959555cee (setup_git_directory(): add an owner
check for the top-level directory, 2022-03-02), it's actively wrong (and
dangerous). The perl code doesn't implement the same ownership checks.
And worse, after "finding" the bare repository, it sets GIT_DIR in the
environment, which tells any subsequent Git commands that we've
confirmed the directory is OK, and to trust us. I.e., it re-opens the
vulnerability plugged by 8959555cee when using Git.pm's repository
discovery code.
We can fix this by just relying on rev-parse to tell us when we're not
in a repository, which fixes the vulnerability. Furthermore, we'll ask
its --is-bare-repository function to tell us if we're bare or not, and
rely on that.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When changing a directory to a symlink on one side of history, and
renaming the parent of that directory to a different directory name
on the other side, e.g. with this kind of setup:
Base commit: Has a file named dir/subdir/file
Side1: Rename dir/ -> renamed-dir/
Side2: delete dir/subdir/file, add dir/subdir as symlink
Then merge-ort was running into an assertion failure:
git: merge-ort.c:2622: apply_directory_rename_modifications: Assertion `ci->dirmask == 0' failed
merge-recursive did not have as obvious an issue handling this case,
likely because we never fixed it to handle the case from commit
902c521a35 ("t6423: more involved directory rename test", 2020-10-15)
where we need to be careful about nested renames when a directory rename
occurs (dir/ -> renamed-dir/ implies dir/subdir/ ->
renamed-dir/subdir/). However, merge-recursive does have multiple
problems with this testcase:
* Incorrect stages for the file: merge-recursive omits the stage in
the index corresponding to the base stage, making `git status`
report "added by us" for renamed-dir/subdir/file instead of the
expected "deleted by them".
* Poor directory/file conflict handling: For the renamed-dir/subdir
symlink, instead of reporting a file/directory conflict as
expected, it reports "Error: Refusing to lose untracked file at
renamed-dir/subdir". This is a lie because there is no untracked
file at that location. It then does the normal suboptimal
merge-recursive thing of having the symlink be tracked in the index
at a location where it can't be written due to D/F conflicts
(namely, renamed-dir/subdir), but writes it to the working tree at
a different location as a new untracked file (namely,
renamed-dir/subdir~B^0)
Technically, these problems don't prevent the user from resolving the
merge if they can figure out to ignore the confusion, but because both
pieces of output are quite confusing I don't want to modify the test
to claim the recursive also passes it even if it doesn't have the bug
that ort did.
So, fix the bug in ort by splitting the conflict_info for "dir/subdir"
into two, one for the directory part, one for the file (i.e. symlink)
part, since the symlink is being renamed by directory rename detection.
The directory part is needed for proper nesting, since there are still
conflict_info fields for files underneath it (though those are marked
as is_null, they are still present until the entries are processed,
and the entry processing wants every non-toplevel entry to have a
parent directory).
Reported-by: Stefano Rivera <stefano@rivera.za.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After we've successfully finished the repack, we call
remove_temporary_files(), which looks for and removes any files matching
".tmp-$$-pack-*", where $$ is the pid of the current process. But this
is pointless. If we make it this far in the process, we've already
renamed these tempfiles into place, and there is nothing left to delete.
Nor is there a point in trying to call it to clean up when we _aren't_
successful. It's not safe for using in a signal handler, and the
previous commit already handed that job over to the tempfile API.
It might seem like it would be useful to clean up stray .tmp files left
by other invocations of git-repack. But it won't clean those files; it
only matches ones with its pid, and leaves the rest. Fortunately, those
are cleaned up naturally by successive calls to git-repack; we'll
consider .tmp-*.pack the same as normal packfiles, so "repack -ad", etc,
will roll up their contents and eventually delete them.
The one case that could matter is if pack-objects generates an extension
we don't know about, like ".tmp-pack-$$-$hash.some-new-ext". The current
code will quietly delete such a file, while after this patch we'd leave
it in place. In practice this doesn't happen, and would be indicative of
a bug. Leaving the file as cruft is arguably a better behavior, as it
means somebody is more likely to eventually notice and fix the bug. If
we really wanted to be paranoid, we could scan for and warn about such
files, but that seems like overkill.
There's nothing to test with regard to the removal of this function. It
was doing nothing, so the behavior should be the same. However, we can
verify (and protect) our assumption that "repack -ad" will eventually
remove stray files by adding a test for that.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git-repack exits due to a signal, it tries to clean up by calling
its remove_temporary_files() function, which walks through the packs dir
looking for ".tmp-$$-pack-*" files to delete (where "$$" is the pid of
the current process).
The biggest problem here is that remove_temporary_files() is not safe to
call in a signal handler. It uses opendir(), which isn't on the POSIX
async-signal-safe list. The details will be platform-specific, but a
likely issue is that it needs to allocate memory; if we receive a signal
while inside malloc(), etc, we'll conflict on the allocator lock and
deadlock with ourselves.
We can fix this by just cleaning up the files directly, without walking
the directory. We already know the complete list of .tmp-* files that
were generated, because we recorded them via populate_pack_exts(). When
we find files there, we can use register_tempfile() to record the
filenames. If we receive a signal, then the tempfile API will clean them
up for us, and it's async-safe and pretty battle-tested.
Note that this is slightly racier than the existing scheme. We don't
record the filenames until pack-objects tells us the hash over stdout.
So during the period between it generating the file and reporting the
hash, we'd fail to clean up. However, that period is very small. During
most of the pack generation process pack-objects is using its own
internal tempfiles. It's only at the very end that it moves them into
the names git-repack expects, and then it immediately reports the name
to us. Given that cleanup like this is best effort (after all, we may
get SIGKILL), this level of race is acceptable.
When we register the tempfiles, we'll record them locally and use the
result to call rename_tempfile(), rather than renaming by hand. This
isn't strictly necessary, as once we've renamed the files they're gone,
and the tempfile API's cleanup unlink() would simply become a pointless
noop. But managing the lifetimes of the tempfile objects is the cleanest
thing to do, and the tempfile pointers naturally fill the same role as
the old booleans.
This patch also fixes another small problem. We only hook signals, and
don't set up an atexit handler. So if we see an error that causes us to
die(), we'll leave the .tmp-* files in place. But since the tempfile API
handles this for us, this is now fixed for free. The new test covers
this by stimulating a failure of pack-objects when generating a cruft
pack. Before this patch, the .tmp-* file for the main pack would have
been left, but now we correctly clean it up.
Two small subtleties on the implementation:
- in the renaming loop, we can stop re-constructing fname_old; we only
use it when we have a tempfile to rename, so we can just ask the
tempfile for its path (which, barring bugs, should be identical)
- when renaming fails, our error message mentions fname_old. But since
a failed rename_tempfile() invalidates the tempfile struct, we'll
lose access to that string. Instead, let's mention the destination
filename, which is what most other callers do.
Reported-by: Jan Pokorný <poki@fnusa.cz>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If pack-objects tells us it generated pack $hash, we expect to find
.tmp-$$-pack-$hash.pack, .idx, .rev, and so on. Some of these files are
optional, but others are not. For the required ones, we'll bail with an
error if any of them is missing.
The error message is just "missing required file", which is a bit vague.
We should be more clear that it is not the user's fault, but rather that
the sub-pgoram we called is not operating as expected. In practice,
nobody should ever see this message, as it would generally only be
caused by a bug in Git.
It probably doesn't make sense to convert this to a BUG(), though, as
there are other (unlikely) possibilities, such as somebody else racily
deleting the files, filesystem errors causing stat() to fail, and so on.
A nice side effect here is that we stop relying on fname_old in this
code path, which will let us deal with it only in the first part of the
conditional.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After generating the main pack and then any additional cruft packs, we
iterate over the "names" list (which contains hashes of packs generated
by pack-objects), and call populate_pack_exts() for each.
There's one small problem with this. In repack_promisor_objects(), we
may add entries to "names" and call populate_pack_exts() for them.
Calling it again is mostly just wasteful, as we'll stat() the filename
with each possible extension, get the same result, and just overwrite
our bits.
So we could drop the call there, and leave the final loop to populate
all of the bits. But instead, this patch does the reverse: drops the
final loop, and teaches the other two sites to populate the bits as they
add entries.
This makes the code easier to reason about, as you never have to worry
about when the util field is valid; it is always valid for each entry.
It also serves my ulterior purpose: recording the generated filenames as
soon as possible will make it easier for a future patch to use them for
cleaning up from a failed operation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We keep a string_list "names" containing the hashes of packs generated
on our behalf by pack-objects. The util field of each item is treated as
a bitfield that tells us which extensions (.pack, .idx, .rev, etc) are
present for each name.
Let's switch this to allocating a real array. That will give us room in
a future patch to store more data than just a single bit per extension.
And it makes the code a little easier to read, as we avoid casting back
and forth between uintptr_t and a void pointer.
Since the only thing we're storing is an array, we could just allocate
it directly. But instead I've put it into a named struct here. That
further increases readability around the casts, and in particular helps
differentiate us from other string_lists in the same file which use
their util field differently. E.g., the existing_*_packs lists still do
bit-twiddling, but their bits have different meaning than the ones in
"names". This makes it hard to grep around the code to see how the util
fields are used; now you can look for "generated_pack_data".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous step made an attempt to correctly compute display
columns allocated and padded different parts of diffstat output.
There are at least two known codepaths in the function that still
mixes up display widths and byte length that need to be fixed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit fixed a failure in 'git subtree merge --squash' when
the previous squash-merge merged an annotated tag of the subtree
repository which is missing locally.
The same failure happens in 'git subtree split', either directly or when
called by 'git subtree push', under the same circumstances: 'cmd_split'
invokes 'find_existing_splits', which loops through previous commits and
invokes 'git rev-parse' (via 'process_subtree_split_trailer') on the
value of any 'git subtree-split' trailer it finds. This fails if this
value is the hash of an annotated tag which is missing locally.
Add a new optional argument 'repository' to 'cmd_split' and
'find_existing_splits', and invoke 'cmd_split' with that argument from
'cmd_push'. This allows 'process_subtree_split_trailer' to try to fetch
the missing tag from the 'repository' if it's not available locally,
mirroring the new behaviour of 'git subtree pull' and 'git subtree
merge'.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git subtree merge --squash $ref' is invoked, either directly or
through 'git subtree pull --squash $repo $ref', the code looks for the
latest squash merge of the subtree in order to create the new merge
commit as a child of the previous squash merge.
This search is done in function 'process_subtree_split_trailer', invoked
by 'find_latest_squash', which looks for the most recent commit with a
'git-subtree-split' trailer; that trailer's value is the object name in
the subtree repository of the ref that was last squash-merged. The
function verifies that this object is present locally with 'git
rev-parse', and aborts if it's not.
The hash referenced by the 'git-subtree-split' trailer is guaranteed to
correspond to a commit since it is the result of running 'git rev-parse
-q --verify "$1^{commit}"' on the first argument of 'cmd_merge' (this
corresponds to 'rev' in 'cmd_merge' which is passed through to
'new_squash_commit' and 'squash_msg').
But this is only the case since e4f8baa88a (subtree: parse revs in
individual cmd_ functions, 2021-04-27), which went into Git 2.32. Before
that commit, 'cmd_merge' verified the revision it was given using 'git
rev-parse --revs-only "$@"'. Such an invocation, when fed the name of an
annotated tag, would return the hash of the tag, not of the commit
referenced by the tag.
This leads to a failure in 'find_latest_squash' when squash-merging if
the most recent squash-merge merged an annotated tag of the subtree
repository, using a pre-2.32 version of 'git subtree', unless that
previous annotated tag is present locally (which is not usually the
case).
We can fix this by fetching the object directly by its hash in
'process_subtree_split_trailer' when 'git rev-parse' fails, but in order
to do so we need to know the name or URL of the subtree repository.
This is not possible in general for 'git subtree merge', but is easy
when it is invoked through 'git subtree pull' since in that case the
subtree repository is passed by the user at the command line.
Allow the 'git subtree pull' scenario to work out-of-the-box by adding
an optional 'repository' argument to functions 'cmd_merge',
'find_latest_squash' and 'process_subtree_split_trailer', and invoke
'cmd_merge' with that 'repository' argument in 'cmd_pull'.
If 'repository' is absent in 'process_subtree_split_trailer', instruct
the user to try fetching the missing object directly.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both functions 'find_latest_squash' (called by 'git subtree merge
--squash' and 'git subtree split --rejoin') and 'find_existing_splits'
(called by git 'subtree split') loop through commits that have a
'git-subtree-dir' trailer, and then process the 'git-subtree-mainline'
and 'git-subtree-split' trailers for those commits.
The processing done for the 'git-subtree-split' trailer is simple: we
check if the object exists with 'rev-parse' and set the variable
'sub' to the object name, or we die if the object does not exist.
In a future commit we will add more steps to the processing of this
trailer in order to make the code more robust.
To reduce code duplication, move the processing of the
'git-subtree-split' trailer to a dedicated function,
'process_subtree_split_trailer'.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'cmd_pull' already checks that only two arguments are given,
'repository' and 'ref'. Define variables with these names instead of
using the positional parameter $2 and "$@".
This will allow a subsequent commit to pass 'repository' to 'cmd_merge'.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function 'find_latest_squash' takes a single argument, 'dir', but a
debug statement uses this variable before it takes its value from $1.
This statement thus gets the value of 'dir' from the calling function,
which currently is the same as the 'dir' argument, so it works but it
is confusing.
Move the definition of 'dir' before its first use.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just as was done in 0008d12284 (submodule: prefix die messages with
'fatal', 2021-07-10) for 'git-submodule.sh', make the 'die' messages
outputed by 'git-subtree.sh' more in line with the rest of the code base
by prefixing them with "fatal: ", and do not capitalize their first
letter.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9a3e3ca2ba (subtree: be stricter about validating flags, 2021-04-27)
added validation code to check that options given to 'git subtree <cmd>'
made sense with the command being used.
Refactor these checks by adding a 'die_incompatible_opt' function to
reduce code duplication.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are three occurences of 'git rev-parse <rev>' in 'git-subtree.sh'
where the command expects a revision and the script dies or exits if the
revision can't be found. In that case, the error message from 'git
rev-parse' is:
$ git rev-parse <bad rev>
<bad rev>
fatal: ambiguous argument '<bad rev>': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
This is a little confusing to the user, since this error message is
outputed by 'git subtree'.
At these points in the script, we know that we are looking for a single
revision, so be explicit by using '--verify', resulting in a little
better error message:
$ git rev-parse --verify <bad rev>
fatal: Needed a single revision
In the two occurences where we 'die' if 'git rev-parse' fails, 'git
subtree' outputs "could not rev-parse split hash $b from commit $sq", so
we actually do not need the supplementary error message from 'git
rev-parse'; add '--quiet' to silence it.
In the third occurence, we 'exit', so keep the error message from 'git
rev-parse'. Note that this messsage is still suboptimal since it can be
understood to mean that 'git rev-parse' did not receive a single
revision as argument, which is not the case here: the command did
receive a single revision, but the revision is not resolvable to an
available object.
The alternative would be to use '--' after the revision, as suggested by
the first error message, resulting in a clearer error message:
$ git rev-parse <bad rev> --
fatal: bad revision '<bad rev>'
Unfortunately we can't use that syntax because in the more common case
of the revision resolving to a known object, the command outputs the
object's hash, a newline, and the dashdash, which breaks the 'git
subtree' script.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some variables in 'test_commit' have names that are common enough that
it is very likely that test authors might use them in a test. If they do
so and use 'test_commit' between setting such a variable and using it,
the variable value from 'test_commit' will leak back into the test and
most likely break it.
Prevent that by marking all variables in 'test_commit' as 'local'. This
allow a subsequent commit to use a 'tag' variable.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To make sure that our manpages are rebuilt when any of the included
source files change and only the affected manpages are rebuilt,
'build-docdep.perl' scans our documentation source files for include
directives, and outputs 'make' dependencies to be included by
'Documentation/Makefile'. This script relies on Perl's hash data
structures, and generates its output while iterating over them, and
since hashes in Perl are very much unordered, the output varies
greatly from run to run, both the order of targets and the order of
dependencies of each target.
This lack of ordering doesn't matter for 'make', because it cares
neither about the order of targets in a Makefile nor about the order
of a target's dependencies. However, it does matter to developers
looking into build issues potentially involving these generated
dependencies, as it's rather hard to tell whether there are any
relevant (i.e. not order-only) changes among the dependencies compared
to the previous run.
So let's make 'build-docdep.perl's output stable and ordered by
sorting the keys of the hashes before iterating over them.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch --edit-description @{-1}" is now a way to edit branch
description of the branch you were on before switching to the
current branch.
* rj/branch-edit-description-with-nth-checkout:
branch: support for shortcuts like @{-1}, completed
Giving "--invert-grep" and "--all-match" without "--grep" to the
"git log" command resulted in an attempt to access grep pattern
expression structure that has not been allocated, which has been
corrected.
* ab/grep-simplify-extended-expression:
grep.c: remove "extended" in favor of "pattern_expression", fix segfault
After checking out a "branch" that is a symbolic-ref that points at
another branch, "git symbolic-ref HEAD" reports the underlying
branch, not the symbolic-ref the user gave checkout as argument.
The command learned the "--no-recurse" option to stop after
dereferencing a symbolic-ref only once.
* jc/symbolic-ref-no-recurse:
symbolic-ref: teach "--[no-]recurse" option
Avoid false-positive from LSan whose assumption may be broken with
higher optimization levels.
* jk/use-o0-in-leak-sanitizer:
Makefile: force -O0 when compiling with SANITIZE=leak
7b8cfe34 (Merge branch 'ed/fsmonitor-on-networked-macos',
2022-10-17) broke the build on macOS with sha1dc by bypassing our
hash abstraction (git_SHA_CTX etc.), but it wasn't caught before the
problematic topic was merged down to the 'master' branch. Nobody
was even compile testing with DC_SHA1 set, although it is the
recommended choice in these days for folks when they use SHA-1.
This was because the default for macOS uses Apple Common Crypto, and
both of the two CI jobs did not override the default. Tweak one of
them to use DC_SHA1 to improve the coverage.
We may want to give similar diversity for Linux jobs so that some of
them build with other implementations of SHA-1; they currently all
build and test with DC_SHA1 as that is the default on everywhere
other than macOS.
But let's start small to fill only the immediate need.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current code is clean with these two sanitizers, and we would
like to keep it that way by running the checks for any new code.
The signal of "passed with asan, but not ubsan" (or vice versa) is
not that useful in practice, so it is tempting to run both santizers
in a single task, but it seems to take forever, so tentatively let's
try having two separate ones.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Work around older clang that warns against C99 zero initialization
syntax for struct.
* jh/struct-zero-init-with-older-clang:
config.mak.dev: disable suggest braces error on old clang versions
Update CodingGuidelines to clarify what features to use and avoid
in C99.
* ab/coding-guidelines-c99:
CodingGuidelines: recommend against unportable C99 struct syntax
CodingGuidelines: mention C99 features we can't use
CodingGuidelines: allow declaring variables in for loops
CodingGuidelines: mention dynamic C99 initializer elements
CodingGuidelines: update for C99
As suggested in
https://github.com/git-for-windows/git/issues/3966#issuecomment-1221264238,
t7112 can run for well over one hour, which seems to be the default
maximum run time at least when running CTest-based tests in Visual
Studio.
Let's increase the time-out as a stop gap to unblock developers wishing
to run Git's test suite in Visual Studio.
Note: The actual run time is highly dependent on the circumstances. For
example, in Git's CI runs, the Windows-based tests typically take a bit
over 5 minutes to run. CI runs have the added benefit that Windows
Defender (the common anti-malware scanner on Windows) is turned off,
something many developers are not at liberty to do on their work
stations. When Defender is turned on, even on this developer's high-end
Ryzen system, t7112 takes over 15 minutes to run.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7f5397a07c (cmake: support for testing git when building out of the
source tree, 2020-06-26), we implemented support for running Git's test
scripts even after building Git in a different directory than the source
directory.
The way we did this was to edit the file `t/test-lib.sh` to override
`GIT_BUILD_DIR` to point somewhere else than the parent of the `t/`
directory.
This is unideal because it always leaves a tracked file marked as
modified, and it is all too easy to commit that change by mistake.
Let's change the strategy by teaching `t/test-lib.sh` to detect the
presence of a file called `GIT-BUILD-DIR` in the source directory. If it
exists, the contents are interpreted as the location to the _actual_
build directory. We then write this file as part of the CTest
definition.
To support building Git via a regular `make` invocation after building
it using CMake, we ensure that the `GIT-BUILD-DIR` file is deleted (for
convenience, this is done as part of the Makefile rule that is already
run with every `make` invocation to ensure that `GIT-BUILD-OPTIONS` is
up to date).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the interactive `add` operation, users can choose to jump to specific
hunks, and Git will present the hunk list in that case. To avoid showing
too many lines at once, only a maximum of 21 hunks are shown, skipping
the "mode change" pseudo hunk.
The comparison performed to skip the "mode change" pseudo hunk (if any)
compares a signed integer `i` to the unsigned value `mode_change` (which
can be 0 or 1 because it is a 1-bit type).
According to section 6.3.1.8 of the C99 standard (see e.g.
https://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf), what should
happen is an automatic conversion of the "lesser" type to the "greater"
type, but since the types differ in signedness, it is ill-defined what
is the correct "usual arithmetic conversion".
Which means that Visual C's behavior can (and does) differ from GCC's:
When compiling Git using the latter, `add -p`'s `goto` command shows no
hunks by default because it casts a negative start offset to a pretty
large unsigned value, breaking the "goto hunk" test case in
`t3701-add-interactive.sh`.
Let's avoid that by converting the unsigned bit explicitly to a signed
integer.
Note: This is a long-standing bug in the Visual C build of Git, but it
has never been caught because t3701 is skipped when `NO_PERL` is set,
which is the case in the `vs-test` jobs of Git's CI runs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even when running the tests via CTest, t7609 and t7610 rely on more than
only a few mergetools to be copied to the build directory. Let's make it
so.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a test script fails in Git's test suite, the usual course of action
is to re-run it using options to increase the verbosity of the output,
e.g. `-v` and `-x`.
Like in Git's CI runs, when running the tests in Visual Studio via the
CTest route, it is cumbersome or at least requires a very unintuitive
approach to pass options to the test scripts: the CMakeLists.txt file
would have to be modified, passing the desired options to _all_ test
scripts, and then the CMake Cache would have to be reconfigured before
running the test in question individually. Unintuitive at best, and
opposite to the niceties IDE users expect.
So let's just pass those options by default: This will not clutter any
output window but the log that is written to a log file will have
information necessary to figure out test failures.
While at it, also imitate what the Windows jobs in Git's CI runs do to
accelerate running the test scripts: pass the `--no-bin-wrappers` and
`--no-chain-lint` options.
This makes the test runs noticeably faster because the `bin-wrappers/`
scripts as well as the `chain-lint` code make heavy use of POSIX shell
scripting, which is really, really slow on Windows due to the need to
emulate POSIX behavior via the MSYS2 runtime. In a test by Eric
Sunshine, it added two minutes (!) just to perform the chain-lint task.
The idea of adding a CMake config option (á la `GIT_TEST_OPTS`) was
considered during the development of this patch, but then dropped: such
a setting is global, across _all_ tests, where e.g. `--run=...` would
not make sense. Users wishing to override these new defaults are better
advised running the test script manually, in a Git Bash, with full
control over the command line.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we'll address in subsequent commits the "DC_SHA1=YesPlease" is not
on by default on OSX, instead we use Apple Common Crypto's SHA-1
implementation.
In 6beb2688d3 (fsmonitor: relocate socket file if .git directory is
remote, 2022-10-04) the build was broken with "DC_SHA1=YesPlease" (and
probably other non-"APPLE_COMMON_CRYPTO" SHA-1 backends).
So let's extract the fix for this from [1] to get the build working
again with "DC_SHA1=YesPlease". In addition to the fix in [1] we also
need to replace "SHA_DIGEST_LENGTH" with "GIT_MAX_RAWSZ".
1. https://lore.kernel.org/git/c085fc15b314abcb5e5ca6b4ee5ac54a28327cab.1665326258.git.gitgitgadget@gmail.com/
Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compiling with -O2 can interact badly with LSan's leak-checker, causing
false positives. Imagine a simplified example like:
char *str = allocate_some_string();
if (some_func(str) < 0)
die("bad str");
free(str);
The compiler may eliminate "str" as a stack variable, and just leave it
in a register. The register is preserved through most of the function,
including across the call to some_func(), since we'd eventually need to
free it. But because die() is marked with NORETURN, the compiler knows
that it doesn't need to save registers, and just clobbers it.
When die() eventually exits, the leak-checker runs. It looks in
registers and on the stack for any reference to the memory allocated by
str (which would indicate that it's not leaked), but can't find one. So
it reports it as a leak.
Neither system is wrong, really. The C standard (mostly section 5.1.2.3)
defines an abstract machine, and compilers are allowed to modify the
program as long as the observable behavior of that abstract machine is
unchanged. Looking at random memory values on the stack is undefined
behavior, and not something that the optimizer needs to support. But
there really isn't any other way for a leak checker to work; it
inherently has to do undefined things like scouring memory for pointers.
So the two things are inherently at odds with each other. We can't fix
it by changing the code, because from the perspective of the program
running in an abstract machine, there is no leak.
This has caused real false positives in the past, like:
- https://lore.kernel.org/git/patch-v3-5.6-9a44204c4c9-20211022T175227Z-avarab@gmail.com/
- https://lore.kernel.org/git/Yy4eo6500C0ijhk+@coredump.intra.peff.net/
- https://lore.kernel.org/git/Y07yeEQu+C7AH7oN@nand.local/
This patch makes those go away by forcing -O0 when compiling with LSan.
There are a few ways we could do this:
- we could just teach the linux-leaks CI job to set -O0. That's the
smallest change, and means we wouldn't get spurious CI failures. But
it doesn't help people looking for leaks manually or in a specific
test (and because the problem depends on the vagaries of the
optimizer, investigating these can waste a lot of time in
head-scratching as the problem comes and goes)
- we default to -O2 in CFLAGS; we could pull this out to a separate
variable ("-O$(O)" or something) and modify "O" when LSan is in use.
This is the most flexible, in that you could still build with "make
O=2 SANITIZE=leak" if you really wanted to (say, for experimenting).
But it would also fail to kick in if the user defines their own
CFLAGS variable, which again leads to head-scratching.
- we can just stick -O0 into BASIC_CFLAGS when enabling LSan. Since
this comes after the user-provided CFLAGS, it will override any
previous -O setting found there. This is more foolproof, albeit less
flexible. If you want to experiment with an optimized leak-checking
build, you'll have to put "-O2 -fsanitize=leak" into CFLAGS
manually, rather than using our SANITIZE=leak Makefile magic.
Since the final one is the least likely to break in normal use, this
patch uses that approach.
The resulting build is a little slower, of course, but since LSan is
already about 2x slower than a regular build, another 10% slowdown isn't
that big a deal.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When attempting to initialize a repository object in an unsafe
directory, a syntax error is reported (Can't use string as a HASH ref
while strict refs in use). Fix this runtime error by adding the required
semicolon after the catch statement.
Without the semicolon, the result of the following line (i.e., the
result of Cwd::abs_path) is passed as the third argument to Error.pm's
catch function. That function expects that its third argument,
$clauses, is a hash reference, and trying to access a string as a hash
reference is a fatal error.
[1] https://lore.kernel.org/git/20221011182607.f1113fff-9333-427d-ba45-741a78fa6040@korelogic.com/
Reported-by: Hank Leininger <hlein@korelogic.com>
Signed-off-by: Michael McClimon <michael@mcclimon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git repack` supports a `--pack-kept-objects` flag which more or less
translates to whether or not we pass `--honor-pack-keep` down to `git
pack-objects` when assembling a new pack.
This behavior has existed since ee34a2bead (repack: add
`repack.packKeptObjects` config var, 2014-03-03). In that commit, the
documentation was extended to say:
[...] Note that we still do not delete `.keep` packs after
`pack-objects` finishes.
Unfortunately, this is not the case when `--pack-kept-objects` is
combined with a `--geometric` repack. When doing a geometric repack, we
include `.keep` packs when enumerating available packs only when
`pack_kept_objects` is set.
So this all works fine when `--no-pack-kept-objects` (or similar) is
given. Kept packs are excluded from the geometric roll-up, so when we go
to delete redundant packs (with `-d`), no `.keep` packs appear "below
the split" in our geometric progression.
But when `--pack-kept-objects` is given, things can go awry. Namely,
when a kept pack is included in the list of packs tracked by the
`pack_geometry` struct *and* part of the pack roll-up, we will delete
the `.keep` pack when we shouldn't.
Note that this *doesn't* result in object corruption, since the `.keep`
pack's objects are still present in the new pack. But the `.keep` pack
itself is removed, which violates our promise from back in ee34a2bead.
But there's more. Because `repack` computes the geometric roll-up
independently from selecting which packs belong in a MIDX (with
`--write-midx`), this can lead to odd behavior. Consider when a `.keep`
pack appears below the geometric split (ie., its objects will be part of
the new pack we generate).
We'll write a MIDX containing the new pack along with the existing
`.keep` pack. But because the `.keep` pack appears below the geometric
split line, we'll (incorrectly) try to remove it. While this doesn't
corrupt the repository, it does cause us to remove the MIDX we just
wrote, since removing that pack would invalidate the new MIDX.
Funny enough, this behavior became far less noticeable after e4d0c11c04
(repack: respect kept objects with '--write-midx -b', 2021-12-20), which
made `pack_kept_objects` be enabled by default only when we were writing
a non-MIDX bitmap.
But e4d0c11c04 didn't resolve this bug, it just made it harder to notice
unless callers explicitly passed `--pack-kept-objects`.
The solution is to avoid trying to remove `.keep` packs during
`--geometric` repacks, even when they appear below the geometric split
line, which is the approach this patch implements.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we write a MIDX bitmap after repacking, it is possible that the
repository would be left in a state with both pack- and multi-pack
reachability bitmaps.
This can occur, for instance, if a pack that was kept (either by having
a .keep file, or during a geometric repack in which it is not rolled up)
has a bitmap file, and the repack wrote a multi-pack index and bitmap.
When loading a reachability bitmap for the repository, the multi-pack
one is always preferred, so the pack-based one is redundant. Let's
remove it unconditionally, even if '-d' isn't passed, since there is no
practical reason to keep both around. The patch below does just that.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a generic ll_merge_fn, but not every implementation needs every
parameter. In particular, neither binary nor ext merges care about names
(since they do not generate conflict markers), and most do not need to
look at the ll_merge_driver itself.
Ironically, neither ll_xdl_merge() nor ll_union_merge() needs to have
their driver parameter annotated (even though both are named
drv_unused!). This is because they may fall back to calling
ll_binary_merge() directly. And even though that function won't look at
it, we still pass it along, and hence it is "used" in the caller.
We could get away with passing NULL, but that's likely more confusing
and brittle than just passing along our own driver. And we have to keep
the driver parameter in all callbacks, since ll_ext_merge() uses it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a virtual pickaxe_fn for handling -G versus -S pickaxe options.
They need to take the same set of parameters, but of course they care
about different ones (e.g., a regex -G will never use a kwset).
Mark the unused ones to appease -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The null stream filter unsurprisingly does not look at its "filter"
argument, since it just eats bytes. But we can't drop it, since it has
to conform to the same virtual interface that real filters do. Mark the
unused parameter to appease -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We squelch error/warning output by passing a noop handler to
set_error_routine(). We need to tell the compiler that this is intended
so that it doesn't trigger -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In parse_git_diff_header(), we have a table-driven parser that maps
strings to handler functions. Not all handlers need all of the
parameters; let's mark the unused ones to appease -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing approxidates, we use a table to map special strings (like
"noon") to functions which handle them. Not all functions need the "now"
parameter, as they are not relative (e.g., "yesterday" does, but "pm"
does not). Let's annotate those to make -Wunused-parameter happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
String-lists may be used with callbacks for clearing or iteration. These
callbacks need to conform to a particular interface, even though not
every callback needs all of its parameters. Mark the unused ones to make
-Wunused-parameter happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 0'th entry of our hash_algos array fills out the virtual methods
with a series of functions which simply BUG(). This is the right thing
to do, since the point is to catch use of an invalid algo parameter, but
we need to annotate them to appease -Wunused-parameters.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a platform feature isn't available or in use, we sometimes
conditionally compile empty or trivial functions to turn these into
noops. We need to annotate their parameters so that -Wunused-parameters
won't complain about them.
Note that there are many more of these in compat/mingw.h, but we'll
leave them for now, as there's some trickery required to get the UNUSED
macro available there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parse-options callback for --again soaks up all remaining options by
manipulating the parse_opt_ctx's argc and argv fields. Even though it
has to look at both, the actual parsing happens via the do_reupdate()
helper, which only looks at the argv half (by passing it along to
parse_pathspec). So that helper doesn't need to see argc at all.
Note that the helper does look at "argv + 1" without confirming that
argc is greater than 0. We know this is correct because it is skipping
past the actual "--again" string, which will always be present. However,
to make what's going on more obvious, let's move that "+1" into the
caller, which has the matching "-1" when fixing up the ctx's argc/argv.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The module_list_compute() function takes an argc/argv pair, but never
looks at argc. This is OK, as the NULL terminator in argv is sufficient
for our purposes (we feed it to parse_pathspec(), which takes only the
array, not a count).
Note that one of the callers _looks_ like it would be buggy, but isn't:
we pass 0/NULL for argc/argv from module_foreach(), so finding the
terminating NULL in that argv naively would segfault. However,
parse_pathspec() is smart enough to interpret a bare NULL as an empty
argv.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callback interface for xdiff_emit_line_fn gives us a line/len pair,
but diffstat_consume() never looks at "len". At first glance this seems
like a bug that could cause us to read further than xdiff intends. But
in practice, we read only the first character, and xdiff would never
pass us an empty line.
Let's add a run-time assertion that this is true, which clarifies our
assumption and silences -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch --edit-description" on an unborh branch misleadingly
said that no such branch exists, which has been corrected.
* rj/branch-edit-desc-unborn:
branch: description for non-existent branch errors
Remove error detection from a function that fetches from promisor
remotes, and make it die when such a fetch fails to bring all the
requested objects, to give an early failure to various operations.
* jt/promisor-remote-fetch-tweak:
promisor-remote: die upon failing fetch
promisor-remote: remove a return value
Clarify that "the sentence after <area>: prefix does not begin with
a capital letter" rule applies only to the commit title.
* jc/use-of-uc-in-log-messages:
SubmittingPatches: use usual capitalization in the log message body
Update comment in the Makefile about the RUNTIME_PREFIX config knob.
* dd/document-runtime-prefix-better:
Makefile: clarify runtime relative gitexecdir
The code to clean temporary object directories (used for
quarantine) tried to remove them inside its signal handler, which
was a no-no.
* jc/tmp-objdir:
tmp-objdir: skip clean up when handling a signal
"GIT_EDITOR=: git branch --edit-description" resulted in failure,
which has been corrected.
* jc/branch-description-unset:
branch: do not fail a no-op --edit-desc
Code clean-up.
* jk/cleanup-callback-parameters:
attr: drop DEBUG_ATTR code
commit: avoid writing to global in option callback
multi-pack-index: avoid writing to global in option callback
test-submodule: inline resolve_relative_url() function
By default, use of fsmonitor on a repository on networked
filesystem is disabled. Add knobs to make it workable on macOS.
* ed/fsmonitor-on-networked-macos:
fsmonitor: fix leak of warning message
fsmonitor: add documentation for allowRemote and socketDir options
fsmonitor: check for compatability before communicating with fsmonitor
fsmonitor: deal with synthetic firmlinks on macOS
fsmonitor: avoid socket location check if using hook
fsmonitor: relocate socket file if .git directory is remote
fsmonitor: refactor filesystem checks to common interface
Treating the action as a string is a hang over from the scripted
rebase. The last commit removed the only remaining use of the action
that required a string so lets convert the other action users to use
the existing action enum instead. If we ever need the action name as a
string in the future the action_names array exists exactly for that
purpose.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When aborting a rebase the reflog message looks like
rebase (abort): updating HEAD
which is not very informative. Improve the message by mentioning the
branch that we are returning to as we do at the end of a successful
rebase so it looks like.
rebase (abort): returning to refs/heads/topic
If GIT_REFLOG_ACTION is set in the environment we no longer omit
"(abort)" from the reflog message. We don't omit "(start)" and
"(finish)" when starting and finishing a rebase in that case so we
shouldn't omit "(abort)".
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The apply backend creates slightly different reflog messages to the
merge backend when starting or finishing a rebase and when picking
commits. These differences make it harder than it needs to be to parse
the reflog (I have a script that reads the finishing messages from
rebase and it is a pain to have to accommodate two different message
formats). While it is possible to determine the backend used for a
rebase from the reflog messages, the differences are not designed for
that purpose. c2417d3af7 (rebase: drop '-i' from the reflog for
interactive-based rebases, 2020-02-15) removed the clear distinction
between the reflog messages of the two backends without complaint.
As the merge backend is the default it is likely to be the format most
common in existing reflogs. For that reason the apply backend is changed
to format its reflog messages to match the merge backend as closely as
possible. Note that there is still a difference as when committing a
conflict resolution the apply backend will use "(pick)" rather than
"(continue)" because it is not currently possible to change the message
for a single commit.
In addition to c2417d3af7 we also changed the reflog messages in
68aa495b59 (rebase: implement --merge via the interactive machinery,
2018-12-11) and 2ac0d6273f (rebase: change the default backend from "am"
to "merge", 2020-02-15). This commit makes the same change to "git
rebase --apply" that 2ac0d6273f made to "git rebase" without any backend
specific options. As the messages are changed to use an existing format
any scripts that can parse the reflog messages of the default rebase
backend should be unaffected by this change.
There are existing tests for the messages from both backends which are
adjusted to ensure that they do not get out of sync in the future.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog messages when finishing a rebase hard code "rebase" rather
than using GIT_REFLOG_ACTION.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog message for every pick after running "rebase --skip" looks
like
rebase (skip) (pick): commit subject line
Fix this by not appending " (skip)" to the reflog action.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog message for a conflict resolution committed by "rebase
--continue" looks like
rebase (continue): commit subject line
Unfortunately the reflog message each subsequent pick look like
rebase (continue) (pick): commit subject line
Fix this by setting the reflog message for "rebase --continue" in
sequencer_continue() so it does not affect subsequent commits. This
introduces a memory leak similar to the one leaking GIT_REFLOG_ACTION
in pick_commits(). Both of these will be fixed in a future series that
stops the sequencer calling setenv().
If we fail to commit the staged changes then we error out so
GIT_REFLOG_ACTION does not need to be reset in that case.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the tests in preparation for adding more tests in the next
few commits. The reworked tests use the same function for testing both
the "merge" and "apply" backends. The test coverage for the "apply"
backend now includes setting GIT_REFLOG_ACTION.
Note that rebasing the "conflicts" branch does not create any
conflicts yet. A commit to do that will be added in the next commit
and the diff ends up smaller if we have don't rename the branch when
it is added.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use move_to_original_branch() when reattaching HEAD after a fast-forward
rather than open coding a copy of that code. move_to_original_branch()
does not call reset_head() if head_name is NULL but there should be no
user visible changes even though we currently call reset_head() in that
case. The reason for this is that the reset_head() call does not add a
message to the reflog because we're not changing the commit that HEAD
points to and so lock_ref_for_update() elides the update. When head_name
is not NULL then reset_head() behaves like "git symbolic-ref" and so the
reflog is updated.
Note that the removal of "strbuf_release(&msg)" is safe as there is an
identical call just above this hunk which can be seen by viewing the
diff with -U6.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pw/rebase-keep-base-fixes:
rebase --keep-base: imply --no-fork-point
rebase --keep-base: imply --reapply-cherry-picks
rebase: factor out branch_base calculation
rebase: rename merge_base to branch_base
rebase: store orig_head as a commit
rebase: be stricter when reading state files containing oids
t3416: set $EDITOR in subshell
t3416: tighten two tests
Given the name of the option it is confusing if --keep-base actually
changes the base of the branch without --fork-point being explicitly
given on the command line.
The combination of --keep-base with an explicit --fork-point is still
supported even though --fork-point means we do not keep the same base
if the upstream branch has been rewound. We do this in case anyone is
relying on this behavior which is tested in t3431[1]
[1] https://lore.kernel.org/git/20200715032014.GA10818@generichostname/
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As --keep-base does not rebase the branch it is confusing if it
removes commits that have been cherry-picked to the upstream branch.
As --reapply-cherry-picks is not supported by the "apply" backend this
commit ensures that cherry-picks are reapplied by forcing the upstream
commit to match the onto commit unless --no-reapply-cherry-picks is
given.
Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Separate out calculating the merge base between 'onto' and 'HEAD' from
the check for whether we can fast-forward or not. This means we can skip
the fast-forward checks when the rebase is forced and avoid calculating
the merge-base between 'HEAD' and 'onto' when --keep-base is given.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_base is not a very descriptive name, the variable always holds
the merge-base of 'branch' and 'onto' which is commit at the base of
the branch being rebased so rename it to branch_base.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using a struct commit rather than a struct oid to hold orig_head means
that we error out straight away if the branch being rebased does not
point to a commit. It also simplifies the code that handles finding
the merge base and fork point as it no longer has to convert from an
oid to a commit.
To avoid changing the behavior of "git rebase <upstream> <branch>" we
keep the existing call to read_ref() and use lookup_commit_object()
on the oid returned by that rather than calling
lookup_commit_reference_by_name() which applies the ref dwim rules to
its argument.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The state files for 'onto' and 'orig_head' should contain a full hex
oid, change the reading functions from get_oid() to get_oid_hex() to
reflect this. They should also name commits and not tags so add and use
a function that looks up a commit from an oid like
lookup_commit_reference() but without dereferencing tags.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As $EDITOR is exported, setting it in one test affects all subsequent
tests. Avoid this by always setting it in a subshell. Also remove a
couple of unnecessary call to set_fake_editor where the editor does
not change the todo list.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a check for the correct error message to the tests that check we
require a single merge base so we can be sure the rebase failed for
the correct reason. Also rename the tests to reflect what they are
testing.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests in this script use an unusual and hard to reason about
conditional construct
if expression; then false; else :; fi
Change them to use more idiomatic construct:
! expression
Cc: Christian Couder <christian.couder@gmail.com>
Cc: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Nsengiyumva Wilberforce <nsengiyumvawilberforce@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging bitmap generation performance, it is useful to know how
many bitmaps were generated from scratch, and how many were the result
of permuting the bit-order of an existing bitmap.
Keep track of the latter, and emit the count as a trace2_data line to
aid in debugging.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging MIDX and MIDX-bitmap related issues, it is useful to
figure out where Git is spending its time.
GitHub has been using the below trace2 regions to instrument various
components of generating a MIDX itself, as well time spent preparing to
build a MIDX bitmap.
These are limited to instrumenting the following functions:
- midx.c::find_commits_for_midx_bitmap()
- midx.c::midx_pack_order()
- midx.c::prepare_midx_packing_data()
- midx.c::write_midx_bitmap()
- midx.c::write_midx_internal()
- midx.c::write_midx_reverse_index()
to start and end with a trace2_region_enter() and trace2_region_leave(),
respectively.
The category for all of these is "midx", which matches the existing
convention. The region description matches the name of the function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When generating a multi-pack bitmap without a `--refs-snapshot` (e.g.,
by running `git multi-pack-index write --bitmap` directly), we determine
the set of bitmap-able commits by enumerating each reference, and adding
the referrent as the tip of a reachability traversal when it appears
somewhere in the MIDX. (Any commit we encounter during the reachability
traversal then becomes a candidate for bitmap selection).
But we incorrectly avoid peeling the object at the tip of each
reference. So if we see some reference that points at an annotated tag
(which in turn points through zero or more additional annotated tags at
a commit), that we will not add it as a tip for the reachability
traversal. This means that if some commit C is only referenced through
one or more annotated tag(s), then C won't become a bitmap candidate.
Correct this by peeling the reference tips as we enumerate them to
ensure that we consider commits which are the targets of annotated tags,
in addition to commits which are referenced directly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was unintentionally introduced via 893b563505 (midx: inline
nth_midxed_pack_entry(), 2021-09-11) where "struct repository *r"
became "struct repository * r".
The latter does not adhere to our usual style conventions, so fix that
up to look more like our usual declarations.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Protected config is implemented by reading a fixed set of paths,
which ignores config [include]-s. Replace this implementation with a
call to config_with_options(), which handles [include]-s and saves us
from duplicating the logic of 1) identifying which paths to read and 2)
reading command line config.
As a result, git_configset_add_parameters() is unused, so remove it. It
was introduced alongside protected config in 5b3c650777 (config: learn
`git_protected_config()`, 2022-07-14) as a way to handle command line
config.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test for the *.txt and *.c output assertions which asserts that
for "-h" lines that aren't the "usage: " or " or: " lines they start
with the same amount of whitespace. This ensures that we won't have
buggy output like:
[...]
or: git tag [-n[<num>]]
[...]
[--create-reflog] [...]
Which should instead be like this, i.e. the options lines should be
aligned:
[...]
or: git tag [-n[<num>]]
[...]
[--create-reflog] [...]
It would be better to be able to use "test_cmp" here, i.e. to
construct the output we expect, and compare it against the actual
output.
For most built-in commands this would be rather straightforward. In
"t0450-txt-doc-vs-help.sh" we already compute the whitespace that a
"git-$builtin" needs, and strip away "usage: " or " or: " from the
start of lines. The problem is:
* For commands that implement subcommands, such as "git bundle", we
don't know whether e.g. "git bundle create" is the subcommand
"create", or the argument "create" to "bundle" for the purposes of
alignment.
We *do* have that information from the *.txt version, since the
part within the ''-quotes should be the command & subcommand, but
that isn't consistent (e.g. see "git bundle" and "git
commit-graph", only the latter is correct), and parsing that out
would be non-trivial.
* If we were to make this stricter we have various
non-parse_options() users (e.g. "git diff-tree") that don't have the
nicely aligned output which we've had since
4631cfc20b (parse-options: properly align continued usage output,
2021-09-21).
So rather than make perfect the enemy of the good let's assert that
for those lines that are indented they should all use the same
indentation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's been a lot of incremental effort to make the SYNOPSIS output
in our documentation consistent with the -h output,
e.g. cbe485298b (git reflog [expire|delete]: make -h output
consistent with SYNOPSIS, 2022-03-17) is one recent example, but that
effort has been an uphill battle due to the lack of regression
testing.
This adds such regression testing, we can parse out the SYNOPSIS
output with "sed", and it turns out it's relatively easy to normalize
it and the "-h" output to match on another.
We now ensure that we won't have regressions when it comes to the list
of commands in "expect_help_to_match_txt" below, and in subsequent
commits we'll make more of them consistent.
The naïve parser here gets quite a few things wrong, but it doesn't
need to be perfect, just good enough that we can compare /some/ of
this help output. There's no cases where the output would match except
for the parser's stupidity, it's all cases of e.g. comparing the *.txt
to non-parse_options() output.
Since that output is wildly different than the *.txt anyway let's
leave this for now, we can fix the parser some other time, or it won't
become necessary as we'll e.g. convert more things to using
parse_options().
Having a special-case for "merge-tree"'s 1f0c3a29da (merge-tree:
implement real merges, 2022-06-18) is a bit ugly, but preferred to
blessing that " (deprecated)" pattern for other commands. We'd
probably want to add some other way of marking deprecated commands in
the SYNOPSIS syntax. Syntactically 1f0c3a29da3's way of doing it is
indistinguishable from the command taking an optional literal
"deprecated" string as an argument.
Some of the issues that are left:
* "git show -h", "git whatchanged -h" and "git reflog --oneline -h"
all showing "git log" and "git show" usage output. I.e. the
"builtin_log_usage" in builtin/log.c doesn't take into account what
command we're running.
* Commands which implement subcommands such as like
"multi-pack-index", "notes", "remote" etc. having their subcommands
in a very different order in the *.txt and *.c. Fixing it would
require some verbose diffs, so it's been left alone for now.
* Commands such as "format-patch" have a very long argument list in
the *.txt, but just "[<options>]" in the *.c.
What to do about these has been left out of this series, except to
the extent that preceding commits changed "[<options>]" (or
equivalent) to the list of options in cases where that list of
options was tiny, or we clearly meant to exhaustively list the
options in both *.txt and *.c.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the "worktree" -h output consistent with the *.txt version.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid repeating the "-h" output for the "git worktree" command, and
instead define the usage of each subcommand with macros, so that the
"-h" output for the command itself can re-use those definitions. See
[1], [2] and [3] for prior art using the same pattern.
1. b25b727494 (builtin/multi-pack-index.c: define common usage with a
macro, 2021-03-30)
2. 8757b35d44 (commit-graph: define common usage with a macro,
2021-08-23)
3. 1e91d3faf6 (reflog: move "usage" variables and use macros,
2022-03-17)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "git reflog" documentation to exhaustively list the
subcommands it accepts in the SYNOPSIS, as opposed to leaving that for
a "[verse]" in the DESCRIPTION section. This documentation style was
added in cf39f54efc (git reflog show, 2007-02-08), but isn't how
other commands which take subcommands are documented.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the "-h" output of "git commit" consistent with the *.txt version
by exhaustively listing the options that it takes.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the "diff-tree -h" output consistent with the *.txt version.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct uses of "<label>..." where we really meant to say
"[<label>...]", i.e. the command in question taken an optional set of
"<label>". As the CodingGuidelines notes "[o]ptional parts [should be]
enclosed in square brackets".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cmd_blame() already detected whether it was processing "blame" or
"annotate", but it didn't adjust its usage output accordingly. Let's
do that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend both the -h output and *.txt to match one another. In this case
the *.txt didn't list the "save" subcommand, and the "-h" was
similarly missing some commands.
Let's also convert the *.c code to use a macro definition, similar to
that used in preceding commits. This avoids duplication.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change those built-in commands that were attempting to exhaustively
list the options in the "-h" output to actually do so, and always
have *.txt documentation know about the exhaustive list of options.
Let's also fix the documentation and -h output for those built-in
commands where the *.txt and -h output was a mismatch of missing
options on both sides.
In the case of "interpret-trailers" fixing the missing options reveals
that the *.txt version was implicitly claiming that the command had
two operating modes, which a look at the -h version (and studying the
documentation) will show is not the case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the "git cmd" form instead of "git-cmd" for both "git
receive-pack" and "git credential-cache--daemon".
For "git-receive-pack" we do have a binary with that name, even when
installed with SKIP_DASHED_BUILT_INS=YesPlease, but for the purposes
of the SYNOPSIS let's use the "git cmd" form like everywhere else. It
can be invoked like that (and our tests do so), the parts of our
documentation that explain when you need to use the dashed form do so,
and use it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the -h output to match that of the *.txt output, the differences
were fairly small. In the case of "[<options>]" we only have a few of
them, so let's exhaustively list them as in the *.txt.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The C version was right to use "()" in place of "[]" around the option
listing, let's update the *.txt version accordingly, and furthermore
list the *.c options in the same order as the *.txt.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For "rerere" say "pathspec" consistently, and list the subcommands in
the order that they're discussed in the "COMMANDS" section of the
documentation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix various issues of SYNOPSIS and -h output syntax where:
* Options such as --force were missing entirely
* ...or the short option, such as -f
* We said "opts" or "options", but could instead enumerate
the (small) set of supported options
* Options that were missing entirely (ls-remote's --sort=<key>)
As we can specify "--sort" multiple times (it's backed by a
string-list" it should really be "[(--sort=<key>)...]", which is
what "git for-each-ref" lists it as, but let's leave that issue for
a subsequent cleanup, and stop at making these consistent. Other
"ref-filter.h" users share the same issue, e.g. "git-branch.txt".
* For "verify-tag" and "verify-commit" we were missing the "--raw"
option.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix cases where the SYNOPSIS and -h output was presented in a
different order.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the "[--]" for those cases where the *.txt and -h were
inconsistent, or where we incorrectly stated in one but not the other
that the "--" was mandatory.
In the case of "rev-list" both sides were wrong, as we we don't
require one or more paths if "--" is used, e.g. this is OK:
git rev-list HEAD --
That part of this change is not a "doc txt & -h consistency" change,
as we're changing both versions, doing so here makes both sides
consistent.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix various inconsistencies between command SYNOPSIS and the
corresponding -h output where our translatable labels didn't match
up.
In some cases we need to adjust the prose that follows the SYNOPSIS
accordingly, as it refers back to the changed label.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change "builtin/credential-cache--daemon.c" to use "<socket-path>" not
"<socket_path>" in a placeholder label, almost all of our
documentation uses this form.
This is now consistent with the "If a placeholder has multiple words,
they are separated by dashes" guideline added in
9c9b4f2f8b (standardize usage info string format, 2015-01-13), let's
add a now-passing test to assert that that's the case.
To do this we need to introduce a very sed-powered parser to extract
the SYNOPSIS from the *.txt, and handle not all commands with "-h"
having a corresponding *.txt (e.g. "bisect--helper"). We'll still want
to handle syntax edge cases in the *.txt in subsequent commits for
other checks, but let's do that then.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's arguably more correct to say "[<option>...]" than either of these
forms, but the vast majority of our documentation uses the
"[<options>]" form to indicate an arbitrary number of options, let's
do the same in these cases, which were the odd ones out.
In the case of "mv" and "sparse-checkout" let's add the missing "[]"
to indicate that these are optional.
In the case of "t/helper/test-proc-receive.c" there is no *.txt
version, making it the only hunk in this commit that's not a "doc txt
& -h consistency" change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The whitespace padding of alternatives should be of the form "[-f |
--force]" not "[-f|--force]". Likewise we should not have padding
before the first option, so "(--all | <pack-filename>...)" is correct,
not "( --all | <pack-filename>... )".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The whitespace padding of alternatives should be of the form "[-f |
--force]" not "[-f|--force]". Likewise we should not have padding
before the first option, so "(--all | <pack-filename>...)" is correct,
not "( --all | <pack-filename>... )".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a "-h" output syntax issue introduced when "--diagnose" was added
in aac0e8ffee (builtin/bugreport.c: create '--diagnose' option,
2022-08-12): We need to close the "[" we opened. The
corresponding *.txt change did not have the same issue.
The "help -h" output then had one "]" too many, which is an issue
introduced in b40845293b (help: correct the usage string in -h and
documentation, 2021-09-10).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in db9d67f2e9 (builtin/cat-file.c: support NUL-delimited
input with `-z`, 2022-07-22), before that change the SYNOPSIS and "-h"
output were the same, but not afterwards.
That change followed a similar earlier divergence in
473fa2df08 (Documentation: add --batch-command to cat-file synopsis,
2022-04-07). Subsequent commits will fix this sort of thing more
systematically, but let's fix this one as a one-off.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the incorrect "[-o | --option <argument>]" syntax, which should be
"[(-o | --option) <argument>]", we were previously claiming that only
the long option accepted the "<argument>", which isn't what we meant.
This syntax issue for "bugreport" originated in
238b439d69 (bugreport: add tool to generate debugging info,
2020-04-16), and for "diagnose" in 6783fd3cef (builtin/diagnose.c:
create 'git diagnose' builtin, 2022-08-12), which copied and adjusted
"bugreport" documentation and code.
In the case of "Documentation/git-stash.txt" and "builtin/stash.c"
this is not a "doc txt & -h consistency" change, as we're changing
both versions, doing so here makes a subsequent change smaller.
In that case fix the incorrect "[-o | --option <argument>]" syntax,
which should be "[(-o | --option) <argument>]", we were previously
claiming that only the long option accepted the "<argument>", which
isn't what we meant.
The "stash" issue has been with us in both the "-h" and *.txt versions
since bd514cada4 (stash: introduce 'git stash store', 2013-06-15).
We could claim that this isn't a syntax issue if a "vertical bar binds
tighter than option and its argument", but such a rule would change
e.g. this "cat-file" SYNOPSIS example to mean something we don't:
... [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]
We have various other examples where the post-image here is already
used, e.g. for "format-patch" ("-o"), "grep" ("-m"),
"submodule" ("set-branch -b") etc.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the documentation and -h output for those built-in commands
where both the -h output and *.txt were lacking in word-wrapping.
There are many more built-ins that could use this treatment, this
change is narrowed to those where this whitespace change is needed to
make the -h and *.txt consistent in the end.
In the case of "Documentation/git-hash-object.txt" and
"builtin/hash-object.c" this is not a "doc txt & -h consistency"
change, as we're changing both versions, doing so here makes a
subsequent change smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change commands in the "diff" family and "rev-list" to separate the
usage information and option listing with an empty line.
In the case of "git diff -h" we did this already (but let's use a
consistent "\n" pattern there), for the rest these are now consistent
with how the parse_options() API would emit usage.
As we'll see in a subsequent commit this also helps to make the "git
<cmd> -h" output more easily machine-readable, as we can assume that
the usage information is separated from the options by an empty line.
Note that "COMMON_DIFF_OPTIONS_HELP" starts with a "\n", so the
seeming omission of a "\n" here is correct, the second one is provided
by the macro.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of our commands use ''-quotation only for the name of the command
itself, and not its (optional) arguments. Let's do the same for these.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Almost all of our documentation doesn't use "'" syntax for
subcommands, but these did, let's make them consistent with the
rest.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid repeating the "-h" output for the "git bundle" command, and
instead define the usage of each subcommand with macros, so that the
"-h" output for the command itself can re-use those definitions. See
[1], [2] and [3] for prior art using the same pattern.
1. b25b727494 (builtin/multi-pack-index.c: define common usage with a
macro, 2021-03-30)
2. 8757b35d44 (commit-graph: define common usage with a macro,
2021-08-23)
3. 1e91d3faf6 (reflog: move "usage" variables and use macros,
2022-03-17)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix indentation issues introduced with 73c3253d75 (bundle: framework
for options before bundle file, 2019-11-10), and carried forward in
some subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Edit the section which explains how to create a good SYNOPSIS section
for clarity and accuracy, it was mostly introduced in
c455bd8950 (CodingGuidelines: Add a section on writing documentation,
2010-11-04):
* Change "extra" example to "file", which now naturally follows from
previous "<file>..." example (one or more) to "[<file>...]" (zero or
more).
* Explain how we prefer spacing around "[]()" tokens and "|"
alternatives, this is not a new policy, but just codifies what's
already the pattern in the most wide use in the documentation.
Having a space around " | " for flags, but not for flag values is
inconsistent, but this style guide codifies existing
patterns. Grepping shows that we don't have any instance matching the
second "Don't" example:
git grep -E -h -o '=\([^)]+\)' -- builtin Documentation/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test to assert basic compliance with the CodingGuidelines in the
SYNOPSIS and builtin -h output. For now we only assert that the "-h"
output doesn't have "\t" characters, as a very basic syntax check.
Subsequent commits will expand on the checks here as various issues
are fixed, but let's first add the test scaffolding.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the *_fn members removed in a preceding commit, let's not copy
the "processes" member of the "struct run_process_parallel_opts" over
to the "struct parallel_processes".
In this case we need the number of processes for the kill_children()
function, which will be called from a signal handler. To do that
adjust this code added in c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) so that we use a
dedicated "struct parallel_processes_for_signal" for passing data to
the signal handler, in addition to the "struct parallel_process" it'll
now have access to our "opts" variable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue the migration away from the "max_processes" member of "struct
parallel_processes" to the "processes" member of the "struct
run_process_parallel_opts", in this case we needed to pass the "opts"
further down into pp_cleanup() and pp_buffer_stderr().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Neither the "processes" nor "max_processes" members ever change after
their initialization, and they're always equivalent, but some existing
code used "pp->max_processes" when we were already passing the "opts"
to the function, let's use the "opts" directly instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the *_fn members removed in a preceding commit, let's not copy
the "data" member of the "struct run_process_parallel_opts" over to
the "struct parallel_processes". Now that we're passing the "opts"
down there's no reason to do so.
This makes the code easier to follow, as we have a "const" attribute
on the "struct run_process_parallel_opts", but not "struct
parallel_processes". We do not alter the "ungroup" argument, so
storing it in the non-const structure would make this control flow
less obvious.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the *_fn members removed in the preceding commit, let's not
copy the "ungroup" member of the "struct run_process_parallel_opts"
over to the "struct parallel_processes". Now that we're passing the
"opts" down there's no reason to do so.
This makes the code easier to follow, as we have a "const" attribute
on the "struct run_process_parallel_opts", but not "struct
parallel_processes". We do not alter the "ungroup" argument, so
storing it in the non-const structure would make this control flow
less obvious.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only remaining reason for copying the callbacks in the "struct
run_process_parallel_opts" over to the "struct parallel_processes" was
to avoid two if/else statements in case the "start_failure" and
"task_finished" callbacks were NULL.
Let's handle those cases in pp_start_one() and pp_collect_finished()
instead, and avoid the default_* stub functions, and the need to copy
this data around.
Organizing the code like this made more sense before the "struct
run_parallel_parallel_opts" existed, as we'd have needed to pass each
of these as a separate parameter.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a "const" to two "struct parallel_processes" parameters where
we're not modifying anything in "pp". For kill_children() we'll call
it from both the signal handler, and from run_processes_parallel()
itself. Adding a "const" there makes it clear that we don't need to
modify any state when killing our children.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Have the users of the "run_processes_parallel_tr2()" function use
"run_processes_parallel()" instead. In preceding commits the latter
was refactored to take a "struct run_process_parallel_opts" argument,
since the only reason for "run_processes_parallel_tr2()" to exist was
to take arguments that are now a part of that struct we can do away
with it.
See ee4512ed48 (trace2: create new combined trace facility,
2019-02-22) for the addition of the "*_tr2()" variant of the function,
it was used by every caller except "t/helper/test-run-command.c"..
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in fd3aaf53f7 (run-command: add an "ungroup" option to
run_process_parallel(), 2022-06-07) which added the "ungroup" passing
it to "run_process_parallel()" via the global
"run_processes_parallel_ungroup" variable was a compromise to get the
smallest possible regression fix for "maint" at the time.
This follow-up to that is a start at passing that parameter and others
via a new "struct run_process_parallel_opts", as the earlier
version[1] of what became fd3aaf53f7 did.
Since we need to change all of the occurrences of "n" to
"opt->SOMETHING" let's take the opportunity and rename the terse "n"
to "processes". We could also have picked "max_processes", "jobs",
"threads" etc., but as the API is named "run_processes_parallel()"
let's go with "processes".
Since the new "run_processes_parallel()" function is able to take an
optional "tr2_category" and "tr2_label" via the struct we can at this
point migrate all of the users of "run_processes_parallel_tr2()" over
to it.
But let's not migrate all the API users yet, only the two users that
passed the "ungroup" parameter via the
"run_processes_parallel_ungroup" global
1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use a designated initializer to initialize those parts of pp_init()
that don't need any conditionals for their initialization, this sets
us on a path to pp_init() itself into mostly a validation and
allocation function.
Since we're doing that we can add "const" to some of the members of
the "struct parallel_processes", which helps to clarify and
self-document this code. E.g. we never alter the "data" pointer we
pass t user callbacks, nor (after the preceding change to stop
invoking online_cpus()) do we change "max_processes", the same goes
for the "ungroup" option.
We can also do away with a call to strbuf_init() in favor of macro
initialization, and to rely on other fields being NULL'd or zero'd.
Making members of a struct "const" rather that the pointer to the
struct itself is usually painful, as e.g. it precludes us from
incrementally setting up the structure. In this case we only set it up
with the assignment in run_process_parallel() and pp_init(), and don't
pass the struct pointer around as "const", so making individual
members "const" is worth the potential hassle for extra safety.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a "jobs = 0" is passed let's BUG() out rather than fall back on
online_cpus(). The default behavior was added when this API was
implemented in c553c72eed (run-command: add an asynchronous parallel
child processor, 2015-12-15).
Most of our code in-tree that scales up to "online_cpus()" by default
calls that function by itself. Keeping this default behavior just for
the sake of two callers means that we'd need to maintain this one spot
where we're second-guessing the config passed down into pp_init().
The preceding commit has an overview of the API callers that passed
"jobs = 0". There were only two of them (actually three, but they
resolved to these two config parsing codepaths).
The "fetch.parallel" caller already had a test for the
"fetch.parallel=0" case added in 0353c68818 (fetch: do not run a
redundant fetch from submodule, 2022-05-16), but there was no such
test for "submodule.fetchJobs". Let's add one here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the "n" variable added in c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) a "size_t". As
we'll see in a subsequent commit we do pass "0" here, but never "jobs
< 0".
We could have made it an "unsigned int", but as we're having to change
this let's not leave another case in the codebase where a size_t and
"unsigned int" size differ on some platforms. In this case it's likely
to never matter, but it's easier to not need to worry about it.
After this and preceding changes:
make run-command.o DEVOPTS=extra-all CFLAGS=-Wno-unused-parameter
Only has one (and new) -Wsigned-compare warning relevant to a
comparison about our "n" or "{nr,max}_processes": About using our
"n" (size_t) in the same expression as online_cpus() (int). A
subsequent commit will adjust & deal with online_cpus() and that
warning.
The only users of the "n" parameter are:
* builtin/fetch.c: defaults to 1, reads from the "fetch.parallel"
config. As seen in the code that parses the config added in
d54dea77db (fetch: let --jobs=<n> parallelize --multiple, too,
2019-10-05) will die if the git_config_int() return value is < 0.
It will however pass us n = 0, as we'll see in a subsequent commit.
* submodule.c: defaults to 1, reads from "submodule.fetchJobs"
config. Read via code originally added in a028a1930c (fetching
submodules: respect `submodule.fetchJobs` config option, 2016-02-29).
It now piggy-backs on the the submodule.fetchJobs code and
validation added in f20e7c1ea2 (submodule: remove
submodule.fetchjobs from submodule-config parsing, 2017-08-02).
Like builtin/fetch.c it will die if the git_config_int() return
value is < 0, but like builtin/fetch.c it will pass us n = 0.
* builtin/submodule--helper.c: defaults to 1. Read via code
originally added in 2335b870fa (submodule update: expose parallelism
to the user, 2016-02-29).
Since f20e7c1ea2 (submodule: remove submodule.fetchjobs from
submodule-config parsing, 2017-08-02) it shares a config parser and
semantics with the submodule.c caller.
* hook.c: hardcoded to 1, see 96e7225b31 (hook: add 'run'
subcommand, 2021-12-22).
* t/helper/test-run-command.c: can be -1 after parsing the arguments,
but will then be overridden to online_cpus() before passing it to
this API. See be5d88e112 (test-tool run-command: learn to run (parts
of) the testsuite, 2019-10-04).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "run-command" test helper to "return" instead of calling
"exit", see 338abb0f04 (builtins + test helpers: use return instead
of exit() in cmd_*, 2021-06-08)
Because we'd previously gotten past the SANITIZE=leak check by using
exit() here we need to move to "goto cleanup" pattern.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "run_processes_parallel{,_tr2}()" functions to return void,
instead of int. Ever since c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) they have
unconditionally returned 0.
To get a "real" return value out of this function the caller needs to
get it via the "task_finished_fn" callback, see the example in hook.c
added in 96e7225b31 (hook: add 'run' subcommand, 2021-12-22).
So the "result = " and "if (!result)" code added to "builtin/fetch.c"
d54dea77db (fetch: let --jobs=<n> parallelize --multiple, too,
2019-10-05) has always been redundant, we always took that "if"
path. Likewise the "ret =" in "t/helper/test-run-command.c" added in
be5d88e112 (test-tool run-command: learn to run (parts of) the
testsuite, 2019-10-04) wasn't used, instead we got the return value
from the "if (suite.failed.nr > 0)" block seen in the context.
Subsequent commits will alter this API interface, getting rid of this
always-zero return value makes it easier to understand those changes.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the cmd__run_command() to use an "if/else if" chain rather than
mutually exclusive "if" statements. This non-functional change makes a
subsequent commit smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New explanation for the difference between these values.
It's hard to understand what they do based only on the names.
New description of used default ports.
Signed-off-by: Sotir Danailov <sndanailov@wired4ever.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When downloading bundles from a git-remote-https subprocess, the bundle
URI logic wants to be opportunistic and download as much as possible and
work with what did succeed. This is particularly important in the "any"
mode, where any single bundle success will work.
If the URI is not available, the git-remote-https process will die()
with a "fatal:" error message, even though that error is not actually
fatal to the super process. Since stderr is passed through, it looks
like a fatal error to the user.
Suppress stderr to avoid these errors from bubbling to the surface. The
bundle URI API adds its own warning() messages on these failures.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When downloading a list of bundles in "all" mode, Git has no
understanding of the dependencies between the bundles. Git attempts to
unbundle the bundles in some order, but some may not pass the
verify_bundle() step because of missing prerequisites. This is passed as
error messages to the user, even when they eventually succeed in later
attempts after their dependent bundles are unbundled.
Add a new VERIFY_BUNDLE_QUIET flag to verify_bundle() that avoids the
error messages from the missing prerequisite commits. The method still
returns the number of missing prerequisit commits, allowing callers to
unbundle() to notice that the bundle failed to apply.
Use this flag in bundle-uri.c and test that the messages go away for
'git clone --bundle-uri' commands.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The verify_bundle() method has a 'verbose' option, but we will want to
extend this method to have more granular control over its output. First,
replace this 'verbose' option with a new 'flags' option with a single
possible value: VERIFY_BUNDLE_VERBOSE.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the content at a given bundle URI is not understood as a bundle
(based on inspecting the initial content), then Git currently gives up
and ignores that content. Independent bundle providers may want to split
up the bundle content into multiple bundles, but still make them
available from a single URI.
Teach Git to attempt parsing the bundle URI content as a Git config file
providing the key=value pairs for a bundle list. Git then looks at the
mode of the list to see if ANY single bundle is sufficient or if ALL
bundles are required. The content at the selected URIs are downloaded
and the content is inspected again, creating a recursive process.
To guard the recursion against malformed or malicious content, limit the
recursion depth to a reasonable four for now. This can be converted to a
configured value in the future if necessary. The value of four is twice
as high as expected to be useful (a bundle list is unlikely to point to
more bundle lists).
To test this scenario, create an interesting bundle topology where three
incremental bundles are built on top of a single full bundle. By using a
merge commit, the two middle bundles are "independent" in that they do
not require each other in order to unbundle themselves. They each only
need the base bundle. The bundle containing the merge commit requires
both of the middle bundles, though. This leads to some interesting
decisions when unbundling, especially when we later implement heuristics
that promote downloading bundles until the prerequisite commits are
satisfied.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The verify_bundle() method checks two things for a bundle's
prerequisites:
1. Are these objects in the object store?
2. Are these objects reachable from our references?
In this second question, multiple uses of verify_bundle() in the same
process can report an invalid bundle even though it is correct. The
reason is due to not clearing all of the commit marks on the commits
previously walked.
The revision walk machinery was first introduced in-process by
fb9a54150d (git-bundle: avoid fork() in verify_bundle(), 2007-02-22).
This implementation used "-1" as the set of flags to clear. The next
meaningful change came in 2b064697a5 (revision traversal: retire
BOUNDARY_SHOW, 2007-03-05), which introduced the PREREQ_MARK flag
instead of a flag normally controlled by the revision-walk machinery.
In 86a0a408b9 (commit: factor out
clear_commit_marks_for_object_array, 2011-10-01), the loop over the
array of commits was replaced with a new
clear_commit_marks_for_object_array(), but simultaneously the "-1" value
was replaced with "ALL_REV_FLAGS", which stopped un-setting the
PREREQ_MARK flag. This means that if multiple commits were marked by the
PREREQ_MARK in a previous run of verify_bundle(), then this loop could
terminate early due to 'i' going to zero:
while (i && (commit = get_revision(&revs)))
if (commit->object.flags & PREREQ_MARK)
i--;
The flag clearing work was changed again in 63647391e6 (bundle: avoid
using the rev_info flag leak_pending, 2017-12-25), but that was only
cosmetic and did not change the behavior.
It may seem that it would be sufficient to add the PREREQ_MARK flag to
the clear_commit_marks() call in its current location. However, we
actually need to do it in the "cleanup:" step, since the first loop
checking "Are these objects in the object store?" might add the
PREREQ_MARK flag to some objects and then terminate without performing a
walk due to one missing object. By clearing the flags in all cases, we
avoid this issue when running verify_bundle() multiple times in the same
process.
Moving this loop to the cleanup step alone would cause a segfault when
running 'git bundle verify' outside of a repository, but this is because
of that error condition using "goto cleanup" when returning is perfectly
safe. Nothing has been initialized at that point, so we can return
immediately without causing any leaks.
This behavior is verified carefully by a test that will be added soon
when Git learns to download bundle lists in a 'git clone --bundle-uri'
command.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next change will start allowing us to parse bundle lists that are
downloaded from a provided bundle URI. Those lists might point to other
lists, which could proceed to an arbitrary depth (and even create
cycles). Restructure fetch_bundle_uri() to have an internal version that
has a recursion depth. Compare that to a new max_bundle_uri_depth
constant that is twice as high as we expect this depth to be for any
legitimate use of bundle list linking.
We can consider making max_bundle_uri_depth a configurable value if
there is demonstrated value in the future.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a bundle provider wants to operate independently from a Git remote,
they want to provide a single, consistent URI that users can use in
their 'git clone --bundle-uri' commands. At this point, the Git client
expects that URI to be a single bundle that can be unbundled and used to
bootstrap the rest of the clone from the Git server. This single bundle
cannot be re-used to assist with future incremental fetches.
To allow for the incremental fetch case, teach Git to understand a
bundle list that could be advertised at an independent bundle URI. Such
a bundle list is likely to be inspected by human readers, even if only
by the bundle provider creating the list. For this reason, we can take
our expected "key=value" pairs and instead format them using Git config
format.
Create bundle_uri_parse_config_format() to parse a file in config format
and convert that into a 'struct bundle_list' filled with its
understanding of the contents.
Be careful to use error_action CONFIG_ERROR_ERROR when calling
git_config_from_file_with_options() because the default action for
git_config_from_file() is to die() on a parsing error. The current
warning isn't particularly helpful if it arises to a user, but it will
be made more verbose at a higher layer later.
Update 'test-tool bundle-uri' to take this config file format as input.
It uses a filename instead of stdin because there is no existing way to
parse a FILE pointer in the config machinery. Using
git_config_from_mem() is overly complicated and more likely to introduce
bugs than this simpler version.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new 'test-tool bundle-uri' test helper. This helper will assist
in testing logic deep in the bundle URI feature.
This change introduces the 'parse-key-values' subcommand, which parses
an input file as a list of lines. These are fed into
bundle_uri_parse_line() to test how we construct a 'struct bundle_list'
from that data. The list is then output to stdout as if the key-value
pairs were a Git config file.
We use an input file instead of stdin because of a future change to
parse in config-file format that works better as an input file.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When advertising a bundle list over Git's protocol v2, we will use
packet lines. Each line will be of the form "key=value" representing a
bundle list. Connect the API necessary for Git's transport to the
key-value pair parsing created in the previous change.
We are not currently implementing this protocol v2 functionality, but
instead preparing to expose this parsing to be unit-testable.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There will be two primary ways to advertise a bundle list: as a list of
packet lines in Git's protocol v2 and as a config file served from a
bundle URI. Both of these fundamentally use a list of key-value pairs.
We will use the same set of key-value pairs across these formats.
Create a new bundle_list_update() method that is currently unusued, but
will be used in the next change. It inspects each key to see if it is
understood and then applies it to the given bundle_list. Here are the
keys that we teach Git to understand:
* bundle.version: This value should be an integer. Git currently
understands only version 1 and will ignore the list if the version is
any other value. This version can be increased in the future if we
need to add new keys that Git should not ignore. We can add new
"heuristic" keys without incrementing the version.
* bundle.mode: This value should be one of "all" or "any". If this
mode is not understood, then Git will ignore the list. This mode
indicates whether Git needs all of the bundle list items to make a
complete view of the content or if any single item is sufficient.
The rest of the keys use a bundle identifier "<id>" as part of the key
name. Keys using the same "<id>" describe a single bundle list item.
* bundle.<id>.uri: This stores the URI of the bundle item. This
currently is expected to be an absolute URI, but will be relaxed to be
a relative URI in the future.
While parsing, return an error if a URI key is repeated, since we can
make that restriction with bundle lists.
Make the git_parse_int() method global so we can parse the integer
version value carefully.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It will likely be rare where a user uses a single bundle URI and expects
that URI to point to a bundle. Instead, that URI will likely be a list
of bundles provided in some format. Alternatively, the Git server could
advertise a list of bundles.
In anticipation of these two ways of advertising multiple bundles,
create a data structure that represents such a list. This will be
populated using a common API, but for now focus on what data can be
represented.
Each list contains a number of remote_bundle_info structs. These contain
an 'id' that is used to uniquely identify them in the list, and also a
'uri' that contains the location of its data. Finally, there is a strbuf
containing the filename used when Git downloads the contents to disk.
The list itself stores these remote_bundle_info structs in a hashtable
using 'id' as the key. The order of the structs in the input is
considered unimportant, but future modifications to the format and these
data structures will place ordering possibilities on the set. The list
also has a few "global" properties, including the version (used when
parsing the list) and the mode. The mode is one of these two options:
1. BUNDLE_MODE_ALL: all listed URIs are intended to be combined
together. The client should download all of the advertised data to
have a complete copy of the data.
2. BUNDLE_MODE_ANY: any one listed item is sufficient to have a complete
copy of the data. The client can choose arbitrarily from these
options. In the future, the client may use pings to find the closest
URI among geodistributed replicas, or use some other heuristic
information added to the format.
This API is currently unused, but will soon be expanded with parsing
logic and then be consumed by the bundle URI download logic.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The find_temp_filename() method was created in 53a50892be (bundle-uri:
create basic file-copy logic, 2022-08-09) and uses odb_mkstemp() to
create a temporary filename. The odb_mkstemp() method uses a strbuf in
its interface, but we do not need to continue carrying a strbuf
throughout the bundle URI code.
Convert the find_temp_filename() method to use a 'char *' and modify its
only caller. This makes sense that we don't actually need to modify this
filename directly later, so using a strbuf is overkill.
This change will simplify the data structure for tracking a bundle list
to use plain strings instead of strbufs.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath to sign learned to report errors when it fails to read
from "ssh-keygen".
* pw/ssh-sign-report-errors:
ssh signing: return an error when signature cannot be read
Fix a logic in "mailinfo -b" that miscomputed the length of a
substring, which lead to an out-of-bounds access.
* pw/mailinfo-b-fix:
mailinfo -b: fix an out of bounds access
Force C locale while running tests around httpd to make sure we can
find expected error messages in the log.
* rs/test-httpd-in-C-locale:
t/lib-httpd: pass LANG and LC_ALL to Apache
Per 33665d98e6 (reftable: make assignments portable to AIX xlc
v12.01, 2022-03-28) forms like ".a.b = *c" can be replaced by using
".a = { .b = *c }" instead.
We'll probably allow these sooner than later, but since the workaround
is trivial let's note it among the C99 features we'd like to hold off
on for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 79d3696cfb (git-grep: boolean expression on pattern matching.,
2006-06-30) the "pattern_expression" member has been used for complex
queries (AND/OR...), with "pattern_list" being used for the simple OR
queries. Since then we've used both "pattern_expression" and its
associated boolean "extended" member to see if we have a complex
expression.
Since f41fb662f5 (revisions API: have release_revisions() release
"grep_filter", 2022-04-13) we've had a subtle bug relating to that: If
we supplied options that were only used for "complex queries", but
didn't supply the query itself we'd set "opt->extended", but would
have a NULL "pattern_expression". As a result these would segfault as
we tried to call "free_grep_patterns()" from "release_revisions()":
git -P log -1 --invert-grep
git -P log -1 --all-match
The root cause of this is that we were conflating the state management
we needed in "compile_grep_patterns()" itself with whether or not we
had an "opt->pattern_expression" later on.
In this cases as we're going through "compile_grep_patterns()" we have
no "opt->pattern_list" but have "opt->no_body_match" or
"opt->all_match". So we'd set "opt->extended = 1", but not "return" on
"opt->extended" as that's an "else if" in the same "if" statement.
That behavior is intentional and required, as the common case is that
we have an "opt->pattern_list" that we're about to parse into the
"opt->pattern_expression".
But we don't need to keep track of this "extended" flag beyond the
state management in compile_grep_patterns() itself. It needs it, but
once we're out of that function we can rely on
"opt->pattern_expression" being non-NULL instead for using these
extended patterns.
As 79d3696cfb itself shows we've assumed that there's a one-to-one
mapping between the two since the very beginning. I.e. "match_line()"
would check "opt->extended" to see if it should call "match_expr()",
and the first thing we do in that function is assume that we have a
"opt->pattern_expression". We'd then call "match_expr_eval()", which
would have died if that "opt->pattern_expression" was NULL.
The "die" was added in c922b01f54 (grep: fix segfault when "git grep
'('" is given, 2009-04-27), and can now be removed as it's now clearly
unreachable. We still do the right thing in the case that prompted
that fix:
git grep '('
fatal: unmatched parenthesis
Arguably neither the "--invert-grep" option added in [1] nor the
earlier "--all-match" option added in [2] were intended to be used
stand-alone, and another approach[3] would be to error out in those
cases. But since we've been treating them as a NOOP when given without
--grep for a long time let's keep doing that.
We could also return in "free_pattern_expr()" if the argument is
non-NULL, as an alternative fix for this segfault does [4]. That would
be more elegant in making the "free_*()" function behave like
"free()", but it would also remove a sanity check: The
"free_pattern_expr()" function calls itself recursively, and only the
top-level is allowed to be NULL, let's not conflate those two
conditions.
1. 22dfa8a23d (log: teach --invert-grep option, 2015-01-12)
2. 0ab7befa31 (grep --all-match, 2006-09-27)
3. https://lore.kernel.org/git/patch-1.1-f4b90799fce-20221010T165711Z-avarab@gmail.com/
4. http://lore.kernel.org/git/7e094882c2a71894416089f894557a9eae07e8f8.1665423686.git.me@ttaylorr.com
Reported-by: orygaw <orygaw@protonmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
94bc671a1f (Add directory pattern matching to attributes, 2012-12-08)
moved the code for adding the trailing slash to names of directories and
submodules up. This left both branches of the if statement starting
with the same conditional fprintf call. Deduplicate it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fsm_settings__get_incompatible_msg() function returns an allocated
string. So we can't pass its result directly to warning(); we must hold
on to the pointer and free it to avoid a leak.
The leak here is small and fixed size, but Coverity complained, and
presumably SANITIZE=leaks would eventually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
branch command with options "edit-description", "set-upstream-to" and
"unset-upstream" expects a branch name. Since ae5a6c3684 (checkout:
implement "@{-N}" shortcut name for N-th last branch, 2009-01-17) a
branch can be specified using shortcuts like @{-1}. Those shortcuts
need to be resolved when considering the arguments.
We can modify the description of the previously checked out branch with:
$ git branch --edit--description @{-1}
We can modify the upstream of the previously checked out branch with:
$ git branch --set-upstream-to upstream @{-1}
$ git branch --unset-upstream @{-1}
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The C99 section of the CodingGuidelines is a good overview of what we
can use, but is sorely lacking in what we can't use. Something that
comes up occasionally is the portability of %z.
Per [1] we couldn't use it for the longest time due to MSVC not
supporting it, but nowadays by requiring C99 we rely on the MSVC
version that does, but we can't use it yet because a C library that
MinGW uses doesn't support it.
1. https://lore.kernel.org/git/a67e0fd8-4a14-16c9-9b57-3430440ef93c@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 44ba10d671 (revision: use C99 declaration of variable in for()
loop, 2021-11-14) released with v2.35.0 we've had a variable declared
with in a for loop.
Since then we've had inadvertent follow-ups to that with at least
cb2607759e (merge-ort: store more specific conflict information,
2022-06-18) released with v2.38.0.
As November 2022 is within the window of this upcoming release,
let's update the guideline to allow this. We can have the promised
"revisit" discussion while this patch cooks, and drop it if it turns
out that it is still premature, which is not expected to happen at
this moment.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first use of variables in initializer elements appears to have
been 2b6854c863 (Cleanup variables in cat-file, 2007-04-21) released
with v1.5.2.
Some of those caused portability issues, and e.g. that "cat-file" use
was changed in 66dbfd55e3 (Rewrite dynamic structure initializations
to runtime assignment, 2010-05-14) which went out with v1.7.2.
But curiously 66dbfd55e3 missed some of them, e.g. an archive.c use
added in d5f53d6d6f (archive: complain about path specs that don't
match anything, 2009-12-12), and another one in merge-index.c (later
builtin/merge-index.c) in 0077138cd9 (Simplify some instances of
run_command() by using run_command_v_opt()., 2009-06-08).
As far as I can tell there's been no point since 2b6854c863 in 2007
where a compiler that didn't support this has been able to compile
git. Presumably 66dbfd55e3 was an attempt to make headway with wider
portability that ultimately wasn't completed.
In any case, we are thoroughly reliant on this syntax at this point,
so let's update the guidelines, see
https://lore.kernel.org/git/xmqqy1tunjgp.fsf@gitster.g/ for the
initial discussion.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 7bc341e21b (git-compat-util: add a test balloon for C99
support, 2021-12-01) we've had a hard dependency on C99, but the prose
in CodingGuidelines was written under the assumption that we were
using C89 with a few C99 features.
As the updated prose notes we'd still like to hold off on novel C99
features, but let's make it clear that we target that C version, and
then enumerate new C99 features that are safe to use.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase --preserve-merges no longer exists so there is no point in
carrying this failing test case.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the "-Wno-missing-braces" option when building with an old version
of clang to suppress the "suggest braces around initialization" error
in developer mode.
For example, using an old version of clang gives the following errors
(when in DEVELOPER=1 mode):
$ make builtin/merge-file.o
CC builtin/merge-file.o
builtin/merge-file.c:29:23: error: suggest braces around initialization \
of subobject [-Werror,-Wmissing-braces]
mmfile_t mmfs[3] = { 0 };
^
{}
builtin/merge-file.c:31:20: error: suggest braces around initialization \
of subobject [-Werror,-Wmissing-braces]
xmparam_t xmp = { 0 };
^
{}
2 errors generated.
This example compiles without error/warning with updated versions of
clang. Since this is an obsolete error, use the -Wno-missing-braces
option to silence the warning when using an older compiler. This
avoids the need to update the code to use "{{0}}" style
initializations.
Upstream clang version 8 has the problem. It was fixed in version 9.
The version of clang distributed by Apple with XCode has its own
unique set of version numbers. Apple clang version 11 has the
problem. It was fixed in version 12.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In read-only repositories, "git merge-tree" tried to come up with a
merge result tree object, which it failed (which is not wrong) and
led to a segfault (which is bad), which has been corrected.
* js/merge-ort-in-read-only-repo:
merge-ort: return early when failing to write a blob
merge-ort: fix segmentation fault in read-only repositories
"git multi-pack-index repack/expire" used to repack unreachable
cruft into a new pack, which have been corrected.
* tb/midx-repack-ignore-cruft-packs:
midx.c: avoid cruft packs with non-zero `repack --batch-size`
midx.c: remove unnecessary loop condition
midx.c: replace `xcalloc()` with `CALLOC_ARRAY()`
midx.c: avoid cruft packs with `repack --batch-size=0`
midx.c: prevent `expire` from removing the cruft pack
Documentation/git-multi-pack-index.txt: clarify expire behavior
Documentation/git-multi-pack-index.txt: fix typo
"git rebase -i" can mistakenly attempt to apply a fixup to a commit
itself, which has been corrected.
* ja/rebase-i-avoid-amending-self:
sequencer: avoid dropping fixup commit that targets self via commit-ish
Documentation on various Boolean GIT_* environment variables have
been clarified.
* jc/environ-docs:
environ: GIT_INDEX_VERSION affects not just a new repository
environ: simplify description of GIT_INDEX_FILE
environ: GIT_FLUSH should be made a usual Boolean
environ: explain Boolean environment variables
environ: document GIT_SSL_NO_VERIFY
"git grep" learned to expand the sparse-index more lazily and on
demand in a sparse checkout.
* sy/sparse-grep:
builtin/grep.c: integrate with sparse index
"scalar unregister" in a repository that is already been
unregistered reported an error.
* ds/scalar-unregister-idempotent:
string-list: document iterator behavior on NULL input
gc: replace config subprocesses with API calls
scalar: make 'unregister' idempotent
maintenance: add 'unregister --force'
Most credential helpers ignored unknown entries in a credential
description, but a few died upon seeing them. The latter were
taught to ignore them, too
* mc/cred-helper-ignore-unknown:
osxkeychain: clarify that we ignore unknown lines
netrc: ignore unknown lines (do not die)
wincred: ignore unknown lines (do not die)
"git remote rename" failed to rename a remote without fetch
refspec, which has been corrected.
* jk/remote-rename-without-fetch-refspec:
remote: handle rename of remote without fetch refspec
"git clone" did not like to see the "--bare" and the "--origin"
options used together without a good reason.
* jk/clone-allow-bare-and-o-together:
clone: allow "--bare" with "-o"
"git fsck" failed to release contents of tree objects already used
from the memory, which has been fixed.
* jk/fsck-on-diet:
parse_object_buffer(): respect save_commit_buffer
fsck: turn off save_commit_buffer
fsck: free tree buffers after walking unreachable objects
Suppose you are managing many maintenance tracks in your project,
and some of the more recent ones are maint-2.36 and maint-2.37.
Further imagine that your project recently tagged the official 2.38
release, which means you would need to start maint-2.38 track soon,
by doing:
$ git checkout -b maint-2.38 v2.38.0^0
$ git branch --list 'maint-2.3[6-9]'
* maint-2.38
maint-2.36
maint-2.37
So far, so good. But it also is reasonable to want not to have to
worry about which maintenance track is the latest, by pointing a
more generic-sounding 'maint' branch at it, by doing:
$ git symbolic-ref refs/heads/maint refs/heads/maint-2.38
which would allow you to say "whichever it is, check out the latest
maintenance track", by doing:
$ git checkout maint
$ git branch --show-current
maint-2.38
It is arguably better to say that we are on 'maint-2.38' rather than
on 'maint', and "git merge/pull" would record "into maint-2.38" and
not "into maint", so I think what we have is a good behaviour.
One thing that is slightly irritating, however, is that I do not
think there is a good way (other than "cat .git/HEAD") to learn that
you checked out 'maint' to get into that state. Just like the output
of "git branch --show-current" shows above, "git symbolic-ref HEAD"
would report 'refs/heads/maint-2.38', bypassing the intermediate
symbolic ref at 'refs/heads/maint' that is pointed at by HEAD.
The internal resolve_ref() API already has the necessary support for
stopping after resolving a single level of a symbolic-ref, and we
can expose it by adding a "--[no-]recurse" option to the command.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check for an error condition whose body unconditionally exists
first, and then perform the special casing of "version" and "help"
as part of the preparation for the "normal codepath". This makes
the code simpler to read.
Signed-off-by: Daniel Sonbolian <dsal3389@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call fspathncmp() instead of open-coding it. This shortens the code and
makes it less repetitive.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the repository does not yet have commits, some errors describe that
there is no branch:
$ git init -b first
$ git branch --edit-description first
error: No branch named 'first'.
$ git branch --set-upstream-to=upstream
fatal: branch 'first' does not exist
$ git branch -c second
error: refname refs/heads/first not found
fatal: Branch copy failed
That "first" branch is unborn but to say it doesn't exists is confusing.
Options "-c" (copy) and "-m" (rename) show the same error when the
origin branch doesn't exists:
$ git branch -c non-existent-branch second
error: refname refs/heads/non-existent-branch not found
fatal: Branch copy failed
$ git branch -m non-existent-branch second
error: refname refs/heads/non-existent-branch not found
fatal: Branch rename failed
Note that "--edit-description" without an explicit argument is already
considering the _empty repository_ circumstance in its error. Also note
that "-m" on the initial branch it is an allowed operation.
Make the error descriptions for those branch operations with unborn or
non-existent branches, more informative.
This is the result of the change:
$ git init -b first
$ git branch --edit-description first
error: No commit on branch 'first' yet.
$ git branch --set-upstream-to=upstream
fatal: No commit on branch 'first' yet.
$ git branch -c second
fatal: No commit on branch 'first' yet.
$ git branch [-c/-m] non-existent-branch second
fatal: No branch named 'non-existent-branch'.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The version numbers do not mean much, but we may want to call the
first one in 2023 version 3.1 or something, but let's just increment
the second digit from the previous one for this cycle.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath that reads from the index v4 had unaligned memory
accesses, which has been corrected.
* vd/fix-unaligned-read-index-v4:
read-cache: avoid misaligned reads in index v4
Prepare for GNU [ef]grep that throw warning of their uses.
* dd/retire-efgrep:
t: convert fgrep usage to "grep -F"
t: convert egrep usage to "grep -E"
t: remove \{m,n\} from BRE grep usage
CodingGuidelines: allow grep -E
With a bit of header twiddling, use the native regexp library on
macOS instead of the compat/ one.
* ds/use-platform-regex-on-macos:
grep: fix multibyte regex handling under macOS
Update the description of the summary section to clarify that the
"do not capitalize" rule applies only the word after the "<area>:"
prefix of the title and nowhere else. This hopefully will prevent
folks from writing their proposed log message in all lowercase.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two documentation issues exist in the technical docs for the bundle URI
feature.
First, there is an extraneous "the" across a linebreak, making the
nonsensical phrase "the bundle the list" which should just be "the
bundle list".
Secondly, the asciidoc update treats the string "`have`s" as starting a
"<code>" block, but the second tick is interpreted as an apostrophe
instead of a closing "</code>" tag. This causes entire sentences to be
formatted as code until the next one comes along. Simply adding a space
here does not work properly as the rendered HTML keeps that space.
Instead, restructure the sentence slightly to avoid using a plural,
allowing the HTML to render correctly.
Reported-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strvec "argv" is used to build a command for run_command_v_opt(),
but never freed. Use a constant string array instead, which doesn't
require any cleanup.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explicitly cloning over the "file://" protocol in t7527 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since its inception in d0bfd026a8 (Add basic infrastructure to assign
attributes to paths, 2007-04-12), the attribute code carries a little
bit of debug code that is conditionally compiled only when DEBUG_ATTR is
set. But since you have to know about it and make a special build of Git
to use it, it's not clear that it's helping anyone (and there are very
few mentions of it on the list over the years).
Meanwhile, it causes slight headaches. Since it's not built as part of a
regular compile, it's subject to bitrot. E.g., this was dealt with in
712efb1a42 (attr: make it build with DEBUG_ATTR again, 2013-01-15), and
it currently fails to build with DEVELOPER=1 since e810e06357 (attr:
tighten const correctness with git_attr and match_attr, 2017-01-27).
And it causes confusion with -Wunused-parameter; the "what" parameter of
fill_one() is unused in a normal build, but needed in a debug build.
Let's just get rid of this code (and the now-useless parameter).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callback function for --trailer writes directly to the global
trailer_args and ignores opt->value completely. This is OK, since that's
where we expect to find the value. But it does mean the option
declaration isn't as clear. E.g., we have:
OPT_BOOL(0, "reset-author", &renew_authorship, ...),
OPT_CALLBACK_F(0, "trailer", NULL, ..., opt_pass_trailer)
In the first one we can see where the result will be stored, but in the
second, we get only NULL, and you have to go read the callback.
Let's pass &trailer_args, and use it in the callback. As a bonus, this
silences a -Wunused-parameter warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We declare the --object-dir option like:
OPT_CALLBACK(0, "object-dir", &opts.object_dir, ...);
but the pointer to opts.object_dir is completely unused. Instead, the
callback writes directly to a global. Which fortunately happens to be
opts.object_dir. So everything works as expected, but it's unnecessarily
confusing.
Instead, let's have the callback write to the option value pointer that
has been passed in. This also quiets a -Wunused-parameter warning (since
we don't otherwise look at "opt").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The resolve_relative_url() function takes argc and argv parameters; it
then reads up to 3 elements of argv without looking at argc at all. At
first glance, this seems like a bug. But it has only one caller,
cmd__submodule_resolve_relative_url(), which does confirm that argc is
3.
The main reason this is a separate function is that it was moved from
library code in 96a28a9bc6 (submodule--helper: move
"resolve-relative-url-test" to a test-tool, 2022-09-01).
We can make this code simpler and more obviously safe by just inlining
the function in its caller. As a bonus, this silences a
-Wunused-parameter warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5411 starts a web server with no explicit language setting, so it uses
the system default. Ten of its tests expect it to return error messages
containing the prefix "fatal: ", emitted by die(). This prefix can be
localized since a1fd2cf8cd (i18n: mark message helpers prefix for
translation, 2022-06-21), however. As a result these ten tests break
for me on a system with LANG="de_DE.UTF-8" because the web server sends
localized messages with "Schwerwiegend: " instead of "fatal: ".
Fix these tests by passing LANG and LC_ALL to the web server, which are
set to "C" by t/test-lib.sh, to get untranslated messages on both sides.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
https://gcc.gnu.org/gcc-4.5/changes.html says
The deprecated attribute now takes an optional string argument, for
example, __attribute__((deprecated("text string"))), that will be
printed together with the deprecation warning.
While GCC 4.5 is already 12 years old, git checks for even older
versions in places. Let's not needlessly break older compilers when
a small and simple fix is readily available.
Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu>
Signed-off-by: Alejandro R Sedeño <asedeno@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git" built with RUNTIME_PREFIX flag turned on could figure out
gitexecdir and other paths as relative to "git" executable.
However, in the section specifies gitexecdir, RUNTIME_PREFIX wasn't
mentioned, thus users may wrongly assume that "git" always locates
gitexecdir as relative path to the executable.
Let's clarify that only "git" built with RUNTIME_PREFIX will locate
gitexecdir as relative path.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explicitly cloning over the "file://" protocol in t5537 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Explicitly cloning over the "file://" protocol in t3206 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Pass a constant string array directly to run_command_v_opt() instead of
copying it into a strvec first. This shortens the code and avoids heap
allocations.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a partial clone, an attempt to read a missing object results in an
attempt to fetch that single object. In order to avoid multiple
sequential fetches, which would occur when multiple objects are missing
(which is the typical case), some commands have been taught to prefetch
in a batch: such a command would, in a partial clone, notice that
several objects that it will eventually need are missing, and call
promisor_remote_get_direct() with all such objects at once.
When this batch prefetch fails, these commands fall back to the
sequential fetches. But at $DAYJOB we have noticed that this results in
a bad user experience: a command would take unexpectedly long to finish
(and possibly use up a lot of bandwidth) if the batch prefetch would
fail for some intermittent reason, but all subsequent fetches would
work. It would be a better user experience for such a command would
just fail.
Therefore, make it a fatal error if the prefetch fails and at least one
object being fetched is known to be a promisor object. (The latter
criterion is to make sure that we are not misleading the user that such
an object would be present from the promisor remote. For example, a
missing object may be a result of repository corruption and not because
it is expectedly missing due to the repository being a partial clone.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No caller of promisor_remote_get_direct() is checking its return value,
so remove it.
Not checking the return value means that the user would not know
whether the failure of reading an object is due to the promisor remote
not supplying the object or because of local repository corruption, but
this will be fixed in a subsequent patch.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add documentation for 'fsmonitor.allowRemote' and 'fsmonitor.socketDir'.
Call-out experimental nature of 'fsmonitor.allowRemote' and limited
filesystem support for 'fsmonitor.socketDir'.
Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If fsmonitor is not in a compatible state, warn with an appropriate message.
Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Starting with macOS 10.15 (Catalina), Apple introduced a new feature
called 'firmlinks' in order to separate the boot volume into two
volumes, one read-only and one writable but still present them to the
user as a single volume. Along with this change, Apple removed the
ability to create symlinks in the root directory and replaced them with
'synthetic firmlinks'. See 'man synthetic.conf'
When FSEevents reports the path of changed files, if the path involves
a synthetic firmlink, the path is reported from the point of the
synthetic firmlink and not the real path. For example:
Real path:
/System/Volumes/Data/network/working/directory/foo.txt
Synthetic firmlink:
/network -> /System/Volumes/Data/network
FSEvents path:
/network/working/directory/foo.txt
This causes the FSEvents path to not match against the worktree
directory.
There are several ways in which synthetic firmlinks can be created:
they can be defined in /etc/synthetic.conf, the automounter can create
them, and there may be other means. Simply reading /etc/synthetic.conf
is insufficient. No matter what process creates synthetic firmlinks,
they all get created in the root directory.
Therefore, in order to deal with synthetic firmlinks, the root directory
is scanned and the first possible synthetic firmink that, when resolved,
is a prefix of the worktree is used to map FSEvents paths to worktree
paths.
Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If monitoring is done via fsmonitor hook rather than IPC there is no
need to check if the location of the Unix Domain socket (UDS) file is
on a remote filesystem.
Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the .git directory is on a remote filesystem, create the socket
file in 'fsmonitor.socketDir' if it is defined, else create it in $HOME.
Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Provide a common interface for getting basic filesystem information
including filesystem type and whether the filesystem is remote.
Refactor existing code for getting basic filesystem info and detecting
remote file systems to the new interface.
Refactor filesystem checks to leverage new interface. For macOS,
error-out if the Unix Domain socket (UDS) file is on a remote
filesystem.
Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the signature file cannot be read we print an error message but do
not return an error to the caller. In practice it seems unlikely that
the file would be unreadable if the call to ssh-keygen succeeds.
The unlink_or_warn() call is moved to the end of the function so that
we always try and remove the signature file. This isn't strictly
necessary at the moment but it protects us against any extra code
being added between trying to read the signature file and the cleanup
at the end of the function in the future. unlink_or_warn() only prints
a warning if it exists and cannot be removed.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we parse the author-script file, we check for missing or duplicate
lines for GIT_AUTHOR_NAME, etc. But after reading the whole file, our
final error conditional checks "date_i" twice and "name_i" not at all.
This not only leads to us failing to abort, but we may do an
out-of-bounds read on the string_list array.
The bug goes back to 442c36bd08 (am: improve author-script error
reporting, 2018-10-31), though the code was soon after moved to this
spot by bcd33ec25f (add read_author_script() to libgit, 2018-10-31).
It was presumably just a typo in 442c36bd08.
We'll add test coverage for all the error cases here, though only the
GIT_AUTHOR_NAME ones fail (even in a vanilla build they segfault
consistently, but certainly with SANITIZE=address).
Reported-by: Michael V. Scovetta <michael.scovetta@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To remove bracketed strings containing "PATCH" from the subject line
cleanup_subject() scans the subject for the opening bracket using an
offset from the beginning of the line. It then searches for the
closing bracket with strchr(). To calculate the length of the
bracketed string it unfortunately adds rather than subtracts the
offset from the result of strchr(). This leads to an out of bounds
access in memmem() when looking to see if the brackets contain
"PATCH".
We have tests that trigger this bug that were added in ae52d57f0b
(t5100: add some more mailinfo tests, 2017-05-31). The commit message
mentions that they are marked test_expect_failure as they trigger an
assertion in strbuf_splice(). While it is reassuring that
strbuf_splice() detects the problem and dies in retrospect that should
perhaps have warranted a little more investigation. The bug was
introduced by 17635fc900 (mailinfo: -b option keeps [bracketed]
strings that is not a [PATCH] marker, 2009-07-15). I think the reason
it has survived so long is that '-b' is not a popular option and
without it the offset is always zero.
This was found by the address sanitizer while I was cleaning up the
test_todo idea in [1].
[1] https://lore.kernel.org/git/db558292-2783-3270-4824-43757822a389@gmail.com/
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
revision.c::handle_revision_arg_1() resolves <rev>^! by first adding the
negated parents and then <rev> itself. builtin_diff_combined() expects
the first tree to be the merge and the remaining ones to be the parents,
though. This mismatch results in bogus diff output.
Remember the first tree that doesn't belong to a parent and use it
instead of blindly picking the first one. This makes "git diff <rev>^!"
consistent with "git show <rev>^!".
Reported-by: Tim Jaacks <tim.jaacks@garz-fricke.com>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitrevisions(7) says that <rev>^! resolves to <rev> and then all the
parents of <rev>. revision.c::handle_revision_arg_1() actually adds
all parents first, then <rev>. Change the documentation to leave the
order unspecified, to avoid misleading readers.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid silent overflow of the int exclude_parent by using the appropriate
function, strtol_i(), to parse its value.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explicitly cloning over the "file://" protocol in t7814 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Explicitly cloning over the "file://" protocol in t5537 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Explicitly cloning over the "file://" protocol in t5516 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Explicitly cloning over the "file://" protocol in t3207 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In the tmp-objdir api, tmp_objdir_create will create a temporary
directory but also register signal handlers responsible for removing
the directory's contents and the directory itself. However, the
function responsible for recursively removing the contents and
directory, remove_dir_recurse() calls opendir(3) and closedir(3).
This can be problematic because these functions allocate and free
memory, which are not async-signal-safe functions. This can lead to
deadlocks.
One place we call tmp_objdir_create() is in git-receive-pack, where
we create a temporary quarantine directory "incoming". Incoming
objects will be written to this directory before they get moved to
the object directory.
We have observed this code leading to a deadlock:
Thread 1 (Thread 0x7f621ba0b200 (LWP 326305)):
#0 __lll_lock_wait_private (futex=futex@entry=0x7f621bbf8b80
<main_arena>) at ./lowlevellock.c:35
#1 0x00007f621baa635b in __GI___libc_malloc
(bytes=bytes@entry=32816) at malloc.c:3064
#2 0x00007f621bae9f49 in __alloc_dir (statp=0x7fff2ea7ed60,
flags=0, close_fd=true, fd=5)
at ../sysdeps/posix/opendir.c:118
#3 opendir_tail (fd=5) at ../sysdeps/posix/opendir.c:69
#4 __opendir (name=<optimized out>)
at ../sysdeps/posix/opendir.c:92
#5 0x0000557c19c77de1 in remove_dir_recurse ()
git#6 0x0000557c19d81a4f in remove_tmp_objdir_on_signal ()
#7 <signal handler called>
git#8 _int_malloc (av=av@entry=0x7f621bbf8b80 <main_arena>,
bytes=bytes@entry=7160) at malloc.c:4116
git#9 0x00007f621baa62c9 in __GI___libc_malloc (bytes=7160)
at malloc.c:3066
git#10 0x00007f621bd1e987 in inflateInit2_ ()
from /opt/gitlab/embedded/lib/libz.so.1
git#11 0x0000557c19dbe5f4 in git_inflate_init ()
git#12 0x0000557c19cee02a in unpack_compressed_entry ()
git#13 0x0000557c19cf08cb in unpack_entry ()
git#14 0x0000557c19cf0f32 in packed_object_info ()
git#15 0x0000557c19cd68cd in do_oid_object_info_extended ()
git#16 0x0000557c19cd6e2b in read_object_file_extended ()
git#17 0x0000557c19cdec2f in parse_object ()
git#18 0x0000557c19c34977 in lookup_commit_reference_gently ()
git#19 0x0000557c19d69309 in mark_uninteresting ()
git#20 0x0000557c19d2d180 in do_for_each_repo_ref_iterator ()
git#21 0x0000557c19d21678 in for_each_ref ()
git#22 0x0000557c19d6a94f in assign_shallow_commits_to_refs ()
git#23 0x0000557c19bc02b2 in cmd_receive_pack ()
git#24 0x0000557c19b29fdd in handle_builtin ()
git#25 0x0000557c19b2a526 in cmd_main ()
git#26 0x0000557c19b28ea2 in main ()
Since we can't do the cleanup in a portable and signal-safe way, skip
the cleanup when we're handling a signal.
This means that when signal handling, the temporary directory may not
get cleaned up properly. This is mitigated by b3cecf49ea (tmp-objdir: new
API for creating temporary writable databases, 2021-12-06) which changed
the default name and allows gc to clean up these temporary directories.
In the event of a normal exit, we should still be cleaning up via the
atexit() handler.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function improperly uses an int to represent the number of entries
in the resulting argument array. This allows a malicious actor to
intentionally overflow the return value, leading to arbitrary heap
writes.
Because the resulting argv array is typically passed to execv(), it may
be possible to leverage this attack to gain remote code execution on a
victim machine. This was almost certainly the case for certain
configurations of git-shell until the previous commit limited the size
of input it would accept. Other calls to split_cmdline() are typically
limited by the size of argv the OS is willing to hand us, so are
similarly protected.
So this is not strictly fixing a known vulnerability, but is a hardening
of the function that is worth doing to protect against possible unknown
vulnerabilities.
One approach to fixing this would be modifying the signature of
`split_cmdline()` to look something like:
int split_cmdline(char *cmdline, const char ***argv, size_t *argc);
Where the return value of `split_cmdline()` is negative for errors, and
zero otherwise. If non-NULL, the `*argc` pointer is modified to contain
the size of the `**argv` array.
But this implies an absurdly large `argv` array, which more than likely
larger than the system's argument limit. So even if split_cmdline()
allowed this, it would fail immediately afterwards when we called
execv(). So instead of converting all of `split_cmdline()`'s callers to
work with `size_t` types in this patch, instead pursue the minimal fix
here to prevent ever returning an array with more than INT_MAX entries
in it.
Signed-off-by: Kevin Backhouse <kevinbackhouse@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When git-shell is run in interactive mode (which must be enabled by
creating $HOME/git-shell-commands), it reads commands from stdin, one
per line, and executes them.
We read the commands with git_read_line_interactively(), which uses a
strbuf under the hood. That means we'll accept an input of arbitrary
size (limited only by how much heap we can allocate). That creates two
problems:
- the rest of the code is not prepared to handle large inputs. The
most serious issue here is that split_cmdline() uses "int" for most
of its types, which can lead to integer overflow and out-of-bounds
array reads and writes. But even with that fixed, we assume that we
can feed the command name to snprintf() (via xstrfmt()), which is
stuck for historical reasons using "int", and causes it to fail (and
even trigger a BUG() call).
- since the point of git-shell is to take input from untrusted or
semi-trusted clients, it's a mild denial-of-service. We'll allocate
as many bytes as the client sends us (actually twice as many, since
we immediately duplicate the buffer).
We can fix both by just limiting the amount of per-command input we're
willing to receive.
We should also fix split_cmdline(), of course, which is an accident
waiting to happen, but that can come on top. Most calls to
split_cmdline(), including the other one in git-shell, are OK because
they are reading from an OS-provided argv, which is limited in practice.
This patch should eliminate the immediate vulnerabilities.
I picked 4MB as an arbitrary limit. It's big enough that nobody should
ever run into it in practice (since the point is to run the commands via
exec, we're subject to OS limits which are typically much lower). But
it's small enough that allocating it isn't that big a deal.
The code is mostly just swapping out fgets() for the strbuf call, but we
have to add a few niceties like flushing and trimming line endings. We
could simplify things further by putting the buffer on the stack, but
4MB is probably a bit much there. Note that we'll _always_ allocate 4MB,
which for normal, non-malicious requests is more than we would before
this patch. But on the other hand, other git programs are happy to use
96MB for a delta cache. And since we'd never touch most of those pages,
on a lazy-allocating OS like Linux they won't even get allocated to
actual RAM.
The ideal would be a version of strbuf_getline() that accepted a maximum
value. But for a minimal vulnerability fix, let's keep things localized
and simple. We can always refactor further on top.
The included test fails in an obvious way with ASan or UBSan (which
notice the integer overflow and out-of-bounds reads). Without them, it
fails in a less obvious way: we may segfault, or we may try to xstrfmt()
a long string, leading to a BUG(). Either way, it fails reliably before
this patch, and passes with it. Note that we don't need an EXPENSIVE
prereq on it. It does take 10-15s to fail before this patch, but with
the new limit, we fail almost immediately (and the perl process
generating 2GB of data exits via SIGPIPE).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
We have no tests of even basic functionality of git-shell. Let's add a
couple of obvious ones. This will serve as a framework for adding tests
for new things we fix, as well as making sure we don't screw anything up
too badly while doing so.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
An earlier patch discussed and fixed a scenario where Git could be used
as a vector to exfiltrate sensitive data through a Docker container when
a potential victim clones a suspicious repository with local submodules
that contain symlinks.
That security hole has since been plugged, but a similar one still
exists. Instead of convincing a would-be victim to clone an embedded
submodule via the "file" protocol, an attacker could convince an
individual to clone a repository that has a submodule pointing to a
valid path on the victim's filesystem.
For example, if an individual (with username "foo") has their home
directory ("/home/foo") stored as a Git repository, then an attacker
could exfiltrate data by convincing a victim to clone a malicious
repository containing a submodule pointing at "/home/foo/.git" with
`--recurse-submodules`. Doing so would expose any sensitive contents in
stored in "/home/foo" tracked in Git.
For systems (such as Docker) that consider everything outside of the
immediate top-level working directory containing a Dockerfile as
inaccessible to the container (with the exception of volume mounts, and
so on), this is a violation of trust by exposing unexpected contents in
the working copy.
To mitigate the likelihood of this kind of attack, adjust the "file://"
protocol's default policy to be "user" to prevent commands that execute
without user input (including recursive submodule initialization) from
taking place by default.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that interact with submodules a handful of times use
`test_config_global`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To prepare for changing the default value of `protocol.file.allow` to
"user", update the `prolog()` function in lib-submodule-update to allow
submodules to be cloned over the file protocol.
This is used by a handful of submodule-related test scripts, which
themselves will have to tweak the value of `protocol.file.allow` in
certain locations. Those will be done in subsequent commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When cloning a repository with `--local`, Git relies on either making a
hardlink or copy to every file in the "objects" directory of the source
repository. This is done through the callpath `cmd_clone()` ->
`clone_local()` -> `copy_or_link_directory()`.
The way this optimization works is by enumerating every file and
directory recursively in the source repository's `$GIT_DIR/objects`
directory, and then either making a copy or hardlink of each file. The
only exception to this rule is when copying the "alternates" file, in
which case paths are rewritten to be absolute before writing a new
"alternates" file in the destination repo.
One quirk of this implementation is that it dereferences symlinks when
cloning. This behavior was most recently modified in 36596fd2df (clone:
better handle symlinked files at .git/objects/, 2019-07-10), which
attempted to support `--local` clones of repositories with symlinks in
their objects directory in a platform-independent way.
Unfortunately, this behavior of dereferencing symlinks (that is,
creating a hardlink or copy of the source's link target in the
destination repository) can be used as a component in attacking a
victim by inadvertently exposing the contents of file stored outside of
the repository.
Take, for example, a repository that stores a Dockerfile and is used to
build Docker images. When building an image, Docker copies the directory
contents into the VM, and then instructs the VM to execute the
Dockerfile at the root of the copied directory. This protects against
directory traversal attacks by copying symbolic links as-is without
dereferencing them.
That is, if a user has a symlink pointing at their private key material
(where the symlink is present in the same directory as the Dockerfile,
but the key itself is present outside of that directory), the key is
unreadable to a Docker image, since the link will appear broken from the
container's point of view.
This behavior enables an attack whereby a victim is convinced to clone a
repository containing an embedded submodule (with a URL like
"file:///proc/self/cwd/path/to/submodule") which has a symlink pointing
at a path containing sensitive information on the victim's machine. If a
user is tricked into doing this, the contents at the destination of
those symbolic links are exposed to the Docker image at runtime.
One approach to preventing this behavior is to recreate symlinks in the
destination repository. But this is problematic, since symlinking the
objects directory are not well-supported. (One potential problem is that
when sharing, e.g. a "pack" directory via symlinks, different writers
performing garbage collection may consider different sets of objects to
be reachable, enabling a situation whereby garbage collecting one
repository may remove reachable objects in another repository).
Instead, prohibit the local clone optimization when any symlinks are
present in the `$GIT_DIR/objects` directory of the source repository.
Users may clone the repository again by prepending the "file://" scheme
to their clone URL, or by adding the `--no-local` option to their `git
clone` invocation.
The directory iterator used by `copy_or_link_directory()` must no longer
dereference symlinks (i.e., it *must* call `lstat()` instead of `stat()`
in order to discover whether or not there are symlinks present). This has
no bearing on the overall behavior, since we will immediately `die()` on
encounter a symlink.
Note that t5604.33 suggests that we do support local clones with
symbolic links in the source repository's objects directory, but this
was likely unintentional, or at least did not take into consideration
the problem with sharing parts of the objects directory with symbolic
links at the time. Update this test to reflect which options are and
aren't supported.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Imagine running "git branch --edit-description" while on a branch
without the branch description, and then exit the editor after
emptying the edit buffer, which is the way to tell the command that
you changed your mind and you do not want the description after all.
The command should just happily oblige, adding no branch description
for the current branch, and exit successfully. But it fails to do
so:
$ git init -b main
$ git commit --allow-empty -m commit
$ GIT_EDITOR=: git branch --edit-description
fatal: could not unset 'branch.main.description'
The end result is OK in that the configuration variable does not
exist in the resulting repository, but we should do better. If we
know we didn't have a description, and if we are asked not to have a
description by the editor, we can just return doing nothing.
This of course introduces TOCTOU. If you add a branch description
to the same branch from another window, while you had the editor
open to edit the description, and then exit the editor without
writing anything there, we'd end up not removing the description you
added in the other window. But you are fooling yourself in your own
repository at that point, and if it hurts, you'd be better off not
doing so ;-).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 131b94a10a (test-lib.sh: Use GLIBC_TUNABLES instead of
MALLOC_CHECK_ on glibc >= 2.34, 2022-03-04) compiling with
SANITIZE=leak has missed reporting some leaks. The old MALLOC_CHECK
method used before glibc 2.34 seems to have been (mostly?) compatible
with it, but after 131b94a10a e.g. running:
TEST_NO_MALLOC_CHECK=1 make SANITIZE=leak test T=t6437-submodule-merge.sh
Would report a leak in builtin/commit.c, but this would not:
TEST_NO_MALLOC_CHECK= make SANITIZE=leak test T=t6437-submodule-merge.sh
Since the interaction is clearly breaking the SANITIZE=leak mode,
let's mark them as explicitly incompatible.
A related regression for SANITIZE=address was fixed in
067109a5e7 (tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK,
2022-04-09).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"upstream branches" is plural but "name" and "local branch" are
singular. Make them all singular. And because we're talking about a
hypothetical branch that doesn't exist yet, use the future tense.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The process for reading the index into memory from disk is to first read its
contents into a single memory-mapped file buffer (type 'char *'), then
sequentially convert each on-disk index entry into a corresponding incore
'cache_entry'. To access the contents of the on-disk entry for processing, a
moving pointer within the memory-mapped file is cast to type 'struct
ondisk_cache_entry *'.
In index v4, the entries in the on-disk index file are written *without*
aligning their first byte to a 4-byte boundary; entries are a variable
length (depending on the entry name and whether or not extended flags are
used). As a result, casting the 'char *' buffer pointer to 'struct
ondisk_cache_entry *' then accessing its contents in a 'SANITIZE=undefined'
build can trigger the following error:
read-cache.c:1886:46: runtime error: member access within misaligned
address <address> for type 'struct ondisk_cache_entry', which requires 4
byte alignment
Avoid this error by reading fields directly from the 'char *' buffer, using
the 'offsetof' individual fields in 'struct ondisk_cache_entry'.
Additionally, add documentation describing why the new approach avoids the
misaligned address error, as well as advice on how to improve the
implementation in the future.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we fixed a segmentation fault when a tree object
could not be written.
However, before the tree object is written, `merge-ort` wants to write
out a blob object (except in cases where the merge results in a blob
that already exists in the database). And this can fail, too, but we
ignore that write failure so far.
Let's pay close attention and error out early if the blob could not be
written. This reduces the error output of t4301.25 ("merge-ort fails
gracefully in a read-only repository") from:
error: insufficient permission for adding an object to repository database ./objects
error: error: Unable to add numbers to database
error: insufficient permission for adding an object to repository database ./objects
error: error: Unable to add greeting to database
error: insufficient permission for adding an object to repository database ./objects
fatal: failure to merge
to:
error: insufficient permission for adding an object to repository database ./objects
error: error: Unable to add numbers to database
fatal: failure to merge
This is _not_ just a cosmetic change: Even though one might assume that
the operation would have failed anyway at the point when the new tree
object is written (and the corresponding tree object _will_ be new if it
contains a blob that is new), but that is not so: As pointed out by
Elijah Newren, when Git has previously been allowed to add loose objects
via `sudo` calls, it is very possible that the blob object cannot be
written (because the corresponding `.git/objects/??/` directory may be
owned by `root`) but the tree object can be written (because the
corresponding objects directory is owned by the current user). This
would result in a corrupt repository because it is missing the blob
object, and with this here patch we prevent that.
Note: This patch adjusts two variable declarations from `unsigned` to
`int` because their purpose is to hold the return value of
`handle_content_merge()`, which is of type `int`. The existing users of
those variables are only interested whether that variable is zero or
non-zero, therefore this type change does not affect the existing code.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the blob/tree objects cannot be written, we really need the merge
operations to fail, and not to continue (and then try to access the tree
object which is however still set to `NULL`).
Let's stop ignoring the return value of `write_object_file()` and
`write_tree()` and set `clean = -1` in the error case.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have small updates since -rc1 but none of them is about a new
thing and there is no updates to the release notes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The for_each_string_list_item() macro takes a string_list and
automatically constructs a for loop to iterate over its contents. This
macro will segfault if the list is non-NULL.
We cannot change the macro to be careful around NULL values because
there are many callers that use the address of a local variable, which
will never be NULL and will cause compile errors with -Werror=address.
For now, leave a documentation comment to try to avoid mistakes in the
future where a caller does not check for a NULL list.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git maintenance [un]register' commands set or unset the multi-
valued maintenance.repo config key with the absolute path of the current
repository. These are set in the global config file.
Instead of calling a subcommand and creating a new process, create the
proper API calls to git_config_set_multivar_in_file_gently(). It
requires loading the filename for the global config file (and erroring
out if now $HOME value is set). We also need to be careful about using
CONFIG_REGEX_NONE when adding the value and using
CONFIG_FLAGS_FIXED_VALUE when removing the value. In both cases, we
check that the value already exists (this check already existed for
'unregister').
Also, remove the transparent translation of the error code from the
config API to the exit code of 'git maintenance'. Instead, use die() to
recover from failures at that level. In the case of 'unregister
--force', allow the CONFIG_NOTHING_SET error code to be a success. This
allows a possible race where another process removes the config value.
The end result is that the config value is not set anymore, so we can
treat this as a success.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'scalar unregister' command removes a repository from the list of
registered Scalar repositories and removes it from the list of
repositories registered for background maintenance. If the repository
was not already registered for background maintenance, then the command
fails, even if the repository was still registered as a Scalar
repository.
After using 'scalar clone' or 'scalar register', the repository would be
enrolled in background maintenance since those commands run 'git
maintenance start'. If the user runs 'git maintenance unregister' on
that repository, then it is still in the list of repositories which get
new config updates from 'scalar reconfigure'. The 'scalar unregister'
command would fail since 'git maintenance unregister' would fail.
Further, the add_or_remove_enlistment() method in scalar.c already has
this idempotent nature built in as an expectation since it returns zero
when the scalar.repo list already has the proper containment of the
repository.
The previous change added the 'git maintenance unregister --force'
option, so use it within 'scalar unregister' to make it idempotent.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git maintenance unregister' subcommand has a step that removes the
current repository from the multi-valued maitenance.repo config key.
This fails if the repository is not listed in that key. This makes
running 'git maintenance unregister' twice result in a failure in the
second instance.
This failure exit code is helpful, but its message is not. Add a new
die() message that explicitly calls out the failure due to the
repository not being registered.
In some cases, users may want to run 'git maintenance unregister' just
to make sure that background jobs will not start on this repository, but
they do not want to check to see if it is registered first. Add a new
'--force' option that will siltently succeed if the repository is not
already registered.
Also add an extra test of 'git maintenance unregister' at a point where
there are no registered repositories. This should fail without --force.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The trace2 region around the call to lazy_bitmap_for_commit() in
bitmap_for_commit() was added in 28cd730680 (pack-bitmap: prepare to
read lookup table extension, 2022-08-14). While adding trace2 regions is
typically helpful for tracking performance, this method is called
possibly thousands of times as a commit walk explores commit history
looking for a matching bitmap. When trace2 output is enabled, this
region is emitted many times and performance is throttled by that
output.
For now, remove these regions entirely.
This is a critical path, and it would be valuable to measure that the
time spent in bitmap_for_commit() does not increase when using the
commit lookup table. The best way to do that would be to use a mechanism
that sums the time spent in a region and reports a single value at the
end of the process. This technique was introduced but not merged by [1]
so maybe this example presents some justification to revisit that
approach.
[1] https://lore.kernel.org/git/pull.1099.v2.git.1640720202.gitgitgadget@gmail.com/
To help with the 'git blame' output in this region, add a comment that
warns against adding a trace2 region. Delete a test from t5310 that used
that trace output to check that this lookup optimization was activated.
To create this kind of test again in the future, the stopwatch traces
mentioned earlier could be used as a signal that we activated this code
path.
Helpedy-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2708ce62d2 (branch: sort detached HEAD based on a flag, 2021-01-07) a
call to wt_status_state_free_buffers, responsible of freeing the
resources that could be allocated in the local struct wt_status_state
state, was eliminated.
The call to wt_status_state_free_buffers was introduced in 962dd7ebc3
(wt-status: introduce wt_status_state_free_buffers(), 2020-09-27). This
commit brings back that call in get_head_description.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Reviewed-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 68d5d03bc4 (rebase: teach --autosquash to match on sha1 in
addition to message, 2010-11-04) taught autosquash to recognize
subjects like "fixup! 7a235b" where 7a235b is an OID-prefix. It
actually did more than advertised: 7a235b can be an arbitrary
commit-ish (as long as it's not trailed by spaces).
Accidental(?) use of this secret feature revealed a bug where we
would silently drop a fixup commit. The bug can also be triggered
when using an OID-prefix but that's unlikely in practice.
Let the commit with subject "fixup! main" be the tip of the "main"
branch. When computing the fixup target for this commit, we find
the commit itself. This is wrong because, by definition, a fixup
target must be an earlier commit in the todo list. We wrongly find
the current commit because we added it to the todo list prematurely.
Avoid these fixup-cycles by only adding the current commit to the
todo list after we have finished looking for the fixup target.
Reported-by: Erik Cervin Edin <erik@cervined.in>
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The text of this message was changed in commit
71076d0edd to avoid making any
suggestion about which strategy is better for the situation at hand.
Update the Franch translation to match.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
We attribute each documentation text file to a man section by finding a
line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
("scalar: add to 'git help -a' command list", 2022-09-02) updated this
logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
it forgot to account for the fact that after the updated regex has found
a match, the man section is no longer to be found in `$1` but now lives
in `$2`.
This makes our git(1) manpage look as follows:
Main porcelain commands
git-add(git)
Add file contents to the index.
[...]
gitk(git)
The Git repository browser.
scalar(scalar)
A tool for managing large Git repositories.
Restore the man sections by not capturing the (git|scalar) part of the
match into `$1`.
As noted by Ævar [1], we could even match any "foo" rather than just
"gitfoo" and "scalarfoo", but that's a larger change. For now, just fix
the regression in cc75e556a9.
[1] https://lore.kernel.org/git/220923.86wn9u4joo.gmgdl@evledraar.gmail.com/#t
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Turn on sparse index and remove ensure_full_index().
Before this patch, `git-grep` utilizes the ensure_full_index() method to
expand the index and search all the entries. Because this method
requires walking all the trees and constructing the index, it is the
slow part within the whole command.
To achieve better performance, this patch uses grep_tree() to search the
sparse directory entries and get rid of the ensure_full_index() method.
Why grep_tree() is a better choice over ensure_full_index()?
1) grep_tree() is as correct as ensure_full_index(). grep_tree() looks
into every sparse-directory entry (represented by a tree) recursively
when looping over the index, and the result of doing so matches the
result of expanding the index.
2) grep_tree() utilizes pathspecs to limit the scope of searching.
ensure_full_index() always expands the index, which means it will
always walk all the trees and blobs in the repo without caring if
the user only wants a subset of the content, i.e. using a pathspec.
On the other hand, grep_tree() will only search the contents that
match the pathspec, and thus possibly walking fewer trees.
3) grep_tree() does not construct and copy back a new index, while
ensure_full_index() does. This also saves some time.
----------------
Performance test
- Summary:
p2000 tests demonstrate a ~71% execution time reduction for
`git grep --cached bogus -- "f2/f1/f1/*"` using tree-walking logic.
However, notice that this result varies depending on the pathspec
given. See below "Command used for testing" for more details.
Test HEAD~ HEAD
-------------------------------------------------------
2000.78: git grep ... (full-v3) 0.35 0.39 (≈)
2000.79: git grep ... (full-v4) 0.36 0.30 (≈)
2000.80: git grep ... (sparse-v3) 0.88 0.23 (-73.8%)
2000.81: git grep ... (sparse-v4) 0.83 0.26 (-68.6%)
- Command used for testing:
git grep --cached bogus -- "f2/f1/f1/*"
The reason for specifying a pathspec is that, if we don't specify a
pathspec, then grep_tree() will walk all the trees and blobs to find the
pattern, and the time consumed doing so is not too different from using
the original ensure_full_index() method, which also spends most of the
time walking trees. However, when a pathspec is specified, this latest
logic will only walk the area of trees enclosed by the pathspec, and the
time consumed is reasonably a lot less.
Generally speaking, because the performance gain is acheived by walking
less trees, which are specified by the pathspec, the HEAD time v.s.
HEAD~ time in sparse-v[3|4], should be proportional to
"pathspec enclosed area" v.s. "all area", respectively. Namely, the
wider the <pathspec> is encompassing, the less the performance
difference between HEAD~ and HEAD, and vice versa.
That is, if we don't specify a pathspec, the performance difference [1]
is indistinguishable: both methods walk all the trees and take generally
same amount of time (even with the index construction time included for
ensure_full_index()).
[1] Performance test result without pathspec (hence walking all trees):
Command used:
git grep --cached bogus
Test HEAD~ HEAD
---------------------------------------------------
2000.78: git grep ... (full-v3) 6.17 5.19 (≈)
2000.79: git grep ... (full-v4) 6.19 5.46 (≈)
2000.80: git grep ... (sparse-v3) 6.57 6.44 (≈)
2000.81: git grep ... (sparse-v4) 6.65 6.28 (≈)
--------------------------
NEEDSWORK about submodules
There are a few NEEDSWORKs that belong to improvements beyond this
topic. See the NEEDSWORK in builtin/grep.c::grep_submodule() for
more context. The other two NEEDSWORKs in t1092 are also relative.
Suggested-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GNU grep deprecated `egrep` and `fgrep` with release 2.5.3 in 2007.
As of release 3.8 in 2022, those commands warn[1] that they are
obsolescent. Now that all the Git test scripts have been scrubbed of
uses of `egrep` and `fgrep`, make `check-non-portable-shell` complain
about them to prevent new instances from creeping back into the project.
[1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'main' of github.com:git/git:
list-objects-filter: initialize sub-filter structs
Git 2.38-rc1
Final batch before -rc1
builtin/diagnose.c: don't translate the two mode values
t/Makefile: remove 'test-results' on 'make clean'
gc: don't translate literal commands
Documentation: clean up various typos in technical docs
Documentation: clean up a few misspelled word typos
version: fix builtin linking & documentation
diagnose: add to command-list.txt
Documentation: add ReviewingGuidelines
commit-graph: Fix missing closedir in expire_commit_graphs
diagnose.c: refactor to safely use 'd_type'
help: fix doubled words in explanation for developer interfaces
api docs: link to html version of api-trace2
docs: fix a few recently broken links
reftable: use a pointer for pq_entry param
Fix uninitialized memory access in a recent fix-up that is already
in -rc1.
* jk/list-objects-filter-cleanup:
list-objects-filter: initialize sub-filter structs
Like in all the other credential helpers, the osxkeychain helper
ignores unknown credential lines.
Add a comment (a la the other helpers) to make it clear and explicit
that this is the desired behaviour.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Contrary to the documentation on credential helpers, as well as the help
text for git-credential-netrc itself, this helper will `die` when
presented with an unknown property/attribute/token.
Correct the behaviour here by skipping and ignoring any tokens that are
unknown. This means all helpers in the tree are consistent and ignore
any unknown credential properties/attributes.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is the expectation that credential helpers be liberal in what they
accept and conservative in what they return, to allow for future growth
and evolution of the protocol/interaction.
All of the other helpers (store, cache, osxkeychain, libsecret,
gnome-keyring) except `netrc` currently ignore any credential lines
that are not recognised, whereas the Windows helper (wincred) instead
dies.
Fix the discrepancy and ignore unknown lines in the wincred helper.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We return an error when trying to rename a remote that has no fetch
refspec:
$ git config --unset-all remote.origin.fetch
$ git remote rename origin foo
fatal: could not unset 'remote.foo.fetch'
To make things even more confusing, we actually _do_ complete the config
modification, via git_config_rename_section(). After that we try to
rewrite the fetch refspec (to say refs/remotes/foo instead of origin).
But our call to git_config_set_multivar() to remove the existing entries
fails, since there aren't any, and it calls die().
We could fix this by using the "gently" form of the config call, and
checking the error code. But there is an even simpler fix: if we know
that there are no refspecs to rewrite, then we can skip that part
entirely.
Reported-by: John A. Leuenhagen <john@zlima12.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We explicitly forbid the combination of "--bare" with "-o", but there
doesn't seem to be any good reason to do so. The original logic came as
part of e6489a1bdf (clone: do not accept more than one -o option.,
2006-01-22), but that commit does not give any reason.
Furthermore, the equivalent combination via config is allowed:
git -c clone.defaultRemoteName=foo clone ...
and works as expected. It may be that this combination was considered
useless, because a bare clone does not set remote.origin.fetch (and
hence there is no refs/remotes/origin hierarchy). But it does set
remote.origin.url, and that name is visible to the user via "git fetch
origin", etc.
Let's allow the options to be used together, and switch the "forbid"
test in t5606 to check that we use the requested name. That test came
much later in 349cff76de (clone: add tests for --template and some
disallowed option pairs, 2020-09-29), and does not offer any logic
beyond "let's test what the code currently does".
Reported-by: John A. Leuenhagen <john@zlima12.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since commit c54980ab83 (list-objects-filter: convert filter_spec to a
strbuf, 2022-09-11), building with SANITIZE=undefined triggers an error
in t5616.
The problem is that we end up with a strbuf that has been
zero-initialized instead of via STRBUF_INIT. Feeding that strbuf to
strbuf_addbuf() in list_objects_filter_copy() means we will call memcpy
like:
memcpy(some_actual_buffer, NULL, 0);
This works on most systems because we're copying zero bytes, but it is
technically undefined behavior to ever pass NULL to memcpy.
Even though c54980ab83 is where the bug manifests, that is only because
we switched away from a string_list, which is OK with being
zero-initialized (though it may cause other problems by not duplicating
the strings, it happened to be OK in this instance).
The actual bug is caused by the commit before that, 2a01bdedf8
(list-objects-filter: add and use initializers, 2022-09-11). There we
consistently initialize the top-level filter structs, but we forgot the
dynamically allocated ones we stick in filter_options->sub when creating
combined filters.
Note that we need to fix two spots here: where we parse a "combine:"
filter, but also where we transform from a single-filter into a combined
one after seeing multiple "--filter" options. In the second spot, we'll
do some minor refactoring to avoid repeating our very-long array index.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the global variable "save_commit_buffer" is set to 0, then
parse_commit() will throw away the commit object data after parsing it,
rather than sticking it into a commit slab. This goes all the way back
to 60ab26de99 ([PATCH] Avoid wasting memory in git-rev-list,
2005-09-15).
But there's another code path which may similarly stash the buffer:
parse_object_buffer(). This is where we end up if we parse a commit via
parse_object(), and it's used directly in a few other code paths like
git-fsck.
The original goal of 60ab26de99 was avoiding extra memory usage for
rev-list. And there it's not all that important to catch parse_object().
We use that function only for looking at the tips of the traversal, and
the majority of the commits are parsed by following parent links, where
we use parse_commit() directly. So we were wasting some memory, but only
a small portion.
It's much easier to see the effect with fsck. Since we now turn off
save_commit_buffer by default there, we _should_ be able to drop the
freeing of the commit buffer in fsck_obj(). But if we do so (taking the
first hunk of this patch without the rest), then the peak heap of "git
fsck" in a clone of git.git goes from 136MB to 194MB. Teaching
parse_object_buffer() to respect save_commit_buffer brings that down to
134.5MB (it's hard to tell from massif's output, but I suspect the
savings comes from avoiding the overhead of the mostly-empty commit
slab).
Other programs should see a small improvement. Both "rev-list --all" and
"fsck --connectivity-only" improve by a few hundred kilobytes, as they'd
avoid loading the tip objects of their traversals.
Most importantly, no code should be hurt by doing this. Any program that
turns off save_commit_buffer is already making the assumption that any
commit it sees may need to have its object data loaded on demand, as it
doesn't know which ones were parsed by parse_commit() versus
parse_object(). Not to mention that anything parsed by the commit graph
may be in the same boat, even if save_commit_buffer was not disabled.
This should be the only spot that needs to be fixed. Grepping for
set_commit_buffer() shows that this and parse_commit() are the only
relevant calls.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing a commit, the default behavior is to stuff the original
buffer into a commit_slab (which takes ownership of it). But for a tool
like fsck, this isn't useful. While we may look at the buffer further as
part of fsck_commit(), we'll always do so through a separate pointer;
attaching the buffer to the slab doesn't help.
Worse, it means we have to remember to free the commit buffer in all
call paths. We do so in fsck_obj(), which covers a regular "git fsck".
But with "--connectivity-only", we forget to do so in both
traverse_one_object(), which covers reachable objects, and
mark_unreachable_referents(), which covers unreachable ones. As a
result, that mode ends up storing an uncompressed copy of every commit
on the heap at once.
We could teach the code paths for --connectivity-only to also free
commit buffers. But there's an even easier fix: we can just turn off the
save_commit_buffer flag, and then we won't attach them to the commits in
the first place.
This reduces the peak heap of running "git fsck --connectivity-only" in
a clone of linux.git from ~2GB to ~1GB. According to massif, the
remaining memory goes where you'd expect: the object structs themselves,
the obj_hash containing them, and the delta base cache.
Note that we'll leave the call to free commit buffers in fsck_obj() for
now; it's not quite redundant because of a related bug that we'll fix in
a subsequent commit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After calling fsck_walk(), a tree object struct may be left in the
parsed state, with the full tree contents available via tree->buffer.
It's the responsibility of the caller to free these when it's done with
the object to avoid having many trees allocated at once.
In a regular "git fsck", we hit fsck_walk() only from fsck_obj(), which
does call free_tree_buffer(). Likewise for "--connectivity-only", we see
most objects via traverse_one_object(), which makes a similar call.
The exception is in mark_unreachable_referents(). When using both
"--connectivity-only" and "--dangling" (the latter of which is the
default), we walk all of the unreachable objects, and there we forget to
free. Most cases would not notice this, because they don't have a lot of
unreachable objects, but you can make a pathological case like this:
git clone --bare /path/to/linux.git repo.git
cd repo.git
rm packed-refs ;# now everything is unreachable!
git fsck --connectivity-only
That ends up with peak heap usage ~18GB, which is (not coincidentally)
close to the size of all uncompressed trees in the repository. After
this patch, the peak heap is only ~2GB.
A few things to note:
- it might seem like fsck_walk(), if it is parsing the trees, should
be responsible for freeing them. But the situation is quite tricky.
In the non-connectivity mode, after we call fsck_walk() we then
proceed with fsck_object() which actually does the type-specific
sanity checks on the object contents. We do pass our own separate
buffer to fsck_object(), but there's a catch: our earlier call to
parse_object_buffer() may have attached that buffer to the object
struct! So by freeing it, we leave the rest of the code with a
dangling pointer.
Likewise, the call to fsck_walk() in index-pack is subtle. It
attaches a buffer to the tree object that must not be freed! And
so rather than calling free_tree_buffer(), it actually detaches it
by setting tree->buffer to NULL.
These cases would _probably_ be fixable by having fsck_walk() free
the tree buffer only when it was the one who allocated it via
parse_tree(). But that would still leave the callers responsible for
freeing other cases, so they wouldn't be simplified. While the
current semantics for fsck_walk() make it easy to accidentally leak
in new callers, at least they are simple to explain, and it's not a
function that's likely to get a lot of new call-sites.
And in any case, it's probably sensible to fix the leak first with
this simple patch, and try any more complicated refactoring
separately.
- a careful reader may notice that fsck_obj() also frees commit
buffers, but neither the call in traverse_one_object() nor the one
touched in this patch does so. And indeed, this is another problem
for --connectivity-only (and accounts for most of the 2GB heap after
this patch), but it's one we'll fix in a separate commit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Despite POSIX states that:
> The old egrep and fgrep commands are likely to be supported for many
> years to come as implementation extensions, allowing historical
> applications to operate unmodified.
GNU grep 3.8 started to warn[1]:
> The egrep and fgrep commands, which have been deprecated since
> release 2.5.3 (2007), now warn that they are obsolescent and should
> be replaced by grep -E and grep -F.
Prepare for their removal in the future.
[1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Despite POSIX states that:
> The old egrep and fgrep commands are likely to be supported for many
> years to come as implementation extensions, allowing historical
> applications to operate unmodified.
GNU grep 3.8 started to warn[1]:
> The egrep and fgrep commands, which have been deprecated since
> release 2.5.3 (2007), now warn that they are obsolescent and should
> be replaced by grep -E and grep -F.
Prepare for their removal in the future.
[1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The CodingGuidelines says we should avoid \{m,n\} in BRE usage.
And their usages in our code base is limited, and subjectively
hard to read.
Replace them with ERE.
Except for "0\{40\}" which would be changed to "$ZERO_OID",
which is a better value for testing with:
GIT_TEST_DEFAULT_HASH=sha256
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Despite forbidden by CodingGuidelines, our usage of 'grep -E' has been
increased over the years, and noone has come and complained.
Let's lift the restriction.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply similar treatment with respect to cruft packs as in a few commits
ago to `repack` with a non-zero `--batch-size`.
Since the case of a non-zero `--batch-size` is handled separately (in
`fill_included_packs_batch()` instead of `fill_included_packs_all()`), a
separate fix must be applied for this case.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fill_included_packs_batch() routine is responsible for aggregating
objects in packs with a non-zero value for the `--batch-size` option of
the `git multi-pack-index repack` sub-command.
Since this routine is explicitly called only when `--batch-size` is
non-zero, there is no point in checking that this is the case in our
loop condition.
Remove the unnecessary part of this condition to avoid confusion.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace a direct invocation of Git's `xcalloc()` wrapper with the
`CALLOC_ARRAY()` macro instead.
The latter is preferred since it is more conventional in Git's codebase,
but also because it automatically picks the correct value for the record
size.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `repack` sub-command of the `git multi-pack-index` builtin creates a
new pack aggregating smaller packs contained in the MIDX up to some
given `--batch-size`.
When `--batch-size=0`, this instructs the MIDX builtin to repack
everything contained in the MIDX into a single pack.
In similar spirit as a previous commit, it is undesirable to repack the
contents of a cruft pack in this step. Teach `repack` to ignore any
cruft pack(s) when `--batch-size=0` for the same reason(s).
(The case of a non-zero `--batch-size` will be handled in a subsequent
commit).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `expire` sub-command unlinks any packs that are (a) contained in the
MIDX, but (b) have no objects referenced by the MIDX.
This sub-command ignores `.keep` packs, which remain on-disk even if
they have no objects referenced by the MIDX. Cruft packs, however,
aren't given the same treatment: if none of the objects contained in the
cruft pack are selected from the cruft pack by the MIDX, then the cruft
pack is eligible to be expired.
This is less than desireable, since the cruft pack has important
metadata about the individual object mtimes, which is useful to
determine how quickly an object should age out of the repository when
pruning.
Ordinarily, we wouldn't expect the contents of a cruft pack to
duplicated across non-cruft packs (and we'd expect to see the MIDX
select all cruft objects from other sources even less often). But
nonetheless, it is still possible to trick the `expire` sub-command into
removing the `.mtimes` file in this circumstance.
Teach the `expire` sub-command to ignore cruft packs in the same manner
as it does `.keep` packs, in order to keep their metadata around, even
when they are unreferenced by the MIDX.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `expire` sub-command of `git multi-pack-index` will never expire
`.keep` packs, regardless of whether or not any of their objects were
selected in the MIDX.
This has always been the case since 19575c7c8e (multi-pack-index:
implement 'expire' subcommand, 2019-06-10), which came after cff9711616
(multi-pack-index: prepare for 'expire' subcommand, 2019-06-10), when
this documentation was originally written.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the extra space character between "tracked" and "by", which dates
back to when this paragraph was originally written in cff9711616
(multi-pack-index: prepare for 'expire' subcommand, 2019-06-10).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'main' of github.com:git/git: (45 commits)
A bit more of remaining topics before -rc1
t1800: correct test to handle Cygwin
chainlint: colorize problem annotations and test delimiters
ls-files: fix black space in error message
list-objects-filter: convert filter_spec to a strbuf
list-objects-filter: add and use initializers
list-objects-filter: handle null default filter spec
list-objects-filter: don't memset after releasing filter struct
builtin/mv.c: fix possible segfault in add_slash()
Documentation/technical: include Scalar technical doc
t/perf: add 'GIT_PERF_USE_SCALAR' run option
t/perf: add Scalar performance tests
scalar-clone: add test coverage
scalar: add to 'git help -a' command list
scalar: implement the `help` subcommand
git help: special-case `scalar`
scalar: include in standard Git build & installation
scalar: fix command documentation section header
t: retire unused chainlint.sed
t/Makefile: teach `make test` and `make prove` to run chainlint.pl
...
The logic to handle worktree refs (worktrees/NAME/REF and
main-worktree/REF) existed in two places:
* ref_type() in refs.c
* parse_worktree_ref() in worktree.c
Collapse this logic together in one function parse_worktree_ref():
this avoids having to cross-check the result of parse_worktree_ref()
and ref_type().
Introduce enum ref_worktree_type, which is slightly different from
enum ref_type. The latter is a misleading name (one would think that
'ref_type' would have the symref option).
Instead, enum ref_worktree_type only makes explicit how a refname
relates to a worktree. From this point of view, HEAD and
refs/bisect/abc are the same: they specify the current worktree
implicitly.
The files-backend must avoid packing refs/bisect/* and friends into
packed-refs, so expose is_per_worktree_ref() separately.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to provide a better organisation for oss-fuzz fuzzers and
to avoid top-level clustters in the git repository when more fuzzers
are introduced, move the existing fuzzer-related sources to their
own oss-fuzz/ hierarchy. Grouping the fuzzers into their own
directory, separate their application on fuzz-testing from the core
functionalities of the git code, prvides better and tidier structure
the oss-fuzz fuzzing library to manage, locate, build and execute
those fuzzers for fuzz-testing purposes in future development.
Signed-off-by: Arthur Chan <arthur.chan@adalogics.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of special-casing of 'suppress' in set_diff_merges(). Instead
set 'merges_need_diff' flag correctly in every option handling
function.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of unneeded "else" statements in func_by_opt().
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses atoi() and checks if the result is not zero to decide what
to do. Turning it into the usual Boolean environment variable to
use git_env_bool() would not break those who have been using "set to
0, or set to non-zero, that can be parsed with atoi()" values, but
will match the expectation of those who expected "true" to mean
"yes".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many environment variables use the git_env_bool() API to parse their
values, and allow the usual "true/yes/on are true, false/no/off are
false. In addition non-zero numbers are true and zero is false. An
empty string is also false." set of values.
Mark them as such, and consistently say "true" or "false", instead
of random mixes of '1', '0', 'yes', 'true', etc. in their
description.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though the name of the environment variable is mentioned in
"git config --help" from http.sslVerify, there is no description for
it. Add one.
Note that this is not a usual Boolean environment variable whose
value can be yes/true/on vs no/false/off; the existence of it is
enough to trigger the feature named by the variable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When unicode filenames (encoded in UTF-8) are used, the visible width
on the screen is not the same as strlen().
For example, `git log --stat` may produce an output like this:
[snip the header]
Arger.txt | 1 +
Ärger.txt | 1 +
2 files changed, 2 insertions(+)
A side note: the original report was about cyrillic filenames.
After some investigations it turned out that
a) This is not a problem with "ambiguous characters" in unicode
b) The same problem exists for all unicode code points (so we
can use Latin based Umlauts for demonstrations below)
The 'Ä' takes the same space on the screen as the 'A'.
But needs one more byte in memory, so the the `git log --stat` output
for "Arger.txt" (!) gets mis-aligned:
The maximum length is derived from "Ärger.txt", 10 bytes in memory,
9 positions on the screen. That is why "Arger.txt" gets one extra ' '
for aligment, it needs 9 bytes in memory.
If there was a file "Ö", it would be correctly aligned by chance,
but "Öhö" would not.
The solution is of course, to use utf8_strwidth() instead of strlen()
when dealing with the width on screen.
And then there is another problem, code like this:
strbuf_addf(&out, "%-*s", len, name);
(or using the underlying snprintf() function) does not align the
buffer to a minimum of len measured in screen-width, but uses the
memory count.
One could be tempted to wish that snprintf() was UTF-8 aware.
That doesn't seem to be the case anywhere (tested on Linux and Mac),
probably snprintf() uses the "bytes in memory"/strlen() approach to be
compatible with older versions and this will never change.
The basic idea is to change code in diff.c like this
strbuf_addf(&out, "%-*s", len, name);
into something like this:
int padding = len - utf8_strwidth(name);
if (padding < 0)
padding = 0;
strbuf_addf(&out, " %s%*s", name, padding, "");
The real change is slighty bigger, as it, as well, integrates two calls
of strbuf_addf() into one.
Tests:
Two things need to be tested:
- The calculation of the maximum width
- The calculation of padding
The name "textfile" is changed into "tëxtfilë", both have a width of 8.
If strlen() was used, to get the maximum width, the shorter "binfile" would
have been mis-aligned:
binfile | [snip]
tëxtfilë | [snip]
If only "binfile" would be renamed into "binfilë":
binfilë | [snip]
textfile | [snip]
In order to verify that the width is calculated correctly everywhere,
"binfile" is renamed into "binfilë", giving 1 bytes more in strlen()
"tëxtfile" is renamed into "tëxtfilë", 2 byte more in strlen().
The updated t4012-diff-binary.sh checks the correct aligment:
binfilë | [snip]
tëxtfilë | [snip]
Reported-by: Alexander Meshcheryakov <alexander.s.m@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit 29de20504e (Makefile: fix default regex settings on
Darwin, 2013-05-11) fixed t0070-fundamental.sh under Darwin (macOS) by
adopting Git's regex library. However, this library is compiled with
NO_MBSUPPORT, which causes git-grep to work incorrectly on multibyte
(e.g. UTF-8) files. Current macOS versions pass t0070-fundamental.sh
with the native macOS regex library, which also supports multibyte
characters.
Adjust the Makefile to use the native regex library, and call
setlocale(3) to set CTYPE according to the user's preference.
The setlocale call is required on all platforms, but in platforms
supporting gettext(3), setlocale was called as a side-effect of
initializing gettext. Therefore, move the CTYPE setlocale call from
gettext.c to common-main.c and the corresponding locale.h include
into git-compat-util.h.
Thanks to the global initialization of CTYPE setlocale, the test-tool
regex command now works correctly with supported multibyte regexes, and
is used to set the MB_REGEX test prerequisite by assessing a platform's
support for them.
Signed-off-by: Diomidis Spinellis <dds@aueb.gr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ds/bundle-uri-clone:
clone: warn on failure to repo_init()
clone: --bundle-uri cannot be combined with --depth
bundle-uri: add support for http(s):// and file://
clone: add --bundle-uri option
bundle-uri: create basic file-copy logic
remote-curl: add 'get' capability
2022-08-24 16:05:16 -07:00
537 changed files with 24699 additions and 10211 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.