Clean-up code that handles combinations of incompatible options.
* rs/i18n-cannot-be-used-together:
i18n: factorize even more 'incompatible options' messages
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.
* js/update-urls-in-doc-and-comment:
doc: refer to internet archive
doc: update links for andre-simon.de
doc: switch links to https
doc: update links to current pages
Earlier we stopped relying on commit-graph that (still) records
information about commits that are lost from the object store,
which has negative performance implications. The default has been
flipped to disable this pessimization.
* ps/commit-graph-less-paranoid:
commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
Introduce "git replay", a tool meant on the server side without
working tree to recreate a history.
* cc/git-replay:
replay: stop assuming replayed branches do not diverge
replay: add --contained to rebase contained branches
replay: add --advance or 'cherry-pick' mode
replay: use standard revision ranges
replay: make it a minimal server side command
replay: remove HEAD related sanity check
replay: remove progress and info output
replay: add an important FIXME comment about gpg signing
replay: change rev walking options
replay: introduce pick_regular_commit()
replay: die() instead of failing assert()
replay: start using parse_options API
replay: introduce new builtin
t6429: remove switching aspects of fast-rebase
Simplify API implementation to delete references by eliminating
duplication.
* ps/ref-deletion-updates:
refs: remove `delete_refs` callback from backends
refs: deduplicate code to delete references
refs/files: use transactions to delete references
t5510: ensure that the packed-refs file needs locking
Newer versions of Getopt::Long started giving warnings against our
(ab)use of it in "git send-email". Bump the minimum version
requirement for Perl to 5.8.1 (from September 2002) to allow
simplifying our implementation.
* tz/send-email-negatable-options:
send-email: avoid duplicate specification warnings
perl: bump the required Perl version to 5.8.1 from 5.8.0
"git rebase --autosquash" is now enabled for non-interactive rebase,
but it is still incompatible with the apply backend.
* ak/rebase-autosquash:
rebase: rewrite --(no-)autosquash documentation
rebase: support --autosquash without -i
rebase: fully ignore rebase.autoSquash without -i
"git for-each-ref --no-sort" still sorted the refs alphabetically
which paid non-trivial cost. It has been redefined to show output
in an unspecified order, to allow certain optimizations to take
advantage of.
* vd/for-each-ref-unsorted-optimization:
t/perf: add perf tests for for-each-ref
ref-filter.c: use peeled tag for '*' format fields
for-each-ref: clean up documentation of --format
ref-filter.c: filter & format refs in the same callback
ref-filter.c: refactor to create common helper functions
ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()'
ref-filter.h: add functions for filter/format & format-only
ref-filter.h: move contains caches into filter
ref-filter.h: add max_count and omit_empty to ref_format
ref-filter.c: really don't sort when using --no-sort
Test and shell scripts clean-up.
* ps/ban-a-or-o-operator-with-test:
Makefile: stop using `test -o` when unlinking duplicate executables
contrib/subtree: convert subtree type check to use case statement
contrib/subtree: stop using `-o` to test for number of args
global: convert trivial usages of `test <expr> -a/-o <expr>`
"git format-patch --encode-email-headers" ignored the option when
preparing the cover letter, which has been corrected.
* ss/format-patch-use-encode-headers-for-cover-letter:
format-patch: fix ignored encode_email_headers for cover letter
Update ref-related tests.
* ps/ref-tests-update:
t: mark several tests that assume the files backend with REFFILES
t7900: assert the absence of refs via git-for-each-ref(1)
t7300: assert exact states of repo
t4207: delete replace references via git-update-ref(1)
t1450: convert tests to remove worktrees via git-worktree(1)
t: convert tests to not access reflog via the filesystem
t: convert tests to not access symrefs via the filesystem
t: convert tests to not write references via the filesystem
t: allow skipping expected object ID in `ref-store update-ref`
"git add" and "git stash" learned to support the ":(attr:...)"
magic pathspec.
* jw/git-add-attr-pathspec:
attr: enable attr pathspec magic for git-add and git-stash
Code clean-up for jk/chunk-bounds topic.
* jk/chunk-bounds-more:
commit-graph: mark chunk error messages for translation
commit-graph: drop verify_commit_graph_lite()
commit-graph: check order while reading fanout chunk
commit-graph: use fanout value for graph size
commit-graph: abort as soon as we see a bogus chunk
commit-graph: clarify missing-chunk error messages
commit-graph: drop redundant call to "lite" verification
midx: check consistency of fanout table
commit-graph: handle overflow in chunk_size checks
The way CI testing used "prove" could lead to running the test
suite twice needlessly, which has been corrected.
* js/ci-discard-prove-state:
ci: avoid running the test suite _twice_
Add support for GitLab CI.
* ps/ci-gitlab:
ci: add support for GitLab CI
ci: install test dependencies for linux-musl
ci: squelch warnings when testing with unusable Git repo
ci: unify setup of some environment variables
ci: split out logic to set up failed test artifacts
ci: group installation of Docker dependencies
ci: make grouping setup more generic
ci: reorder definitions for grouping functions
Update the base topic to work with CMake builds.
* js/doc-unit-tests-with-cmake:
cmake: handle also unit tests
cmake: use test names instead of full paths
cmake: fix typo in variable name
artifacts-tar: when including `.dll` files, don't forget the unit-tests
unit-tests: do show relative file paths
unit-tests: do not mistake `.pdb` files for being executable
cmake: also build unit tests
Process to add some form of low-level unit tests has started.
* js/doc-unit-tests:
ci: run unit tests in CI
unit tests: add TAP unit test framework
unit tests: add a project plan document
Continue the work of 12909b6b8a (i18n: turn "options are incompatible"
into "cannot be used together", 2022-01-05) and a699367bb8 (i18n:
factorize more 'incompatible options' messages, 2022-01-31) to use the
same parameterized error message for reporting incompatible command line
options. This reduces the number of strings to translate and makes the
UI slightly more consistent.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Releasing strbuf and string_list just before exiting is not strictly
necessary, but it gets rid of false positives reported by leak checkers,
which can then be more easily used to show that the column-printing
machinery behind print_columns() are free of leaks.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The replay command is able to replay multiple branches but when some of
them are based on other replayed branches, their commit should be
replayed onto already replayed commits.
For this purpose, let's store the replayed commit and its original
commit in a key value store, so that we can easily find and reuse a
replayed commit instead of the original one.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's add a `--contained` option that can be used along with
`--onto` to rebase all the branches contained in the <revision-range>
argument.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is already a 'rebase' mode with `--onto`. Let's add an 'advance' or
'cherry-pick' mode with `--advance`. This new mode will make the target
branch advance as we replay commits onto it.
The replayed commits should have a single tip, so that it's clear where
the target branch should be advanced. If they have more than one tip,
this new mode will error out.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of the fixed "<oldbase> <branch>" arguments, the replay
command now accepts "<revision-range>..." arguments in a similar
way as many other Git commands. This makes its interface more
standard and more flexible.
This also enables many revision related options accepted and
eaten by setup_revisions(). If the replay command was a high level
one or had a high level mode, it would make sense to restrict some
of the possible options, like those generating non-contiguous
history, as they could be confusing for most users.
Also as the interface of the command is now mostly finalized,
we can add more documentation and more testcases to make sure
the command will continue to work as designed in the future.
We only document the rev-list related options among all the
revision related options that are now accepted, as the rev-list
related ones are probably the most useful for now.
Helped-by: Dragan Simic <dsimic@manjaro.org>
Helped-by: Linus Arver <linusa@google.com>
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want this command to be a minimal command that just does server side
picking of commits, displaying the results on stdout for higher level
scripts to consume.
So let's simplify it:
* remove the worktree and index reading/writing,
* remove the ref (and reflog) updating,
* remove the assumptions tying us to HEAD, since (a) this is not a
rebase and (b) we want to be able to pick commits in a bare repo,
i.e. to/from branches that are not checked out and not the main
branch,
* remove unneeded includes,
* handle rebasing multiple branches by printing on stdout the update
ref commands that should be performed.
The output can be piped into `git update-ref --stdin` for the ref
updates to happen.
In the future to make it easier for users to use this command
directly maybe an option can be added to automatically pipe its output
into `git update-ref`.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want replay to be a command that can be used on the server side on
any branch, not just the current one, so we are going to stop updating
HEAD in a future commit.
A "sanity check" that makes sure we are replaying the current branch
doesn't make sense anymore. Let's remove it.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The replay command will be changed in a follow up commit, so that it
will not update refs directly, but instead it will print on stdout a
list of commands that can be consumed by `git update-ref --stdin`.
We don't want this output to be polluted by its current low value
output, so let's just remove the latter.
In the future, when the command gets an option to update refs by
itself, it will make a lot of sense to display a progress meter, but
we are not there yet.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to be able to handle signed commits in some way in the future,
but we are not ready to do it now. So for the time being let's just add
a FIXME comment to remind us about it.
These are different ways we could handle them:
- in case of a cli user and if there was an interactive mode, we could
perhaps ask if the user wants to sign again
- we could add an option to just fail if there are signed commits
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's force the rev walking options we need after calling
setup_revisions() instead of before.
This might override some user supplied rev walking command line options
though. So let's detect that and warn users by:
a) setting the desired values, before setup_revisions(),
b) checking after setup_revisions() whether these values differ from
the desired values,
c) if so throwing a warning and setting the desired values again.
We want the command to work from older commits to newer ones by default.
Also we don't want history simplification, as we want to deal with all
the commits in the affected range.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's refactor the code to handle a regular commit (a commit that is
neither a root commit nor a merge commit) into a single function instead
of keeping it inside cmd_replay().
This is good for separation of concerns, and this will help further work
in the future to replay merge commits.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's not a good idea for regular Git commands to use an assert() to
check for things that could happen but are not supported.
Let's die() with an explanation of the issue instead.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of manually parsing arguments, let's start using the parse_options
API. This way this new builtin will look more standard, and in some
upcoming commits will more easily be able to handle more command line
options.
Note that we plan to later use standard revision ranges instead of
hardcoded "<oldbase> <branch>" arguments. When we will use standard
revision ranges, it will be easier to check if there are no spurious
arguments if we keep ARGV[0], so let's call parse_options() with
PARSE_OPT_KEEP_ARGV0 even if we don't need ARGV[0] right now to avoid
some useless code churn.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For now, this is just a rename from `t/helper/test-fast-rebase.c` into
`builtin/replay.c` with minimal changes to make it build appropriately.
Let's add a stub documentation and a stub test script though.
Subsequent commits will flesh out the capabilities of the new command
and make it a more standard regular builtin.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the time t6429 was written, merge-ort was still under development,
did not have quite as many tests, and certainly was not widely deployed.
Since t6429 was exercising some codepaths just a little differently, we
thought having them also test the "merge_switch_to_result()" bits of
merge-ort was useful even though they weren't intrinsic to the real
point of these tests.
However, the value provided by doing extra testing of the
"merge_switch_to_result()" bits has decreased a bit over time, and it's
actively making it harder to refactor `test-tool fast-rebase` into `git
replay`, which we are going to do in following commits. Dispense with
these bits.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7a5d604443 (commit: detect commits that exist in commit-graph but not
in the ODB, 2023-10-31), we have introduced a new object existence check
into `repo_parse_commit_internal()` so that we do not parse commits via
the commit-graph that don't have a corresponding object in the object
database. This new check of course comes with a performance penalty,
which the commit put at around 30% for `git rev-list --topo-order`. But
there are in fact scenarios where the performance regression is even
higher. The following benchmark against linux.git with a fully-build
commit-graph:
Benchmark 1: git.v2.42.1 rev-list --count HEAD
Time (mean ± σ): 658.0 ms ± 5.2 ms [User: 613.5 ms, System: 44.4 ms]
Range (min … max): 650.2 ms … 666.0 ms 10 runs
Benchmark 2: git.v2.43.0-rc1 rev-list --count HEAD
Time (mean ± σ): 1.333 s ± 0.019 s [User: 1.263 s, System: 0.069 s]
Range (min … max): 1.302 s … 1.361 s 10 runs
Summary
git.v2.42.1 rev-list --count HEAD ran
2.03 ± 0.03 times faster than git.v2.43.0-rc1 rev-list --count HEAD
While it's a noble goal to ensure that results are the same regardless
of whether or not we have a potentially stale commit-graph, taking twice
as much time is a tough sell. Furthermore, we can generally assume that
the commit-graph will be updated by git-gc(1) or git-maintenance(1) as
required so that the case where the commit-graph is stale should not at
all be common.
With that in mind, default-disable GIT_COMMIT_GRAPH_PARANOIA and restore
the behaviour and thus performance previous to the mentioned commit. In
order to not be inconsistent, also disable this behaviour by default in
`lookup_commit_in_graph()`, where the object existence check has been
introduced right at its inception via f559d6d45e (revision: avoid
hitting packfiles when commits are in commit-graph, 2021-08-09).
This results in another speedup in commands that end up calling this
function, even though it's less pronounced compared to the above
benchmark. The following has been executed in linux.git with ~1.2
million references:
Benchmark 1: GIT_COMMIT_GRAPH_PARANOIA=true git rev-list --all --no-walk=unsorted
Time (mean ± σ): 2.947 s ± 0.003 s [User: 2.412 s, System: 0.534 s]
Range (min … max): 2.943 s … 2.949 s 3 runs
Benchmark 2: GIT_COMMIT_GRAPH_PARANOIA=false git rev-list --all --no-walk=unsorted
Time (mean ± σ): 2.724 s ± 0.030 s [User: 2.207 s, System: 0.514 s]
Range (min … max): 2.704 s … 2.759 s 3 runs
Summary
GIT_COMMIT_GRAPH_PARANOIA=false git rev-list --all --no-walk=unsorted ran
1.08 ± 0.01 times faster than GIT_COMMIT_GRAPH_PARANOIA=true git rev-list --all --no-walk=unsorted
So whereas 7a5d604443 initially introduced the logic to start doing an
object existence check in `repo_parse_commit_internal()` by default, the
updated logic will now instead cause `lookup_commit_in_graph()` to stop
doing the check by default. This behaviour continues to be tweakable by
the user via the GIT_COMMIT_GRAPH_PARANOIA environment variable.
Note that this requires us to amend some tests to manually turn on the
paranoid checks again. This is because we cause repository corruption by
manually deleting objects which are part of the commit graph already.
These circumstances shouldn't usually happen in repositories.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These pages are no longer reachable from their original locations,
which makes things difficult for readers. Instead, switch to linking to
the Internet Archive for the content.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Beyond the fact that it's somewhat traditional to respect sites'
self-identification, it's helpful for links to point to the things
that people expect them to reference. Here that means linking to
specific pages instead of a domain.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These sites offer https versions of their content.
Using the https versions provides some protection for users.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's somewhat traditional to respect sites' self-identification.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for reflog states that the --dry-run option of the
expire and delete subcommands has a corresponding short name, -n.
However, 33d7bdd645 (builtin/reflog.c: use parse-options api for expire,
delete subcommands, 2022-01-06) did not include this short name in the
new options parsing.
Re-add the short name in the new dry-run option definitions.
Signed-off-by: Josh Brobst <josh@brob.st>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One parameter is called `file_pach`. On the face of it, this looks as if
it was supposed to talk about a `path` instead of a `pach`.
However, looking at the way this callback is called, it gets fed the
`d_name` from a directory entry, which provides just the file name, not
the full path. Therefore, let's fix this by calling the parameter
`file_name` instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Newer versions of Getopt::Long started giving warnings against our
(ab)use of it in "git send-email". Bump the minimum version
requirement for Perl to 5.8.1 (from September 2002) to allow
simplifying our implementation.
* tz/send-email-negatable-options:
send-email: avoid duplicate specification warnings
perl: bump the required Perl version to 5.8.1 from 5.8.0
"git rebase --autosquash" is now enabled for non-interactive rebase,
but it is still incompatible with the apply backend.
* ak/rebase-autosquash:
rebase: rewrite --(no-)autosquash documentation
rebase: support --autosquash without -i
rebase: fully ignore rebase.autoSquash without -i
"git for-each-ref --no-sort" still sorted the refs alphabetically
which paid non-trivial cost. It has been redefined to show output
in an unspecified order, to allow certain optimizations to take
advantage of.
* vd/for-each-ref-unsorted-optimization:
t/perf: add perf tests for for-each-ref
ref-filter.c: use peeled tag for '*' format fields
for-each-ref: clean up documentation of --format
ref-filter.c: filter & format refs in the same callback
ref-filter.c: refactor to create common helper functions
ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()'
ref-filter.h: add functions for filter/format & format-only
ref-filter.h: move contains caches into filter
ref-filter.h: add max_count and omit_empty to ref_format
ref-filter.c: really don't sort when using --no-sort
"To dereference" and "to peel" were sometimes used in in-code
comments and documentation but without description in the glossary.
* vd/glossary-dereference-peel:
glossary: add definitions for dereference & peel
Now that `refs_delete_refs` is implemented in a generic way via the ref
transaction interfaces there are no callers left that invoke the
`delete_refs` callback anymore. Remove it from all of our backends.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the files and the packed-refs reference backends now use the same
generic transactions-based code to delete references. Let's pull these
implementations up into `refs_delete_refs()` to deduplicate the code.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the `files_delete_refs()` callback function of the files backend we
implement deletion of references. This is done in two steps:
1. We lock the packed-refs file and delete all references from it in
a single transaction.
2. We delete all loose references via separate calls to
`refs_delete_ref()`.
These steps essentially duplicate the logic around locking and deletion
order that we already have in the transactional interfaces, where we do
know to lock and evict references from the packed-refs file. Despite the
fact that we duplicate the logic, it's also less efficient than if we
used a single generic transaction:
- The transactional interface knows to skip locking of the packed
refs in case they don't contain any of the refs which are about to
be deleted.
- We end up creating N+1 separate reference transactions, one for
the packed-refs file and N for the individual loose references.
Refactor the code to instead delete references via a single transaction.
As we don't assert the expected old object ID this is equivalent to the
previous behaviour, and we already do the same in the packed-refs
backend.
Despite the fact that the result is simpler to reason about, this change
also results in improved performance. The following benchmarks have been
executed in linux.git:
```
$ hyperfine -n '{rev}, packed={packed} refcount={refcount}' \
-L packed true,false -L refcount 1,1000 -L rev master,pks-ref-store-generic-delete-refs \
--setup 'git -C /home/pks/Development/git switch --detach {rev} && make -C /home/pks/Development/git -j17' \
--prepare 'printf "create refs/heads/new-branch-%d HEAD\n" $(seq {refcount}) | git -C /home/pks/Reproduction/linux.git update-ref --stdin && if test {packed} = true; then git pack-refs --all; fi' \
--warmup=10 \
'/home/pks/Development/git/bin-wrappers/git -C /home/pks/Reproduction/linux.git branch -d new-branch-{1..{refcount}}'
Benchmark 1: master packed=true refcount=1
Time (mean ± σ): 7.8 ms ± 1.6 ms [User: 3.4 ms, System: 4.4 ms]
Range (min … max): 5.5 ms … 11.0 ms 120 runs
Benchmark 2: master packed=false refcount=1
Time (mean ± σ): 7.0 ms ± 1.1 ms [User: 3.2 ms, System: 3.8 ms]
Range (min … max): 5.7 ms … 9.8 ms 180 runs
Benchmark 3: master packed=true refcount=1000
Time (mean ± σ): 283.8 ms ± 5.2 ms [User: 45.7 ms, System: 231.5 ms]
Range (min … max): 276.7 ms … 291.6 ms 10 runs
Benchmark 4: master packed=false refcount=1000
Time (mean ± σ): 284.4 ms ± 5.3 ms [User: 44.2 ms, System: 233.6 ms]
Range (min … max): 277.1 ms … 293.3 ms 10 runs
Benchmark 5: generic-delete-refs packed=true refcount=1
Time (mean ± σ): 6.2 ms ± 1.8 ms [User: 2.3 ms, System: 3.9 ms]
Range (min … max): 4.1 ms … 12.2 ms 142 runs
Benchmark 6: generic-delete-refs packed=false refcount=1
Time (mean ± σ): 7.1 ms ± 1.4 ms [User: 2.8 ms, System: 4.3 ms]
Range (min … max): 4.2 ms … 10.8 ms 157 runs
Benchmark 7: generic-delete-refs packed=true refcount=1000
Time (mean ± σ): 198.9 ms ± 1.7 ms [User: 29.5 ms, System: 165.7 ms]
Range (min … max): 196.1 ms … 201.4 ms 10 runs
Benchmark 8: generic-delete-refs packed=false refcount=1000
Time (mean ± σ): 199.7 ms ± 7.8 ms [User: 32.2 ms, System: 163.2 ms]
Range (min … max): 193.8 ms … 220.7 ms 10 runs
Summary
generic-delete-refs packed=true refcount=1 ran
1.14 ± 0.37 times faster than master packed=false refcount=1
1.15 ± 0.40 times faster than generic-delete-refs packed=false refcount=1
1.26 ± 0.44 times faster than master packed=true refcount=1
32.24 ± 9.17 times faster than generic-delete-refs packed=true refcount=1000
32.36 ± 9.29 times faster than generic-delete-refs packed=false refcount=1000
46.00 ± 13.10 times faster than master packed=true refcount=1000
46.10 ± 13.13 times faster than master packed=false refcount=1000
```
Especially in the case where we have many references we can see a clear
performance speedup of nearly 30%.
This is in contrast to the stated objecive in a27dcf89b6 (refs: make
delete_refs() virtual, 2016-09-04), where the virtual `delete_refs()`
function was introduced with the intent to speed things up rather than
making things slower. So it seems like we have outlived the need for a
virtual function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the tests in t5510 asserts that a `git fetch --prune` detects
failures to prune branches. This is done by locking the packed-refs
file, which would then later lead to a locking issue when Git tries to
rewrite the file to prune the branches from it.
Interestingly though, we do not pack the about-to-be-pruned branch into
the packed-refs file, so it never even contained that branch in the
first place. While this is good enough right now because the pruning
will always lock the file regardless of whether it contains the branch
or not, this is a mere implementation detail. In fact, we're about to
rewrite branch deletions to make use of the ref transaction interface,
which knows to skip rewrites of the packed-refs file in the case where
it does not contain the branches in the first place, and this will break
the test.
Prepare the test for that change by packing the refs before trying to
prune them.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A warning is issued for options which are specified more than once
beginning with perl-Getopt-Long >= 2.55. In addition to causing users
to see warnings, this results in test failures which compare the output.
An example, from t9001-send-email.37:
| +++ diff -u expect actual
| --- expect 2023-11-14 10:38:23.854346488 +0000
| +++ actual 2023-11-14 10:38:23.848346466 +0000
| @@ -1,2 +1,7 @@
| +Duplicate specification "no-chain-reply-to" for option "no-chain-reply-to"
| +Duplicate specification "to-cover|to-cover!" for option "to-cover"
| +Duplicate specification "cc-cover|cc-cover!" for option "cc-cover"
| +Duplicate specification "no-thread" for option "no-thread"
| +Duplicate specification "no-to-cover" for option "no-to-cover"
| fatal: longline.patch:35 is longer than 998 characters
| warning: no patches were sent
| error: last command exited with $?=1
| not ok 37 - reject long lines
Remove the duplicate option specs. These are primarily the explicit
'--no-' prefix opts which were added in f471494303 (git-send-email.perl:
support no- prefix with older GetOptions, 2015-01-30). This was done
specifically to support perl-5.8.0 which includes Getopt::Long 2.32[1].
Getopt::Long 2.33 added support for the '--no-' prefix natively by
appending '!' to the option specification string, which was included in
perl-5.8.1 and is not present in perl-5.8.0. The previous commit bumped
the minimum supported Perl version to 5.8.1 so we no longer need to
provide the '--no-' variants for negatable options manually.
Teach `--git-completion-helper` to output the '--no-' options. They are
not included in the options hash and would otherwise be lost.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following commit will make use of a Getopt::Long feature which is
only present in Perl >= 5.8.1. Document that as the minimum version we
support.
Many of our Perl scripts will continue to run with 5.8.0 but this change
allows us to adjust them as needed without breaking any promises to our
users.
The Perl requirement was last changed in d48b284183 (perl: bump the
required Perl version to 5.8 from 5.6.[21], 2010-09-24). At that time,
5.8.0 was 8 years old. It is now over 21 years old.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add performance tests for 'for-each-ref'. The tests exercise different
combinations of filters/formats/options, as well as the overall performance
of 'git for-each-ref | git cat-file --batch-check' to demonstrate the
performance difference vs. 'git for-each-ref' with "%(*fieldname)" format
specifiers.
All tests are run against a repository with 40k loose refs - 10k commits,
each having a unique:
- branch
- custom ref (refs/custom/special_*)
- annotated tag pointing at the commit
- annotated tag pointing at the other annotated tag (i.e., a nested tag)
After those tests are finished, the refs are packed with 'pack-refs --all'
and the same tests are rerun.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most builtins ('rev-parse <revision>^{}', 'show-ref --dereference'),
"dereferencing" a tag refers to a recursive peel of the tag object. Unlike
these cases, the dereferencing prefix ('*') in 'for-each-ref' format
specifiers triggers only a single, non-recursive dereference of a given tag
object. For most annotated tags, a single dereference is all that is needed
to access the tag's associated commit or tree; "recursive" and
"non-recursive" dereferencing are functionally equivalent in these cases.
However, nested tags (annotated tags whose target is another annotated tag)
dereferenced once return another tag, where a recursive dereference would
return the commit or tree.
Currently, if a user wants to filter & format refs and include information
about a recursively-dereferenced tag, they can do so with something like
'cat-file --batch-check':
git for-each-ref --format="%(objectname)^{} %(refname)" <pattern> |
git cat-file --batch-check="%(objectname) %(rest)"
But the combination of commands is inefficient. So, to improve the
performance of this use case and align the defererencing behavior of
'for-each-ref' with that of other commands, update the ref formatting code
to use the peeled tag (from 'peel_iterated_oid()') to populate '*' fields
rather than the tag's immediate target object (from 'get_tagged_oid()').
Additionally, add a test to 't6300-for-each-ref' to verify new nested tag
behavior and update 't6302-for-each-ref-filter.sh' to print the correct
value for nested dereferenced fields.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the description of the `*` prefix from the --format option
documentation to the part of the command documentation that deals with other
object type-specific modifiers. Also reorganize and reword the remaining
--format documentation so that the explanation of the default format doesn't
interrupt the details on format string interpolation.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'filter_and_format_refs()' to try to perform ref filtering &
formatting in a single ref iteration, without an intermediate 'struct
ref_array'. This can only be done if no operations need to be performed on a
pre-filtered array; specifically, if the refs are
- filtered on reachability,
- sorted, or
- formatted with ahead-behind information
they cannot be filtered & formatted in the same iteration. In that case,
fall back on the current filter-then-sort-then-format flow.
This optimization substantially improves memory usage due to no longer
storing a ref array in memory. In some cases, it also dramatically reduces
runtime (e.g. 'git for-each-ref --no-sort --count=1', which no longer loads
all refs into a 'struct ref_array' to printing only the first ref).
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor out parts of 'ref_array_push()', 'ref_filter_handler()', and
'filter_refs()' into new helper functions:
* Extract the code to grow a 'struct ref_array' and append a given 'struct
ref_array_item *' to it from 'ref_array_push()' into 'ref_array_append()'.
* Extract the code to filter a given ref by refname & object ID then create
a new 'struct ref_array_item *' from 'filter_one()' into
'apply_ref_filter()'.
* Extract the code for filter pre-processing, contains cache creation, and
ref iteration from 'filter_refs()' into 'do_filter_refs()'.
In later patches, these helpers will be used by new ref-filter API
functions. This patch does not result in any user-facing behavior changes or
changes to callers outside of 'ref-filter.c'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename 'ref_filter_handler()' to 'filter_one()' to more clearly distinguish
it from other ref filtering callbacks that will be added in later patches.
The "*_one()" naming convention is common throughout the codebase for
iteration callbacks.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add two new public methods to 'ref-filter.h':
* 'print_formatted_ref_array()' which, given a format specification & array
of ref items, formats and prints the items to stdout.
* 'filter_and_format_refs()' which combines 'filter_refs()',
'ref_array_sort()', and 'print_formatted_ref_array()' into a single
function.
This consolidates much of the code used to filter and format refs in
'builtin/for-each-ref.c', 'builtin/tag.c', and 'builtin/branch.c', reducing
duplication and simplifying the future changes needed to optimize the filter
& format process.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the 'contains_cache' and 'no_contains_cache' used in filter_refs into
an 'internal' struct of the 'struct ref_filter'. In later patches, the
'struct ref_filter *' will be a common data structure across multiple
filtering functions. These caches are part of the common functionality the
filter struct will support, so they are updated to be internally accessible
wherever the filter is used.
The design used here mirrors what was introduced in 576de3d956
(unpack_trees: start splitting internal fields from public API, 2023-02-27)
for 'unpack_trees_options'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an internal 'array_opts' struct to 'struct ref_format' containing
formatting options that pertain to the formatting of an entire ref array:
'max_count' and 'omit_empty'. These values are specified by the '--count'
and '--omit-empty' options, respectively, to 'for-each-ref'/'tag'/'branch'.
Storing these values in the 'ref_format' will simplify the consolidation of
ref array formatting logic across builtins in later patches.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When '--no-sort' is passed to 'for-each-ref', 'tag', and 'branch', the
printed refs are still sorted by ascending refname. Change the handling of
sort options in these commands so that '--no-sort' to truly disables
sorting.
'--no-sort' does not disable sorting in these commands is because their
option parsing does not distinguish between "the absence of '--sort'"
(and/or values for tag.sort & branch.sort) and '--no-sort'. Both result in
an empty 'sorting_options' string list, which is parsed by
'ref_sorting_options()' to create the 'struct ref_sorting *' for the
command. If the string list is empty, 'ref_sorting_options()' interprets
that as "the absence of '--sort'" and returns the default ref sorting
structure (equivalent to "refname" sort).
To handle '--no-sort' properly while preserving the "refname" sort in the
"absence of --sort'" case, first explicitly add "refname" to the string list
*before* parsing options. This alone doesn't actually change any behavior,
since 'compare_refs()' already falls back on comparing refnames if two refs
are equal w.r.t all other sort keys.
Now that the string list is populated by default, '--no-sort' is the only
way to empty the 'sorting_options' string list. Update
'ref_sorting_options()' to return a NULL 'struct ref_sorting *' if the
string list is empty, and add a condition to 'ref_array_sort()' to skip the
sort altogether if the sort structure is NULL. Note that other functions
using 'struct ref_sorting *' do not need any changes because they already
ignore NULL values.
Finally, remove the condition around sorting in 'ls-remote', since it's no
longer necessary. Unlike 'for-each-ref' et. al., it does *not* do any
sorting by default. This default is preserved by simply leaving its sort key
string list empty before parsing options; if no additional sort keys are
set, 'struct ref_sorting *' is NULL and sorting is skipped.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite the description of the rebase --(no-)autosquash options to try
to make it a bit clearer. Don't use "the '...'" to refer to part of a
commit message, mention how --interactive can be used to review the
todo list, and add a bit more detail on commit --squash/amend.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rebase --autosquash option is quietly ignored when used without
--interactive (apart from preventing preemptive fast-forwarding and
triggering conflicts with apply backend options).
Change that to support --autosquash without --interactive, by dropping
its restriction to REBASE_INTERACTIVE_EXCPLICIT mode. When used this
way, auto-squashing is done without opening the todo list editor.
Drop the -i requirement from the --autosquash description, and amend
t3415-rebase-autosquash.sh to test the option and the rebase.autoSquash
config variable with and without -i.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Setting the rebase.autoSquash config variable to true implies a couple
of restrictions: it prevents preemptive fast-forwarding and it triggers
conflicts with apply backend options. However, it only actually results
in auto-squashing when combined with the --interactive (or -i) option,
due to code in run_specific_rebase() that disables auto-squashing unless
the REBASE_INTERACTIVE_EXPLICIT flag is set.
Doing autosquashing for rebase.autoSquash without --interactive would be
problematic in terms of backward compatibility, but conversely, there is
no need for the aforementioned restrictions without --interactive.
So drop the options.config_autosquash check from the conditions for
clearing allow_preemptive_ff, as the case where it is combined with
--interactive is already covered by the REBASE_INTERACTIVE_EXPLICIT
flag check above it.
Also drop the "apply options are incompatible with rebase.autoSquash"
error, because it is unreachable if it is restricted to --interactive,
as apply options already cause an error when used with --interactive.
Drop the tests for the error from t3422-rebase-incompatible-options.sh,
which has separate tests for the conflicts of --interactive with apply
options.
When neither --autosquash nor --no-autosquash are given, only set
options.autosquash to true if rebase.autosquash is combined with
--interactive.
Don't initialize options.config_autosquash to -1, as there is no need to
distinguish between rebase.autoSquash being unset or explicitly set to
false.
Finally, amend the rebase.autoSquash documentation to say it only
affects interactive mode.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The way CI testing used "prove" could lead to running the test
suite twice needlessly, which has been corrected.
* js/ci-discard-prove-state:
ci: avoid running the test suite _twice_
Test and shell scripts clean-up.
* ps/ban-a-or-o-operator-with-test:
Makefile: stop using `test -o` when unlinking duplicate executables
contrib/subtree: convert subtree type check to use case statement
contrib/subtree: stop using `-o` to test for number of args
global: convert trivial usages of `test <expr> -a/-o <expr>`
"git format-patch --encode-email-headers" ignored the option when
preparing the cover letter, which has been corrected.
* ss/format-patch-use-encode-headers-for-cover-letter:
format-patch: fix ignored encode_email_headers for cover letter
This is a late amendment of 4a6e4b9602 (CI: remove Travis CI support,
2021-11-23), whereby the `.prove` file (being written by the `prove`
command that is used to run the test suite) is no longer retained
between CI builds: This feature was only ever used in the Travis CI
builds, we tried for a while to do the same in Azure Pipelines CI runs
(but I gave up on it after a while), and we never used that feature in
GitHub Actions (nor does the new GitLab CI code use it).
Retaining the Prove cache has been fragile from the start, even though
the idea seemed good at the time, the idea being that the `.prove` file
caches information about previous `prove` runs (`save`) and uses them
(`slow`) to run the tests in the order from longer-running to shorter
ones, making optimal use of the parallelism implied by `--jobs=<N>`.
However, using a Prove cache can cause some surprising behavior: When
the `prove` caches information about a test script it has run,
subsequent `prove` runs (with `--state=slow`) will run the same test
script again even if said script is not specified on the `prove`
command-line!
So far, this bug did not matter. Right until d8f416bbb8 (ci: run unit
tests in CI, 2023-11-09) did it not matter.
But starting with that commit, we invoke `prove` _twice_ in CI, once to
run the regular test suite of regression test scripts, and once to run
the unit tests. Due to the bug, the second invocation re-runs all of the
tests that were already run as part of the first invocation. This not
only wastes build minutes, it also frequently causes the `osx-*` jobs to
fail because they already take a long time and now are likely to run
into a timeout.
The worst part about it is that there is actually no benefit to keep
running with `--state=slow,save`, ever since we decided no longer to
try to reuse the Prove cache between CI runs.
So let's just drop that Prove option and live happily ever after.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update ref-related tests.
* ps/ref-tests-update:
t: mark several tests that assume the files backend with REFFILES
t7900: assert the absence of refs via git-for-each-ref(1)
t7300: assert exact states of repo
t4207: delete replace references via git-update-ref(1)
t1450: convert tests to remove worktrees via git-worktree(1)
t: convert tests to not access reflog via the filesystem
t: convert tests to not access symrefs via the filesystem
t: convert tests to not write references via the filesystem
t: allow skipping expected object ID in `ref-store update-ref`
"git add" and "git stash" learned to support the ":(attr:...)"
magic pathspec.
* jw/git-add-attr-pathspec:
attr: enable attr pathspec magic for git-add and git-stash
Code clean-up for jk/chunk-bounds topic.
* jk/chunk-bounds-more:
commit-graph: mark chunk error messages for translation
commit-graph: drop verify_commit_graph_lite()
commit-graph: check order while reading fanout chunk
commit-graph: use fanout value for graph size
commit-graph: abort as soon as we see a bogus chunk
commit-graph: clarify missing-chunk error messages
commit-graph: drop redundant call to "lite" verification
midx: check consistency of fanout table
commit-graph: handle overflow in chunk_size checks
When building executables we may end up with both `foo` and `foo.exe` in
the project's root directory. This can cause issues on Cygwin, which is
why we unlink the `foo` binary (see 6fc301bbf6 (Makefile: remove $foo
when $foo.exe is built/installed., 2007-01-10)). This step is skipped if
either:
- `foo` is a directory, which can happen when building Git on
Windows via MSVC (see ade2ca0ca9 (Do not try to remove
directories when removing old links, 2009-10-27)).
- `foo` is a hardlink to `foo.exe`, which can happen on Cygwin (see
0d768f7c8f (Makefile: building git in cygwin 1.7.0, 2008-08-15)).
These two conditions are currently chained together via `test -o`, which
is discouraged by our code style guide. Convert the recipe to instead
use an `if` statement with `&&`'d conditions, which both matches our
style guide and is easier to ready.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `subtree_for_commit ()` helper function asserts that the subtree
identified by its parameters are either a commit or tree. This is done
via the `-o` parameter of test, which is discouraged.
Refactor the code to instead use a switch statement over the type.
Despite being aligned with our coding guidelines, the resulting code is
arguably also easier to read.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Functions in git-subtree.sh all assert that they are being passed the
correct number of arguments. In cases where we accept a variable number
of arguments we assert this via a single call to `test` with `-o`, which
is discouraged by our coding guidelines.
Convert these cases to stop doing so. This requires us to decompose
assertions of the style `assert test $# = 2 -o $# = 3` into two calls
because we have no easy way to logically chain statements passed to the
assert function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our coding guidelines say to not use `test` with `-a` and `-o` because
it can easily lead to bugs. Convert trivial cases where we still use
these to instead instead concatenate multiple invocations of `test` via
`&&` and `||`, respectively.
While not all of the converted instances can cause ambiguity, it is
worth getting rid of all of them regardless:
- It becomes easier to reason about the code as we do not have to
argue why one use of `-a`/`-o` is okay while another one isn't.
- We don't encourage people to use these expressions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hooks executed by Subversion are spawned with an empty environment. By
default, not even variables like PATH will be propagated to them. In
order to ensure that we're still able to find required executables, we
thus write the current PATH variable into the hook script itself and
then re-export it in t9164.
This happens too late in the script though, as we already tried to
execute the basename(1) utility before exporting the PATH variable. This
tends to work on most platforms as the fallback value of PATH for Bash
(see `getconf PATH`) is likely to contain this binary. But on more
exotic platforms like NixOS this is not the case, and thus the test
fails.
While we could work around this issue by simply setting PATH earlier, it
feels fragile to inject a user-controlled value into the script and have
the shell interpret it. Instead, we can refactor the hook setup to write
a `hooks-env` file that configures PATH for us. Like this, Subversion
will know to set up the environment as expected for all hooks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When setting up httpd for our tests, we also install a passwd and
proxy-passwd file that contain the test user's credentials. These
credentials currently use crypt(3) as the password encryption schema.
This schema can be considered deprecated nowadays as it is not safe
anymore. Quoting Apache httpd's documentation [1]:
> Unix only. Uses the traditional Unix crypt(3) function with a
> randomly-generated 32-bit salt (only 12 bits used) and the first 8
> characters of the password. Insecure.
This is starting to cause issues in modern Linux distributions. glibc
has deprecated its libcrypt library that used to provide crypt(3) in
favor of the libxcrypt library. This newer replacement provides a
compile time switch to disable insecure password encryption schemata,
which causes crypt(3) to always return `EINVAL`. The end result is that
httpd tests that exercise authentication will fail on distros that use
libxcrypt without these insecure encryption schematas.
Regenerate the passwd files to instead use the default password
encryption schema, which is md5. While it feels kind of funny that an
MD5-based encryption schema should be more secure than anything else, it
is the current default and supported by all platforms. Furthermore, it
really doesn't matter all that much given that these files are only used
for testing purposes anyway.
[1]: https://httpd.apache.org/docs/2.4/misc/password_encryptions.html
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to set up the Apache httpd server, we need to locate both the
httpd binary and its default module path. This is done with a hardcoded
list of locations that we scan. While this works okayish with distros
that more-or-less follow the Filesystem Hierarchy Standard, it falls
apart on others like NixOS that don't.
While it is possible to specify these paths via `LIB_HTTPD_PATH` and
`LIB_HTTPD_MODULE_PATH`, it is not a nice experience for the developer
to figure out how to set those up. And in fact we can do better by
dynamically detecting both httpd and its module path at runtime:
- The httpd binary can be located via PATH.
- The module directory can (in many cases) be derived via the
`HTTPD_ROOT` compile-time variable.
Amend the code to do so.
Note that the new runtime-detected paths will only be used as a fallback
in case none of the hardcoded paths are usable. For the PATH lookup this
is because httpd is typically installed into "/usr/sbin", which is often
not included in the user's PATH variable. And the module path detection
relies on a configured httpd installation and may thus not work in all
cases, either.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing the cover letter, the encode_email_headers option was
ignored. That is, UTF-8 subject lines and email addresses were
written out as-is, without any Q-encoding, even if
--encode-email-headers was passed on the command line.
This is due to encode_email_headers not being copied over from
struct rev_info to struct pretty_print_context. Fix that and add
a test.
Signed-off-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add support for GitLab CI.
* ps/ci-gitlab:
ci: add support for GitLab CI
ci: install test dependencies for linux-musl
ci: squelch warnings when testing with unusable Git repo
ci: unify setup of some environment variables
ci: split out logic to set up failed test artifacts
ci: group installation of Docker dependencies
ci: make grouping setup more generic
ci: reorder definitions for grouping functions
Update the base topic to work with CMake builds.
* js/doc-unit-tests-with-cmake:
cmake: handle also unit tests
cmake: use test names instead of full paths
cmake: fix typo in variable name
artifacts-tar: when including `.dll` files, don't forget the unit-tests
unit-tests: do show relative file paths
unit-tests: do not mistake `.pdb` files for being executable
cmake: also build unit tests
Process to add some form of low-level unit tests has started.
* js/doc-unit-tests:
ci: run unit tests in CI
unit tests: add TAP unit test framework
unit tests: add a project plan document
The unit tests should also be available e.g. in Visual Studio's Test
Explorer when configuring Git's source code via CMake.
Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The primary purpose of Git's CMake definition is to allow developing Git
in Visual Studio. As part of that, the CTest feature allows running
individual test scripts conveniently in Visual Studio's Test Explorer.
However, this Test Explorer's design targets object-oriented languages
and therefore expects the test names in the form
`<namespace>.<class>.<testname>`. And since we specify the full path
of the test scripts instead, including the ugly `/.././t/` part, these
dots confuse the Test Explorer and it uses a large part of the path as
"namespace".
Let's just use `t.suite.<name>` instead. This presents the tests in
Visual Studio's Test Explorer in the following form by default (i.e.
unless the user changes the view via the "Group by" menu):
◢ ◈ git
◢ ◈ t
◢ ◈ suite
◈ t0000-basic
◈ t0001-init
◈ t0002-gitfile
[...]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of recent, Git also builds executables in `t/unit-tests/`. For
technical reasons, when building with CMake and Visual C, the
dependencies (".dll files") need to be copied there, too, otherwise
running the executable will fail "due to missing dependencies".
The CMake definition already contains the directives to copy those
`.dll` files, but we also need to adjust the `artifacts-tar` rule in
the `Makefile` accordingly to let the `vs-test` job in the CI runs
pass successfully.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Visual C interpolates `__FILE__` with the absolute _Windows_ path of
the source file. GCC interpolates it with the relative path, and the
tests even verify that.
So let's make sure that the unit tests only emit such paths.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building the unit tests via CMake, the `.pdb` files are built.
Those are, essentially, files containing the debug information
separately from the executables.
Let's not confuse them with the executables we actually want to run.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new, better way to run unit tests was just added to Git. This adds
support for building those unit tests via CMake.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Run unit tests in both Cirrus and GitHub CI. For sharded CI instances
(currently just Windows on GitHub), run only on the first shard. This is
OK while we have only a single unit test executable, but we may wish to
distribute tests more evenly when we add new unit tests in the future.
We may also want to add more status output in our unit test framework,
so that we can do similar post-processing as in
ci/lib.sh:handle_failed_tests().
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch contains an implementation for writing unit tests with TAP
output. Each test is a function that contains one or more checks. The
test is run with the TEST() macro and if any of the checks fail then the
test will fail. A complete program that tests STRBUF_INIT would look
like
#include "test-lib.h"
#include "strbuf.h"
static void t_static_init(void)
{
struct strbuf buf = STRBUF_INIT;
check_uint(buf.len, ==, 0);
check_uint(buf.alloc, ==, 0);
check_char(buf.buf[0], ==, '\0');
}
int main(void)
{
TEST(t_static_init(), "static initialization works);
return test_done();
}
The output of this program would be
ok 1 - static initialization works
1..1
If any of the checks in a test fail then they print a diagnostic message
to aid debugging and the test will be reported as failing. For example a
failing integer check would look like
# check "x >= 3" failed at my-test.c:102
# left: 2
# right: 3
not ok 1 - x is greater than or equal to three
There are a number of check functions implemented so far. check() checks
a boolean condition, check_int(), check_uint() and check_char() take two
values to compare and a comparison operator. check_str() will check if
two strings are equal. Custom checks are simple to implement as shown in
the comments above test_assert() in test-lib.h.
Tests can be skipped with test_skip() which can be supplied with a
reason for skipping which it will print. Tests can print diagnostic
messages with test_msg(). Checks that are known to fail can be wrapped
in TEST_TODO().
There are a couple of example test programs included in this
patch. t-basic.c implements some self-tests and demonstrates the
diagnostic output for failing test. The output of this program is
checked by t0080-unit-test-output.sh. t-strbuf.c shows some example
unit tests for strbuf.c
The unit tests will be built as part of the default "make all" target,
to avoid bitrot. If you wish to build just the unit tests, you can run
"make build-unit-tests". To run the tests, you can use "make unit-tests"
or run the test binaries directly, as in "./t/unit-tests/bin/t-strbuf".
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In our current testing environment, we spend a significant amount of
effort crafting end-to-end tests for error conditions that could easily
be captured by unit tests (or we simply forgo some hard-to-setup and
rare error conditions). Describe what we hope to accomplish by
implementing unit tests, and explain some open questions and milestones.
Discuss desired features for test frameworks/harnesses, and provide a
comparison of several different frameworks. Finally, document our
rationale for implementing a custom framework.
Co-authored-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The patches from f32af12cee (Merge branch 'jk/chunk-bounds', 2023-10-23)
added many new untranslated error messages. While it's unlikely for most
users to see these messages at all, most of the other commit-graph error
messages are translated (and likewise for the matching midx messages).
Let's mark them all for consistency (and to help any poor unfortunate
user who does manage to find a broken graph file).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we've moved all of the checks from this function directly into the
chunk-reading code used by the caller (and there is only one caller), we
can just drop it entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We read the fanout chunk, storing a pointer to it, but only confirm that
the entries are monotonic in a final "lite" verification step. Let's
move that into the actual OIDF chunk callback, so that we can report
problems immediately (for all the reasons given in the previous
"commit-graph: abort as soon as we see a bogus chunk" commit).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit-graph, midx, and pack idx files all have both a lookup table of
oids and an oid fanout table. In midx and pack idx files, we take the
final entry of the fanout table as the source of truth for the number of
entries, and then verify that the size of the lookup table matches that.
But for commit-graph files, we do the opposite: we use the size of the
lookup table as the source of truth, and then check the final fanout
entry against it.
As noted in 4169d89645 (commit-graph: check consistency of fanout
table, 2023-10-09), either is correct. But there are a few reasons to
prefer the fanout table as the source of truth:
1. The fanout entries are 32-bits on disk, and that defines the
maximum number of entries we can store. But since the size of the
lookup table is only bounded by the filesystem, it can be much
larger. And hence computing it as the commit-graph does means that
we may truncate the result when storing it in a uint32_t.
2. We read the fanout first, then the lookup table. If we're verifying
the chunks as we read them, then we'd want to take the fanout as
truth (we have nothing yet to check it against) and then we can
check that the lookup table matches what we already know.
3. It is pointlessly inconsistent with the midx and pack idx code.
Since the three have to do similar size and bounds checks, it is
easier to reason about all three if they use the same approach.
So this patch moves the assignment of g->num_commits to the fanout
parser, and then we can check the size of the lookup chunk as soon as we
try to load it.
There's already a test covering this situation, which munges the final
fanout entry to 2^32-1. In the current code we complain that it does not
agree with the table size. But now that we treat the munged value as the
source of truth, we'll complain that the lookup table is the wrong size
(again, either is correct). So we'll have to update the message we
expect (and likewise for an earlier test which does similar munging).
There's a similar test for this situation on the midx side, but rather
than making a very-large fanout value, it just truncates the lookup
table. We could do that here, too, but the very-large fanout value
actually shows an interesting corner case. On a 32-bit system,
multiplying to find the expected table size would cause an integer
overflow. Using st_mult() would detect that, but cause us to die()
rather than falling back to the non-graph code path. Checking the size
using division (as we do with existing chunk-size checks) avoids the
overflow entirely, and the test demonstrates this when run on a 32-bit
system.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to read commit-graph files tries to read all of the required
chunks, but doesn't abort if we can't find one (or if it's corrupted).
It's only at the end of reading the file that we then do some sanity
checks for NULL entries. But it's preferable to detect the errors and
bail immediately, for a few reasons:
1. It's less error-prone. It's easy in the reader functions to flag an
error but still end up setting some struct fields (an error I in
fact made while working on this patch series).
2. It's safer. Since verifying some chunks depends on the values of
other chunks, we may be depending on not-yet-verified data. I don't
know offhand of any case where this can cause problems, but it's
one less subtle thing to worry about in the reader code.
3. It prevents the user from seeing nonsense errors. If we're missing
an OIDL chunk, then g->num_commits will be zero. And so we may
complain that the size of our CDAT chunk (which should have a
fixed-size record for each commit) is wrong unless it's also zero.
But that's misleading; the problem is the missing OIDL chunk; the
CDAT one might be fine!
So let's just check the return value from read_chunk(). This is exactly
how the midx chunk-reading code does it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a required commit-graph chunk cannot be loaded, we leave its entry
in the struct NULL, and then later complain that it is missing. But
that's just one reason we might not have loaded it, as we also do some
data quality checks.
Let's switch these messages to say "missing or corrupted", which is
exactly what the midx code says for the same cases. Likewise, we'll use
the same phrasing and capitalization as those for consistency. And while
we're here, we can mark them for translation (just like the midx ones).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of verify_commit_graph_lite() is to have cheap verification
checks both for everyday use of the graph files (to avoid out of bounds
reads, etc) as well as for doing a full check via "commit-graph verify"
(which will also check the hash, etc).
But the expensive verification checks operate on a commit_graph struct,
which we get by using the normal everyday-reader code! So any problem
we'd find by calling it would have been found before we even got to the
verify_one_commit_graph() function.
Removing it simplifies the code a bit, but also frees us up to move the
"lite" verification steps around within that everyday-reader code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph, midx, and pack idx on-disk formats all have oid fanout
tables which are fed to bsearch_hash(). If these tables do not increase
monotonically, then the binary search may not only produce bogus values,
it may cause out of bounds reads.
We fixed this for commit graphs in 4169d89645 (commit-graph: check
consistency of fanout table, 2023-10-09). That commit argued that we did
not need to do the same for midx and pack idx files, because they
already did this check. However, that is wrong. We _do_ check the fanout
table for pack idx files when we load them, but we only do so for midx
files when running "git multi-pack-index verify". So it is possible to
get an out-of-bounds read by running a normal command with a specially
crafted midx file.
Let's fix this using the same solution (and roughly the same test) we
did for the commit-graph in 4169d89645. This replaces the same check
from "multi-pack-index verify", because verify uses the same read
routines, we'd bail on reading the midx much sooner now. So let's make
sure to copy its verbose error message.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check the size of chunks with fixed records by multiplying the width
of each record by the number of commits in the file. Like:
if (chunk_size != g->num_commits * GRAPH_DATA_WIDTH)
If this multiplication overflows, we may not notice a chunk is too small
(which could later lead to out-of-bound reads).
In the current code this is only possible for the CDAT chunk, but the
reasons are quite subtle. We compute g->num_commits by dividing the size
of the OIDL chunk by the hash length (since it consists of a bunch of
hashes). So we know that any size_t multiplication that uses a value
smaller than the hash length cannot overflow. And the CDAT records are
the only ones that are larger (the others are just 4-byte records). So
it's worth fixing all of these, to make it clear that they're not
subject to overflow (without having to reason about seemingly unrelated
code).
The obvious thing to do is add an st_mult(), like:
if (chunk_size != st_mult(g->num_commits, GRAPH_DATA_WIDTH))
And that certainly works, but it has one downside: if we detect an
overflow, we'll immediately die(). But the commit graph is an optional
file; if we run into other problems loading it, we'll generally return
an error and fall back to accessing the full objects. Using st_mult()
means a malformed file will abort the whole process.
So instead, we can do a division like this:
if (chunk_size / GRAPH_DATA_WIDTH != g->num_commits)
where there's no possibility of overflow. We do lose a little bit of
precision; due to integer division truncation we'd allow up to an extra
GRAPH_DATA_WIDTH-1 bytes of data in the chunk. That's OK. Our main goal
here is making sure we don't have too _few_ bytes, which would cause an
out-of-bounds read (we could actually replace our "!=" with "<", but I
think it's worth being a little pedantic, as a large mismatch could be a
sign of other problems).
I didn't add a test here. We'd need to generate a very large graph file
in order to get g->num_commits large enough to cause an overflow. And a
later patch in this series will use this same division technique in a
way that is much easier to trigger in the tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already support Azure Pipelines and GitHub Workflows in the Git
project, but until now we do not have support for GitLab CI. While it is
arguably not in the interest of the Git project to maintain a ton of
different CI platforms, GitLab has recently ramped up its efforts and
tries to contribute to the Git project more regularly.
Part of a problem we hit at GitLab rather frequently is that our own,
custom CI setup we have is so different to the setup that the Git
project has. More esoteric jobs like "linux-TEST-vars" that also set a
couple of environment variables do not exist in GitLab's custom CI
setup, and maintaining them to keep up with what Git does feels like
wasted time. The result is that we regularly send patch series upstream
that fail to compile or pass tests in GitHub Workflows. We would thus
like to integrate the GitLab CI configuration into the Git project to
help us send better patch series upstream and thus reduce overhead for
the maintainer. Results of these pipeline runs will be made available
(at least) in GitLab's mirror of the Git project at [1].
This commit introduces the integration into our regular CI scripts so
that most of the setup continues to be shared across all of the CI
solutions. Note that as the builds on GitLab CI run as unprivileged
user, we need to pull in both sudo and shadow packages to our Alpine
based job to set this up.
[1]: https://gitlab.com/gitlab-org/git
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The linux-musl CI job executes tests on Alpine Linux, which is based on
musl libc instead of glibc. We're missing some test dependencies though,
which causes us to skip a subset of tests.
Install these test dependencies to increase our test coverage on this
platform. There are still some missing test dependecies, but these do
not have a corresponding package in the Alpine repositories:
- p4 and p4d, both parts of the Perforce version control system.
- cvsps, which generates patch sets for CVS.
- Subversion and the SVN::Core Perl library, the latter of which is
not available in the Alpine repositories. While the tool itself is
available, all Subversion-related tests are skipped without the
SVN::Core Perl library anyway.
The Apache2-based tests require a bit more care though. For one, the
module path is different on Alpine Linux, which requires us to add it to
the list of known module paths to detect it. But second, the WebDAV
module on Alpine Linux is broken because it does not bundle the default
database backend [1]. We thus need to skip the WebDAV-based tests on
Alpine Linux for now.
[1]: https://gitlab.alpinelinux.org/alpine/aports/-/issues/13112
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our CI jobs that run on Docker also use mostly the same architecture to
build and test Git via the "ci/run-build-and-tests.sh" script. These
scripts also provide some functionality to massage the Git repository
we're supposedly operating in.
In our Docker-based infrastructure we may not even have a Git repository
available though, which leads to warnings when those functions execute.
Make the helpers exit gracefully in case either there is no Git in our
PATH, or when not running in a Git repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both GitHub Actions and Azure Pipelines set up the environment variables
GIT_TEST_OPTS, GIT_PROVE_OPTS and MAKEFLAGS. And while most values are
actually the same, the setup is completely duplicate. With the upcoming
support for GitLab CI this duplication would only extend even further.
Unify the setup of those environment variables so that only the uncommon
parts are separated. While at it, we also perform some additional small
improvements:
- We now always pass `--state=failed,slow,save` via GIT_PROVE_OPTS.
It doesn't hurt on platforms where we don't persist the state, so
this further reduces boilerplate.
- When running on Windows systems we set `--no-chain-lint` and
`--no-bin-wrappers`. Interestingly though, we did so _after_
already having exported the respective environment variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have some logic in place to create a directory with the output from
failed tests, which will then subsequently be uploaded as CI artifacts.
We're about to add support for GitLab CI, which will want to reuse the
logic.
Split the logic into a separate function so that it is reusable.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The output of CI jobs tends to be quite long-winded and hard to digest.
To help with this, many CI systems provide the ability to group output
into collapsible sections, and we're also doing this in some of our
scripts.
One notable omission is the script to install Docker dependencies.
Address it to bring more structure to the output for Docker-based jobs.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the grouping setup more generic by always calling `begin_group ()`
and `end_group ()` regardless of whether we have stubbed those functions
or not. This ensures we can more readily add support for additional CI
platforms.
Furthermore, the `group ()` function is made generic so that it is the
same for both GitHub Actions and for other platforms. There is a
semantic conflict here though: GitHub Actions used to call `set +x` in
`group ()` whereas the non-GitHub case unconditionally uses `set -x`.
The latter would get overriden if we kept the `set +x` in the generic
version of `group ()`. To resolve this conflict, we simply drop the `set
+x` in the generic variant of this function. As `begin_group ()` calls
`set -x` anyway this is not much of a change though, as the only
commands that aren't printed anymore now are the ones between the
beginning of `group ()` and the end of `begin_group ()`.
Last, this commit changes `end_group ()` to also accept a parameter that
indicates _which_ group should end. This will be required by a later
commit that introduces support for GitLab CI.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We define a set of grouping functions that are used to group together
output in our CI, where these groups then end up as collapsible sections
in the respective pipeline platform. The way these functions are defined
is not easily extensible though as we have an up front check for the CI
_not_ being GitHub Actions, where we define the non-stub logic in the
else branch.
Reorder the conditional branches such that we explicitly handle GitHub
Actions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rev-list --unpacked --objects" failed to exclude packed
non-commit objects, which has been corrected.
* tb/rev-list-unpacked-fix:
pack-bitmap: drop --unpacked non-commit objects from results
list-objects: drop --unpacked non-commit objects from results
Leakfix.
* ps/leakfixes:
setup: fix leaking repository format
setup: refactor `upgrade_repository_format()` to have common exit
shallow: fix memory leak when registering shallow roots
test-bloom: stop setting up Git directory twice
Further limit tree depth max to avoid Windows build running out of
the stack space.
* jk/tree-name-and-depth-limit:
max_tree_depth: lower it for MSVC to avoid stack overflows
Allow users to limit or exclude files based on file attributes
during git-add and git-stash.
For example, the chromium project would like to use
$ git add . ':(exclude,attr:submodule)'
as submodules are managed by an external tool, forbidding end users
to record changes with "git add". Allowing "git add" to often
records changes that users do not want in their commits.
This commit does not change any attr magic implementation. It is
only adding attr as an allowed pathspec in git-add and git-stash,
which was previously blocked by GUARD_PATHSPEC and a pathspec mask
in parse_pathspec()).
However, we fix a bug in prefix_magic() where attr values were
unintentionally removed. This was triggerable when parse_pathspec()
is called with PATHSPEC_PREFIX_ORIGIN as a flag, which was the case
for git-stash (Bug originally filed here [*])
Furthermore, while other commands hit this code path it did not
result in unexpected behavior because this bug only impacts the
pathspec->items->original field which is NOT used to filter
paths. However, git-stash does use pathspec->items->original when
building args used to call other git commands. (See add_pathspecs()
usage and implementation in stash.c)
It is possible that when the attr pathspec feature was first added
in b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few
GUARD_PATHSPEC() invocations.
Later, to get a more user-friendly error message when attr was used
with git-add, PATHSPEC_ATTR was added as a mask to git-add's
invocation of parse_pathspec() 84d938b732 (add: do not accept
pathspec magic 'attr', 2018-09-18). However, this user-friendly
error message was never added for git-stash.
[Reference]
* https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/)
Signed-off-by: Joanna Wang <jojwang@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Another step to deprecate test_i18ngrep.
* jc/test-i18ngrep:
tests: teach callers of test_i18ngrep to use test_grep
test framework: further deprecate test_i18ngrep
Add the REFFILES prerequisite to several tests that assume we're using
the files backend. There are various reasons why we cannot easily
convert those tests to be backend-independent, where the most common
one is that we have no way to write corrupt references into the refdb
via our tooling. We may at a later point in time grow the tooling to
make this possible, but for now we just mark these tests as requiring
the files backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're asserting that a prefetch of remotes via git-maintenance(1)
doesn't write any references in refs/remotes by validating that the
directory ".git/refs/remotes" is missing. This is quite roundabout: we
don't care about the directory existing, we care about the references
not existing, and the way these are stored is on the behest of the
reference database.
Convert the test to instead check via git-for-each-ref(1) whether any
remote reference exist.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the tests in t7300 verify that git-clean(1) doesn't touch
repositories that are embedded into the main repository. This is done by
asserting a small set of substructures that are assumed to always exist,
like the "refs/", "objects/" or "HEAD". This has the downside that we
need to assume a specific repository structure that may be subject to
change when new backends for the refdb land. At the same time, we don't
thoroughly assert that git-clean(1) really didn't end up cleaning any
files in the repository either.
Convert the tests to instead assert that all files continue to exist
after git-clean(1) by comparing a file listing via find(1) before and
after executing clean. This makes our actual assertions stricter while
having to care less about the repository's actual on-disk format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t4207 we set up a set of replace objects via git-replace(1). Because
these references should not be impacting subsequent tests we also set up
some cleanup logic that deletes the replacement references via a call to
`rm -rf`. This reaches into the internal implementation details of the
reference backend and will thus break when we grow an alternative refdb
implementation.
Refactor the tests to delete the replacement refs via Git commands so
that we become independent of the actual refdb that's in use. As we
don't have a nice way to delete all replacements or all references in a
certain namespace, we opt for a combination of git-for-each-ref(1) and
git-update-ref(1)'s `--stdin` mode.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of our tests in t1450 create worktrees and then corrupt them.
As it is impossible to delete such worktrees via a normal call to `git
worktree remove`, we instead opt to remove them manually by calling
rm(1) instead.
This is ultimately unnecessary though as we can use the `-f` switch to
remove the worktree. Let's convert the tests to do so such that we don't
have to reach into internal implementation details of worktrees.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of our tests reach directly into the filesystem in order to both
read or modify the reflog, which will break once we have a second
reference backend in our codebase that stores reflogs differently.
Refactor these tests to either use git-reflog(1) or the ref-store test
helper. Note that the refactoring to use git-reflog(1) also requires us
to adapt our expectations in some cases where we previously verified the
exact on-disk log entries. This seems like an acceptable tradeoff though
to ensure that different backends have the same user-visible behaviour
as any user would typically use git-reflog(1) anyway to access the logs.
Any backend-specific verification of the written on-disk format should
be implemented in a separate, backend-specific test.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of our tests access symbolic references via the filesystem
directly. While this works with the current files reference backend, it
this will break once we have a second reference backend in our codebase.
Refactor these tests to instead use git-symbolic-ref(1) or our
`ref-store` test tool. The latter is required in some cases where safety
checks of git-symbolic-ref(1) would otherwise reject writing a symbolic
reference.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of our tests manually create, update or delete references by
writing the respective new values into the filesystem directly. While
this works with the current files reference backend, this will break
once we have a second reference backend implementation in our codebase.
Refactor these tests to instead use git-update-ref(1) or our `ref-store`
test tool. The latter is required in some cases where safety checks of
git-update-ref(1) would otherwise reject a reference update.
While at it, refactor some of the tests to schedule the cleanup command
via `test_when_finished` before modifying the repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We require the caller to pass both the old and new expected object ID to
our `test-tool ref-store update-ref` helper. When trying to update a
symbolic reference though it's impossible to specify the expected object
ID, which means that the test would instead have to force-update the
reference. This is currently impossible though.
Update the helper to optionally skip verification of the old object ID
in case the test passes in an empty old object ID as input.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/show-ref:
t: use git-show-ref(1) to check for ref existence
builtin/show-ref: add new mode to check for reference existence
builtin/show-ref: explicitly spell out different modes in synopsis
builtin/show-ref: ensure mutual exclusiveness of subcommands
builtin/show-ref: refactor options for patterns subcommand
builtin/show-ref: stop using global vars for `show_one()`
builtin/show-ref: stop using global variable to count matches
builtin/show-ref: refactor `--exclude-existing` options
builtin/show-ref: fix dead code when passing patterns
builtin/show-ref: fix leaking string buffer
builtin/show-ref: split up different subcommands
builtin/show-ref: convert pattern to a local variable
"git merge-file" learns a mode to read three contents to be merged
from blob objects.
* bc/merge-file-object-input:
merge-file: add an option to process object IDs
git-merge-file doc: drop "-file" from argument placeholders
"git rev-list --missing" did not work for missing commit objects,
which has been corrected.
* kn/rev-list-missing-fix:
rev-list: add commit object support in `--missing` option
rev-list: move `show_commit()` to the bottom
revision: rename bit to `do_not_die_on_missing_objects`
Teach "git show-ref" a mode to check the existence of a ref.
* ps/show-ref:
t: use git-show-ref(1) to check for ref existence
builtin/show-ref: add new mode to check for reference existence
builtin/show-ref: explicitly spell out different modes in synopsis
builtin/show-ref: ensure mutual exclusiveness of subcommands
builtin/show-ref: refactor options for patterns subcommand
builtin/show-ref: stop using global vars for `show_one()`
builtin/show-ref: stop using global variable to count matches
builtin/show-ref: refactor `--exclude-existing` options
builtin/show-ref: fix dead code when passing patterns
builtin/show-ref: fix leaking string buffer
builtin/show-ref: split up different subcommands
builtin/show-ref: convert pattern to a local variable
"git bugreport" learned to complain when it received a command line
argument that it will not use.
* es/bugreport-no-extra-arg:
bugreport: reject positional arguments
t0091-bugreport: stop using i18ngrep
"git send-email" did not have certain pieces of data computed yet
when it tried to validate the outging messages and its recipient
addresses, which has been sorted out.
* ms/send-email-validate-fix:
send-email: move validation code below process_address_list
"git reflog expire --single-worktree" has been broken for the past
20 months or so, which has been corrected.
* rs/reflog-expire-single-worktree-fix:
reflog: fix expire --single-worktree
The codepath to traverse the commit-graph learned to notice that a
commit is missing (e.g., corrupt repository lost an object), even
though it knows something about the commit (like its parents) from
what is in commit-graph.
Comments?
* ps/do-not-trust-commit-graph-blindly-for-existence:
commit: detect commits that exist in commit-graph but not in the ODB
commit-graph: introduce envvar to disable commit existence checks
"cd sub && git grep -f patterns" tried to read "patterns" file at
the top level of the working tree; it has been corrected to read
"sub/patterns" instead.
* jc/grep-f-relative-to-cwd:
grep: -f <path> is relative to $cwd
"git rev-list --missing" did not work for missing commit objects,
which has been corrected.
* kn/rev-list-missing-fix:
rev-list: add commit object support in `--missing` option
rev-list: move `show_commit()` to the bottom
revision: rename bit to `do_not_die_on_missing_objects`
commit: detect commits that exist in commit-graph but not in the ODB
commit-graph: introduce envvar to disable commit existence checks
The `--missing` object option in rev-list currently works only with
missing blobs/trees. For missing commits the revision walker fails with
a fatal error.
Let's extend the functionality of `--missing` option to also support
commit objects. This is done by adding a `missing_objects` field to
`rev_info`. This field is an `oidset` to which we'll add the missing
commits as we encounter them. The revision walker will now continue the
traversal and call `show_commit()` even for missing commits. In rev-list
we can then check if the commit is a missing commit and call the
existing code for parsing `--missing` objects.
A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, using the `--missing=print` option while disabling the
alternate object directory allows us to find the boundary objects
between the main and alternate object directory.
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `show_commit()` function already depends on `finish_commit()`, and
in the upcoming commit, we'll also add a dependency on
`finish_object__ma()`. Since in C symbols must be declared before
they're used, let's move `show_commit()` below both `finish_commit()`
and `finish_object__ma()`, so the code is cleaner as a whole without the
need for declarations.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bit `do_not_die_on_missing_tree` is used in revision.h to ensure the
revision walker does not die when encountering a missing tree. This is
currently exclusively set within `builtin/rev-list.c` to ensure the
`--missing` option works with missing trees.
In the upcoming commits, we will extend `--missing` to also support
missing commits. So let's rename the bit to
`do_not_die_on_missing_objects`, which is object type agnostic and can
be used for both trees/commits.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/do-not-trust-commit-graph-blindly-for-existence:
commit: detect commits that exist in commit-graph but not in the ODB
commit-graph: introduce envvar to disable commit existence checks
Commit graphs can become stale and contain references to commits that do
not exist in the object database anymore. Theoretically, this can lead
to a scenario where we are able to successfully look up any such commit
via the commit graph even though such a lookup would fail if done via
the object database directly.
As the commit graph is mostly intended as a sort of cache to speed up
parsing of commits we do not want to have diverging behaviour in a
repository with and a repository without commit graphs, no matter
whether they are stale or not. As commits are otherwise immutable, the
only thing that we really need to care about is thus the presence or
absence of a commit.
To address potentially stale commit data that may exist in the graph,
our `lookup_commit_in_graph()` function will check for the commit's
existence in both the commit graph, but also in the object database. So
even if we were able to look up the commit's data in the graph, we would
still pretend as if the commit didn't exist if it is missing in the
object database.
We don't have the same safety net in `parse_commit_in_graph_one()`
though. This function is mostly used internally in "commit-graph.c"
itself to validate the commit graph, and this usage is fine. We do
expose its functionality via `parse_commit_in_graph()` though, which
gets called by `repo_parse_commit_internal()`, and that function is in
turn used in many places in our codebase.
For all I can see this function is never used to directly turn an object
ID into a commit object without additional safety checks before or after
this lookup. What it is being used for though is to walk history via the
parent chain of commits. So when commits in the parent chain of a graph
walk are missing it is possible that we wouldn't notice if that missing
commit was part of the commit graph. Thus, a query like `git rev-parse
HEAD~2` can succeed even if the intermittent commit is missing.
It's unclear whether there are additional ways in which such stale
commit graphs can lead to problems. In any case, it feels like this is a
bigger bug waiting to happen when we gain additional direct or indirect
callers of `repo_parse_commit_internal()`. So let's fix the inconsistent
behaviour by checking for object existence via the object database, as
well.
This check of course comes with a performance penalty. The following
benchmarks have been executed in a clone of linux.git with stable tags
added:
Benchmark 1: git -c core.commitGraph=true rev-list --topo-order --all (git = master)
Time (mean ± σ): 2.913 s ± 0.018 s [User: 2.363 s, System: 0.548 s]
Range (min … max): 2.894 s … 2.950 s 10 runs
Benchmark 2: git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
Time (mean ± σ): 3.834 s ± 0.052 s [User: 3.276 s, System: 0.556 s]
Range (min … max): 3.780 s … 3.961 s 10 runs
Benchmark 3: git -c core.commitGraph=false rev-list --topo-order --all (git = master)
Time (mean ± σ): 13.841 s ± 0.084 s [User: 13.152 s, System: 0.687 s]
Range (min … max): 13.714 s … 13.995 s 10 runs
Benchmark 4: git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
Time (mean ± σ): 13.762 s ± 0.116 s [User: 13.094 s, System: 0.667 s]
Range (min … max): 13.645 s … 14.038 s 10 runs
Summary
git -c core.commitGraph=true rev-list --topo-order --all (git = master) ran
1.32 ± 0.02 times faster than git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
4.72 ± 0.05 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
4.75 ± 0.04 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = master)
We look at a ~30% regression in general, but in general we're still a
whole lot faster than without the commit graph. To counteract this, the
new check can be turned off with the `GIT_COMMIT_GRAPH_PARANOIA` envvar.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our `lookup_commit_in_graph()` helper tries to look up commits from the
commit graph and, if it doesn't exist there, falls back to parsing it
from the object database instead. This is intended to speed up the
lookup of any such commit that exists in the database. There is an edge
case though where the commit exists in the graph, but not in the object
database. To avoid returning such stale commits the helper function thus
double checks that any such commit parsed from the graph also exists in
the object database. This makes the function safe to use even when
commit graphs aren't updated regularly.
We're about to introduce the same pattern into other parts of our code
base though, namely `repo_parse_commit_internal()`. Here the extra
sanity check is a bit of a tougher sell: `lookup_commit_in_graph()` was
a newly introduced helper, and as such there was no performance hit by
adding this sanity check. If we added `repo_parse_commit_internal()`
with that sanity check right from the beginning as well, this would
probably never have been an issue to begin with. But by retrofitting it
with this sanity check now we do add a performance regression to
preexisting code, and thus there is a desire to avoid this or at least
give an escape hatch.
In practice, there is no inherent reason why either of those functions
should have the sanity check whereas the other one does not: either both
of them are able to detect this issue or none of them should be. This
also means that the default of whether we do the check should likely be
the same for both. To err on the side of caution, we thus rather want to
make `repo_parse_commit_internal()` stricter than to loosen the checks
that we already have in `lookup_commit_in_graph()`.
The escape hatch is added in the form of a new GIT_COMMIT_GRAPH_PARANOIA
environment variable that mirrors GIT_REF_PARANOIA. If enabled, which is
the default, we will double check that commits looked up in the commit
graph via `lookup_commit_in_graph()` also exist in the object database.
This same check will also be added in `repo_parse_commit_internal()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath to handle recipient addresses `git send-email
--compose` learns from the user was completely broken, which has
been corrected.
* jk/send-email-fix-addresses-from-composed-messages:
send-email: handle to/cc/bcc from --compose message
Revert "send-email: extract email-parsing code into a subroutine"
doc/send-email: mention handling of "reply-to" with --compose
Code clean-up.
* ob/rebase-cleanup:
rebase: move parse_opt_keep_empty() down
rebase: handle --strategy via imply_merge() as well
rebase: simplify code related to imply_merge()
The pathspec code carelessly dereferenced NULL while emitting an
error message, which has been corrected.
* kh/pathspec-error-wo-repository-fix:
grep: die gracefully when outside repository
"git p4" tried to store symlinks to LFS when told, but has been
fixed not to do so, because it does not make sense.
* mm/p4-symlink-with-lfs:
git-p4 shouldn't attempt to store symlinks in LFS
The attribute subsystem learned to honor `attr.tree` configuration
that specifies which tree to read the .gitattributes files from.
* jc/attr-tree-config:
attr: add attr.tree for setting the treeish to read attributes from
attr: read attributes from HEAD when bare repo
Many typos, ungrammatical sentences and wrong phrasing have been
fixed.
* sn/typo-grammo-phraso-fixes:
t/README: fix multi-prerequisite example
doc/gitk: s/sticked/stuck/
git-jump: admit to passing merge mode args to ls-files
doc/diff-options: improve wording of the log.diffMerges mention
doc: fix some typos, grammar and wording issues
The index file has room only for lower 32-bit of the file size in
the cached stat information, which means cached stat information
will have 0 in its sd_size member for a file whose size is multiple
of 4GiB. This is mistaken for a racily clean path. Avoid it by
storing a bogus sd_size value instead for such files.
* bc/racy-4gb-files:
Prevent git from rehashing 4GiB files
t: add a test helper to truncate files
Feeding "git stash store" with a random commit that was not created
by "git stash create" now errors out.
* jc/fail-stash-to-store-non-stash:
stash: be careful what we store
The codepaths that read "chunk" formatted files have been corrected
to pay attention to the chunk size and notice broken files.
* jk/chunk-bounds:
t5319: make corrupted large-offset test more robust
Unlike "git log --pretty=%D", "git log --pretty="%(decorate)" did
not auto-initialize the decoration subsystem, which has been
corrected.
* ak/pretty-decorate-more-fix:
pretty: fix ref filtering for %(decorate) formats
The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.
* vd/loose-ref-iteration-optimization:
files-backend.c: avoid stat in 'loose_fill_ref_dir'
dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
dir.[ch]: expose 'get_dtype'
ref-cache.c: fix prefix matching in ref iteration
"git merge-tree" learned to take strategy backend specific options
via the "-X" option, like "git merge" does.
* ty/merge-tree-strategy-options:
merge: introduce {copy|clear}_merge_options()
The codepaths that read "chunk" formatted files have been corrected
to pay attention to the chunk size and notice broken files.
* jk/chunk-bounds:
chunk-format: drop pair_chunk_unsafe()
commit-graph: detect out-of-order BIDX offsets
commit-graph: check bounds when accessing BIDX chunk
commit-graph: check bounds when accessing BDAT chunk
commit-graph: bounds-check generation overflow chunk
commit-graph: check size of generations chunk
commit-graph: bounds-check base graphs chunk
commit-graph: detect out-of-bounds extra-edges pointers
commit-graph: check size of commit data chunk
midx: check size of revindex chunk
midx: bounds-check large offset chunk
midx: check size of object offset chunk
midx: enforce chunk alignment on reading
midx: check size of pack names chunk
commit-graph: check consistency of fanout table
midx: check size of oid lookup chunk
commit-graph: check size of oid fanout chunk
midx: stop ignoring malformed oid fanout chunk
t: add library for munging chunk-format files
chunk-format: note that pair_chunk() is unsafe
Fix "git merge-tree" to stop segfaulting when the --attr-source
option is used.
* jc/merge-ort-attr-index-fix:
merge-ort: initialize repo in index state
"git repack" learned "--max-cruft-size" to prevent cruft packs from
growing without bounds.
* tb/repack-max-cruft-size:
repack: free existing_cruft array after use
"git diff --merge-base X other args..." insisted that X must be a
commit and errored out when given an annotated tag that peels to a
commit, but we only need it to be a committish. This has been
corrected.
* ar/diff-index-merge-base-fix:
diff: fix --merge-base with annotated tags
In .gitmodules files, submodules are keyed by their names, and the
path to the submodule whose name is $name is specified by the
submodule.$name.path variable. There were a few codepaths that
mixed the name and path up when consulting the submodule database,
which have been corrected. It took long for these bugs to be found
as the name of a submodule initially is the same as its path, and
the problem does not surface until it is moved to a different path,
which apparently happens very rarely.
* js/submodule-fix-misuse-of-path-and-name:
t7420: test that we correctly handle renamed submodules
t7419: test that we correctly handle renamed submodules
t7419, t7420: use test_cmp_config instead of grepping .gitmodules
t7419: actually test the branch switching
submodule--helper: return error from set-url when modifying failed
submodule--helper: use submodule_from_path in set-{url,branch}
"git repack" learned "--max-cruft-size" to prevent cruft packs from
growing without bounds.
* tb/repack-max-cruft-size:
builtin/repack.c: avoid making cruft packs preferred
builtin/repack.c: implement support for `--max-cruft-size`
builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE
t7700: split cruft-related tests to t7704
Leakfix.
* jk/commit-graph-leak-fixes:
commit-graph: clear oidset after finishing write
commit-graph: free write-context base_graph_name during cleanup
commit-graph: free write-context entries before overwriting
commit-graph: free graph struct that was not added to chain
commit-graph: delay base_graph assignment in add_graph_to_chain()
commit-graph: free all elements of graph chain
commit-graph: move slab-clearing to close_commit_graph()
merge: free result of repo_get_merge_bases()
commit-reach: free temporary list in get_octopus_merge_bases()
t6700: mark test as leak-free
Test coverage for trailers has been improved.
* la/trailer-test-and-doc-updates:
trailer doc: <token> is a <key> or <keyAlias>, not both
trailer doc: separator within key suppresses default separator
trailer doc: emphasize the effect of configuration variables
trailer --unfold help: prefer "reformat" over "join"
trailer --parse docs: add explanation for its usefulness
trailer --only-input: prefer "configuration variables" over "rules"
trailer --parse help: expose aliased options
trailer --no-divider help: describe usual "---" meaning
trailer: trailer location is a place, not an action
trailer doc: narrow down scope of --where and related flags
trailer: add tests to check defaulting behavior with --no-* flags
trailer test description: this tests --where=after, not --where=before
trailer tests: make test cases self-contained
GitHub CI workflow has learned to trigger Coverity check.
* js/ci-coverity:
coverity: detect and report when the token or project is incorrect
coverity: allow running on macOS
coverity: support building on Windows
coverity: allow overriding the Coverity project
coverity: cache the Coverity Build Tool
ci: add a GitHub workflow to submit Coverity scans
Shuffle some bits across headers and sources to prepare for
libification effort.
* cw/prelim-cleanup:
parse: separate out parsing functions from config.h
config: correct bad boolean env value error message
wrapper: reduce scope of remove_or_warn()
hex-ll: separate out non-hash-algo functions
The "streaming" interface used for bulk-checkin codepath has been
narrowed to take only blob objects for now, with no real loss of
functionality.
* eb/limit-bulk-checkin-to-blobs:
bulk-checkin: only support blobs in index_bulk_checkin
"git merge-tree" learned to take strategy backend specific options
via the "-X" option, like "git merge" does.
* ty/merge-tree-strategy-options:
merge-tree: add -X strategy option
The display width table for unicode characters has been updated for
Unicode 15.1
* bb/unicode-width-table-15:
unicode: update the width tables to Unicode 15.1
"git for-each-ref" and friends learn to apply mailmap to authorname
and other fields.
* ks/ref-filter-mailmap:
ref-filter: add mailmap support
t/t6300: introduce test_bad_atom
t/t6300: cleanup test_atom
"git rev-list --stdin" learned to take non-revisions (like "--not")
recently from the standard input, but the way such a "--not" was
handled was quite confusing, which has been rethought. This is
potentially a change that breaks backward compatibility.
* ps/revision-cmdline-stdin-not:
revision: make pseudo-opt flags read via stdin behave consistently
"checkout --merge -- path" and "update-index --unresolve path" did
not resurrect conflicted state that was resolved to remove path,
but now they do.
* jc/unresolve-removal:
checkout: allow "checkout -m path" to unmerge removed paths
checkout/restore: add basic tests for --merge
checkout/restore: refuse unmerging paths unless checking out of the index
update-index: remove stale fallback code for "--unresolve"
update-index: use unmerge_index_entry() to support removal
resolve-undo: allow resurrecting conflicted state that resolved to deletion
update-index: do not read HEAD and MERGE_HEAD unconditionally
UBSAN options were not propagated through the test framework to git
run via the httpd, unlike ASAN options, which has been corrected.
* jk/test-pass-ubsan-options-to-http-test:
test-lib: set UBSAN_OPTIONS to match ASan
The command line completion script (in contrib/) can be told to
complete aliases by including ": git <cmd> ;" in the alias to tell
it that the alias should be completed similar to how "git <cmd>" is
completed. The parsing code for the alias as been loosened to
allow ';' without an extra space before it.
* jc/alias-completion:
completion: loosen and document the requirement around completing alias
"git range-diff --notes=foo" compared "log --notes=foo --notes" of
the two ranges, instead of using just the specified notes tree.
* kh/range-diff-notes:
range-diff: treat notes like `log`
"git diff" learned diff.statNameWidth configuration variable, to
give the default width for the name part in the "--stat" output.
* ds/stat-name-width-configuration:
diff --stat: add config option to limit filename width
Unused parameters in fsmonitor related code paths have been marked
as such.
* jk/fsmonitor-unused-parameter:
run-command: mark unused parameters in start_bg_wait callbacks
fsmonitor: mark unused hashmap callback parameters
fsmonitor/darwin: mark unused parameters in system callback
fsmonitor: mark unused parameters in stub functions
fsmonitor/win32: mark unused parameter in fsm_os__incompatible()
fsmonitor: mark some maybe-unused parameters
fsmonitor/win32: drop unused parameters
fsmonitor: prefer repo_git_path() to git_pathdup()
Fix recent regression in Git-GUI that fails to run hook scripts at
all.
* ml/git-gui-exec-path-fix:
git-gui - use git-hook, honor core.hooksPath
git-gui - re-enable use of hook scripts
An error message given by "git send-email" when given a malformed
address did not give correct information, which has been corrected.
* tb/send-email-extract-valid-address-error-message-fix:
git-send-email.perl: avoid printing undef when validating addresses
HTTP Header redaction code has been adjusted for a newer version of
cURL library that shows its traces differently from earlier
versions.
* jk/redact-h2h3-headers-fix:
http: update curl http/2 info matching for curl 8.3.0
http: factor out matching of curl http/2 trace lines
Code clean-up.
* jk/ort-unused-parameter-cleanups:
merge-ort: lowercase a few error messages
merge-ort: drop unused "opt" parameter from merge_check_renames_reusable()
merge-ort: drop unused parameters from detect_and_process_renames()
merge-ort: stop passing "opt" to read_oid_strbuf()
merge-ort: drop custom err() function
The code to keep track of existing packs in the repository while
repacking has been refactored.
* tb/repack-existing-packs-cleanup:
builtin/repack.c: extract common cruft pack loop
builtin/repack.c: avoid directly inspecting "util"
builtin/repack.c: store existing cruft packs separately
builtin/repack.c: extract `has_existing_non_kept_packs()`
builtin/repack.c: extract redundant pack cleanup for existing packs
builtin/repack.c: extract redundant pack cleanup for --geometric
builtin/repack.c: extract marking packs for deletion
builtin/repack.c: extract structure to store existing packs
* jc/update-index-show-index-version:
test-tool: retire "index-version"
update-index: add --show-index-version
update-index doc: v4 is OK with JGit and libgit2
Clarify how "alias.foo = : git cmd ; aliased-command-string" should
be spelled with necessary whitespaces around punctuation marks to
work.
* pb/completion-aliases-doc:
completion: improve doc for complex aliases
The command-line complation support (in contrib/) learned to
complete "git commit --trailer=" for possible trailer keys.
* pb/complete-commit-trailers:
completion: commit: complete trailers tokens more robustly
"git diff --cached" codepath did not fill the necessary stat
information for a file when fsmonitor knows it is clean and ended
up behaving as if it is not clean, which has been corrected.
* js/diff-cached-fsmonitor-fix:
diff-lib: fix check_removed when fsmonitor is on
Code clean-up.
* la/trailer-cleanups:
trailer: use offsets for trailer_start/trailer_end
trailer: rename *_DEFAULT enums to *_UNSPECIFIED
trailer: teach find_patch_start about --no-divider
trailer: split process_command_line_args into separate functions
trailer: split process_input_file into separate pieces
trailer: separate public from internal portion of trailer_iterator
Update "git maintainance" timers' implementation based on systemd
timers to work with WSL.
* js/systemd-timers-wsl-fix:
maintenance(systemd): support the Windows Subsystem for Linux
"git diff --no-index -R <(one) <(two)" did not work correctly,
which has been corrected.
* pw/diff-no-index-from-named-pipes:
diff --no-index: fix -R with stdin
Previously these fields in the trailer_info struct were of type "const
char *" and pointed to positions in the input string directly (to the
start and end positions of the trailer block).
Use offsets to make the intended usage less ambiguous. We only need to
reference the input string in format_trailer_info(), so update that
function to take a pointer to the input.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do not use *_DEFAULT as a suffix to the enums, because the word
"default" is overloaded. The following are two examples of the ambiguity
of the word "default":
(1) "Default" can mean using the "default" values that are hardcoded
in trailer.c as
default_conf_info.where = WHERE_END;
default_conf_info.if_exists = EXISTS_ADD_IF_DIFFERENT_NEIGHBOR;
default_conf_info.if_missing = MISSING_ADD;
in ensure_configured(). These values are referred to as "the
default" in the docs for interpret-trailers. These defaults are used
if no "trailer.*" configurations are defined.
(2) "Default" can also mean the "trailer.*" configurations themselves,
because these configurations are used by "default" (ahead of the
hardcoded defaults in (1)) if no command line arguments are
provided. This concept of defaulting back to the configurations was
introduced in 0ea5292e6b (interpret-trailers: add options for
actions, 2017-08-01).
In addition, the corresponding *_DEFAULT values are chosen when the user
provides the "--no-where", "--no-if-exists", or "--no-if-missing" flags
on the command line. These "--no-*" flags are used to clear previously
provided flags of the form "--where", "--if-exists", and "--if-missing".
Using these "--no-*" flags undoes the specifying of these flags (if
any), so using the word "UNSPECIFIED" is more natural here.
So instead of using "*_DEFAULT", use "*_UNSPECIFIED" because this
signals to the reader that the *_UNSPECIFIED value by itself carries no
meaning (it's a zero value and by itself does not "default" to anything,
necessitating the need to have some other way of getting to a useful
value).
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, find_patch_start only finds the start of the patch part of
the input (by looking at the "---" divider) for cases where the
"--no-divider" flag has not been provided. If the user provides this
flag, we do not rely on find_patch_start at all and just call strlen()
directly on the input.
Instead, make find_patch_start aware of "--no-divider" and make it
handle that case as well. This means we no longer need to call strlen at
all and can just rely on the existing code in find_patch_start. By
forcing callers to consider this important option, we avoid the kind of
mistake described in be3d654343 (commit: pass --no-divider to
interpret-trailers, 2023-06-17).
This patch will make unit testing a bit more pleasant in this area in
the future when we adopt a unit testing framework, because we would not
have to test multiple functions to check how finding the start of a
patch part works (we would only need to test find_patch_start).
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The completion script (in contrib/) has been taught to treat the
"-t" option to "git checkout" and "git switch" just like the
"--track" option, to complete remote-tracking branches.
* js/complete-checkout-t:
completion(switch/checkout): treat --track and -t the same
The command-line complation support (in contrib/) learned to
complete "git commit --trailer=" for possible trailer keys.
* pb/complete-commit-trailers:
completion: commit: complete configured trailer tokens
"git grep -e A --no-or -e B" is accepted, even though the negation
of "or" did not mean anything, which has been tightened.
* rs/grep-no-no-or:
grep: reject --no-or
References from description of the `--patch` option in various
manual pages have been simplified and improved.
* so/diff-doc-for-patch-update:
doc/diff-options: fix link to generating patch section
Various fixes to the behaviour of "rebase -i" when the command got
interrupted by conflicting changes.
* pw/rebase-i-after-failure:
rebase -i: fix adding failed command to the todo list
rebase --continue: refuse to commit after failed command
rebase: fix rewritten list for failed pick
sequencer: factor out part of pick_commits()
sequencer: use rebase_path_message()
rebase -i: remove patch file after conflict resolution
rebase -i: move unlink() calls
The default log message created by "git revert", when reverting a
commit that records a revert, has been tweaked.
* ob/revert-of-revert-is-reapply:
git-revert.txt: add discussion
sequencer: beautify subject of reverts of reverts
"git log --format" has been taught the %(decorate) placeholder.
* ak/pretty-decorate-more:
decorate: use commit color for HEAD arrow
pretty: add pointer and tag options to %(decorate)
pretty: add %(decorate[:<options>]) format
decorate: color each token separately
decorate: avoid some unnecessary color overhead
decorate: refactor format_decorations()
pretty-formats: enclose options in angle brackets
pretty-formats: define "literal formatting code"
We now limit depth of the tree objects and maximum length of
pathnames recorded in tree objects.
* jk/tree-name-and-depth-limit:
lower core.maxTreeDepth default to 2048
tree-diff: respect max_allowed_tree_depth
list-objects: respect max_allowed_tree_depth
read_tree(): respect max_allowed_tree_depth
traverse_trees(): respect max_allowed_tree_depth
add core.maxTreeDepth config
fsck: detect very large tree pathnames
tree-walk: rename "error" variable
tree-walk: drop MAX_TRAVERSE_TREES macro
tree-walk: reduce stack size for recursive functions
"git for-each-ref --sort='contents:size'" sorts the refs according
to size numerically, giving a ref that points at a blob twelve-byte
(12) long before showing a blob hundred-byte (100) long.
* ks/ref-filter-sort-numerically:
ref-filter: sort numerically when ":size" is used
Update an error message (which would probably never been seen).
* ob/sequencer-reword-error-message:
sequencer: fix error message on failure to copy SQUASH_MSG
Unused parameters to functions are marked as such, and/or removed,
in order to bring us closer to -Wunused-parameter clean.
* jk/unused-post-2.42-part2:
parse-options: mark unused parameters in noop callback
interpret-trailers: mark unused "unset" parameters in option callbacks
parse-options: add more BUG_ON() annotations
merge: do not pass unused opt->value parameter
parse-options: mark unused "opt" parameter in callbacks
parse-options: prefer opt->value to globals in callbacks
checkout-index: delay automatic setting of to_tempfile
format-patch: use OPT_STRING_LIST for to/cc options
merge: simplify parsing of "-n" option
merge: make xopts a strvec
"git format-patch --rfc --subject-prefix=<foo>" used to ignore the
"--subject-prefix" option and used "[RFC PATCH]"; now we will add
"RFC" prefix to whatever subject prefix is specified.
This is a backward compatible change that may deserve a note.
* dd/format-patch-rfc-updates:
format-patch: --rfc honors what --subject-prefix sets
Mark unused parameters to functions as such, and/or remove unused
parameters, to bring us closer to -Wunused-parameter clean.
* jk/unused-post-2.42: (22 commits)
update-ref: mark unused parameter in parser callbacks
gc: mark unused descriptors in scheduler callbacks
bundle-uri: mark unused parameters in callbacks
fetch: mark unused parameter in ref_transaction callback
credential: mark unused parameter in urlmatch callback
grep: mark unused parmaeters in pcre fallbacks
imap-send: mark unused parameters with NO_OPENSSL
worktree: mark unused parameters in noop repair callback
negotiator/noop: mark unused callback parameters
add-interactive: mark unused callback parameters
grep: mark unused parameter in output function
test-trace2: mark unused argv/argc parameters
trace2: mark unused config callback parameter
trace2: mark unused us_elapsed_absolute parameters
stash: mark unused parameter in diff callback
ls-tree: mark unused parameter in callback
commit-graph: mark unused data parameters in generation callbacks
worktree: mark unused parameters in each_ref_fn callback
pack-bitmap: mark unused parameters in show_object callback
ref-filter: mark unused parameters in parser callbacks
...
Use of --max-pack-size to allow multiple packfiles to be created is
now supported even when we are sending unreachable objects to cruft
packs.
* tb/multi-cruft-pack:
Documentation/gitformat-pack.txt: drop mixed version section
Documentation/gitformat-pack.txt: remove multi-cruft packs alternative
builtin/pack-objects.c: support `--max-pack-size` with `--cruft`
builtin/pack-objects.c: remove unnecessary strbuf_reset()
Tests with LSan from time to time seem to emit harmless message
that makes our tests unnecessarily flakey; we work it around by
filtering the uninteresting output.
* jk/test-lsan-denoise-output:
test-lib: ignore uninteresting LSan output
Skip flakey "git p4" tests in sanitizer CI job, as well as "git
svn" tests.
* js/ci-san-skip-p4-and-svn-tests:
ci(linux-asan-ubsan): let's save some time
Mark tests that are known to pass with LSan as such.
* tb/mark-more-tests-as-leak-free:
leak tests: mark t5583-push-branches.sh as leak-free
leak tests: mark t3321-notes-stripspace.sh as leak-free
leak tests: mark a handful of tests as leak-free
It may be tempting to leave the help text NULL for a command line
option that is either hidden or too obvious, but "git subcmd -h"
and "git subcmd --help-all" would have segfaulted if done so. Now
the help text is optional.
* rs/parse-options-help-text-is-optional:
parse-options: allow omitting option help text
Git GUI updates.
* py/git-gui-updates:
git-gui - use mkshortcut on Cygwin
git-gui - use cygstart to browse on Cygwin
git-gui - remove obsolete Cygwin specific code
git gui Makefile - remove Cygwin modifications
Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
Work around Tcl's default `PATH` lookup
Move the `_which` function (almost) to the top
Move is_<platform> functions to the beginning
is_Cygwin: avoid `exec`ing anything
windows: ignore empty `PATH` elements
git-gui: Fix a typo in README
Tweak GitHub Actions CI so that pushing the same commit to multiple
branch tips at the same time will not waste building and testing
the same thing twice.
* jc/ci-skip-same-commit:
ci: avoid building from the same commit in parallel
"git format-patch" learns a way to feed cover letter description,
that (1) can be used on detached HEAD where there is no branch
description available, and (2) also can override the branch
description if there is one.
* ob/format-patch-description-file:
format-patch: add --description-file option
"git diff --no-such-option" and other corner cases around the exit
status of the "diff" command has been corrected.
* jk/diff-result-code-cleanup:
diff: drop useless "status" parameter from diff_result_code()
diff: drop useless return values in git-diff helpers
diff: drop useless return from run_diff_{files,index} functions
diff: die when failing to read index in git-diff builtin
diff: show usage for unknown builtin_diff_files() options
diff-files: avoid negative exit value
diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
Update the use of API for consistency between two calls to
require_clean_work_tree() from the sequencer code.
* ob/sequencer-empty-hint-fix:
sequencer: rectify empty hint in call of require_clean_work_tree()
transfer.unpackLimit ought to be used as a fallback, but overrode
fetch.unpackLimit and receive.unpackLimit instead.
* ts/unpacklimit-config-fix:
transfer.unpackLimit: fetch/receive.unpackLimit takes precedence
"git diff -w --exit-code" with various options did not work
correctly, which is being addressed.
* jc/diff-exit-code-with-w-fixes:
diff: the -w option breaks --exit-code for --raw and other output modes
t4040: remove test that succeeded for a wrong reason
diff: teach "--stat -w --exit-code" to notice differences
diff: mode-only change should be noticed by "--patch -w --exit-code"
diff: move the fallback "--exit-code" code down
Update commit-graph verification code that detects mixture of zero
and non-zero generation numbers.
* tb/commit-graph-verify-fix:
commit-graph: avoid repeated mixed generation number warnings
t/t5318-commit-graph.sh: test generation zero transitions during fsck
commit-graph: verify swapped zero/non-zero generation cases
commit-graph: introduce `commit_graph_generation_from_graph()`
Teach "git check-attr" work better with sparse-index.
* sl/sparse-check-attr:
check-attr: integrate with sparse-index
attr.c: read attributes in a sparse directory
t1092: add tests for 'git check-attr'
Allow "git maintenance" schedule to be randomly distributed.
* ds/maintenance-schedule-fuzz:
maintenance: update schedule before config
maintenance: fix systemd schedule overlaps
maintenance: use random minute in systemd scheduler
maintenance: swap method locations
maintenance: use random minute in cron scheduler
maintenance: use random minute in Windows scheduler
maintenance: use random minute in launchctl scheduler
maintenance: add get_random_minute()
"git cmd -h" learned to signal which options can be negated by
listing such options like "--[no-]opt".
* rs/parse-options-negation-help:
parse-options: simplify usage_padding()
parse-options: no --[no-]no-...
parse-options: factor out usage_indent() and usage_padding()
parse-options: show negatability of options in short help
t1502: test option negation
t1502: move optionspec help output to a file
t1502, docs: disallow --no-help
subtree: disallow --no-{help,quiet,debug,branch,message}
Typo/grammofix to documentation added during this cycle.
* tl/notes-separator:
notes doc: tidy up `--no-stripspace` paragraph
notes doc: split up run-on sentences
Update commit-graph verification code that detects mixture of zero
and non-zero generation numbers.
* tb/commit-graph-verify-fix:
commit-graph: invert negated conditional
t/t5318-commit-graph.sh: test generation zero transitions during fsck
commit-graph: verify swapped zero/non-zero generation cases
commit-graph: introduce `commit_graph_generation_from_graph()`
Test updates.
* ob/test-lib-rebase-fake-editor-updates:
t/lib-rebase: improve documentation of set_fake_editor()
t/lib-rebase: set_fake_editor(): handle FAKE_LINES more consistently
t/lib-rebase: set_fake_editor(): fix recognition of reset's short command
Chomp overly long label names used in the sequencer machinery.
* mp/rebase-label-length-limit:
rebase: allow overriding the maximal length of the generated labels
sequencer: truncate labels to accommodate loose refs
Now that we're using `commit_graph_generation_from_graph()` (which can
return a zero value) and we have a pure if/else instead of an else-if,
let's swap the arms of the if-statement.
This avoids an "if (!x) { foo(); } else { bar(); }" and replaces it with
the more readable "if (x) { bar(); } else { foo(); }".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The second test called "detect incorrect generation number" asserts that
we correctly warn during an fsck when we see a non-zero generation
number after seeing a zero beforehand.
The other transition (going from non-zero to zero) was previously
untested. Test both directions, and rename the existing test to make
clear which direction it is exercising.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In verify_one_commit_graph(), we have code that complains when a commit
is found with a generation number of zero, and then later with a
non-zero number. It works like this:
1. When we see an entry with generation zero, we set the
generation_zero flag to GENERATION_ZERO_EXISTS.
2. When we later see an entry with a non-zero generation, we complain
if the flag is GENERATION_ZERO_EXISTS.
There's a matching GENERATION_NUMBER_EXISTS value, which in theory would
be used to find the case that we see the entries in the opposite order:
1. When we see an entry with a non-zero generation, we set the
generation_zero flag to GENERATION_NUMBER_EXISTS.
2. When we later see an entry with a zero generation, we complain if
the flag is GENERATION_NUMBER_EXISTS.
But that doesn't work; step 2 is implemented, but there is no step 1. We
never use NUMBER_EXISTS at all, and Coverity rightly complains that step
2 is dead code.
We can fix that by implementing that step 1.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2ee11f7261 (commit-graph: return generation from memory, 2023-03-20),
the `commit_graph_generation()` function stopped returning zeros when
asked to locate the generation number of a given commit.
This was done at the time to prepare for a later change which set
generation values in memory, meaning that we could no longer rely on
`graph_pos` alone to tell us whether or not to trust the generation
number returned by this function.
In 2ee11f7261, it was noted that this change only impacted very old
commit-graphs, which were written with all commits having generation
number 0. Indeed, zero is not a valid generation number, so we should
never expect to see that value outside of the aforementioned case.
The test fallout in 2ee11f7261 indicated that we were no longer able to
fsck a specific old case of commit-graph corruption, where we see a
non-zero generation number after having seen a generation number of 0
earlier.
Introduce a variant of `commit_graph_generation()` which behaves like
that function did prior to 2ee11f7261, known as
`commit_graph_generation_from_graph()`. Then use this function in the
context of `verify_one_commit_graph()`, where we only want to trust the
values from the graph.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Windows updates.
* ds/maintenance-on-windows-fix:
git maintenance: avoid console window in scheduled tasks on Windows
win32: add a helper to run `git.exe` without a foreground window
A message written in olden time prevented a branch from getting
checked out saying it is already checked out elsewhere, but these
days, we treat a branch that is being bisected or rebased just like
a branch that is checked out and protect it. Rephrase the message
to say that the branch is in use.
* rj/branch-in-use-error-message:
branch: error message checking out a branch in use
branch: error message deleting a branch in use
Adjust to newer Term::ReadLine to prevent it from breaking
the interactive prompt code in send-email.
* jk/send-email-with-new-readline:
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Developer support to detect meaningless combination of options.
* rs/parse-opt-forbid-set-int-0-without-noneg:
parse-options: disallow negating OPTION_SET_INT 0
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.
* jt/path-filter-fix:
commit-graph: fix small leak with invalid changedPathsVersion
When writing a commit graph, we sanity-check the value of
commitgraph.changedPathsVersion and return early if it's invalid. But we
only do so after allocating the write_commit_graph_context, meaing we
leak it. We could "goto cleanup" to fix this, but instead let's push
this check to the top of the function with the other early checks which
return before doing any work.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update two credential helpers to correctly match which credential
to erase; they dropped not the ones with stale password.
* mh/credential-erase-improvements-more:
credential/wincred: erase matching creds only
credential/libsecret: erase matching creds only
The way authentication related data other than passwords (e.g.
oath token and password expiration data) are stored in libsecret
keyrings has been rethought.
* mh/credential-libsecret-attrs:
credential/libsecret: store new attributes
"git rebase -i" with a series of squash/fixup, when one of the
steps stopped in conflicts and ended up getting skipped, did not
handle the accumulated commit log messages, which has been
corrected.
* pw/rebase-skip-commit-message-fix:
rebase --skip: fix commit message clean up when skipping squash
"git bisect visualize" stopped running "gitk" on Git for Windows
when the command was reimplemented in C around Git 2.34 timeframe.
This has been corrected.
* ma/locate-in-path-for-windows:
docs: update when `git bisect visualize` uses `gitk`
compat/mingw: implement a native locate_in_PATH()
run-command: conditionally define locate_in_PATH()
Exclude "." from the set of characters to be removed from the
beginning and the end of the human-readable name.
* bc/ident-dot-is-no-longer-crud-letter:
ident: don't consider '.' a crud
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.
* jt/path-filter-fix:
commit-graph: new filter ver. that fixes murmur3
repo-settings: introduce commitgraph.changedPathsVersion
t4216: test changed path filters with high bit paths
t/helper/test-read-graph: implement `bloom-filters` mode
bloom.h: make `load_bloom_filter_from_graph()` public
t/helper/test-read-graph.c: extract `dump_graph_info()`
gitformat-commit-graph: describe version 2 of BDAT
Adjust to OpenSSL 3+, which deprecates its SHA-1 functions based on
its traditional API, by using its EVP API instead.
* ew/hash-with-openssl-evp:
avoid SHA-1 functions deprecated in OpenSSL 3+
sha256: avoid functions deprecated in OpenSSL 3+
The murmur3 implementation in bloom.c has a bug when converting series
of 4 bytes into network-order integers when char is signed (which is
controllable by a compiler option, and the default signedness of char is
platform-specific). When a string contains characters with the high bit
set, this bug causes results that, although internally consistent within
Git, does not accord with other implementations of murmur3 (thus,
the changed path filters wouldn't be readable by other off-the-shelf
implementatios of murmur3) and even with Git binaries that were compiled
with different signedness of char. This bug affects both how Git writes
changed path filters to disk and how Git interprets changed path filters
on disk.
Therefore, introduce a new version (2) of changed path filters that
corrects this problem. The existing version (1) is still supported and
is still the default, but users should migrate away from it as soon
as possible.
Because this bug only manifests with characters that have the high bit
set, it may be possible that some (or all) commits in a given repo would
have the same changed path filter both before and after this fix is
applied. However, in order to determine whether this is the case, the
changed paths would first have to be computed, at which point it is not
much more expensive to just compute a new changed path filter.
So this patch does not include any mechanism to "salvage" changed path
filters from repositories. There is also no "mixed" mode - for each
invocation of Git, reading and writing changed path filters are done
with the same version number; this version number may be explicitly
stated (typically if the user knows which version they need) or
automatically determined from the version of the existing changed path
filters in the repository.
There is a change in write_commit_graph(). graph_read_bloom_data()
makes it possible for chunk_bloom_data to be non-NULL but
bloom_filter_settings to be NULL, which causes a segfault later on. I
produced such a segfault while developing this patch, but couldn't find
a way to reproduce it neither after this complete patch (or before),
but in any case it seemed like a good thing to include that might help
future patch authors.
The value in t0095 was obtained from another murmur3 implementation
using the following Go source code:
package main
import "fmt"
import "github.com/spaolacci/murmur3"
func main() {
fmt.Printf("%x\n", murmur3.Sum32([]byte("Hello world!")))
fmt.Printf("%x\n", murmur3.Sum32([]byte{0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff}))
}
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent commit will introduce another version of the changed-path
filter in the commit graph file. In order to control which version to
write (and read), a config variable is needed.
Therefore, introduce this config variable. For forwards compatibility,
teach Git to not read commit graphs when the config variable
is set to an unsupported version. Because we teach Git this,
commitgraph.readChangedPaths is now redundant, so deprecate it and
define its behavior in terms of the config variable we introduce.
This commit does not change the behavior of writing (Git writes changed
path filters when explicitly instructed regardless of any config
variable), but a subsequent commit will restrict Git such that it will
only write when commitgraph.changedPathsVersion is a recognized value.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequent commits will teach Git another version of changed path
filter that has different behavior with paths that contain at least
one character with its high bit set, so test the existing behavior as
a baseline.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement a mode of the "read-graph" test helper to dump out the
hexadecimal contents of the Bloom filter(s) contained in a commit-graph.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare for a future commit to use the load_bloom_filter_from_graph()
function directly to load specific Bloom filters out of the commit-graph
for manual inspection (to be used during tests).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare for the 'read-graph' test helper to perform other tasks besides
dumping high-level information about the commit-graph by extracting its
main routine into a separate function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code change to Git to support version 2 will be done in subsequent
commits.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leakfixes.
* ew/sha256-gcrypt-leak-fixes:
sha256/gcrypt: die on gcry_md_open failures
sha256/gcrypt: fix memory leak with SHA-256 repos
sha256/gcrypt: fix build with SANITIZE=leak
Tone down the warning on SHA-256 repositories being an experimental
curiosity. We do not have support for them to interoperate with
traditional SHA-1 repositories, but at this point, we do not plan
to make breaking changes to SHA-256 repositories and there is no
longer need for such a strongly phrased warning.
* am/doc-sha256:
doc: sha256 is no longer experimental
"git branch -f X" to repoint the branch X seid that X was "checked
out" in another worktree, even when branch X was not and instead
being bisected or rebased. The message was reworded to say the
branch was "in use".
* jc/branch-in-use-error-message:
branch: update the message to refuse touching a branch in-use
"git blame --contents=file" has been taught to work in a bare
repository.
* hy/blame-in-bare-with-contents:
blame: allow --contents to work with bare repo
Command line parser fix, and a small parse-options API update.
* jc/parse-options-short-help:
short help: allow a gap smaller than USAGE_GAP
remote: simplify "remote add --tags" help text
short help: allow multi-line opthelp
Clarify how to pick a starting point for a new topic in the
SubmittingPatches document.
* la/doc-choose-starting-point-fixup:
SubmittingPatches: use of older maintenance tracks is an exception
SubmittingPatches: explain why 'next' and above are inappropriate base
SubmittingPatches: choice of base for fixing an older maintenance track
Rewrite the description of giving a custom command to the
submodule.<name>.update configuraiton variable.
* pv/doc-submodule-update-settings:
doc: highlight that .gitmodules does not support !command
Fix tests with unportable regex patterns.
* ja/worktree-orphan-fix:
t2400: rewrite regex to avoid unintentional PCRE
builtin/worktree.c: convert tab in advice to space
t2400: drop no-op `--sq` from rev-parse call
The implementation of "get_sha1_hex()" that reads a hexadecimal
string that spells a full object name has been extended to cope
with any hash function used in the repository, but the "sha1" in
its name survived. Rename it to get_hash_hex(), a name that is
more consistent within its friends like get_hash_hex_algop().
* jc/retire-get-sha1-hex:
hex: retire get_sha1_hex()
"git branch --list --format=<format>" and friends are taught
a new "%(describe)" placeholder.
* ks/ref-filter-describe:
ref-filter: add new "describe" atom
ref-filter: add multiple-option parsing functions
When the user edits "rebase -i" todo file so that it starts with a
"fixup", which would make it invalid, the command truncated the
rest of the file before giving an error and returning the control
back to the user. Stop truncating to make it easier to correct
such a malformed todo file.
* ah/sequencer-rewrite-todo-fix:
sequencer: finish parsing the todo list despite an invalid first line
Instead of inventing a custom counter variables for debugging,
use existing trace2 facility in the fsync customization codepath.
* bb/use-trace2-counters-for-fsync-stats:
wrapper: use trace2 counters to collect fsync stats
"./configure --with-expat=no" did not work as a way to refuse use
of the expat library on a system with the library installed, which
has been corrected.
* ah/autoconf-fixes:
configure.ac: always save NO_ICONV to config.status
configure.ac: don't overwrite NO_CURL option
configure.ac: don't overwrite NO_EXPAT option
Code simplification.
* jc/tree-walk-drop-base-offset:
tree-walk: drop unused base_offset from do_match()
tree-walk: lose base_offset that is never used in tree_entry_interesting
Command line parser fixes.
* jc/parse-options-show-branch:
show-branch: reject --[no-](topo|date)-order
show-branch: --no-sparse should give dense output
Clarify how to choose the starting point for a new topic in
developer guidance document.
* la/doc-choose-starting-point:
SubmittingPatches: simplify guidance for choosing a starting point
SubmittingPatches: emphasize need to communicate non-default starting points
SubmittingPatches: de-emphasize branches as starting points
SubmittingPatches: discuss subsystems separately from git.git
SubmittingPatches: reword awkward phrasing
"git tag --list --points-at X" showed tags that directly refers to
object X, but did not list a tag that points at such a tag, which
has been corrected.
* jk/nested-points-at:
ref-filter: simplify return type of match_points_at
ref-filter: avoid parsing non-tags in match_points_at()
ref-filter: avoid parsing tagged objects in match_points_at()
ref-filter: handle nested tags in --points-at option
Mark-up unused parameters in the code so that we can eventually
enable -Wunused-parameter by default.
* jk/unused-parameter:
t/helper: mark unused callback void data parameters
tag: mark unused parameters in each_tag_name_fn callbacks
rev-parse: mark unused parameter in for_each_abbrev callback
replace: mark unused parameter in each_mergetag_fn callback
replace: mark unused parameter in ref callback
merge-tree: mark unused parameter in traverse callback
fsck: mark unused parameters in various fsck callbacks
revisions: drop unused "opt" parameter in "tweak" callbacks
count-objects: mark unused parameter in alternates callback
am: mark unused keep_cr parameters
http-push: mark unused parameter in xml callback
http: mark unused parameters in curl callbacks
do_for_each_ref_helper(): mark unused repository parameter
test-ref-store: drop unimplemented reflog-expire command
Names of MinGW header files are spelled in mixed case in some
source files, but the build host can be using case sensitive
filesystem with header files with their name spelled in all
lowercase.
* mh/mingw-case-sensitive-build:
mingw: use lowercase includes for some Windows headers
Various offset computation in the code that accesses the packfiles
and other data in the object layer has been hardened against
arithmetic overflow, especially on 32-bit systems.
* tb/object-access-overflow-protection:
commit-graph.c: prevent overflow in `verify_commit_graph()`
commit-graph.c: prevent overflow in `write_commit_graph()`
commit-graph.c: prevent overflow in `merge_commit_graph()`
commit-graph.c: prevent overflow in `split_graph_merge_strategy()`
commit-graph.c: prevent overflow in `load_tree_for_commit()`
commit-graph.c: prevent overflow in `fill_commit_in_graph()`
commit-graph.c: prevent overflow in `fill_commit_graph_info()`
commit-graph.c: prevent overflow in `load_oid_from_graph()`
commit-graph.c: prevent overflow in add_graph_to_chain()
commit-graph.c: prevent overflow in `write_commit_graph_file()`
pack-bitmap.c: ensure that eindex lookups don't overflow
midx.c: prevent overflow in `fill_included_packs_batch()`
midx.c: prevent overflow in `write_midx_internal()`
midx.c: store `nr`, `alloc` variables as `size_t`'s
midx.c: prevent overflow in `nth_midxed_offset()`
midx.c: prevent overflow in `nth_midxed_object_oid()`
midx.c: use `size_t`'s for fanout nr and alloc
packfile.c: use checked arithmetic in `nth_packed_object_offset()`
packfile.c: prevent overflow in `load_idx()`
packfile.c: prevent overflow in `nth_packed_object_id()`
Help newbies by suggesting that there are cases where force-pushing
is a valid and sensible thing to update a branch at a remote
repository, rather than reconciling with merge/rebase.
* ah/advise-force-pushing:
push: don't imply that integration is always required before pushing
remote: don't imply that integration is always required before pushing
wt-status: don't show divergence advice when committing
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.
* tb/refs-exclusion-and-packed-refs:
ls-refs.c: avoid enumerating hidden refs where possible
upload-pack.c: avoid enumerating hidden refs where possible
builtin/receive-pack.c: avoid enumerating hidden references
refs.h: implement `hidden_refs_to_excludes()`
refs.h: let `for_each_namespaced_ref()` take excluded patterns
revision.h: store hidden refs in a `strvec`
refs/packed-backend.c: add trace2 counters for jump list
refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
refs/packed-backend.c: refactor `find_reference_location()`
refs: plumb `exclude_patterns` argument throughout
builtin/for-each-ref.c: add `--exclude` option
ref-filter.c: parameterize match functions over patterns
ref-filter: add `ref_filter_clear()`
ref-filter: clear reachable list pointers after freeing
ref-filter.h: provide `REF_FILTER_INIT`
refs.c: rename `ref_filter`
The recent change to "git repack" made it react less nicely when a
leftover .idx file that no longer has the corresponding .pack file
in the repository, which has been corrected.
* tb/repack-cleanup:
builtin/repack.c: avoid dir traversal in `collect_pack_filenames()`
builtin/repack.c: only repack `.pack`s that exist
"git fsck --no-progress" still spewed noise from the commit-graph
subsystem, which has been corrected.
* tb/fsck-no-progress:
commit-graph.c: avoid duplicated progress output during `verify`
commit-graph.c: pass progress to `verify_one_commit_graph()`
commit-graph.c: iteratively verify commit-graph chains
commit-graph.c: extract `verify_one_commit_graph()`
fsck: suppress MIDX output with `--no-progress`
fsck: suppress commit-graph output with `--no-progress`
"git ls-files '(attr:X)D/'" that triggers the common prefix
optimization codepath failed to read from "D/.gitattributes",
which has been corrected.
* jc/pathspec-match-with-common-prefix:
dir: match "attr" pathspec magic with correct paths
t6135: attr magic with path pattern
Further shuffling of declarations across header files to streamline
file dependencies.
* cw/compat-util-header-cleanup:
git-compat-util: move alloc macros to git-compat-util.h
treewide: remove unnecessary includes for wrapper.h
kwset: move translation table from ctype
sane-ctype.h: create header for sane-ctype macros
git-compat-util: move wrapper.c funcs to its header
git-compat-util: move strbuf.c funcs to its header
Code snippets in a tutorial document no longer compiled after
recent header shuffling, which have been corrected.
* vd/adjust-mfow-doc-to-updated-headers:
docs: add necessary headers to Documentation/MFOW.txt
"git diff --no-index" learned to read from named pipes as if they
were regular files, to allow "git diff <(process) <(substitution)"
some shells support.
* pw/diff-no-index-from-named-pipes:
diff --no-index: support reading from named pipes
t4054: test diff --no-index with stdin
diff --no-index: die on error reading stdin
diff --no-index: refuse to compare stdin to a directory
"imap-send" codepaths got cleaned up to get rid of unused
parameters.
* jk/imap-send-unused-variable-cleanup:
imap-send: drop unused fields from imap_cmd_cb
imap-send: drop unused parameter from imap_cmd_cb callback
imap-send: use server conf argument in setup_curl()
"git bugreport" tests did not test what it wanted to test, which
has been corrected.
* ma/t0091-fixup:
t0091-bugreport.sh: actually verify some content of report
The "git for-each-ref" family of commands learned placeholders
related to GPG signature verification.
* ks/ref-filter-signature:
ref-filter: add new "signature" atom
t/lib-gpg: introduce new prereq GPG2
A few places failed to differenciate the case where the index is
truly empty (nothing added) and we haven't yet read from the
on-disk index file, which have been corrected.
* js/empty-index-fixes:
commit -a -m: allow the top-level tree to become empty again
split-index: accept that a base index can be empty
do_read_index(): always mark index as initialized unless erroring out
Reduce reliance on a global state in the config reading API.
* gc/config-context:
config: pass source to config_parser_event_fn_t
config: add kvi.path, use it to evaluate includes
config.c: remove config_reader from configsets
config: pass kvi to die_bad_number()
trace2: plumb config kvi
config.c: pass ctx with CLI config
config: pass ctx with config files
config.c: pass ctx in configsets
config: add ctx arg to config_fn_t
urlmatch.h: use config_fn_t type
config: inline git_color_default_config
During a cherry-pick or revert session that works on multiple
commits, "git status" did not give correct information, which has
been corrected.
* jk/cherry-pick-revert-status:
fix cherry-pick/revert status when doing multiple commits
"git apply" punts when it is fed too large a patch input; the error
message it gives when it happens has been clarified.
* pw/apply-too-large:
apply: improve error messages when reading patch
'git notes append' was taught '--separator' to specify string to insert
between paragraphs.
* tl/notes-separator:
notes: introduce "--no-separator" option
notes.c: introduce "--[no-]stripspace" option
notes.c: append separator instead of insert by pos
notes.c: introduce '--separator=<paragraph-break>' option
t3321: add test cases about the notes stripspace behavior
notes.c: use designated initializers for clarity
notes.c: cleanup 'strbuf_grow' call in 'append_edit'
Partially revert a sanity check that the rest of the config code
was not ready, to avoid triggering it in a corner case.
* gc/config-partial-submodule-kvi-fix:
config: don't BUG when both kvi and source are set
Move functions that are not about pure string manipulation out of
strbuf.[ch]
* cw/strbuf-cleanup:
strbuf: remove global variable
path: move related function to path
object-name: move related functions to object-name
credential-store: move related functions to credential-store file
abspath: move related functions to abspath
strbuf: clarify dependency
strbuf: clarify API boundary
Add more "git var" for toolsmiths to learn various locations Git is
configured with either via the configuration or hardcoded defaults.
* bc/more-git-var:
var: add config file locations
var: add attributes files locations
attr: expose and rename accessor functions
var: adjust memory allocation for strings
var: format variable structure with C99 initializers
var: add support for listing the shell
t: add a function to check executable bit
var: mark unused parameters in git_var callbacks
The set-up code for the get_revision() API now allows feeding
options like --all and --not in the --stdin mode.
* ps/revision-stdin-with-options:
revision: handle pseudo-opts in `--stdin` mode
revision: small readability improvement for reading from stdin
revision: reorder `read_revisions_from_stdin()`
When the external merge driver is killed by a signal, its output
should not be trusted as a resolution with conflicts that is
proposed by the driver, but the code did.
* jc/abort-ll-merge-with-a-signal:
t6406: skip "external merge driver getting killed by a signal" test on Windows
Header files cleanup.
Kicked out of 'next', applied build fixes including js/cmake-wo-cache-h
* en/header-split-cache-h-part-3: (28 commits)
fsmonitor-ll.h: split this header out of fsmonitor.h
hash-ll, hashmap: move oidhash() to hash-ll
object-store-ll.h: split this header out of object-store.h
khash: name the structs that khash declares
merge-ll: rename from ll-merge
git-compat-util.h: remove unneccessary include of wildmatch.h
builtin.h: remove unneccessary includes
list-objects-filter-options.h: remove unneccessary include
diff.h: remove unnecessary include of oidset.h
repository: remove unnecessary include of path.h
log-tree: replace include of revision.h with simple forward declaration
cache.h: remove this no-longer-used header
read-cache*.h: move declarations for read-cache.c functions from cache.h
repository.h: move declaration of the_index from cache.h
merge.h: move declarations for merge.c from cache.h
diff.h: move declaration for global in diff.c from cache.h
preload-index.h: move declarations for preload-index.c from elsewhere
sparse-index.h: move declarations for sparse-index.c from cache.h
name-hash.h: move declarations for name-hash.c from cache.h
run-command.h: move declarations for run-command.c from cache.h
...
We create .pack and then .idx, we consider only packfiles that have
.idx usable (those with only .pack are not ready yet), so we should
remove .idx before removing .pack for consistency.
* ds/remove-idx-before-pack:
packfile: delete .idx files before .pack files
Avoid breakage of "git pack-objects --cruft" due to inconsistency
between the way the code enumerates packfiles in the repository.
* tb/collect-pack-filenames-fix:
builtin/repack.c: only collect fully-formed packs
When "git commit --trailer=..." invokes the interpret-trailers
machinery, it knows what it feeds to interpret-trailers is a full
log message without any patch, but failed to express that by
passing the "--no-divider" option, which has been corrected.
* jk/commit-use-no-divider-with-interpret-trailers:
commit: pass --no-divider to interpret-trailers
Curl library recently changed how http2 traces are shown and broke
the code to redact sensitive info header, which has been fixed.
* jk/redact-h2h3-headers-fix:
http: handle both "h2" and "h2h3" in curl info lines
Leakfixes
* rj/leakfixes:
tests: mark as passing with SANITIZE=leak
config: fix a leak in git_config_copy_or_rename_section_in_file
branch: fix a leak in cmd_branch
branch: fix a leak in setup_tracking
rev-parse: fix a leak with --abbrev-ref
Even when diff.ignoreSubmodules tells us to ignore submodule
changes, "git commit" with an index that already records changes to
submodules should include the submodule changes in the resulting
commit, but it did not.
* js/defeat-ignore-submodules-config-with-explicit-addition:
diff-lib: honor override_submodule_config flag bit
Suggest to refrain from using hex literals that are non-portable
when writing printf(1) format strings.
* jt/doc-use-octal-with-printf:
CodingGuidelines: use octal escapes, not hex
Leakfixes (subset)
* rj/leakfixes:
branch: fix a leak in setup_tracking
branch: fix a leak in check_tracking_branch
branch: fix a leak in inherit_tracking
branch: fix a leak in dwim_and_setup_tracking
remote: fix a leak in query_matches_negative_refspec
config: fix a leak in git_config_copy_or_rename_section_in_file
Gracefully deal with a stale MIDX file that lists a packfile that
no longer exists.
* tb/open-midx-bitmap-fallback:
pack-bitmap.c: gracefully degrade on failure to load MIDX'd pack
"git pack-objects" learned to invoke a new hook program that
enumerates extra objects to be used as anchoring points to keep
otherwise unreachable objects in cruft packs.
* tb/gc-recent-object-hook:
gc: introduce `gc.recentObjectsHook`
reachable.c: extract `obj_is_recent()`
Simplify error message when run-command fails to start a command.
* rs/run-command-exec-error-on-noent:
run-command: report exec error even on ENOENT
t1800: loosen matching of error message for bad shebang
The reimplemented "git add -i" did not honor color.ui configuration.
* ds/add-i-color-configuration-fix:
add: test use of brackets when color is disabled
add: check color.ui for interactive add
"git cat-file --batch" and friends learned "-Z" that uses NUL
delimiter for both input and output.
* ps/cat-file-null-output:
cat-file: add option '-Z' that delimits input and output with NUL
cat-file: simplify reading from standard input
strbuf: provide CRLF-aware helper to read until a specified delimiter
t1006: modernize test style to use `test_cmp`
t1006: don't strip timestamps from expected results
'git worktree add' learned how to create a worktree based on an
orphaned branch with `--orphan`.
* ja/worktree-orphan:
worktree add: emit warn when there is a bad HEAD
worktree add: extend DWIM to infer --orphan
worktree add: introduce "try --orphan" hint
worktree add: add --orphan flag
t2400: add tests to verify --quiet
t2400: refactor "worktree add" opt exclusion tests
t2400: cleanup created worktree in test
worktree add: include -B in usage docs
"git [-c log.follow=true] log [--follow] ':(glob)f**'" used to barf.
* jk/log-follow-with-non-literal-pathspec:
diff: detect pathspec magic not supported by --follow
diff: factor out --follow pathspec check
pathspec: factor out magic-to-name function
The value of config.worktree is per-repository, but has been kept
in a singleton global variable per process. This has been OK as
most Git operations interacted with a single repository at a time,
but not right for operations like recursive "grep" that want to
access multiple repositories from a single process without forking.
The global variable has been eliminated and made into a member in
the per-repository data structure.
* vd/worktree-config-is-per-repository:
repository: move 'repository_format_worktree_config' to repo scope
config: pass 'repo' directly to 'config_with_options()'
config: use gitdir to get worktree config
"git submodule" code trusted the data coming from the config (and
the in-tree .gitmodules file) too much without validating, leading
to NULL dereference if the user mucks with a repository (e.g.
submodule.<name>.url is removed). This has been corrected.
* tb/submodule-null-deref-fix:
builtin/submodule--helper.c: handle missing submodule URLs
Test style updates.
* jc/test-modernization-2:
t9400-git-cvsserver-server: modernize test format
t9200-git-cvsexportcommit: modernize test format
t9104-git-svn-follow-parent: modernize test format
t9100-git-svn-basic: modernize test format
t7700-repack: modernize test format
t7600-merge: modernize test format
t7508-status: modernize test format
t7201-co: modernize test format
t7111-reset-table: modernize test format
t7110-reset-merge: modernize test format
* jc/test-modernization:
t7101-reset-empty-subdirs: modernize test format
t6050-replace: modernize test format
t5306-pack-nobase: modernize test format
t5303-pack-corruption-resilience: modernize test format
t5301-sliding-window: modernize test format
t5300-pack-object: modernize test format
t4206-log-follow-harder-copies: modernize test format
t4202-log: modernize test format
t4004-diff-rename-symlink: modernize test format
t4003-diff-rename-1: modernize test format
t4002-diff-basic: modernize test format
t3903-stash: modernize test format
t3700-add: modernize test format
t3500-cherry: modernize test format
t1006-cat-file: modernize test format
t1002-read-tree-m-u-2way: modernize test format
t1001-read-tree-m-2way: modernize test format
t3210-pack-refs: modernize test format
t0030-stripspace: modernize test format
t0000-basic: modernize test format
Document more pseudo-refs and teach the command line completion
machinery to complete AUTO_MERGE.
* pb/complete-and-document-auto-merge-and-friends:
completion: complete AUTO_MERGE
Documentation: document AUTO_MERGE
git-merge.txt: modernize word choice in "True merge" section
completion: complete REVERT_HEAD and BISECT_HEAD
revisions.txt: document more special refs
revisions.txt: use description list for special refs
Header files cleanup.
* en/header-split-cache-h-part-3: (28 commits)
fsmonitor-ll.h: split this header out of fsmonitor.h
hash-ll, hashmap: move oidhash() to hash-ll
object-store-ll.h: split this header out of object-store.h
khash: name the structs that khash declares
merge-ll: rename from ll-merge
git-compat-util.h: remove unneccessary include of wildmatch.h
builtin.h: remove unneccessary includes
list-objects-filter-options.h: remove unneccessary include
diff.h: remove unnecessary include of oidset.h
repository: remove unnecessary include of path.h
log-tree: replace include of revision.h with simple forward declaration
cache.h: remove this no-longer-used header
read-cache*.h: move declarations for read-cache.c functions from cache.h
repository.h: move declaration of the_index from cache.h
merge.h: move declarations for merge.c from cache.h
diff.h: move declaration for global in diff.c from cache.h
preload-index.h: move declarations for preload-index.c from elsewhere
sparse-index.h: move declarations for sparse-index.c from cache.h
name-hash.h: move declarations for name-hash.c from cache.h
run-command.h: move declarations for run-command.c from cache.h
...
Clang's sanitizer implementation seems to work better than GCC's.
* jk/ci-use-clang-for-sanitizer-jobs:
ci: drop linux-clang job
ci: run ASan/UBSan in a single job
ci: use clang for ASan/UBSan checks
Code clean-up.
* ps/fetch-cleanups:
fetch: use `fetch_config` to store "submodule.fetchJobs" value
fetch: use `fetch_config` to store "fetch.parallel" value
fetch: use `fetch_config` to store "fetch.recurseSubmodules" value
fetch: use `fetch_config` to store "fetch.showForcedUpdates" value
fetch: use `fetch_config` to store "fetch.pruneTags" value
fetch: use `fetch_config` to store "fetch.prune" value
fetch: pass through `fetch_config` directly
fetch: drop unneeded NULL-check for `remote_ref`
fetch: drop unused DISPLAY_FORMAT_UNKNOWN enum value
This creates a new fsmonitor-ll.h with most of the functions from
fsmonitor.h, though it leaves three inline functions where they were.
Two-thirds of the files that previously included fsmonitor.h did not
need those three inline functions or the six extra includes those inline
functions required, so this allows them to only include the lower level
header.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
oidhash() was used by both hashmap and khash, which makes sense.
However, the location of this function in hashmap.[ch] meant that
khash.h had to depend upon hashmap.h, making people unfamiliar with
khash think that it was built upon hashmap. (Or at least, I personally
was confused for a while about this in the past.)
Move this function to hash-ll, so that khash.h can stop depending upon
hashmap.h.
This has another benefit as well: it allows us to remove hashmap.h's
dependency on hash-ll.h. While some callers of hashmap.h were making
use of oidhash, most were not, so this change provides another way to
reduce the number of includes.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The vast majority of files including object-store.h did not need dir.h
nor khash.h. Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.
After this patch:
$ git grep -h include..object-store | sort | uniq -c
2 #include "object-store.h"
129 #include "object-store-ll.h"
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
khash.h lets you instantiate custom hash types that map between two
types. These are defined as a struct, as you might expect, and khash
typedef's that to kh_foo_t. But it declares the struct anonymously,
which doesn't give a name to the struct type itself; there is no
"struct kh_foo". This has two small downsides:
- when using khash, we declare "kh_foo_t *the_foo". This is
unlike our usual naming style, which is "struct kh_foo *the_foo".
- you can't forward-declare a typedef of an unnamed struct type in
C. So we might do something like this in a header file:
struct kh_foo;
struct bar {
struct kh_foo *the_foo;
};
to avoid having to include the header that defines the real
kh_foo. But that doesn't work with the typedef'd name. Without the
"struct" keyword, the compiler doesn't know we mean that kh_foo is
a type.
So let's always give khash structs the name that matches our
conventions ("struct kh_foo" to match "kh_foo_t"). We'll keep doing
the typedef to retain compatibility with existing callers.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A long term (but rather minor) pet-peeve of mine was the name
ll-merge.[ch]. I thought it made it harder to realize what stuff was
related to merging when I was working on the merge machinery and trying
to improve it.
Further, back in d1cbe1e6d8 ("hash-ll.h: split out of hash.h to remove
dependency on repository.h", 2023-04-22), we have split the portions of
hash.h that do not depend upon repository.h into a "hash-ll.h" (due to
the recommendation to use "ll" for "low-level" in its name[1], but which
I used as a suffix precisely because of my distaste for "ll-merge").
When we discussed adding additional "*-ll.h" files, a request was made
that we use "ll" consistently as either a prefix or a suffix. Since it
is already in use as both a prefix and a suffix, the only way to do so
is to rename some files.
Besides my distaste for the ll-merge.[ch] name, let me also note that
the files
ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h
would have essentially nothing to do with each other and make no sense
to group. But giving them the common "ll-" prefix would group them. Using
"-ll" as a suffix thus seems just much more logical to me. Rename
ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to
ensure we get a more logical grouping of files.
[1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The include of wildmatch.h in git-compat-util.h was added in cebcab189a
(Makefile: add USE_WILDMATCH to use wildmatch as fnmatch, 2013-01-01) as
a way to be able to compile-time force any calls to fnmatch() to instead
invoke wildmatch(). The defines and inline function were removed in
70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15),
and this include in git-compat-util.h has been unnecessary ever since.
Remove the include from git-compat-util.h, but add it to the .c files
that had omitted the direct #include they needed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that a few .c files under builtin/ were
depending upon some headers but had forgotten to #include them. Add the
missing direct includes while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that several .c files depended upon various
things that oidset included, but had omitted the direct #include for
those headers. Add those now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.
Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first). This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the functions defined in read-cache.c, move their declarations from
cache.h to a new header, read-cache-ll.h. Also move some related inline
functions from cache.h to read-cache.h. The purpose of the
read-cache-ll.h/read-cache.h split is that about 70% of the sites don't
need the inline functions and the extra headers they include.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
the_index is a global variable defined in repository.c; as such, its
declaration feels better suited living in repository.h rather than
cache.h. Move it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have a preload-index.c file; move the declarations for the
functions in that file into a new preload-index.h. These were
previously split between cache.h and repository.h.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note in particular that this reverses the decision made in 118a2e8bde
("cache: move ensure_full_index() to cache.h", 2021-04-01).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions do not depend upon struct cache_entry or struct
index_state in any way, and it seems more logical to break them out into
this file, especially since statinfo.h already has the struct stat_data
declaration.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function add_files_to_cache(), plus associated helper functions,
were defined in builtin/add.c, but also shared with builtin/checkout.c
and builtin/commit.c. Move these shared functions to read-cache.c.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function add_files_to_cache() is used by all three of builtin/{add,
checkout, commit}.c. That suggests this is common library code, and
should be moved somewhere else, like read-cache.c. However, the
function and its helpers made use of two global variables that made
straight code movement difficult:
* the_index
* include_sparse
The latter was perhaps more problematic since it was only accessible in
builtin/add.c but was still affecting builtin/checkout.c and
builtin/commit.c without this fact being very clear from the code. I'm
not sure if the other two callers would want to add a `--sparse` flag
similar to add.c to get non-default behavior, but exposing this
dependence will help if we ever decide we do want to add such a flag.
Modify add_files_to_cache() and its helpers to accept the necessary
arguments instead of relying on globals.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function overlay_tree_on_index(), plus associated helper functions,
were defined in builtin/ls-files.c, but also shared with
builtin/commit.c. Move these shared functions to read-cache.c.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions init_db() and initialize_repository_version() were shared
by builtin/init-db.c and builtin/clone.c, and declared in cache.h.
Move these functions, plus their several helpers only used by these
functions, to setup.[ch].
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much like the parent commit, this commit was prompted by a desire to
move the functions which builtin/init-db.c and builtin/clone.c share out
of the former file and into setup.c. A secondary issue that made it
difficult was the init_shared_repository global variable; replace it
with a simple parameter that is passed to the relevant functions.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit was prompted by a desire to move the functions which
builtin/init-db.c and builtin/clone.c share out of the former file and
into setup.c. One issue that made it difficult was the
init_is_bare_repository global variable.
init_is_bare_repository's sole use in life it to cache a value in
init_db(), and then be used in create_default_files(). This is a bit
odd since init_db() directly calls create_default_files(), and is the
only caller of that function. Convert the global to a simple function
parameter instead.
(Of course, this doesn't fix the fact that this value is then ignored by
create_default_files(), as noted in a big TODO comment in that function,
but it at least includes no behavioral change other than getting rid of
a very questionable global variable.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comments in create_default_files() talks about reading config from
the config file in the specified `--templates` directory, which leads to
the question of whether core.bare could be set in such a config file and
thus whether the code is doing the right thing. It turns out, that it
doesn't; it unconditionally ignores core.bare in the config file in any
--templates directory. It is not clear to me that fixing it can be done
within this function; it seems to occur too late:
* create_default_files() is called by init_db()
* init_db() is called by both builtin/{clone.c,init-db.c}
* both callers of init_db() call set_git_work_tree() before init_db()
and in order to actual affect whether a repository is bear, we'd need to
somewhere reset these values, not just the is_bare_repository_cfg
setting.
I do not want to open this can of worms at this time; I'm trying to
clean up some headers, for which I need to move some functions, for
which I need to clean up some globals, and that's far enough down the
rabbit hole. So, simply document the issue with a careful TODO comment
and a few testcases.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* en/header-split-cache-h-part-2: (22 commits)
reftable: ensure git-compat-util.h is the first (indirect) include
diff.h: reduce unnecessary includes
object-store.h: reduce unnecessary includes
commit.h: reduce unnecessary includes
fsmonitor: reduce includes of cache.h
cache.h: remove unnecessary headers
treewide: remove cache.h inclusion due to previous changes
cache,tree: move basic name compare functions from read-cache to tree
cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
hash-ll.h: split out of hash.h to remove dependency on repository.h
tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
dir.h: move DTYPE defines from cache.h
versioncmp.h: move declarations for versioncmp.c functions from cache.h
ws.h: move declarations for ws.c functions from cache.h
match-trees.h: move declarations for match-trees.c functions from cache.h
pkt-line.h: move declarations for pkt-line.c functions from cache.h
base85.h: move declarations for base85.c functions from cache.h
copy.h: move declarations for copy.c functions from cache.h
server-info.h: move declarations for server-info.c functions from cache.h
packfile.h: move pack_window and pack_entry from cache.h
...
2023-05-08 08:21:21 -07:00
179 changed files with 3443 additions and 1052 deletions
* Print a single ref, outside of any ref-filter. Note that the
* name must be a fully qualified refname.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.