"git pull" with any strategy when the other side is behind us
should succeed as it is a no-op, but doesn't.
* ev/pull-already-up-to-date-is-noop:
pull: should be noop when already-up-to-date
"git grep" looking in a blob that has non-UTF8 payload was
completely broken when linked with versions of PCREv2 library older
than 10.34 in the latest release.
* hm/paint-hits-in-log-grep:
Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"
This reverts commit f6526728f9.
The change in f652672 (dir: select directories correctly, 2021-09-24)
caused a regression in directory-based matches with non-cone-mode
patterns, especially for .gitignore patterns. A test is included to
prevent this regression in the future.
The commit ed495847 (dir: fix pattern matching on dirs, 2021-09-24) was
reverted in 5ceb663 (dir: fix directory-matching bug, 2021-11-02) for
similar reasons. Neither commit changed tests, and tests added later in
the series continue to pass when these commits are reverted.
Reported-by: Danial Alihosseini <danial.alihosseini@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit ae39ba431a, as it
breaks "grep" when looking for a string in non UTF-8 haystack, when
linked with certain versions of PCREv2 library.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The already-up-to-date pull bug was fixed for --ff-only but it did not
include the case where --ff or --ff-only are not specified. This updates
the --ff-only fix to include the case where --ff or --ff-only are not
specified in command line flags or config.
Signed-off-by: Erwin Villejo <erwin.villejo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A superfluous ']' was added to the title of the GitHub CI section in
f003a91f5c (SubmittingPatches: replace discussion of Travis with GitHub
Actions, 2021-07-22). Remove it.
While at it, format the URL for a GitHub user's workflow runs of Git
between backticks, since if not Asciidoc formats only the first part,
"https://github.com/<Your", as a link, which is not very useful.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When checking typos in file "po/ko.po", "git-po-helper" reports lots of
false positives because there are no spaces between ASCII and Korean
characters. After applied commit adee197 "(dict: add smudge table for
Korean language, 2021-11-11)" of "git-l10n/git-po-helper" to suppress
these false positives, some easy-to-fix typos are found and fixed.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
When we added a new event type to trace2 event stream, we forgot to
raise the format version number, which has been corrected.
* js/trace2-raise-format-version:
trace2: increment event format version
Regression fix.
* ab/fsck-unexpected-type:
object-file: free(*contents) only in read_loose_object() caller
object-file: fix SEGV on free() regression in v2.34.0-rc2
In 64bc752 (trace2: add trace2_child_ready() to report on background
children, 2021-09-20), we added a new "child_ready" event. In
Documentation/technical/api-trace2.txt, we promise that adding a new
event type will result in incrementing the trace2 event format version
number, but this was not done. Correct this in code & docs.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commit a free() of uninitialized memory regression in
96e41f58fe (fsck: report invalid object type-path combinations,
2021-10-01) was fixed, but we'd still have an issue with leaking
memory from fsck_loose(). Let's fix that issue too.
That issue was introduced in my 31deb28f5e (fsck: don't hard die on
invalid object types, 2021-10-01). It can be reproduced under
SANITIZE=leak with the test I added in 093fffdfbe (fsck tests: add
test for fsck-ing an unknown type, 2021-10-01):
./t1450-fsck.sh --run=84 -vixd
In some sense it's not a problem, we lost the same amount of memory in
terms of things malloc'd and not free'd. It just moved from the "still
reachable" to "definitely lost" column in valgrind(1) nomenclature[1],
since we'd have die()'d before.
But now that we don't hard die() anymore in the library let's properly
free() it. Doing so makes this code much easier to follow, since we'll
now have one function owning the freeing of the "contents" variable,
not two.
For context on that memory management pattern the read_loose_object()
function was added in f6371f9210 (sha1_file: add read_loose_object()
function, 2017-01-13) and subsequently used in c68b489e56 (fsck:
parse loose object paths directly, 2017-01-13). The pattern of it
being the task of both sides to free() the memory has been there in
this form since its inception.
1. https://valgrind.org/docs/manual/mc-manual.html#mc-manual.leaks
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit f45022dc2f,
as this is like breakage in the traversal more likely. In a
history with 10 single strand of pearls,
1-->2-->3--...->7-->8-->9-->10
asking "rev-list --unsorted-input 1 10 --not 9 8 7 6 5 4" fails to
paint the bottom 1 uninteresting as the traversal stops, without
completing the propagation of uninteresting bit starting at 4 down
through 3 and 2 to 1.
Fix a regression introduced in my 96e41f58fe (fsck: report invalid
object type-path combinations, 2021-10-01). When fsck-ing blobs larger
than core.bigFileThreshold, we'd free() a pointer to uninitialized
memory.
This issue would have been caught by SANITIZE=address, but since it
involves core.bigFileThreshold, none of the existing tests in our test
suite covered it.
Running them with the "big_file_threshold" in "environment.c" changed
to say "6" would have shown this failure, but let's add a dedicated
test for this scenario based on Han Xin's report[1].
The bug was introduced between v9 and v10[2] of the fsck series merged
in 061a21d36d (Merge branch 'ab/fsck-unexpected-type', 2021-10-25).
1. https://lore.kernel.org/git/20211111030302.75694-1-hanxin.hx@alibaba-inc.com/
2. https://lore.kernel.org/git/cover-v10-00.17-00000000000-20211001T091051Z-avarab@gmail.com/
Reported-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The way Cygwin emulates a unix-domain socket, on top of which the
simple-ipc mechanism is implemented, can race with the program on
the other side that wants to use the socket, and briefly make it
appear as a regular file before lstat(2) starts reporting it as a
socket. We now have a workaround on the side that connects to a
unix domain socket.
* js/simple-ipc-cygwin-socket-fix:
simple-ipc: work around issues with Cygwin's Unix socket emulation
"git maintenance run" learned to use system supplied scheduler
backend, but cron on macOS turns out to be unusable for this
purpose.
* ds/no-usable-cron-on-macos:
maintenance: disable cron on macOS
"git pull --ff-only" and "git pull --rebase --ff-only" should make
it a no-op to attempt pulling from a remote that is behind us, but
instead the command errored out by saying it was impossible to
fast-forward, which may technically be true, but not a useful thing
to diagnose as an error. This has been corrected.
* jc/fix-pull-ff-only-when-already-up-to-date:
pull: --ff-only should make it a noop when already-up-to-date
The "-Y find-principals" option of ssh-keygen seems to be broken in
Debian's openssh-client 1:8.7p1-1, whereas it works fine in 1:8.4p1-5.
This causes several failures for GPGSSH tests. We fulfill the
prerequisite because generating the keys works fine, but actually
verifying a signature causes results ranging from bogus results to
ssh-keygen segfaulting.
We can find the broken version during the prereq check by feeding it
empty input. This should result in it complaining to stderr, but in the
broken version it triggers the segfault, causing the GPGSSH tests to be
skipped.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In eba1ba9 (maintenance: `git maintenance run` learned
`--scheduler=<scheduler>`, 2021-09-04), we introduced the ability to
specify a scheduler explicitly. This led to some extra checks around
whether an alternative scheduler was available. This added the
functionality of removing background maintenance from schedulers other
than the one selected.
On macOS, cron is technically available, but running 'crontab' triggers
a UI prompt asking for special permissions. This is the major reason why
launchctl is used as the default scheduler. The is_crontab_available()
method triggers this UI prompt, causing user disruption.
Remove this disruption by using an #ifdef to prevent running crontab
this way on macOS. This has the unfortunate downside that if a user
manually selects cron via the '--scheduler' option, then adjusting the
scheduler later will not remove the schedule from cron. The
'--scheduler' option ignores the is_available checks, which is how we
can get into this situation.
Extract the new check_crontab_process() method to avoid making the
'child' variable unused on macOS. The method is marked MAYBE_UNUSED
because it has no callers on macOS.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cygwin emulates Unix sockets by writing files with custom contents and
then marking them as system files.
The tricky problem is that while the file is written and its `system`
bit is set, it is still identified as a file. This caused test failures
when Git is too fast looking for the Unix sockets and then complains
that there is a plain file in the way.
Let's work around this by adding a delayed retry loop, specifically for
Cygwin.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Tested-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of github.com:git/git:
Git 2.34-rc2
parse-options.[ch]: revert use of "enum" for parse_options()
t/lib-git.sh: fix ACL-related permissions failure
A few fixes before -rc2
async_die_is_recursing: work around GCC v11.x issue on Fedora
Document positive variant of commit and merge option "--no-verify"
pull: honor --no-verify and do not call the commit-msg hook
http-backend: remove a duplicated code branch
Fix ssh-signing test to work on a platform where the default ACL is
overly loose to upset OpenSSH (reported on an installation of Cygwin).
* ad/ssh-signing-testfix:
t/lib-git.sh: fix ACL-related permissions failure
Revert the parse_options() prototype change in my recent
352e761388 (parse-options.[ch]: consistently use "enum
parse_opt_result", 2021-10-08) was incorrect. The parse_options()
function returns the number of argc elements that haven't been
processed, not "enum parse_opt_result".
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As well as checking that the relevant functionality is available, the
GPGSSH prerequisite check creates the SSH keys that are used by the test
functions it gates. If these keys are created in a directory that
has a default Access Control List, the key files can inherit those
permissions.
This can result in a scenario where the private keys are created
successfully, so the prerequisite check passes and the tests are run,
but the key files have permissions that are too permissive, meaning
OpenSSH will refuse to load them and the tests will fail.
To avoid this happening, before creating the keys, clear any default ACL
set on the directory that will contain them. This step allowed to fail;
if setfacl isn't present, that's a very likely indicator that the
filesystem in question simply doesn't support default ACLs.
Helped-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One CI task based on Fedora image noticed a not-quite-kosher
consturct recently, which has been corrected.
* vd/pthread-setspecific-g11-fix:
async_die_is_recursing: work around GCC v11.x issue on Fedora
One CI task based on Fedora image noticed a not-quite-kosher
consturct recently, which has been corrected.
* vd/pthread-setspecific-g11-fix:
async_die_is_recursing: work around GCC v11.x issue on Fedora
"git pull --no-verify" did not affect the underlying "git merge".
* ar/fix-git-pull-no-verify:
pull: honor --no-verify and do not call the commit-msg hook
This fix corrects an issue found in the `dockerized(pedantic, fedora)` CI
build, first appearing after the introduction of a new version of the Fedora
docker image version. This image includes a version of `glibc` with the
attribute `__attr_access_none` added to `pthread_setspecific` [1], the
implementation of which only exists for GCC 11.X - the version included in
the Fedora image. The attribute requires that the pointer provided in the
second argument of `pthread_getspecific` must, if not NULL, be a pointer to
a valid object. In the usage in `async_die_is_recursing`, `(void *)1` is not
valid, causing the error.
This fix imitates a workaround added in SELinux [2] by using the pointer to
the static `async_die_counter` itself as the second argument to
`pthread_setspecific`. This guaranteed non-NULL, valid pointer matches the
intent of the current usage while not triggering the build error.
[1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=a1561c3bbe8
[2] https://lore.kernel.org/all/20211021140519.6593-1-cgzones@googlemail.com/
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of github.com:git/git:
Git 2.34-rc1
rebase -i: fix rewording with --committer-date-is-author-date
dir: fix directory-matching bug
gpg-interface: avoid buffer overrun in parse_ssh_output()
gpg-interface: handle missing " with " gracefully in parse_ssh_output()
A few more topics before -rc1
i18n: fix typos found during l10n for git 2.34.0
t5310: drop lib-bundle.sh include
format-patch (doc): clarify --base=auto
gc: perform incremental repack when implictly enabled
fsck: verify multi-pack-index when implictly enabled
fsck: verify commit graph when implicitly enabled
grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data
commit-graph: don't consider "replace" objects with "verify"
commit-graph tests: fix another graph_git_two_modes() helper
commit-graph tests: fix error-hiding graph_git_two_modes() helper
pretty: colorize pattern matches in commit messages
grep: refactor next_match() and match_one_pattern() for external use
baf8ec8d3a (rebase -r: don't write .git/MERGE_MSG when
fast-forwarding, 2021-08-20) stopped reading the author script in
run_git_commit() when rewording a commit. This is normally safe
because "git commit --amend" preserves the authorship. However if the
user passes "--committer-date-is-author-date" then we need to read the
author date from the author script when rewording. Fix this regression
by tightening the check for when it is safe to skip reading the author
script.
Reported-by: Jonas Kittner <jonas.kittner@ruhr-uni-bochum.de>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts the change from ed49584 (dir: fix pattern matching on dirs,
2021-09-24), which claimed to fix a directory-matching problem without a
test case. It turns out to _create_ a bug, but it is a bit subtle.
The bug would have been revealed by the first of two tests being added to
t0008-ignores.sh. The first uses a pattern "/git/" inside the a/.gitignores
file, which matches against 'a/git/foo' but not 'a/git-foo/bar'. This test
would fail before the revert.
The second test shows what happens if the test instead uses a pattern "git/"
and this test passes both before and after the revert.
The difference in these two cases are due to how
last_matching_pattern_from_list() checks patterns both if they have the
PATTERN_FLAG_MUSTBEDIR and PATTERN_FLAG_NODIR flags. In the case of "git/",
the PATTERN_FLAG_NODIR is also provided, making the change in behavior in
match_pathname() not affect the end result of
last_matching_pattern_from_list().
Reported-by: Glen Choo <chooglen@google.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the string "key" we found in the output of ssh-keygen happens to be
located at the very end of the line, then going four characters further
leaves us beyond the end of the string. Explicitly search for the
space after "key" to handle a missing one gracefully.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the output of ssh-keygen starts with "Good \"git\" signature for ",
but is not followed by " with " for some reason, then parse_ssh_output()
uses -1 as the len parameter of xmemdupz(), which in turn will end the
program. Reject the signature and carry on instead in that case.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is wrong to read some settings directly from the config
subsystem, as things like feature.experimental can affect their
default values.
* gc/use-repo-settings:
gc: perform incremental repack when implictly enabled
fsck: verify multi-pack-index when implictly enabled
fsck: verify commit graph when implicitly enabled
Teach "git commit-graph" command not to allow using replace objects
at all, as we do not use the commit-graph at runtime when we see
object replacement.
* ab/ignore-replace-while-working-on-commit-graph:
commit-graph: don't consider "replace" objects with "verify"
commit-graph tests: fix another graph_git_two_modes() helper
commit-graph tests: fix error-hiding graph_git_two_modes() helper
"git log --grep=string --author=name" learns to highlight hits just
like "git grep string" does.
* hm/paint-hits-in-log-grep:
grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data
pretty: colorize pattern matches in commit messages
grep: refactor next_match() and match_one_pattern() for external use
* cleaning duplicate incorrect translations part 2
* update translation table
* unfuzzy all entries
* typos check in git commands and git flags
* random line translations
* updating all phrases #1
Signed-off-by: Daniel Santos <daniel@brilhante.top>
Emir and Jean-Noël reported typos in some i18n messages when preparing
l10n for git 2.34.0.
* Fix unstable spelling of config variable "gpg.ssh.defaultKeyCommand"
which was introduced in commit fd9e226776 (ssh signing: retrieve a
default key from ssh-agent, 2021-09-10).
* Add missing space between "with" and "--python" which was introduced
in commit bd0708c7eb (ref-filter: add %(raw) atom, 2021-07-26).
* Fix unmatched single quote in 'builtin/index-pack.c' which was
introduced in commit 8737dab346 (index-pack: refactor renaming in
final(), 2021-09-09)
[1] https://github.com/git-l10n/git-po/pull/567
Reported-by: Emir Sarı <bitigchi@me.com>
Reported-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of github.com:git/git: (762 commits)
Git 2.34-rc0
wrapper: remove xunsetenv()
log: document --encoding behavior on iconv() failure
Revert "logmsg_reencode(): warn when iconv() fails"
completion: fix incorrect bash/zsh string equality check
add, rm, mv: fix bug that prevents the update of non-sparse dirs
git-bundle.txt: add missing words and punctuation
Documentation/Makefile: fix lint-docs mkdir dependency
submodule: drop unused sm_name parameter from append_fetch_remotes()
The fifteenth batch
gitweb.txt: change "folder" to "directory"
gitignore.txt: change "folder" to "directory"
git-multi-pack-index.txt: change "folder" to "directory"
git.txt: fix typo
archive: describe compression level option
config.txt: fix typo
command-list.txt: remove 'sparse-index' from main help
userdiff-cpp: back out the digit-separators in numbers
submodule--helper: fix incorrect newlines in an error message
branch (doc): -m/-c copies config and reflog
...
"git branch -c/-m new old" was not described to copy config, which
has been corrected.
* jc/branch-copy-doc:
branch (doc): -m/-c copies config and reflog
Consistently use 'directory', not 'folder', to call the filesystem
entity that collects a group of files and, eh, directories.
* ma/doc-folder-to-directory:
gitweb.txt: change "folder" to "directory"
gitignore.txt: change "folder" to "directory"
git-multi-pack-index.txt: change "folder" to "directory"
Drop "git sparse-index" from the list of common commands.
* sg/sparse-index-not-that-common-a-command:
command-list.txt: remove 'sparse-index' from main help
Update "git archive" documentation and give explicit mention on the
compression level for both zip and tar.gz format.
* bs/archive-doc-compression-level:
archive: describe compression level option
Message regression fix.
* ks/submodule-add-message-fix:
submodule: drop unused sm_name parameter from append_fetch_remotes()
submodule--helper: fix incorrect newlines in an error message
Leakfix.
* ab/plug-random-leaks:
reflog: free() ref given to us by dwim_log()
submodule--helper: fix small memory leaks
clone: fix a memory leak of the "git_dir" variable
grep: fix a "path_list" memory leak
grep: use object_array_clear() in cmd_grep()
grep: prefer "struct grep_opt" over its "void *" equivalent
"git for-each-ref" family of commands were leaking the ref_sorting
instances that hold sorting keys specified by the user; this has
been corrected.
* ab/ref-filter-leakfix:
branch: use ref_sorting_release()
ref-filter API user: add and use a ref_sorting_release()
tag: use a "goto cleanup" pattern, leak less memory
"git push" client talking to an HTTP server did not diagnose the
lack of the final status report from the other side correctly,
which has been corrected.
* jk/http-push-status-fix:
transport-helper: recognize "expecting report" error from send-pack
send-pack: complain about "expecting report" with --helper-status
A new feature has been added to abort early in the test framework.
* ab/test-bail:
test-lib.sh: use "Bail out!" syntax on bad SANITIZE=leak use
test-lib.sh: de-duplicate error() teardown code
We already note that we may produce invalid output when we skip calling
iconv() altogether. But we may also do so if iconv() fails, and we have
no good alternative. Let's document this to avoid surprising users.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit fd680bc5 (logmsg_reencode(): warn when iconv()
fails, 2021-08-27). Throwing a warning for each and every commit
that gets reencoded, without allowing a way to squelch, would make
it unpleasant for folks who have to deal with an ancient part of the
history in an old project that used wrong encoding in the commits.
This documents "--verify" option of the commands. It can be used to re-enable
the hooks disabled by an earlier "--no-verify" in command-line.
Signed-off-by: Alexander Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier, we made sure that "git pull --ff-only" (and "git -c
pull.ff=only pull") errors out when our current HEAD is not an
ancestor of the tip of the history we are merging, but the condition
to trigger the error was implemented incorrectly.
Imagine you forked from a remote branch, built your history on top
of it, and then attempted to pull from them again. If they have not
made any update in the meantime, our current HEAD is obviously not
their ancestor, and this new error triggers.
Without the --ff-only option, we just report that there is no need
to pull; we did the same historically with --ff-only, too.
Make sure we do not fail with the recently added check to restore
the historical behaviour.
Reported-by: Kenneth Arnold <ka37@calvin.edu>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit ddfe900612 (test-lib-functions: move function to lib-bitmap.sh,
2021-02-09) meant to include lib-bitmap.sh in t5310, but also includes
lib-bundle.sh. Yet we don't use any of its functions, nor have anything
to do with bundles. This is probably just a typo/copy-paste error, as
lib-bundle.sh was added (correctly) to other scripts in the same series.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option was incorrectly auto-translated to "--no-verify-signatures",
which causes the unexpected effect of the hook being called.
And an even more unexpected effect of disabling verification of signatures.
The manual page describes the option to behave same as the similarly
named option of "git merge", which seems to be the original intention
of this option in the "pull" command.
Signed-off-by: Alexander Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the basic `[`/`test` command, the string equality operator is a
single `=`. The `==` operator is only available in `[[`, which is a
bash-ism also supported by zsh.
This mix-up was causing the following completion error in zsh:
> __git_ls_files_helper:7: = not found
(That message refers to the extraneous symbol in `==` ← `=`).
This updates that comparison to use a single `=` inside the
basic `[ … ]` test conditional.
Although this fix is inconsistent with the other comparisons in this
file, which use `[[ … == … ]]`, and the two expressions are functionally
identical in this context, that approach was rejected due to a
preference for `[`.
Signed-off-by: Robert Estelle <robertestelle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These three commands recently learned to avoid updating paths outside
the sparse checkout even if they are missing the SKIP_WORKTREE bit. This
is done using path_in_sparse_checkout(), which checks whether a given
path matches the current list of sparsity rules, similar to what
clear_ce_flags() does when we run "git sparse checkout init" or "git
sparse-checkout reapply". However, clear_ce_flags() uses a recursive
approach, applying the match results from parent directories on paths
that get the UNDECIDED result, whereas path_in_sparse_checkout() only
attempts to match the full path and immediately considers UNDECIDED as
NOT_MATCHED. This makes the function miss matches with leading
directories. For example, if the user has the sparsity patterns "!/a"
and "b/", add, rm, and mv will fail to update the path "a/b/c" and end
up displaying a warning about it being outside the sparse checkout even
though it isn't. This problem only occurs in full pattern mode as the
pattern matching functions never return UNDECIDED for cone mode.
To fix this, replicate the recursive behavior of clear_ce_flags() in
path_in_sparse_checkout(), falling back to the parent directory match
when a path gets the UNDECIDED result. (If this turns out to be too
expensive in some cases, we may want to later add some form of caching
to accelerate multiple queries within the same directory. This is not
implemented in this patch, though.) Also add two tests for each affected
command (add, rm, and mv) to check that they behave correctly with the
recursive pattern matching. The first test would previously fail without
this patch while the second already succeeded. It is added mostly to
make sure that we are not breaking the existing pattern matching for
directories that are really sparse, and also as a protection against any
future regressions.
Two other existing tests had to be changed as well: one test in t3602
checks that "git rm -r <dir>" won't remove sparse entries, but it didn't
allow the non-sparse entries inside <dir> to be removed. The other one,
in t7002, tested that "git mv" would correctly display a warning message
for sparse paths, but it accidentally expected the message to include
two non-sparse paths as well.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an "and" to separate the two halves of the first sentence of the
paragraph more. Add a comma to similarly separate the two halves of the
second sentence a bit better. Add a period at the end of the paragraph.
Further down in the file, add the missing "be" in "must be accompanied".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 8650c6298c (doc lint: make "lint-docs" non-.PHONY, 2021-10-15), we
put the output for gitlink linter into .build/lint-docs/gitlink. There
are order-only dependencies to create the sequence of subdirs like:
.build/lint-docs: | .build
$(QUIET)mkdir $@
.build/lint-docs/gitlink: | .build/lint-docs
$(QUIET)mkdir $@
where each level has to depend on the prior one (since the parent
directory must exist for us to create something inside it). But the
"howto" and "config" subdirectories of gitlink have the wrong
dependency; they depend on "lint-docs", not "lint-docs/gitlink".
This usually works out, because the LINT_DOCS_GITLINK targets which
depend on "gitlink/howto" also depend on just "gitlink", so the
directory gets created anyway. But since we haven't given make an
explicit ordering, things can racily happen out of order.
If you stick a "sleep 1" in the rule to build "gitlink" like this:
## Lint: gitlink
.build/lint-docs/gitlink: | .build/lint-docs
- $(QUIET)mkdir $@
+ $(QUIET)sleep 1 && mkdir $@
then "make clean; make lint-docs" will fail reliably. Or you can see it
as-is just by building the directory in isolation:
$ make clean
[...]
$ make .build/lint-docs/gitlink/howto
GEN mergetools-list.made
GEN cmd-list.made
GEN doc.dep
SUBDIR ../
make[1]: 'GIT-VERSION-FILE' is up to date.
SUBDIR ../
make[1]: 'GIT-VERSION-FILE' is up to date.
mkdir: cannot create directory ‘.build/lint-docs/gitlink/howto’: No such file or directory
make: *** [Makefile:476: .build/lint-docs/gitlink/howto] Error 1
The fix is easy: we just need to depend on the correct parent directory.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit c21fb4676f (submodule--helper: fix incorrect newlines in an error
message, 2021-10-23) accidentally added a new, unused parameter while
changing the name and signature of show_fetch_remotes() to
append_fetch_remotes(). We can drop this to keep things simpler (and
satisfy -Wunused-parameter).
The error is likely because c21fb4676f is fixing a problem from
8c8195e9c3 (submodule--helper: introduce add-clone subcommand,
2021-07-10). An earlier iteration of that second commit introduced the
same unused parameter (though it was dropped before it finally made it
to 'next'), and the fix on top accidentally carried forward the extra
parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test clean-up.
* ab/test-lib-diff-cleanup:
tests: stop using top-level "README" and "COPYING" files
"lib-diff" tests: make "README" and "COPYING" test data smaller
The codepath to write a new version of .midx multi-pack index files
has learned to release the mmaped memory holding the current
version of .midx before removing them from the disk, as some
platforms do not allow removal of a file that still has mapping.
* tb/fix-midx-rename-while-mapped:
midx.c: guard against commit_lock_file() failures
midx.c: lookup MIDX by object directory during repack
midx.c: lookup MIDX by object directory during expire
midx.c: extract MIDX lookup by object_dir
Improve test framework around unwritable directories.
* ab/test-cleanly-recreate-trash-directory:
test-lib.sh: try to re-chmod & retry on failed trash removal
Bunch of tests are marked as "passing leak check".
* ab/mark-leak-free-tests-more:
merge: add missing strbuf_release()
ls-files: add missing string_list_clear()
ls-files: fix a trivial dir_clear() leak
tests: fix test-oid-array leak, test in SANITIZE=leak
tests: fix a memory leak in test-oidtree.c
tests: fix a memory leak in test-parse-options.c
tests: fix a memory leak in test-prio-queue.c
Bunch of tests are marked as "passing leak check".
* ab/mark-leak-free-tests:
leak tests: mark some misc tests as passing with SANITIZE=leak
leak tests: mark various "generic" tests as passing with SANITIZE=leak
leak tests: mark some read-tree tests as passing with SANITIZE=leak
leak tests: mark some ls-files tests as passing with SANITIZE=leak
leak tests: mark all checkout-index tests as passing with SANITIZE=leak
leak tests: mark all trace2 tests as passing with SANITIZE=leak
leak tests: mark all ls-tree tests as passing with SANITIZE=leak
leak tests: run various "test-tool" tests in t00*.sh SANITIZE=leak
leak tests: run various built-in tests in t00*.sh SANITIZE=leak
Random changes to parse-options implementation.
* ab/parse-options-cleanup:
parse-options: change OPT_{SHORT,UNSET} to an enum
parse-options tests: test optname() output
parse-options.[ch]: make opt{bug,name}() "static"
commit-graph: stop using optname()
parse-options.c: move optname() earlier in the file
parse-options.h: make the "flags" in "struct option" an enum
parse-options.c: use exhaustive "case" arms for "enum parse_opt_result"
parse-options.[ch]: consistently use "enum parse_opt_result"
parse-options.[ch]: consistently use "enum parse_opt_flags"
parse-options.h: move PARSE_OPT_SHELL_EVAL between enums
Userdiff patterns for the C++ language has been updated.
* js/userdiff-cpp:
userdiff-cpp: back out the digit-separators in numbers
userdiff-cpp: learn the C++ spaceship operator
userdiff-cpp: permit the digit-separating single-quote in numbers
userdiff-cpp: prepare test cases with yet unsupported features
userdiff-cpp: tighten word regex
t4034: add tests showing problematic cpp tokenizations
t4034/cpp: actually test that operator tokens are not split
The xxdiff difftool backend can exit with status 128, which the
difftool-helper that launches the backend takes as a significant
failure, when it is not significant at all. Work it around.
* da/mergetools-special-case-xxdiff-exit-128:
mergetools/xxdiff: prevent segfaults from stopping difftool
Fix-up for the other topic already in 'next'.
* fs/ssh-signing-fix:
gpg-interface: fix leak of strbufs in get_ssh_key_fingerprint()
gpg-interface: fix leak of "line" in parse_ssh_output()
ssh signing: clarify trustlevel usage in docs
ssh signing: fmt-merge-msg tests & config parse
Use ssh public crypto for object and push-cert signing.
* fs/ssh-signing:
ssh signing: test that gpg fails for unknown keys
ssh signing: tests for logs, tags & push certs
ssh signing: duplicate t7510 tests for commits
ssh signing: verify signatures using ssh-keygen
ssh signing: provide a textual signing_key_id
ssh signing: retrieve a default key from ssh-agent
ssh signing: add ssh key format and signing code
ssh signing: add test prereqs
ssh signing: preliminary refactoring and clean-up
Recent sparse-index addition, namely any use of index_name_pos(),
can expand sparse index entries and breaks any code that walks
cache-tree or existing index entries. One such instance of such a
breakage has been corrected.
* pw/sparse-cache-tree-verify-fix:
t1092: run "rebase --apply" without "-q" in testing
sparse index: fix use-after-free bug in cache_tree_verify()
"git commit" gave duplicated error message when the object store
was unwritable, which has been corrected.
* ab/fix-commit-error-message-upon-unwritable-object-store:
commit: fix duplication regression in permission error output
unwritable tests: assert exact error output
Avoid performance measurements from getting ruined by gc and other
housekeeping pauses interfering in the middle.
* rs/disable-gc-during-perf-tests:
perf: disable automatic housekeeping
Follow through the work to use the repo interface to access
submodule objects in-process, instead of abusing the alternate
object database interface.
* jt/no-abuse-alternate-odb-for-submodules:
submodule: trace adding submodule ODB as alternate
submodule: pass repo to check_has_commit()
object-file: only register submodule ODB if needed
merge-{ort,recursive}: remove add_submodule_odb()
refs: peeling non-the_repository iterators is BUG
refs: teach arbitrary repo support to iterators
refs: plumb repo into ref stores
Leakfix.
* ab/unpack-trees-leakfix:
sequencer: fix a memory leak in do_reset()
sequencer: add a "goto cleanup" to do_reset()
unpack-trees: don't leak memory in verify_clean_subdirectory()
"git fsck" has been taught to report mismatch between expected and
actual types of an object better.
* ab/fsck-unexpected-type:
fsck: report invalid object type-path combinations
fsck: don't hard die on invalid object types
object-file.c: stop dying in parse_loose_header()
object-file.c: return ULHR_TOO_LONG on "header too long"
object-file.c: use "enum" return type for unpack_loose_header()
object-file.c: simplify unpack_loose_short_header()
object-file.c: make parse_loose_header_extended() public
object-file.c: return -1, not "status" from unpack_loose_header()
object-file.c: don't set "typep" when returning non-zero
cat-file tests: test for current --allow-unknown-type behavior
cat-file tests: add corrupt loose object test
cat-file tests: test for missing/bogus object with -t, -s and -p
cat-file tests: move bogus_* variable declarations earlier
fsck tests: test for garbage appended to a loose object
fsck tests: test current hash/type mismatch behavior
fsck tests: refactor one test to use a sub-repo
fsck tests: add test for fsck-ing an unknown type
We prefer "directory" over "folder" when discussing the file system
concept. Change this instance for consistency.
After this, the only hits for '\<folder\>' in Documentation/ relate to
IMAP folders.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We prefer "directory" over "folder" when discussing the file system
concept. Change this instance for consistency -- indeed, even within
this paragraph, we already use "directory".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We prefer "directory" over "folder" when discussing the file system
concept. In all of our documentation, these are the only spots where we
refer to the `.git` directory as a folder. Switch to "directory", and
while doing so, add backticks to the ".git" filename to set it in
monospace.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Describe the only <extra> option in `git archive`, that is the compression
level option. Previously this option is only described for zip backend;
add description also for tar backend.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since 'git sparse-checkout' was introduced [1] it is included in
'git --help' in the section "work on the current change" along with
the commands 'add', 'mv', 'restore', and 'rm'. It clearly doesn't
belong to that group, moreover it can't be considered such a common
command to belong to 'git --help' in the first place, so remove it
from there.
[1] 94c0956b60 (sparse-checkout: create builtin with 'list'
subcommand, 2019-11-21)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Try to make reading the computation of the gzipped flag a bit more
natural.
Signed-off-by: Robin Dupret <robin.dupret@hey.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The implementation of digit-separating single-quotes introduced a
note-worthy regression: the change of a character literal with a
digit would splice the digit and the closing single-quote. For
example, the change from 'a' to '2' is now tokenized as
'[-a'-]{+2'+} instead of '[-a-]{+2+}'.
The options to fix the regression are:
- Tighten the regular expression such that the single-quote can only
occur between digits (that would match the official syntax).
- Remove support for digit separators.
I chose to remove support, because
- I have not seen a lot of code make use of digit separators.
- If code does use digit separators, then the numbers are typically
long. If a change in one of the segments occurs, it is actually
better visible if only that segment is highlighted as the word
that changed instead of the whole long number.
This choice does introduce another minor regression, though, which
is highlighted in the test case: when a change occurs in the second
or later segment of a hexadecimal number where the segment begins
with a digit, but also has letters, the segment is mistaken as
consisting of a number and an identifier. I can live with that.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A refactoring[1] done as part of the recent conversion of
'git submodule add' to builtin, changed the error message
shown when a Git directory already exists locally for a submodule
name. Before the refactoring, the error used to appear like so:
--- START OF OUTPUT ---
$ git submodule add ../sub/ subm
A git directory for 'subm' is found locally with remote(s):
origin /me/git-repos-for-test/sub
If you want to reuse this local git directory instead of cloning again from
/me/git-repos-for-test/sub
use the '--force' option. If the local git directory is not the correct repo
or you are unsure what this means choose another name with the '--name' option.
--- END OF OUTPUT ---
After the refactoring the error started appearing like so:
--- START OF OUTPUT ---
$ git submodule add ../sub/ subm
A git directory for 'subm' is found locally with remote(s): origin /me/git-repos-for-test/sub
fatal: If you want to reuse this local git directory instead of cloning again from
/me/git-repos-for-test/sub
use the '--force' option. If the local git directory is not the correct repo
or if you are unsure what this means, choose another name with the '--name' option.
--- END OF OUTPUT ---
As one could observe the remote information is printed along with the
first line rather than on its own line. Also, there's an additional
newline following output.
Make the error message consistent with the error message that used to be
printed before the refactoring.
This also moves the 'fatal:' prefix that appears in the middle of the
error message to the first line as it would more appropriate to have
it in the first line. The output after the change would look like:
--- START OF OUTPUT ---
$ git submodule add ../sub/ subm
fatal: A git directory for 'subm' is found locally with remote(s):
origin /me/git-repos-for-test/sub
If you want to reuse this local git directory instead of cloning again from
/me/git-repos-for-test/sub
use the '--force' option. If the local git directory is not the correct repo
or you are unsure what this means choose another name with the '--name' option.
--- END OF OUTPUT ---
[1]: https://lore.kernel.org/git/20210710074801.19917-5-raykar.ath@gmail.com/#t
Signed-off-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description section for the command mentions config and reflog
are moved or copied by these options, but the description for these
options did not. Make them match.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
What --base=auto tells format-patch is to compute the base commit
itself, using the tracking information. It does not make anything
track anything.
Tighten the phrasing so that it won't be copied and pasted to other
places.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When dwim_log() returns the "ref" is always ether NULL or an
xstrdup()'d string.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a missing strbuf_release() and a clear_pathspec() to the
submodule--helper.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At this point in cmd_clone the "git_dir" is always either an
xstrdup()'d string, or something we got from mkpathdup(). Let's free()
it before we clobber it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Free the "path_list" used in builtin/grep.c, it was declared as
STRING_LIST_INIT_NODUP, let's change it to a STRING_LIST_INIT_DUP
since an early user in cmd_grep() appends a string passed via
parse-options.c to it, which needs to be duplicated.
Let's then convert the remaining callers to use
string_list_append_nodup() instead, allowing us to free the list.
This makes all the tests in t7811-grep-open.sh pass, 6/10 would fail
before this change. The only remaining failure would have been due to
a stray "git checkout" (which still leaks memory). In this case we can
use a "git reset --hard" instead, so let's do that, and move the
test_when_finished() above the code that would modify the relevant
file.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Free the "struct object_array" before exiting. This makes grep tests
(e.g. "t7815-grep-binary.sh") a bit happer under SANITIZE=leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stylistically fix up code added in bfac23d953 (grep: Fix two memory
leaks, 2010-01-30). We usually don't use the "arg" at all once we've
casted it to the struct we want, let's not do that here when we're
freeing it. Perhaps it was thought that a cast to "void *" would
otherwise be needed?
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in the error() path in handle_path_include(), this
allows us to run t1305-config-include.sh under SANITIZE=leak,
previously 4 tests there would fail. This fixes up a leak in
9b25a0b52e (config: add include directive, 2012-02-06).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "sparse" target needed the GIT-CFLAGS dependency before my
c234e8a0ec (Makefile: make the "sparse" target non-.PHONY,
2021-09-23), but since then it depends on the corresponding *.o files,
which in turn depend on the correct header files, as well as on
GIT-CFLAGS. There's no need to re-state this dependency here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove messages that were last used by the code removed in
a74b35081c (rebase: drop support for `--preserve-merges`,
2021-09-07).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "eval_ngettext()" function has been orphaned since its last user
was removed in a74b35081c (rebase: drop support for
`--preserve-merges`, 2021-09-07).
See b8fc9e43a7 (i18n: rebase-interactive: mark here-doc strings for
translation, 2016-06-17) for the commit that added these
eval_ngettext() wrappers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use a ref_sorting_release() in branch.c to free the memory from the
ref_sorting_options(). This plugs the final in-tree memory leak of
that API.
In the preceding commit the "sorting" variable was left in the
cmd_branch() scope, even though that wasn't needed anymore. Move it to
the "else if (list)" scope instead. We can also move the "struct
string_list" only used for that branch to be declared in that block
That "struct ref_sorting" does not need to be "static" (and isn't
re-used). The "ref_sorting_options()" will return a valid one, we
don't need to make it "static" to have it zero'd out. That it was
static was another artifact of the pre-image of the preceding commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a ref_sorting_release() and use it for some of the current API
users, the ref_sorting_default() function and its siblings will do a
malloc() which wasn't being free'd previously.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change cmd_tag() to free its "struct strbuf"'s instead of using an
UNLEAK() pattern. This changes code added in 886e1084d7 (builtin/:
add UNLEAKs, 2017-10-01).
As shown in the context of the declaration of the "struct
msg_arg" (which I'm changing to use a designated initializer while at
it, and to show the context in this change), that struct is just a
thin wrapper around an int and "struct strbuf".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in 8c32856133 (blame: document --color-* options,
2021-10-08), which added an extra newline before the "+" syntax.
The "Documentation/doc-diff HEAD~ HEAD" output with this applied is:
[...]
@@ -1815,13 +1815,13 @@ CONFIGURATION FILE
specified colors if the line was introduced before the given
timestamp, overwriting older timestamped colors.
- + Instead of an absolute timestamp relative timestamps work as well,
- e.g. 2.weeks.ago is valid to address anything older than 2 weeks.
+ Instead of an absolute timestamp relative timestamps work as well,
+ e.g. 2.weeks.ago is valid to address anything older than 2 weeks.
- + It defaults to blue,12 month ago,white,1 month ago,red, which colors
- everything older than one year blue, recent changes between one month
- and one year old are kept white, and lines introduced within the last
- month are colored red.
+ It defaults to blue,12 month ago,white,1 month ago,red, which
+ colors everything older than one year blue, recent changes between
+ one month and one year old are kept white, and lines introduced
+ within the last month are colored red.
color.blame.repeatedLines
Use the specified color to colorize line annotations for git blame
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The PATH used in CI job may be too wide and let incompatible dlls
to be grabbed, which can cause the build&test to fail. Tighten it.
* js/windows-ci-path-fix:
ci(windows): ensure that we do not pick up random executables
The "--color-lines" and "--color-by-age" options of "git blame"
have been missing, which are now documented.
* bs/doc-blame-color-lines:
blame: document --color-* options
blame: describe default output format
Recent sparse-index work broke safety against attempts to add paths
with trailing slashes to the index, which has been corrected.
* rs/make-verify-path-really-verify-again:
read-cache: let verify_path() reject trailing dir separators again
read-cache: add verify_path_internal()
t3905: show failure to ignore sub-repo
"git cat-file --batch" with the "--batch-all-objects" option is
supposed to iterate over all the objects found in a repository, but
it used to translate these object names using the replace mechanism,
which defeats the point of enumerating all objects in the repository.
This has been corrected.
* jk/cat-file-batch-all-wo-replace:
cat-file: use packed_object_info() for --batch-all-objects
cat-file: split ordered/unordered batch-all-objects callbacks
cat-file: disable refs/replace with --batch-all-objects
cat-file: mention --unordered along with --batch-all-objects
t1006: clean up broken objects
An editor session launched during a Git operation (e.g. during 'git
commit') can leave the terminal in a funny state. The code path
has updated to save the terminal state before, and restore it
after, it spawns an editor.
* cm/save-restore-terminal:
editor: save and reset terminal after calling EDITOR
terminal: teach git how to save/restore its terminal settings
Code clean-up.
* ab/designated-initializers-more:
builtin/remote.c: add and use SHOW_INFO_INIT
builtin/remote.c: add and use a REF_STATES_INIT
urlmatch.[ch]: add and use URLMATCH_CONFIG_INIT
builtin/blame.c: refactor commit_info_init() to COMMIT_INFO_INIT macro
daemon.c: refactor hostinfo_init() to HOSTINFO_INIT macro
"git repack" has been taught to generate multi-pack reachability
bitmaps.
* tb/repack-write-midx:
test-read-midx: fix leak of bitmap_index struct
builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
builtin/repack.c: make largest pack preferred
builtin/repack.c: support writing a MIDX while repacking
builtin/repack.c: extract showing progress to a variable
builtin/repack.c: rename variables that deal with non-kept packs
builtin/repack.c: keep track of existing packs unconditionally
midx: preliminary support for `--refs-snapshot`
builtin/multi-pack-index.c: support `--stdin-packs` mode
midx: expose `write_midx_file_only()` publicly
The "--preserve-merges" option of "git rebase" has been removed.
* js/retire-preserve-merges:
sequencer: restrict scope of a formerly public function
rebase: remove a no-longer-used function
rebase: stop mentioning the -p option in comments
rebase: remove obsolete code comment
rebase: drop the internal `rebase--interactive` command
git-svn: drop support for `--preserve-merges`
rebase: drop support for `--preserve-merges`
pull: remove support for `--rebase=preserve`
tests: stop testing `git rebase --preserve-merges`
remote: warn about unhandled branch.<name>.rebase values
t5520: do not use `pull.rebase=preserve`
The mergesort implementation used to sort linked list has been
optimized.
* rs/mergesort:
test-mergesort: use repeatable random numbers
mergesort: use ranks stack
p0071: test performance of llist_mergesort()
p0071: measure sorting of already sorted and reversed files
test-mergesort: add unriffle_skewed mode
test-mergesort: add unriffle mode
test-mergesort: add generate subcommand
test-mergesort: add test subcommand
test-mergesort: add sort subcommand
test-mergesort: use strbuf_getline()
When a transport helper pushes via send-pack, it passes --helper-status
to get a machine-readable status back for each ref. The previous commit
taught the send-pack code to hand back "error expecting report" if the
server did not send us the proper ref-status. And that's enough to cause
us to recognize that an error occurred for the ref and print something
sensible in our final status table.
But we do interpret these messages on the remote-helper side to turn
them back into REF_STATUS_* enum values. Recognizing this token to turn
it back into REF_STATUS_EXPECTING_REPORT has two advantages:
1. We now print exactly the same message in the human-readable (and
machine-readable --porcelain) output for this situation whether the
transport went through a helper (e.g., http) or not (e.g., ssh).
2. If any code in the helper really cares about distinguishing
EXPECT_REPORT from more generic error conditions, it could now do
so. I didn't find any, so this is mostly future-proofing.
So this is mostly cosmetic for now, but it seems like the
least-surprising thing for the transport-helper code to be doing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing to a server which erroneously omits the final ref-status
report, the client side should complain about the refs for which we
didn't receive the status (because we can't just assume they were
updated). This works over most transports like ssh, but for http we'll
print a very misleading "Everything up-to-date".
It works for ssh because send-pack internally sets the status of each
ref to REF_STATUS_EXPECTING_REPORT, and then if the server doesn't tell
us about a particular ref, it will stay at that value. When we print the
final status table, we'll see that we're still on EXPECTING_REPORT and
complain then.
But for http, we go through remote-curl, which invokes send-pack with
"--stateless-rpc --helper-status". The latter option causes send-pack to
return a machine-readable list of ref statuses to the remote helper. But
ever since its inception in de1a2fdd38 (Smart push over HTTP: client
side, 2009-10-30), the send-pack code has simply omitted mention of any
ref which ended up in EXPECTING_REPORT.
In the remote helper, we then take the absence of any status report
from send-pack to mean that the ref was not even something we tried to
send, and thus it prints "Everything up-to-date". Fortunately it does
detect the eventual non-zero exit from send-pack, and propagates that in
its own non-zero exit code. So at least a careful script invoking "git
push" would notice the failure. But sending the misleading message on
stderr is certainly confusing for humans (not to mention the
machine-readable "push --porcelain" output, though again, any careful
script should be checking the exit code from push, too).
Nobody seems to have noticed because the server in this instance has to
be misbehaving: it has promised to support the ref-status capability
(otherwise the client will not set EXPECTING_REPORT at all), but didn't
send us any. If the connection were simply cut, then send-pack would
complain about getting EOF while trying to read the status. But if the
server actually sends a flush packet (i.e., saying "now you have all of
the ref statuses" without actually sending any), then the client ends up
in this confused situation.
The fix is simple: we should return an error message from "send-pack
--helper-status", just like we would for any other error per-ref error
condition (in the test I included, the server simply omits all ref
status responses, but a more insidious version of this would skip only
some of them).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We read stdout from gpg into a strbuf, then split it into a list of
strbufs, pull out one element, and return it. But we don't free either
the original stdout buffer, nor the list returned from strbuf_split().
This patch fixes both. Note that we have to detach the returned string
from its strbuf before calling strbuf_list_free(), as that would
otherwise throw it away.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We xmemdupz() this buffer, but never free it. Let's do so. We'll use a
cleanup label, since there are multiple exits from the function.
Note that it was also declared a "const char *". We could switch that to
"char *" to indicate that it's allocated, but that make it awkward to
use with skip_prefix(). So instead, we'll introduce an extra non-const
pointer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We run a few operations and make sure they produce identical results
with and without sparse-index; the version we merged to the "next"
branch used the "-q" option to work around a breakage caused by a
version used at Microsoft with some unreleased changes, but since
we would want to make sure the commands produce identical results,
including reports given to the output that lists which commits were
picked, use of "-q" loses too much interesting information.
Let's drop "-q" from the command invocation and revisit the issue
when the problematic changes are upstreamed.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
maintenance_task_incremental_repack()
The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.
In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
always use the_repository->settings.
Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like the previous commit, change fsck to check the
"core_multi_pack_index" variable set in "repo-settings.c" instead of
reading the "core.multiPackIndex" config variable. This fixes a bug
where we wouldn't verify midx if the config key was missing. This bug
was introduced in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25) where core.multiPackIndex was turned on by default.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change fsck to check the "core_commit_graph" variable set in
"repo-settings.c" instead of reading the "core.commitGraph" variable.
This fixes a bug where we wouldn't verify the commit-graph if the
config key was missing. This bug was introduced in
31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
where core.commitGraph was turned on by default.
Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
commit-graph as expected for the 3 values of core.commitGraph. Also,
disable GIT_TEST_COMMIT_GRAPH in t/t0410-partial-clone.sh because some
test cases use fsck in ways that assume that commit-graph checking is
disabled.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was added in 4981fe750b (pkt-line: share
buffer/descriptor reading implementation, 2013-02-23), but in
01f9ec64c8 (Use packet_reader instead of packet_read_line,
2018-12-29) the code that was using it was removed.
Since it's being removed we can in turn remove the "src" and "src_len"
arguments to packet_read(), all the remaining users just passed a
NULL/NULL pair to it.
That function is only a thin wrapper for packet_read_with_status()
which still needs those arguments, but for the thin packet_read()
convenience wrapper we can do away with it for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was added in f1f4d8acf4 (pkt-line: add
packet_buf_write_len function, 2018-03-15) for use in
0f1dc53f45 (remote-curl: implement stateless-connect command,
2018-03-15).
In a97d00799a (remote-curl: use post_rpc() for protocol v2 also,
2019-02-21) that only user of it went away, let's remove it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a MIDX, we atomically move the new MIDX into place via
commit_lock_file(), but do not check to see if that call was successful.
Make sure that we do check in order to prevent us from incorrectly
reporting that we wrote a new MIDX if we actually encountered an error.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply similar treatment as in the last commit to the MIDX `repack`
operation.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before a new MIDX can be written, expire_midx_packs() first loads the
existing MIDX, figures out which packs can be expired, and then writes a
new MIDX based on that information.
In order to load the existing MIDX, it uses load_multi_pack_index(),
which mmaps the multi-pack-index file, but does not store the resulting
`struct multi_pack_index *` in the object store.
write_midx_internal() also needs to open the existing MIDX, and it does
so by iterating the results of get_multi_pack_index(), so that it reuses
the same pointer held by the object store. But before it can move the
new MIDX into place, it close_object_store() to munmap() the
multi-pack-index file to accommodate platforms like Windows which don't
allow overwriting files which are memory mapped.
That's where things get weird. Since expire_midx_packs has its own
*separate* memory mapped copy of the MIDX, the MIDX file is still memory
mapped! Interestingly, this doesn't seem to cause a problem in our
tests. (I believe that this has much more to do with my own lack of
familiarity with Windows than it does a lack of coverage in our tests).
In any case, we can side-step the whole issue by teaching
expire_midx_packs() to use the `struct multi_pack_index` pointer it
found via the object store instead of maintain its own copy. That way,
when write_midx_internal() calls `close_object_store()`, we know that
there are no memory mapped copies of the MIDX laying around.
A couple of other small notes about this patch:
- As far as I can tell, passing `local == 1` to the call to
load_multi_pack_index() was an error, since object_dir could be an
alternate. But it doesn't matter, since even though we write
`m->local = 1`, we never read that field back later on.
- Setting `m = NULL` after write_midx_internal() was likely to prevent
a double-free back from when that function took a `struct
multi_pack_index *` that it called close_midx() on itself. We can
rely on write_midx_internal() to call that for us now.
Finally, this enforces the same "the value of --object-dir must be the
local object store, or an alternate" rule from f57a739691 (midx: avoid
opening multiple MIDXs when writing, 2021-09-01) to the `expire`
sub-command, too.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first thing that write_midx_internal() does is load the MIDX
corresponding to the given object directory, if one is present.
Prepare for other functions in midx.c to do the same thing by extracting
that operation out to a small helper function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tb/repack-write-midx:
test-read-midx: fix leak of bitmap_index struct
builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
builtin/repack.c: make largest pack preferred
builtin/repack.c: support writing a MIDX while repacking
builtin/repack.c: extract showing progress to a variable
builtin/repack.c: rename variables that deal with non-kept packs
builtin/repack.c: keep track of existing packs unconditionally
midx: preliminary support for `--refs-snapshot`
builtin/multi-pack-index.c: support `--stdin-packs` mode
midx: expose `write_midx_file_only()` publicly
If we attempt to grep non-ascii log message text with an ascii pattern, we
run into the following issue:
$ git log --color --author='.var.*Bjar' -1 origin/master | grep ^Author
grep: (standard input): binary file matches
So, to fix this teach the grep code to use PCRE2_UTF, as long as the log
output is encoded in UTF-8.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Speed up the "lint-docs" target by making it non-.PHONY. Similar to my
c234e8a0ec (Makefile: make the "sparse" target non-.PHONY,
2021-09-23). We'll now create empty files corresponding to a
dependency graph for each of these lint scripts.
This speeds things up a bit[1], and makes the output correspond to any
in-tree changes we have:
$ touch git-add.txt; make lint-docs; make lint-docs
GEN cmd-list.made
GEN doc.dep
LINT GITLINK git-add.txt
LINT MAN END git-add.txt
LINT MAN SEC git-add.txt
make: Nothing to be done for 'lint-docs'.
As with the "sparse" target changes this has a hard dependency on the
use of ".DELETE_ON_ERROR" in the Makefile, added here in
db10fc6c09 (doc: simplify Makefile using .DELETE_ON_ERROR,
2021-05-21). This method also depends on the output for us emitting
any errors on STDERR (fixed in a preceding commit), as well us these
scripts exiting with non-zero on any errors (which they were already
doing).
1.
$ git show HEAD~:Documentation/Makefile >Makefile.old
$ hyperfine --warmup 2 -L f ",.old" 'make -j1 -f Makefile{f} lint-docs'
Benchmark #1: make -j1 -f Makefile lint-docs
Time (mean ± σ): 60.8 ms ± 1.4 ms [User: 58.7 ms, System: 2.5 ms]
Range (min … max): 58.9 ms … 64.0 ms 48 runs
Benchmark #2: make -j1 -f Makefile.old lint-docs
Time (mean ± σ): 84.0 ms ± 1.5 ms [User: 78.6 ms, System: 5.7 ms]
Range (min … max): 81.8 ms … 87.8 ms 35 runs
Summary
'make -j1 -f Makefile lint-docs' ran
1.38 ± 0.04 times faster than 'make -j1 -f Makefile.old lint-docs'
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the trick we use to speed up the "clean" target to also extend
to the "lint-docs" target. See 54df87555b (Documentation/Makefile:
conditionally include doc.dep, 2020-12-08) for the "clean"
implementation.
The "doc-lint" target only depends on *.txt files, so we don't need to
generate GIT-VERSION-FILE etc. if that's all we're doing. This makes
the "make lint-docs" target more than 2x as fast:
$ git show HEAD~:Documentation/Makefile >Makefile.old
$ hyperfine -L f ",.old" 'make -f Makefile{f} lint-docs'
Benchmark #1: make -f Makefile lint-docs
Time (mean ± σ): 100.2 ms ± 1.3 ms [User: 93.7 ms, System: 6.7 ms]
Range (min … max): 98.4 ms … 103.1 ms 29 runs
Benchmark #2: make -f Makefile.old lint-docs
Time (mean ± σ): 220.0 ms ± 20.0 ms [User: 206.0 ms, System: 18.0 ms]
Range (min … max): 206.6 ms … 267.5 ms 11 runs
Summary
'make -f Makefile lint-docs' ran
2.19 ± 0.20 times faster than 'make -f Makefile.old lint-docs'
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Have all of the scripts invoked by "make check-docs" emit their output
on STDERR. This does not currently matter due to the way we're
invoking them, but will in a subsequent change. It's a good idea to do
this in any case for consistency with other tools we invoke.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the broken "make lint-docs" (or "make check-docs" at the
top-level) target, which has been broken since my cafd9828e8 (doc
lint: lint and fix missing "GIT" end sections, 2021-04-09).
The CI for "seen" is emitting an error about a broken gitlink, but due
to there being 3x scripts chained via ";" instead of "&&" we're not
carrying forward the non-zero exit code.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 459b8d22e5 (tests: do not borrow from COPYING and README from the
real source, 2015-02-15) tests that used "lib-diff.sh" (called
"diff-lib.sh" then) were made to stop relying on the top-level COPYING
file, but we still had other tests that referenced it.
Let's move them over to use the "COPYING_test_data" utility function
introduced in the preceding commit, and in the case of the one test
that needed the "README" file use a ROT 13 version of that "COPYING"
test data. That test added in afd222967c (Extend testing git-mv for
renaming of subdirectories, 2006-07-26) just needs more test data that's not the same as the "COPYING" test data, so a ROT 13 version will do.
This change removes the last references to ../{README,COPYING} in the
test suite.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow-up the change in 459b8d22e5 (tests: do not borrow from COPYING
and README from the real source, 2015-02-15) by not shipping a full
copy of older versions of the top-level "COPYING" and "README" files.
The tests that use them just need the small blurb at the top of
"COPYING" as test data, or mock data that's dissimilar. Let's provide
that with a "COPYING_test_data" function instead.
We're not replacing this with some other generic test
data (e.g. "lorum ipsum") because these tests require test file header
to be the old "COPYING" file. See e.g. "t4003-diff-rename-1.sh" which
changes the file, and then does full "test_cmp" comparisons on the
resulting "git diff" output.
This change only changes tests that used the "lib-diff.sh" library,
but splits up what they need into a new "lib-diff-data.sh". A
subsequent commit will change related tests that were missed in
459b8d22e5.
For the test in "t4008-diff-break-rewrite.sh" the "README" file can go
away in favor of echoing the line "some dissimilar content" to a file
in the one test that needed it.
The point of that test is to start with files "A" and "B", and then
have A be more similar to the state of "B" than to its old version (by
copying over the content from the "COPYING" file). Just comparing the
pre-image of "some dissimilar content" and later a munged version of
the "COPYING" output serves that purpose.
While we're at it get rid of a stray "echo $tree" debugging line added
in 15d061b435 ([PATCH] Fix the way diffcore-rename records unremoved
source., 2005-05-27), and stop calling "hash-object" to get the hash
of an object we've just added to the index. We can instead extract
that information from the index itself with "rev-parse".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the code added in d6538246d3 (commit-graph: not compatible
with replace objects, 2018-08-20) which ignored replace objects in the
"write" command to ignore it in the "verify" command too.
We can just move this assignment to the cmd_commit_graph(), it
dispatches to "write" and "verify", and we're unlikely to ever get a
sub-command that would like to consider replace refs.
This will make tests added in eddc1f556c (mktag tests: test
update-ref and reachable fsck, 2021-06-17) pass in combination with
the "GIT_TEST_COMMIT_GRAPH" mode added in 859fdc0c3c (commit-graph:
define GIT_TEST_COMMIT_GRAPH, 2018-08-29), except that mode is
currently broken (but is being fixed concurrently). See the discussion
starting at [1].
1. https://lore.kernel.org/git/87wnmihswp.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 135a712375 (commit-graph: add --split option to builtin,
2019-06-18) this function was copy/pasted to the split commit-graph
tests, as in the preceding commit we need to fix this to use
&&-chaining, so it won't be hiding errors.
Unlike its sister function in "t5318-commit-graph.sh", which we got
lucky with, this one was hiding a real test failure. A tests added in
c523035cbd (commit-graph: allow cross-alternate chains, 2019-06-18)
has never worked as intended. Unlike most other graph_git_behavior
uses in this file it clones the repository into a sub-directory, so
we'll need to refer to "commits/6" as "origin/commits/6".
It's not easy to simply move the "graph_git_behavior" to the test
above it, since it itself spawns a "test_expect_success". Let's
instead add support to "graph_git_behavior()" and
"graph_git_two_modes()" to pass a "-C" argument to git.
We also need to add a "test -d fork" here, because otherwise we'll
fail on e.g.:
GIT_SKIP_TESTS=t5324.13 ./t5324-split-commit-graph.sh
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The graph_git_two_modes() helper added in 177722b344 (commit:
integrate commit graph with commit parsing, 2018-04-10) didn't
&&-chain its "git commit-graph" invocations, which as can be seen with
SANITIZE=leak will happily mark tests as passing if both of these
commands die, since test_cmp() will be comparing two empty files.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Try to re-chmod the trash directory on startup if we fail to "rm -rf"
it. This fixes problems where the test leaves the trash directory
behind in a bad permission state for whatever reason.
This fixes an interaction between [1] where t0004-unwritable.sh was
made to use "test_when_finished" for cleanup, and [2] which added the
"--immediate" mode. If a test in this file failed when running with
"--immediate" we wouldn't run the "test_when_finished" block, which
re-chmods the ".git/objects" directory (see [1]).
This can be demonstrated as e.g. (output snipped for less verbosity):
$ ./t0004-unwritable.sh --run=3 --immediate
ok 1 # skip setup (--run)
ok 2 # skip write-tree should notice unwritable repository (--run)
not ok 3 - commit should notice unwritable repository
[...]
$ ./t0004-unwritable.sh --run=3 --immediate
rm: cannot remove '[...]/trash directory.t0004-unwritable/.git/objects/info': Permission denied
FATAL: Cannot prepare test area
[...]
Instead of some version of reverting [1] let's make the test-lib.sh
resilient to this edge-case, it will happen due to [1], but also
e.g. if the relevant "test-lib.sh" process is kill -9'd during the
test run. We should try harder to recover in this case. If we fail to
remove the test directory let's retry after (re-)chmod-ing it.
This doesn't need to be guarded by something that's equivalent to
"POSIXPERM" since if we don't support "chmod" we were about to fail
anyway.
Let's also discard any error output from (a possibly nonexisting)
"chmod", we'll fail on the subsequent "rm -rf" anyway, likewise for
the first "rm -rf" invocation, we don't want to get the "cannot
remove" output if we can get around it with the "chmod", but we do
want any error output from the second "rm -rf", in case that doesn't
fix the issue.
The lack of &&-chaining between the "chmod" and "rm -rf" is
intentional, if we fail the first "rm -rf", can't chmod, but then
succeed the second time around that's what we were hoping for. We just
want to nuke the directory, not carry forward every possible error
code or error message.
1. dbda967684 (t0004 (unwritable files): simplify error handling,
2010-09-06)
2. b586744a86 (test: skip clean-up when running under --immediate
mode, 2011-06-27)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode added in
956d2e4639 (tests: add a test mode for SANITIZE=leak, run it in CI,
2021-09-23) to use a TAP "Bail out!" message when exiting. This will
cause the test run to exit immediately under a TAP consumer like
"prove(1)".
See 614fe01521 (test-lib: bail out when "-v" used under "prove",
2016-10-22) for the initial introduction of "Bail out!" to the
--verbose being amended here.
Before this compiling with "SANITIZE=" and running the tests with
"prove(1)" would cause all the tests to be run to the end (output
trimmed for fewer columns):
$ GIT_TEST_PASSING_SANITIZE_LEAK=true make
rm -f -r 'test-results'
*** prove ***
t0000-basic.sh ......... Dubious, test returned 1 (wstat 256, 0x100)
No subtests run
t0001-init.sh .......... Dubious, test returned 1 (wstat 256, 0x100)
No subtests run
[...output where we list every single t[0-9]*.sh file as failing snipped]
Whereas now we'll fail early, like this ("->" line wrapping added):
$ GIT_TEST_PASSING_SANITIZE_LEAK=true make
[...]
t0000-basic.sh ..................................... Bailout called. Further testing stopped:
-> GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak
FAILED--Further testing stopped: GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except
-> when compiled with SANITIZE=leak
make: *** [Makefile:53: prove] Error 1
This change also adds a red color to the "Bailout called" line, as
we're now using "say_color error". That improves existing output in
the case of e.g.:
$ prove -j8 t[0-9]*.sh :: -v
Bailout called. Further testing stopped: verbose mode forbidden under TAP harness; try --verbose-log
FAILED--Further testing stopped: verbose mode forbidden under TAP harness; try --verbose-log
We don't need to have a "Bail out! " prefix when we're not running
under a TAP consumer (i.e. if test -n "$HARNESS_ACTIVE"), but let's
not make the output conditional on that. Showing it under e.g.:
$ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0095-bloom.sh
Bail out! GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak
Doesn't harm anything, and I don't think the (small) complexity of
only adding this if we're under "$HARNESS_ACTIVE" is worth it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
De-duplicate the "finalize_junit_xml; GIT_EXIT_OK=t; exit 1" code
shared between the "error()" and "--immediate on failure" code paths,
in preparation for adding a third user in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few kinds of changes "git status" can show were not documented.
* ja/doc-status-types-and-copies:
Documentation/git-status: mention how to detect copies
Documentation/git-status: document porcelain status T (typechange)
Documentation/diff-format: state in which cases porcelain status is T
Documentation/git-status: remove impossible porcelain status DR and DC
Prevent "make sparse" from running for the source files that
haven't been modified.
* ab/make-sparse-for-real:
Makefile: make the "sparse" target non-.PHONY
When "git cmd -h" shows more than one line of usage text (e.g.
the cmd subcommand may take sub-sub-command), parse-options API
learned to align these lines, even across i18n/l10n.
* ab/align-parse-options-help:
parse-options: properly align continued usage output
git rev-parse --parseopt tests: add more usagestr tests
send-pack: properly use parse_options() API for usage string
parse-options API users: align usage output in C-strings
Teach "git help -c" into helping the command line completion of
configuration variables.
* ab/help-config-vars:
help: move column config discovery to help.c library
help / completion: make "git help" do the hard work
help tests: test --config-for-completion option & output
help: simplify by moving to OPT_CMDMODE()
help: correct logic error in combining --all and --guides
help: correct logic error in combining --all and --config
help tests: add test for --config output
help: correct usage & behavior of "git help --guides"
help: correct the usage string in -h and documentation
Built-in fsmonitor (part 1).
* jh/builtin-fsmonitor-part1:
t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command
run-command: create start_bg_command
simple-ipc/ipc-win32: add Windows ACL to named pipe
simple-ipc/ipc-win32: add trace2 debugging
simple-ipc: move definition of ipc_active_state outside of ifdef
simple-ipc: preparations for supporting binary messages.
trace2: add trace2_child_ready() to report on background children
Mostly preliminary clean-up in the hook API.
* ab/config-based-hooks-1:
hook-list.h: add a generated list of hooks, like config-list.h
hook.c users: use "hook_exists()" instead of "find_hook()"
hook.c: add a hook_exists() wrapper and use it in bugreport.c
hook.[ch]: move find_hook() from run-command.c to hook.c
Makefile: remove an out-of-date comment
Makefile: don't perform "mv $@+ $@" dance for $(GENERATED_H)
Makefile: stop hardcoding {command,config}-list.h
Makefile: mark "check" target as .PHONY
Updates to the tests in t0000 to test the test framework.
* ab/lib-subtest:
test-lib tests: get rid of copy/pasted mock test code
test-lib tests: assert 1 exit code, not non-zero
test-lib tests: refactor common part of check_sub_test_lib_test*()
test-lib tests: avoid subshell for "test_cmp" for readability
test-lib tests: don't provide a description for the sub-tests
test-lib tests: split up "write and run" into two functions
test-lib tests: move "run_sub_test" to a new lib-subtest.sh
Various fixes in code paths that move untracked files away to make room.
* en/removing-untracked-fixes:
Documentation: call out commands that nuke untracked files/directories
Comment important codepaths regarding nuking untracked files/dirs
unpack-trees: avoid nuking untracked dir in way of locally deleted file
unpack-trees: avoid nuking untracked dir in way of unmerged file
Change unpack_trees' 'reset' flag into an enum
Remove ignored files by default when they are in the way
unpack-trees: make dir an internal-only struct
unpack-trees: introduce preserve_ignored to unpack_trees_options
read-tree, merge-recursive: overwrite ignored files by default
checkout, read-tree: fix leak of unpack_trees_options.dir
t2500: add various tests for nuking untracked files
"git grep --recurse-submodules" takes trees and blobs from the
submodule repository, but the textconv settings when processing a
blob from the submodule is not taken from the submodule repository.
A test is added to demonstrate the issue, without fixing it.
* mt/grep-submodule-textconv:
grep: demonstrate bug with textconv attributes and submodules
"git add", "git mv", and "git rm" have been adjusted to avoid
updating paths outside of the sparse-checkout definition unless
the user specifies a "--sparse" option.
* ds/add-rm-with-sparse-index:
advice: update message to suggest '--sparse'
mv: refuse to move sparse paths
rm: skip sparse paths with missing SKIP_WORKTREE
rm: add --sparse option
add: update --renormalize to skip sparse paths
add: update --chmod to skip sparse paths
add: implement the --sparse option
add: skip tracked paths outside sparse-checkout cone
add: fail when adding an untracked sparse file
dir: fix pattern matching on dirs
dir: select directories correctly
t1092: behavior for adding sparse files
t3705: test that 'sparse_entry' is unstaged
A link to the bundle-format was added in 5c8273d57c (bundle doc: rewrite
the "DESCRIPTION" section, 2021-07-31).
Ensure `technical/bundle-format.html` is created to avoid a broken link
in `git-bundle.html`.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users often use "git difftool HEAD^" to review their work, and have
"mergetool.prompt" set to false so that difftool does not prompt them
before diffing each file.
This is very convenient because users can see all their diffs by
reviewing the xxdiff windows one at a time.
A problem occurs when xxdiff encounters some binary files.
It can segfault and return exit code 128, which is special-cased
by git-difftool-helper as being an extraordinary situation that
aborts the process.
Suppress the exit code from xxdiff in its diff_cmd() implementation
when we see exit code 128 so that the GIT_EXTERNAL_DIFF loop continues
on uninterrupted to the next file rather than aborting when it
encounters the first binary file.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak introduced in 9055e401dd (sequencer: introduce new
commands to reset the revision, 2018-04-25), which called
setup_unpack_trees_porcelain() without a corresponding call to
clear_unpack_trees_porcelain().
This introduces a change in behavior in that we now start calling
clear_unpack_trees_porcelain() even without having called the
setup_unpack_trees_porcelain(). That's OK, that clear function, like
most others, will accept a zero'd out struct.
This inches us closer to passing various tests in
"t34*.sh" (e.g. "t3434-rebase-i18n.sh"), but because they have so many
other memory leaks in revisions.c this doesn't make any test file or
even a single test pass.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Restructure code that's mostly added in 9055e401dd (sequencer:
introduce new commands to reset the revision, 2018-04-25) to avoid
code duplication, and to make freeing other resources easier in a
subsequent commit.
It's safe to initialize "tree_desc" to be zero'd out in order to
unconditionally free desc.buffer, it won't be initialized on the first
couple of "goto"'s.
There are three earlier "return"'s in this function which should
probably be made to use this new "cleanup" too, per [1] it looks like
they're leaving behind stale locks. But let's not try to fix every
potential bug here now, I'm just trying to narrowly plug a memory
leak.
1. https://lore.kernel.org/git/CABPp-BH=3DP-dXRCphY53-3eZd1TU8h5GY_M12nnbEGm-UYB9Q@mail.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On the Windows build agents, a lot of programs are installed, and added
to the PATH automatically.
One such program is Git for Windows, and due to the way it is set up,
unfortunately its copy of `gpg.exe` is also reachable via the PATH.
This usually does not pose any problems. To the contrary, it even allows
us to test the GPG parts of Git's test suite even if `gpg.exe` is not
delivered as part of `git-sdk-64-minimal`, the minimal subset of Git for
Windows' SDK that we use in the CI builds to compile Git.
However, every once in a while we build a new MSYS2 runtime, which means
that there is a mismatch between the copy in `git-sdk-64-minimal` and
the copy in C:\Program Files\Git\usr\bin. When that happens we hit the
dreaded problem where only one `msys-2.0.dll` is expected to be in the
PATH, and things start to fail.
Let's avoid all of this by restricting the PATH to the minimal set. This
is actually done by `git-sdk-64-minimal`'s `/etc/profile`, and we just
have to source this file manually (one would expect that it is sourced
automatically, but the Bash steps in Azure Pipelines/GitHub workflows
are explicitly run using `--noprofile`, hence the need for doing this
explicitly).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
facca53ac added verification for ssh signatures but incorrectly
described the usage of gpg.minTrustLevel. While the verifications
trustlevel is stil set to fully or undefined depending on if the key is
known or not it has no effect on the verification result. Unknown keys
will always fail verification. This commit updates the docs to match
this behaviour.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A signature attached to a signed commit, and the contents of the
commit that merged a signed tag, are both recorded as a value of an
object header field as a multi-line value, and are subject to the
formatting convention for multi-line values in the headers, with a
leading SP signaling that the rest of the line is a continuation of
the previous line. Most notably, an empty line in such a multi-line
value would result in a line with a sole SP on it.
Examples in the signature-format technical documentation include a
few of these cases but we did not show these otherwise invisible SPs
in the example. These trailing spaces cannot be seen on display or
on paper, and forces the readers to look for them in their editors
or pagers, even if we added them to the document.
Extend the overview section to explain the multi-line value
formatting and highlight these otherwise invisible SPs by inventing
the "a dollar-sign at the end of line that appears after SP merely
signals that there is a SP there, and the dollar-sign itself does
not appear in the real file" notation, inspired by "cat -e" output,
to help readers to learn exactly where such "a single SP that is
originally an empty line" appears in the examples.
Reported-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*{mktree,commit,diff,grep,rm,merge,hunk}*"
as passing when git is compiled with SANITIZE=leak. They'll now be
listed as running under the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test
mode (the "linux-leaks" CI target).
These were picked because we still have a lot of failures in adjacent
areas, and we didn't have much if any coverage of e.g. grep and diff
before this change, we could still whitelist a lot more tests, but
let's stop for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark various "generic" tests as passing when git is compiled with
SANITIZE=leak. These tests were subjectively picked from the lists of
passing tests since they're all small, and test some generic feature
such as wildmatch(), commonly used environment variables, ident
parsing etc.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*read-tree*" as passing when git is
compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target). We still have around half the tests that match
"*read-tree*" failing, but let's whitelist those that don't.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*ls-files*" as passing when git is
compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target). We still have others that match '*ls-files*" that fail
under SANITIZE=leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*{checkout,switch}*" as passing when git
is compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target).
Unfortunately almost all of those tests fail when compiled with
SANITIZE=leak, these only pass because they run "checkout-index", not
the main "checkout" command.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark all tests that match "*trace2*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark those tests that match "*ls-tree*" as passing when git is
compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark various existing tests in t00*.sh that invoke a "test-tool" with
as passing when git is compiled with SANITIZE=leak.
They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark various existing tests in t00*.sh that invoke git built-ins with
TEST_PASSES_SANITIZE_LEAK=true as passing when git is compiled with
SANITIZE=leak.
They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Protocol v0 clients can get stuck parsing a malformed feature line.
* ah/connect-parse-feature-v0-fix:
connect: also update offset for features without values
"make clean" has been updated to remove leftover .depend/
directories, even when it is not told to use them to compute header
dependencies.
* ab/make-clean-depend-dirs:
Makefile: clean .depend dirs under COMPUTE_HEADER_DEPENDENCIES != yes
Sensitive data in the HTTP trace were supposed to be redacted, but
we failed to do so in HTTP/2 requests.
* jk/http-redact-fix:
http: match headers case-insensitively when redacting
"git cvsserver" had a long-standing bug in its authentication code,
which has finally been corrected (it is unclear and is a separate
question if anybody is seriously using it, though).
* cb/cvsserver:
Documentation: cleanup git-cvsserver
git-cvsserver: protect against NULL in crypt(3)
git-cvsserver: use crypt correctly to compare password hashes
"git clone" from a repository whose HEAD is unborn into a bare
repository didn't follow the branch name the other side used, which
is corrected.
* jk/clone-unborn-head-in-bare:
clone: handle unborn branch in bare repos
"git stash", where the tentative change involves changing a
directory to a file (or vice versa), was confused, which has been
corrected.
* en/stash-df-fix:
stash: restore untracked files AFTER restoring tracked files
stash: avoid feeding directories to update-index
t3903: document a pair of directory/file bugs
When "git am --abort" fails to abort correctly, it still exited
with exit status of 0, which has been corrected.
* en/am-abort-fix:
am: fix incorrect exit status on am fail to abort
t4151: add a few am --abort tests
git-am.txt: clarify --abort behavior
"git update-ref --stdin" failed to flush its output as needed,
which potentially led the conversation to a deadlock.
* ps/update-ref-batch-flush:
t1400: avoid SIGPIPE race condition on fifo
update-ref: fix streaming of status updates
The "mode" word is useless in a call to open(2) that does not
create a new file. Such a call in the files backend of the ref
subsystem has been cleaned up.
* rs/no-mode-to-open-when-appending:
refs/files-backend: remove unused open mode parameter
The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up. This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.
* tb/pack-finalize-ordering:
pack-objects: rename .idx files into place after .bitmap files
pack-write: split up finish_tmp_packfile() function
builtin/index-pack.c: move `.idx` files into place last
index-pack: refactor renaming in final()
builtin/repack.c: move `.idx` files into place last
pack-write.c: rename `.idx` files after `*.rev`
pack-write: refactor renaming in finish_tmp_packfile()
bulk-checkin.c: store checksum directly
pack.h: line-wrap the definition of finish_tmp_packfile()
The code that optionally creates the *.rev reverse index file has
been optimized to avoid needless computation when it is not writing
the file out.
* ab/reverse-midx-optim:
pack-write: skip *.rev work when not writing *.rev
The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge.
* jc/trivial-threeway-binary-merge:
apply: resolve trivial merge without hitting ll-merge with "--3way"
Doc update plus improved error reporting.
* jk/log-warn-on-bogus-encoding:
docs: use "character encoding" to refer to commit-object encoding
logmsg_reencode(): warn when iconv() fails
The output from "git fast-export", when its anonymization feature
is in use, showed an annotated tag incorrectly.
* tk/fast-export-anonymized-tag-fix:
fast-export: fix anonymized tag using original length
Even when running "git send-email" without its own threaded
discussion support, a threading related header in one message is
carried over to the subsequent message to result in an unwanted
threading, which has been corrected.
* mh/send-email-reset-in-reply-to:
send-email: avoid incorrect header propagation
Buggy tests could damage repositories outside the throw-away test
area we created. We now by default export GIT_CEILING_DIRECTORIES
to limit the damage from such a stray test.
* sg/set-ceiling-during-tests:
test-lib: set GIT_CEILING_DIRECTORIES to protect the surrounding repository
The sparse-index support can corrupt the index structure by storing
a stale and/or uninitialized data, which has been corrected.
* jh/sparse-index-resize-fix:
sparse-index: copy dir_hash in ensure_full_index()
"git upload-pack" which runs on the other side of "git fetch"
forgot to take the ref namespaces into account when handling
want-ref requests.
* ka/want-ref-in-namespace:
docs: clarify the interaction of transfer.hideRefs and namespaces
upload-pack.c: treat want-ref relative to namespace
t5730: introduce fetch command helper
Build update for Apple clang.
* cb/makefile-apple-clang:
build: catch clang that identifies itself as "$VENDOR clang"
build: clang version may not be followed by extra words
build: update detect-compiler for newer Xcode version
"git branch -D <branch>" used to refuse to remove a broken branch
ref that points at a missing commit, which has been corrected.
* rs/branch-allow-deleting-dangling:
branch: allow deleting dangling branches with --force
The delayed checkout code path in "git checkout" etc. were chatty
even when --quiet and/or --no-progress options were given.
* mt/quiet-with-delayed-checkout:
checkout: make delayed checkout respect --quiet and --no-progress
"git diff --relative" segfaulted and/or produced incorrect result
when there are unmerged paths.
* dd/diff-files-unmerged-fix:
diff-lib: ignore paths that are outside $cwd if --relative asked
mmap() imitation used to call xmalloc() that dies upon malloc()
failure, which has been corrected to just return an error to the
caller to be handled.
* rs/git-mmap-uses-malloc:
compat: let git_mmap use malloc(3) directly
Various bugs in "git rebase -r" have been fixed.
* pw/rebase-r-fixes:
rebase -r: fix merge -c with a merge strategy
rebase -r: don't write .git/MERGE_MSG when fast-forwarding
rebase -i: add another reword test
rebase -r: make 'merge -c' behave like reword
Checking out all the paths from HEAD during the last conflicted
step in "git rebase" and continuing would cause the step to be
skipped (which is expected), but leaves MERGE_MSG file behind in
$GIT_DIR and confuses the next "git commit", which has been
corrected.
* pw/rebase-skip-final-fix:
rebase --continue: remove .git/MERGE_MSG
rebase --apply: restore some tests
t3403: fix commit authorship
Use upload-artifacts v1 (instead of v2) for 32-bit linux, as the
new version has a blocker bug for that architecture.
* cb/ci-use-upload-artifacts-v1:
ci: use upload-artifacts v1 for dockerized jobs
"git commit --fixup" now works with "--edit" again, after it was
broken in v2.32.
* jk/commit-edit-fixup-fix:
commit: restore --edit when combined with --fixup
"git range-diff" code clean-up.
* jk/range-diff-fixes:
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
"git apply" miscounted the bytes and failed to read to the end of
binary hunks.
* jk/apply-binary-hunk-parsing-fix:
apply: keep buffer/size pair in sync when parsing binary hunks
"git pull" had various corner cases that were not well thought out
around its --rebase backend, e.g. "git pull --ff-only" did not stop
but went ahead and rebased when the history on other side is not a
descendant of our history. The series tries to fix them up.
* en/pull-conflicting-options:
pull: fix handling of multiple heads
pull: update docs & code for option compatibility with rebasing
pull: abort by default when fast-forwarding is not possible
pull: make --rebase and --no-rebase override pull.ff=only
pull: since --ff-only overrides, handle it first
pull: abort if --ff-only is given and fast-forwarding is impossible
t7601: add tests of interactions with multiple merge heads and config
t7601: test interaction of merge/rebase/fast-forward flags and options
Bugfix for common ancestor negotiation recently introduced in "git
push" codepath.
* jt/push-negotiation-fixes:
fetch: die on invalid --negotiation-tip hash
send-pack: fix push nego. when remote has refs
send-pack: fix push.negotiate with remote helper
Input validation of "git pack-objects --stdin-packs" has been
corrected.
* ab/pack-stdin-packs-fix:
pack-objects: fix segfault in --stdin-packs option
pack-objects tests: cover blindspots in stdin handling
"git maintenance" scheduler fix for macOS.
* js/maintenance-launchctl-fix:
maintenance: skip bootout/bootstrap when plist is registered
maintenance: create `launchctl` configuration using a lock file
Update to the command line completion (in contrib/) for tcsh.
* ti/tcsh-completion-regression-fix:
completion: tcsh: Fix regression by drop of wrapper functions
Command line completion updates.
* fc/completion-updates:
completion: bash: add correct suffix in variables
completion: bash: fix for multiple dash commands
completion: bash: fix for suboptions with value
completion: bash: fix prefix detection in branch.*
Documentation updates.
* en/merge-strategy-docs:
Update error message and code comment
merge-strategies.txt: add coverage of the `ort` merge strategy
git-rebase.txt: correct out-of-date and misleading text about renames
merge-strategies.txt: fix simple capitalization error
merge-strategies.txt: avoid giving special preference to patience algorithm
merge-strategies.txt: do not imply using copy detection is desired
merge-strategies.txt: update wording for the resolve strategy
Documentation: edit awkward references to `git merge-recursive`
directory-rename-detection.txt: small updates due to merge-ort optimizations
git-rebase.txt: correct antiquated claims about --rebase-merges
When the option --dry-run/-n is given, "git add" doesn't change the
index, but still writes out new object files. Only hash the latter
without writing instead to make the run as dry as possible.
Use this opportunity to also make the hash_flags variable unsigned,
to match the index_path() parameter it is used as.
Reported-by: git.mexon@spamgourmet.com
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in the error output emitted when .git/objects can't
be written to. Before 9c4d6c0297 (cache-tree: Write updated
cache-tree after commit, 2014-07-13) we'd emit only one "insufficient
permission" error, now we'll do so again.
The cause is rather straightforward, we've got WRITE_TREE_SILENT for
the use-case of wanting to prepare an index silently, quieting any
permission etc. error output. Then when we attempt to update to
that (possibly broken) index we'll run into the same errors again.
But with 9c4d6c0297 the gap between the cache-tree API and the object
store wasn't closed in terms of asking write_object_file() to be
silent. I.e. post-9c4d6c0297b the first call is to prepare_index(),
and after that we'll call prepare_to_commit(). We only want verbose
error output from the latter.
So let's add and use that facility with a corresponding HASH_SILENT
flag, its only user is cache-tree.c's update_one(), which will set it
if its "WRITE_TREE_SILENT" flag is set.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for fixing a regression where we started emitting some
of these error messages twice, let's assert what the output from "git
commit" and friends is now in the case of permission errors.
As noted in [1] using test_expect_failure to mark up a TODO test has
some unexpected edge cases, e.g. we don't want to break --run=3 by
skipping the "test_lazy_prereq" here. This pattern allows us to test
just the test_cmp (and the "cat", which shouldn't fail) with the added
"test_expect_failure", we'll flip that to a "test_expect_success" in
the next commit.
1. https://lore.kernel.org/git/87tuhmk19c.fsf@evledraar.gmail.com/T/#u
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When merging a signed tag fmt-merge-msg was unable to verify its
validity missing the necessary ssh allowedSignersFile config.
Adds gpg config parsing to fmt-merge-msg.
Adds tests for ssh signed tags to fmt-merge-msg tests.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fs/ssh-signing:
ssh signing: test that gpg fails for unknown keys
ssh signing: tests for logs, tags & push certs
ssh signing: duplicate t7510 tests for commits
ssh signing: verify signatures using ssh-keygen
ssh signing: provide a textual signing_key_id
ssh signing: retrieve a default key from ssh-agent
ssh signing: add ssh key format and signing code
ssh signing: add test prereqs
ssh signing: preliminary refactoring and clean-up
Turn off automatic background maintenance for perf tests by default to
avoid interference with performance measurements. Do that by using the
new file t/perf/config and using it as the system config file for perf
tests. Future tests intended to measure gc performance can override
the setting locally or call "git gc" explicitly.
This fixes a breakage in p2000 caused by gc automatically emptying the
reflog due its fake dates from 2005 being older than 90 days.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up in "git difftool".
* da/difftool:
difftool: add a missing space to the run_dir_diff() comments
difftool: remove an unnecessary call to strbuf_release()
difftool: refactor dir-diff to write files using helper functions
difftool: create a tmpdir path without repeated slashes
CI learns to run the leak sanitizer builds.
* ab/sanitize-leak-ci:
tests: add a test mode for SANITIZE=leak, run it in CI
Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
The ref iteration code used to optionally allow dangling refs to be
shown, which has been tightened up.
* jk/ref-paranoia:
refs: drop "broken" flag from for_each_fullref_in()
ref-filter: drop broken-ref code entirely
ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
repack, prune: drop GIT_REF_PARANOIA settings
refs: turn on GIT_REF_PARANOIA by default
refs: omit dangling symrefs when using GIT_REF_PARANOIA
refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
refs-internal.h: reorganize DO_FOR_EACH_* flag documentation
refs-internal.h: move DO_FOR_EACH_* flags next to each other
t5312: be more assertive about command failure
t5312: test non-destructive repack
t5312: create bogus ref as necessary
t5312: drop "verbose" helper
t5600: provide detached HEAD for corruption failures
t5516: don't use HEAD ref for invalid ref-deletion tests
t7900: clean up some more broken refs
Compilation fix.
* js/win-lazyload-buildfix:
Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format better
lazyload.h: use an even more generic function pointer than FARPROC
lazyload.h: fix warnings about mismatching function pointer types
Test updates.
* sg/test-split-index-fix:
read-cache: fix GIT_TEST_SPLIT_INDEX
tests: disable GIT_TEST_SPLIT_INDEX for sparse index tests
read-cache: look for shared index files next to the index, too
t1600-index: disable GIT_TEST_SPLIT_INDEX
t1600-index: don't run git commands upstream of a pipe
t1600-index: remove unnecessary redirection
"git multi-pack-index write --bitmap" learns to propagate the
hashcache from original bitmap to resulting bitmap.
* tb/midx-write-propagate-namehash:
t5326: test propagating hashcache values
p5326: generate pack bitmaps before writing the MIDX bitmap
p5326: don't set core.multiPackIndex unnecessarily
p5326: create missing 'perf-tag' tag
midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps
pack-bitmap.c: propagate namehash values from existing bitmaps
t/helper/test-bitmap.c: add 'dump-hashes' mode
Since C++20, the language has a generalized comparison operator <=>.
Teach the cpp driver not to separate it into <= and > tokens.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since C++17, the single-quote can be used as digit separator:
3.141'592'654
1'000'000
0xdead'beaf
Make it known to the word regex of the cpp driver, so that numbers are
not split into separate tokens at the single-quotes.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are going to add support for C++'s digit-separating single-quote and
the spaceship operator. By adding the test cases in this separate
commit, the effect on the word highlighting will become more obvious
as the features are implemented and the file cpp/expect is updated.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "cat-file --batch-all-objects" iterates over each object, it knows
where to find each one. But when we look up details of the object, we
don't use that information at all.
This patch teaches it to use the pack/offset pair when we're iterating
over objects in a pack. This yields a measurable speed improvement
(timings on a fully packed clone of linux.git):
Benchmark #1: ./git.old cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"
Time (mean ± σ): 8.128 s ± 0.118 s [User: 7.968 s, System: 0.156 s]
Range (min … max): 8.007 s … 8.301 s 10 runs
Benchmark #2: ./git.new cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"
Time (mean ± σ): 4.294 s ± 0.064 s [User: 4.167 s, System: 0.125 s]
Range (min … max): 4.227 s … 4.457 s 10 runs
Summary
'./git.new cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"' ran
1.89 ± 0.04 times faster than './git.old cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"
The implementation is pretty simple: we just call packed_object_info()
instead of oid_object_info_extended() when we can. Most of the changes
are just plumbing the pack/offset pair through the callstack. There is
one subtlety: replace lookups are not handled by packed_object_info().
But since those are disabled for --batch-all-objects, and since we'll
only have pack info when that option is in effect, we don't have to
worry about that.
There are a few limitations to this optimization which we could address
with further work:
- I didn't bother recording when we found an object loose. Technically
this could save us doing a fruitless lookup in the pack index. But
opening and mmap-ing a loose object is so expensive in the first
place that this doesn't matter much. And if your repository is large
enough to care about per-object performance, most objects are going
to be packed anyway.
- This works only in --unordered mode. For the sorted mode, we'd have
to record the pack/offset pair as part of our oid-collection. That's
more code, plus at least 16 extra bytes of heap per object. It would
probably still be a net win in runtime, but we'd need to measure.
- For --batch, this still helps us with getting the object metadata,
but we still do a from-scratch lookup for the object contents. This
probably doesn't matter that much, because the lookup cost will be
much smaller relative to the cost of actually unpacking and printing
the objects.
For small objects, we could probably swap out read_object_file() for
using packed_object_info() with a "object_info.contentp" to get the
contents. But we'd still need to deal with streaming for larger
objects. A better path forward here is to teach the initial
oid_object_info_extended() / packed_object_info() calls to retrieve
the contents of smaller objects while they are already being
accessed. That would save the extra lookup entirely. But it's a
non-trivial feature to add to the object_info code, so I left it for
now.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we originally added --batch-all-objects, it stuffed everything into
an oid_array(), and then iterated over that array with a callback to
write the actual output.
When we later added --unordered, that code path writes immediately as we
discover each object, but just calls the same batch_object_cb() as our
entry point to the writing code. That callback has a narrow interface;
it only receives the oid, but we know much more about each object in the
unordered write (which we'll make use of in the next patch). So let's
just call batch_object_write() directly. The callback wasn't saving us
much effort.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're enumerating all objects in the object database, it doesn't
make sense to respect refs/replace. The point of this option is to
enumerate all of the objects in the database at a low level. By
definition we'd already show the replacement object's contents (under
its real oid), and showing those contents under another oid is almost
certainly working against what the user is trying to do.
Note that you could make the same argument for something like:
git show-index <foo.idx |
awk '{print $2}' |
git cat-file --batch
but there we can't know in cat-file exactly what the user intended,
because we don't know the source of the input. They could be trying to
do low-level debugging, or they could be doing something more high-level
(e.g., imagine a porcelain built around cat-file for its object
accesses). So in those cases, we'll have to rely on the user specifying
"git --no-replace-objects" to tell us what to do.
One _could_ make an argument that "cat-file --batch" is sufficiently
low-level plumbing that it should not respect replace-objects at all
(and the caller should do any replacement if they want it). But we have
been doing so for some time. The history is a little tangled:
- looking back as far as v1.6.6, we would not respect replace refs for
--batch-check, but would for --batch (because the former used
sha1_object_info(), and the replace mechanism only affected actual
object reads)
- this discrepancy was made even weirder by 98e2092b50 (cat-file:
teach --batch to stream blob objects, 2013-07-10), where we always
output the header using the --batch-check code, and then printed the
object separately. This could lead to "cat-file --batch" dying (when
it notices the size or type changed for a non-blob object) or even
producing bogus output (in streaming mode, we didn't notice that we
wrote the wrong number of bytes).
- that persisted until 1f7117ef7a (sha1_file: perform object
replacement in sha1_object_info_extended(), 2013-12-11), which then
respected replace refs for both forms.
So it has worked reliably this way for over 7 years, and we should make
sure it continues to do so. That could also be an argument that
--batch-all-objects should not change behavior (which this patch is
doing), but I really consider the current behavior to be an unintended
bug. It's a side effect of how the code is implemented (feeding the oids
back into oid_object_info() rather than looking at what we found while
reading the loose and packed object storage).
The implementation is straight-forward: we just disable the global
read_replace_refs flag when we're in --batch-all-objects mode. It would
perhaps be a little cleaner to change the flag we pass to
oid_object_info_extended(), but that's not enough. We also read objects
via read_object_file() and stream_blob_to_fd(). The former could switch
to its _extended() form, but the streaming code has no mechanism for
disabling replace refs. Setting the global flag works, and as a bonus,
it's impossible to have any "oops, we're sometimes replacing the object
and sometimes not" bugs in the output (like the ones caused by
98e2092b50 above).
The tests here cover the regular-input and --batch-all-objects cases,
for both --batch-check and --batch. There is a test in t6050 that covers
the regular-input case with --batch already, but this new one goes much
further in actually verifying the output (plus covering --batch-check
explicitly). This is perhaps a little overkill and the tests would be
simpler just covering --batch-check, but I wanted to make sure we're
checking that --batch output is consistent between the header and the
content. The global-flag technique used here makes that easy to get
right, but this is future-proofing us against regressions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The note on ordering for --batch-all-objects was written when that was
the only possible ordering. These days we have --unordered, too, so
let's point to it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few of the tests create intentionally broken objects with broken
types. Let's clean them up after we're done with them, so that later
tests don't get confused (we hadn't noticed because this only affects
tests which use --batch-all-objects, but I'm about to add more).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Submodule ODBs are never added as alternates during the execution of the
test suite, but there may be a rare interaction that the test suite does
not have coverage of. Add a trace message when this happens, so that
users who trace their commands can notice such occurrences.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass the repo explicitly when calling check_has_commit() to avoid
relying on add_submodule_odb(). With this commit and the parent commit,
the last remaining tests no longer rely on add_submodule_odb(), so mark
these tests accordingly.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a35e03dee0 ("submodule: lazily add submodule ODBs as alternates",
2021-09-08), Git was taught to add all known submodule ODBs as
alternates when attempting to read an object that doesn't exist, as a
fallback for when a submodule object is read as if it were in
the_repository. However, this behavior wasn't restricted to happen only
when reading from the_repository. Fix this.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the parent commit and some of its ancestors, the only place
commits are being accessed through alternates is in the user-facing
message formatting code. Fix those, and remove the add_submodule_odb()
calls.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is currently no support for peeling the current ref of an iterator
iterating over a non-the_repository ref store, and none is needed. Thus,
for now, BUG() if that happens.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note that should_pack_ref() is called when writing refs, which is only
supported for the_repository, hence the_repository is hardcoded there.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for the next 2 patches that adds (partial) support for
arbitrary repositories to ref iterators, plumb a repository into all ref
stores. There are no changes to program logic.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git log" command limits its output to the commits that contain strings
matched by a pattern when the "--grep=<pattern>" option is used, but unlike
output from "git grep -e <pattern>", the matches are not highlighted,
making them harder to spot.
Teach the pretty-printer code to highlight matches from the
"--grep=<pattern>", "--author=<pattern>" and "--committer=<pattern>"
options (to view the last one, you may have to ask for --pretty=fuller).
Also, it must be noted that we are effectively greping the content twice
(because it would be a hassle to rework the existing matching code to do
a /g match and then pass it all down to the coloring code), however it only
slows down "git log --author=^H" on this repository by around 1-2%
(compared to v2.33.0), so it should be a small enough slow down to justify
the addition of the feature.
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the comparisons against OPT_SHORT and OPT_UNSET to an enum
which keeps track of how a given option got parsed. The case of "0"
was an implicit OPT_LONG, so let's add an explicit label for it.
Due to the xor in 0f1930c587 (parse-options: allow positivation of
options starting, with no-, 2012-02-25) the code already relied on
this being set back to 0. To avoid refactoring the logic involved in
that let's just start the enum at "0" instead of the usual "1<<0" (1),
but BUG() out if we don't have one of our expected flags.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were no tests for checking the specific output that we'll
generate in optname(), let's add some. That output was added back in
4a59fd1312 (Add a simple option parser., 2007-10-15).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change these two functions to "static", the last user of "optname()"
outside of parse-options.c itself went away in the preceding commit,
for the reasons noted in 9440b831ad (parse-options: replace
opterror() with optname(), 2018-11-10) we shouldn't be adding any more
users of it.
The "optbug()" function was never used outside of parse-options.c, but
was made non-static in 1f275b7c4c (parse-options: export opterr,
optbug, 2011-08-11). I think the only external user of optname() was
the commit-graph.c caller added in 09e0327f57 (builtin/commit-graph.c:
introduce '--max-new-filters=<n>', 2020-09-18).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop using optname() in builtin/commit-graph.c to emit an error with
the --max-new-filters option. This changes code added in 809e0327f5
(builtin/commit-graph.c: introduce '--max-new-filters=<n>',
2020-09-18).
See 9440b831ad (parse-options: replace opterror() with optname(),
2018-11-10) for why using optname() like this is considered bad,
i.e. it's assembling human-readable output piecemeal, and the "option
`X'" at the start can't be translated.
It didn't matter in this case, but this code was also buggy in its use
of "opt->flags" to optname(), that function expects flags, but not
*those* flags.
Let's pass "max-new-filters" to the new error because the option name
isn't translatable, and because we can re-use a translation added in
f7e68a0878 (parse-options: check empty value in OPT_INTEGER and
OPT_ABBREV, 2019-05-29).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for making "optname" a static function move it above
its first user in parse-options.c.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "flags" members of "struct option" to refer to their
corresponding "enum" defined earlier in the file.
The benefit of changing this to an enum isn't as great as with some
"enum parse_opt_type" as we'll always check this as a bitfield, so we
can't rely on the compiler checking "case" arms for us. But let's do
it for consistency with the rest of the file.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "default" case in parse_options() that handles the return
value of parse_options_step() to simply have a "case" arm for
PARSE_OPT_UNKNOWN, instead of leaving it to a comment. This means the
compiler can warn us about any missing case arms.
This adjusts code added in ff43ec3e2d (parse-opt: create
parse_options_step., 2008-06-23), given its age it may pre-date the
existence (or widespread use) of this coding style, which we've since
adopted more widely.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the "enum parse_opt_result" instead of an "int flags" as the
return value of the applicable functions in parse-options.c.
This will help catch future bugs, such as the missing "case" arms in
the two existing users of the API in "blame.c" and "shortlog.c". A
third caller in 309be813c9 (update-index: migrate to parse-options
API, 2010-12-01) was already checking for these.
As can be seen when trying to sort through the deluge of warnings
produced when compiling this with CC=g++ (mostly unrelated to this
change) we're not consistently using "enum parse_opt_result" even now,
i.e. we'll return error() and "return 0;". See f41179f16b
(parse-options: avoid magic return codes, 2019-01-27) for a commit
which started changing some of that.
I'm not doing any more of that exhaustive migration here, and it's
probably not worthwhile past the point of being able to check "enum
parse_opt_result" in switch().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the "enum parse_opt_flags" instead of an "int flags" as arguments
to the various functions in parse-options.c.
Even though this is an enum bitfield there's there's a benefit to
doing this when it comes to the wider C ecosystem. E.g. the GNU
debugger (gdb) will helpfully detect and print out meaningful enum
labels in this case. Here's the output before and after when breaking
in "parse_options()" after invoking "git stash show":
Before:
(gdb) p flags
$1 = 9
After:
(gdb) p flags
$1 = (PARSE_OPT_KEEP_DASHDASH | PARSE_OPT_KEEP_UNKNOWN)
Of course as noted in[1] there's a limit to this smartness,
i.e. manually setting it with unrelated enum labels won't be
caught. There are some third-party extensions to do more exhaustive
checking[2], perhaps we'll be able to make use of them sooner than
later.
We've also got prior art using this pattern in the codebase. See
e.g. "enum bloom_filter_computed" added in 312cff5207 (bloom: split
'get_bloom_filter()' in two, 2020-09-16) and the "permitted" enum
added in ce910287e7 (add -p: fix checking of user input, 2020-08-17).
1. https://lore.kernel.org/git/87mtnvvj3c.fsf@evledraar.gmail.com/
2. https://github.com/sinelaw/elfs-clang-plugins/blob/master/enums_conversion/README.md
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit cdc2d5f11f (builtin/blame: dim uninteresting metadata lines,
2018-04-23) and 25d5f52901 (builtin/blame: highlight recently changed
lines, 2018-04-23) introduce --color-lines and --color-by-age options to
git blame, respectively. While both options are mentioned in usage help,
they aren't documented in git-blame(1). Document them.
Co-authored-by: Dr. Matthias St. Pierre <m.st.pierre@ncp-e.com>
Signed-off-by: Dr. Matthias St. Pierre <m.st.pierre@ncp-e.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generally, word regex can be written such that they match tokens
liberally and need not model the actual syntax because it can be assumed
that the regex will only be applied to syntactically correct text.
The regex for cpp (C/C++) is too liberal, though. It regards these
sequences as single tokens:
1+2
1.5-e+2+f
and the following amalgams as one token:
.l as in str.length
.f as in str.find
.e as in str.erase
Tighten the regex in the following way:
- Accept + and - only in one position in the exponent. + and - are no
longer regarded as the sign of a number and are treated by the
catcher-all that is not visible in the driver's regex.
- Accept a leading decimal point only when it is followed by a digit.
For readability, factor hex- and binary numbers into an own term.
As a drive-by, this fixes that floating point numbers such as 12E5
(with upper-case E) were split into two tokens.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The word regex is too loose and matches long streaks of characters
that should actually be separate tokens. Add these problematic test
cases. Separate the lines with text that will remain identical in the
pre- and post-image so that the diff algorithm will not lump removals
and additions of consecutive lines together. This makes the expected
output easier to read.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
8d96e7288f (t4034: bulk verify builtin word regex sanity, 2010-12-18)
added many tests with the intent to verify that operators consisting of
more than one symbol are kept together. These are tested by probing a
transition from, e.g., a!=b to x!=y, which results in the word-diff
[-a-]{+x+}!=[-b-]{+y+}
But that proves only that the letters and operators are separate tokens.
To prove that != is an unseparable token, we have to probe a transition
from, e.g., a=b to a!=b having a word-diff
a[-=-]{+!=+}b
that proves that the ! is not separate from the =.
In the post-image, add to or remove from operators a character that
turns it into another valid operator.
Change the identifiers used around operators such that the diff
algorithm does not have an incentive to match, e.g., a<b in one spot
in the pre-image with a<b elsewhere in the post-image.
Adjust the expected output to match the new differences. Notice that
there are some undesirable tokenizations around e, ., and -. This will
be addressed in a later change.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use MINSTD to generate pseudo-random numbers consistently instead of
using rand(3), whose output can vary from system to system, and reset
its seed before filling in the test values. This gives repeatable
results across versions and systems, which simplifies sharing and
comparing of results between developers.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
6e773527b6 (sparse-index: convert from full to sparse, 2021-03-30) made
verify_path() accept trailing directory separators for directories,
which is necessary for sparse directory entries. This clemency causes
"git stash" to stumble over sub-repositories, though, and there may be
more unintended side-effects.
Avoid them by restoring the old verify_path() behavior and accepting
trailing directory separators only in places that are supposed to handle
sparse directory entries.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Turn verify_path() into an internal function that distinguishes between
valid paths and those with trailing directory separators and rename it
to verify_path_internal(). Provide a wrapper with the old behavior
under the old name. No functional change intended. The new function
will be used in the next patch.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git stash" used to ignore sub-repositories until 6e773527b6
(sparse-index: convert from full to sparse, 2021-03-30). Add a test
that demonstrates this regression.
Reported-by: Robert Leftwich <robert@gitpod.io>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We strbuf_reset() this "struct strbuf" in a loop earlier, but never
freed it. Plugs a memory leak that's been here ever since this code
got introduced in 1c7b76be7d (Build in merge, 2008-07-07).
This takes us from 68 failed tests in "t7600-merge.sh" to 59 under
SANITIZE=leak, and makes "t7604-merge-custom-message.sh" pass!
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that's been here ever since 72aeb18772 (clean.c,
ls-files.c: respect encapsulation of exclude_list_groups, 2013-01-16),
we dup'd the argument in option_parse_exclude(), but never freed the
string_list.
This makes almost all of t3001-ls-files-others-exclude.sh pass (it had
a lot of failures before). Let's mark it as passing with
TEST_PASSES_SANITIZE_LEAK=true, and then exclude the tests that still
failed with a !SANITIZE_LEAK prerequisite check until we fix those
leaks. We can still see the failed tests under
GIT_TEST_FAIL_PREREQS=true.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix an edge case that was missed when the dir_clear() call was added
in eceba53214 (dir: fix problematic API to avoid memory leaks,
2020-08-18), we need to also clean up when we're about to exit with
non-zero.
That commit says, on the topic of the dir_clear() API and UNLEAK():
[...]two of them clearly thought about leaks since they had an
UNLEAK(dir) directive, which to me suggests that the method to
free the data was too unclear.
I think that 0e5bba53af (add UNLEAK annotation for reducing leak
false positives, 2017-09-08) which added the UNLEAK() makes it clear
that that wasn't the case, rather it was the desire to avoid the
complexity of freeing the memory at the end of the program.
This does add a bit of complexity, but I think it's worth it to just
fix these leaks when it's easy in built-ins. It allows them to serve
as canaries for underlying APIs that shouldn't be leaking, it
encourages us to make those freeing APIs nicer for all their users,
and it prevents other leaking regressions by being able to mark the
entire test as TEST_PASSES_SANITIZE_LEAK=true.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a trivial memory leak present ever since 38d905bf58 (sha1-array:
add test-sha1-array and basic tests, 2014-10-01), now that that's
fixed we can test this under GIT_TEST_PASSING_SANITIZE_LEAK=true.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in t/helper/test-oidtree.c, we were not freeing the
"struct strbuf" we used for the stdin input we parsed. This leak has
been here ever since 92d8ed8ac1 (oidtree: a crit-bit tree for
odb_loose_cache, 2021-07-07).
Now that it's fixed we can declare that t0069-oidtree.sh will pass
under GIT_TEST_PASSING_SANITIZE_LEAK=true.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in t/helper/test-parse-options.c, we were not
freeing the allocated "struct string_list" or its items. Let's move
the declaration of the "list" variable into the cmd__parse_options()
and release it at the end.
In c8ba163916 (parse-options: add OPT_STRING_LIST helper, 2011-06-09)
the "list" variable was added, and later on in
c8ba163916 (parse-options: add OPT_STRING_LIST helper, 2011-06-09)
the "expect" was added.
The "list" variable was last touched in 2721ce21e4 (use string_list
initializer consistently, 2016-06-13), but it was still left at the
static scope, it's better to move it to the function for consistency.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in t/helper/test-prio-queue.c, the lack of freeing
the memory with clear_prio_queue() has been there ever since this code
was originally added in b4b594a315 (prio-queue: priority queue of
pointers to structs, 2013-06-06).
By fixing this leak we can cleanly run t0009-prio-queue.sh under
SANITIZE=leak, so annotate it as such with
TEST_PASSES_SANITIZE_LEAK=true.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix two different but related memory leaks in
verify_clean_subdirectory(). We leaked both the "pathbuf" if
read_directory() returned non-zero, and we never cleaned up our own
"struct dir_struct" either.
* "pathbuf": When the read_directory() call followed by the
free(pathbuf) was added in c81935348b (Fix switching to a branch
with D/F when current branch has file D., 2007-03-15) we didn't
bother to free() before we called die().
But when this code was later libified in 203a2fe117 (Allow callers
of unpack_trees() to handle failure, 2008-02-07) we started to leak
as we returned data to the caller. This fixes that memory leak,
which can be observed under SANITIZE=leak with e.g. the
"t1001-read-tree-m-2way.sh" test.
* "struct dir_struct": We've leaked the dir_struct ever since this
code was added back in c81935348b.
When that commit was written there wasn't an equivalent of
dir_clear(). Since it was added in 270be81604 (dir.c: provide
clear_directory() for reclaiming dir_struct memory, 2013-01-06)
we've omitted freeing the memory allocated here.
This memory leak could also be observed under SANITIZE=leak and the
"t1001-read-tree-m-2way.sh" test.
This makes all the test in "t1001-read-tree-m-2way.sh" pass under
"GIT_TEST_PASSING_SANITIZE_LEAK=true", we'd previously die in tests
25, 26 & 28.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a sparse index it is possible for the tree that is being verified
to be freed while it is being verified. This happens when the index is
sparse but the cache tree is not and index_name_pos() looks up a path
from the cache tree that is a descendant of a sparse index entry. That
triggers a call to ensure_full_index() which frees the cache tree that
is being verified. Carrying on trying to verify the tree after this
results in a use-after-free bug. Instead restart the verification if a
sparse index is converted to a full index. This bug is triggered by a
call to reset_head() in "git rebase --apply". Thanks to René Scharfe
and Derrick Stolee for their help analyzing the problem.
==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
READ of size 4 at 0x606000001b20 thread T0
#0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863
#1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
#2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
#3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
#4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
#5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
#6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
#7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
#8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
#9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
#10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
#11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
#12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
#13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
#14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d)
0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58)
freed by thread T0 here:
#0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127
#1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35
#2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
#3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
#4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
#5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310
#6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588
#7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850
#8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
#9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
#10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
#11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
#12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
#13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
#14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
#15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
#16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
#17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
#18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
#19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
#20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
previously allocated by thread T0 here:
#0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140
#2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17
#3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763
#4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
#5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
#6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779
#7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85
#8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
#9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
#10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
#11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
#12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
#13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
#14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In read_midx_preferred_pack(), we open the bitmap index but never free
it. This isn't a big deal since this is just a test helper, and we exit
immediately after, but since we're trying to keep our leak-checking tidy
now, it's worth fixing.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to re-read the edited todo list in "git rebase -i" was
made more robust.
* pw/rebase-reread-todo-after-editing:
rebase: fix todo-list rereading
sequencer.c: factor out a function
Code cleanup.
* ab/repo-settings-cleanup:
repository.h: don't use a mix of int and bitfields
repo-settings.c: simplify the setup
read-cache & fetch-negotiator: check "enum" values in switch()
environment.c: remove test-specific "ignore_untracked..." variable
wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.c
Code clean-up in the "grep" machinery.
* jk/grep-haystack-is-read-only:
grep: store grep_source buffer as const
grep: mark "haystack" buffers as const
grep: stop modifying buffer in grep_source_1()
grep: stop modifying buffer in show_line()
grep: stop modifying buffer in strip_timestamp
Regression in "git commit-graph" command line parsing has been
corrected.
* tb/commit-graph-usage-fix:
builtin/multi-pack-index.c: disable top-level --[no-]progress
builtin/commit-graph.c: don't accept common --[no-]progress
"git rebase <upstream> <tag>" failed when aborted in the middle, as
it mistakenly tried to write the tag object instead of peeling it
to HEAD.
* pw/rebase-of-a-tag-fix:
rebase: dereference tags
rebase: use lookup_commit_reference_by_name()
rebase: use our standard error return value
t3407: rework rebase --quit tests
t3407: strengthen rebase --abort tests
t3407: use test_path_is_missing
t3407: rename a variable
t3407: use test_cmp_rev
t3407: use test_commit
t3407: run tests in $TEST_DIRECTORY
More code paths that use the hack to add submodule's object
database to the set of alternate object store have been cleaned up.
* jt/add-submodule-odb-clean-up:
revision: remove "submodule" from opt struct
repository: support unabsorbed in repo_submodule_init
submodule: remove unnecessary unabsorbed fallback
When EDITOR is invoked to modify a commit message, it will likely
change the terminal settings, and if it misbehaves will leave the
terminal output damaged as shown in a recent report from Windows
Terminal[1]
Instead use the functions provided by compat/terminal to save the
settings and recover safely.
[1] https://github.com/microsoft/terminal/issues/9359
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, git will share its console with all its children (unless
they create their own), and is therefore possible that any of them
that might change the settings for it could affect its operations once
completed.
Refactor the platform specific functionality to save the terminal
settings and expand it to also do so for the output handler.
This will allow for the state of the terminal to be saved and
restored around a child that might misbehave (ex vi) which will
be implemented next.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach test_perf_() to remove the temporary test_times.* files
at the end of each test.
test_perf_() runs a particular GIT_PERF_REPEAT_COUNT times and creates
./test_times.[123...]. It then uses a perl script to find the minimum
over "./test_times.*" (note the wildcard) and writes that time to
"test-results/<testname>.<testnumber>.result".
If the repeat count is changed during the pXXXX test script, stale
test_times.* files (from previous steps) may be included in the min()
computation. For example:
...
GIT_PERF_REPEAT_COUNT=3 \
test_perf "status" "
git status
"
GIT_PERF_REPEAT_COUNT=1 \
test_perf "checkout other" "
git checkout other
"
...
The time reported in the summary for "XXXX.2 checkout other" would
be "min( checkout[1], status[2], status[3] )".
We prevent that error by removing the test_times.* files at the end of
each test.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using `test_size` with `wc -c`, users on certain platforms can run
into issues when `wc` emits leading space characters in its output,
which confuses get_times.
Callers could switch to use test_file_size instead of `wc -c` (the
former never prints leading space characters, so will always work with
test_size regardless of platform), but this is an easy enough spot to
miss that we should teach get_times to be more tolerant of the input it
accepts.
Teach get_times to do just that by stripping any leading space
characters.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The man page documents that git-status can find copies, but does not
mention how. Whereas git-diff has command line options -C, there is
no such option for git-status - it will only detect copies when the
"status.renames" config option is "copies" or "copy". Document that
in git-status.txt because this has confused me and others[1].
[1]: https://www.reddit.com/r/git/comments/ppc2l9/how_to_get_a_file_with_copied_status/
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As reported in [1], T is missing from the description of porcelain
status letters in git-status(1) (whereas T *is* documented in
git-diff-files(1) and friends). Document T right after M (modified)
because the two are very similar.
[1] https://github.com/fish-shell/fish-shell/issues/8311
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Porcelain status letter T is documented as "type of the file", which
is technically correct but not enough information for users that are
not so familiar with this term from systems programming. In particular,
given that the only supported file types are "regular file", "symbolic
link" and "submodule", the term "file type" is surely opaque to the
many(?) users who are not aware that symbolic links can be tracked -
I thought that a "chmod +x" could cause the T status (wrong, it's M).
Explicitly document the three file types so users know if/how they
want to handle this.
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 176ea74793 ("wt-status.c: handle worktree renames", 2017-12-27)
made a porcelain status like .R or .C possible. They occur only when
the source file is added to the index and the destination file is
added with --intent-to-add.
They also documented DR, but that status is impossible. The index
change D means that the source file does not exist in the index.
The worktree change R/C states that the file has been renamed/copied
since the index, but that's impossible if it did not exist there.
Reported-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
b3dfeebb92 (rebase: avoid computing unnecessary patch IDs, 2016-07-29)
added a perf test that calls tac(1) from GNU core utilities. Support
systems without it by reversing the generated list using sort -nr
instead. sort(1) with options -n and -r is already used in other tests.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Protocol v0 clients can get stuck parsing a malformed feature line.
* ah/connect-parse-feature-v0-fix:
connect: also update offset for features without values
"make clean" has been updated to remove leftover .depend/
directories, even when it is not told to use them to compute header
dependencies.
* ab/make-clean-depend-dirs:
Makefile: clean .depend dirs under COMPUTE_HEADER_DEPENDENCIES != yes
Sensitive data in the HTTP trace were supposed to be redacted, but
we failed to do so in HTTP/2 requests.
* jk/http-redact-fix:
http: match headers case-insensitively when redacting
Futz with the way 'errno' is relied on in the refs API to carry the
failure modes up the call chain.
* hn/refs-errno-cleanup:
refs: make errno output explicit for read_raw_ref_fn
refs/files-backend: stop setting errno from lock_ref_oid_basic
refs: remove EINVAL errno output from specification of read_raw_ref_fn
refs file backend: move raceproof_create_file() here
Continued work on top of the hn/refs-errno-cleanup topic.
* ab/refs-files-cleanup:
refs/files: remove unused "errno != ENOTDIR" condition
refs/files: remove unused "errno == EISDIR" code
refs/files: remove unused "oid" in lock_ref_oid_basic()
refs API: remove OID argument to reflog_expire()
reflog expire: don't lock reflogs using previously seen OID
refs/files: add a comment about refs_reflog_exists() call
refs: make repo_dwim_log() accept a NULL oid
refs/debug: re-indent argument list for "prepare"
refs/files: remove unused "skip" in lock_raw_ref() too
refs/files: remove unused "extras/skip" in lock_ref_oid_basic()
refs: drop unused "flags" parameter to lock_ref_oid_basic()
refs/files: remove unused REF_DELETING in lock_ref_oid_basic()
refs/packet: add missing BUG() invocations to reflog callbacks
"git cvsserver" had a long-standing bug in its authentication code,
which has finally been corrected (it is unclear and is a separate
question if anybody is seriously using it, though).
* cb/cvsserver:
Documentation: cleanup git-cvsserver
git-cvsserver: protect against NULL in crypt(3)
git-cvsserver: use crypt correctly to compare password hashes
"git clone" from a repository whose HEAD is unborn into a bare
repository didn't follow the branch name the other side used, which
is corrected.
* jk/clone-unborn-head-in-bare:
clone: handle unborn branch in bare repos
"git stash", where the tentative change involves changing a
directory to a file (or vice versa), was confused, which has been
corrected.
* en/stash-df-fix:
stash: restore untracked files AFTER restoring tracked files
stash: avoid feeding directories to update-index
t3903: document a pair of directory/file bugs
To prevent the race described in an earlier patch, generate and pass a
reference snapshot to the multi-pack bitmap code, if we are writing one
from `git repack`.
This patch is mostly limited to creating a temporary file, and then
calling for_each_ref(). Except we try to minimize duplicates, since
doing so can drastically reduce the size in network-of-forks style
repositories. In the kernel's fork network (the repository containing
all objects from the kernel and all its forks), deduplicating the
references drops the snapshot size from 934 MB to just 12 MB.
But since we're handling duplicates in this way, we have to make sure
that we preferred references (those listed in pack.preferBitmapTips)
before non-preferred ones (to avoid recording an object which is pointed
at by a preferred tip as non-preferred).
We accomplish this by doing separate passes over the references: first
visiting each prefix in pack.preferBitmapTips, and then over the rest of
the references.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add missing space between colon sentence (`bisect-run failed:`) and the
following sentence (`git bisect--helper --bisect-state`).
Fixes: d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13)
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While there is descriptions for porcelain and incremental output
formats, the default format isn't described. Describe that format for
the sake of completeness.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the error that's emitted in cases where we find a loose object
we parse, but which isn't at the location we expect it to be.
Before this change we'd prefix the error with a not-a-OID derived from
the path at which the object was found, due to an emergent behavior in
how we'd end up with an "OID" in these codepaths.
Now we'll instead say what object we hashed, and what path it was
found at. Before this patch series e.g.:
$ git hash-object --stdin -w -t blob </dev/null
e69de29bb2
$ mv objects/e6/ objects/e7
Would emit ("[...]" used to abbreviate the OIDs):
git fsck
error: hash mismatch for ./objects/e7/9d[...] (expected e79d[...])
error: e79d[...]: object corrupt or missing: ./objects/e7/9d[...]
Now we'll instead emit:
error: e69d[...]: hash-path mismatch, found at: ./objects/e7/9d[...]
Furthermore, we'll do the right thing when the object type and its
location are bad. I.e. this case:
$ git hash-object --stdin -w -t garbage --literally </dev/null
8315a83d2acc4c174aed59430f9a9c4ed926440f
$ mv objects/83 objects/84
As noted in an earlier commits we'd simply die early in those cases,
until preceding commits fixed the hard die on invalid object type:
$ git fsck
fatal: invalid object type
Now we'll instead emit sensible error messages:
$ git fsck
error: 8315[...]: hash-path mismatch, found at: ./objects/84/15[...]
error: 8315[...]: object is of unknown type 'garbage': ./objects/84/15[...]
In both fsck.c and object-file.c we're using null_oid as a sentinel
value for checking whether we got far enough to be certain that the
issue was indeed this OID mismatch.
We need to add the "object corrupt or missing" special-case to deal
with cases where read_loose_object() will return an error before
completing check_object_signature(), e.g. if we have an error in
unpack_loose_rest() because we find garbage after the valid gzip
content:
$ git hash-object --stdin -w -t blob </dev/null
e69de29bb2
$ chmod 755 objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
$ echo garbage >>objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
$ git fsck
error: garbage at end of loose object 'e69d[...]'
error: unable to unpack contents of ./objects/e6/9d[...]
error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...]
There is currently some weird messaging in the edge case when the two
are combined, i.e. because we're not explicitly passing along an error
state about this specific scenario from check_stream_oid() via
read_loose_object() we'll end up printing the null OID if an object is
of an unknown type *and* it can't be unpacked by zlib, e.g.:
$ git hash-object --stdin -w -t garbage --literally </dev/null
8315a83d2acc4c174aed59430f9a9c4ed926440f
$ chmod 755 objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
$ echo garbage >>objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
$ /usr/bin/git fsck
fatal: invalid object type
$ ~/g/git/git fsck
error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f'
error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
[...]
I think it's OK to leave that for future improvements, which would
involve enum-ifying more error state as we've done with "enum
unpack_loose_header_result" in preceding commits. In these
increasingly more obscure cases the worst that can happen is that
we'll get slightly nonsensical or inapplicable error messages.
There's other such potential edge cases, all of which might produce
some confusing messaging, but still be handled correctly as far as
passing along errors goes. E.g. if check_object_signature() returns
and oideq(real_oid, null_oid()) is true, which could happen if it
returns -1 due to the read_istream() call having failed.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the error fsck emits on invalid object types, such as:
$ git hash-object --stdin -w -t garbage --literally </dev/null
<OID>
From the very ungraceful error of:
$ git fsck
fatal: invalid object type
$
To:
$ git fsck
error: <OID>: object is of unknown type 'garbage': <OID_PATH>
[ other fsck output ]
We'll still exit with non-zero, but now we'll finish the rest of the
traversal. The tests that's being added here asserts that we'll still
complain about other fsck issues (e.g. an unrelated dangling blob).
To do this we need to pass down the "OBJECT_INFO_ALLOW_UNKNOWN_TYPE"
flag from read_loose_object() through to parse_loose_header(). Since
the read_loose_object() function is only used in builtin/fsck.c we can
simply change it to accept a "struct object_info" (which contains the
OBJECT_INFO_ALLOW_UNKNOWN_TYPE in its flags). See
f6371f9210 (sha1_file: add read_loose_object() function, 2017-01-13)
for the introduction of read_loose_object().
Since we'll need a "struct strbuf" to hold the "type_name" let's pass
it to the for_each_loose_file_in_objdir() callback to avoid allocating
a new one for each loose object in the iteration. It also makes the
memory management simpler than sticking it in fsck_loose() itself, as
we'll only need to strbuf_reset() it, with no need to do a
strbuf_release() before each "return".
Before this commit we'd never check the "type" if read_loose_object()
failed, but now we do. We therefore need to initialize it to OBJ_NONE
to be able to tell the difference between e.g. its
unpack_loose_header() having failed, and us getting past that and into
parse_loose_header().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make parse_loose_header() return error codes and data instead of
invoking die() by itself.
For now we'll move the relevant die() call to loose_object_info() and
read_loose_object() to keep this change smaller. In a subsequent
commit we'll make read_loose_object() return an error code instead of
dying. We should also address the "allow_unknown" case (should be
moved to builtin/cat-file.c), but for now I'll be leaving it.
For making parse_loose_header() not die() change its prototype to
accept a "struct object_info *" instead of the "unsigned long *sizep"
it accepted before. Its callers can now check the populated populated
"oi->typep".
Because of this we don't need to pass in the "unsigned int flags"
which we used for OBJECT_INFO_ALLOW_UNKNOWN_TYPE, we can instead do
that check in loose_object_info().
This also refactors some confusing control flow around the "status"
variable. In some cases we set it to the return value of "error()",
i.e. -1, and later checked if "status < 0" was true.
Since 93cff9a978 (sha1_loose_object_info: return error for corrupted
objects, 2017-04-01) the return value of loose_object_info() (then
named sha1_loose_object_info()) had been a "status" variable that be
any negative value, as we were expecting to return the "enum
object_type".
The only negative type happens to be OBJ_BAD, but the code still
assumed that more might be added. This was then used later in
e.g. c84a1f3ed4 (sha1_file: refactor read_object, 2017-06-21). Now
that parse_loose_header() will return 0 on success instead of the
type (which it'll stick into the "struct object_info") we don't need
to conflate these two cases in its callers.
Since parse_loose_header() doesn't need to return an arbitrary
"status" we only need to treat its "ret < 0" specially, but can
idiomatically overwrite it with our own error() return. This along
with having made unpack_loose_header() return an "enum
unpack_loose_header_result" in an earlier commit means that we can
move the previously nested if/else cases mostly into the "ULHR_OK"
branch of the "switch" statement.
We should be less silent if we reach that "status = -1" branch, which
happens if we've got trailing garbage in loose objects, see
f6371f9210 (sha1_file: add read_loose_object() function, 2017-01-13)
for a better way to handle it. For now let's punt on it, a subsequent
commit will address that edge case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split up the return code for "header too long" from the generic
negative return value unpack_loose_header() returns, and report via
error() if we exceed MAX_HEADER_LEN.
As a test added earlier in this series in t1006-cat-file.sh shows
we'll correctly emit zlib errors from zlib.c already in this case, so
we have no need to carry those return codes further down the
stack. Let's instead just return ULHR_TOO_LONG saying we ran into the
MAX_HEADER_LEN limit, or other negative values for "unable to unpack
<OID> header".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a preceding commit we changed and documented unpack_loose_header()
from its previous behavior of returning any negative value or zero, to
only -1 or 0.
Let's add an "enum unpack_loose_header_result" type and use it for
these return values, and have the compiler assert that we're
exhaustively covering all of them.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Combine the unpack_loose_short_header(),
unpack_loose_header_to_strbuf() and unpack_loose_header() functions
into one.
The unpack_loose_header_to_strbuf() function was added in
46f034483e (sha1_file: support reading from a loose object of unknown
type, 2015-05-03).
Its code was mostly copy/pasted between it and both of
unpack_loose_header() and unpack_loose_short_header(). We now have a
single unpack_loose_header() function which accepts an optional
"struct strbuf *" instead.
I think the remaining unpack_loose_header() function could be further
simplified, we're carrying some complexity just to be able to emit a
garbage type longer than MAX_HEADER_LEN, we could alternatively just
say "we found a garbage type <first 32 bytes>..." instead. But let's
leave the current behavior in place for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the parse_loose_header_extended() function public and remove the
parse_loose_header() wrapper. The only direct user of it outside of
object-file.c itself was in streaming.c, that caller can simply pass
the required "struct object-info *" instead.
This change is being done in preparation for teaching
read_loose_object() to accept a flag to pass to
parse_loose_header(). It isn't strictly necessary for that change, we
could simply use parse_loose_header_extended() there, but will leave
the API in a better end state.
It would be a better end-state to have already moved the declaration
of these functions to object-store.h to avoid the forward declaration
of "struct object_info" in cache.h, but let's leave that cleanup for
some other time.
1. https://lore.kernel.org/git/patch-v6-09.22-5b9278e7bb4-20210907T104559Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Return a -1 when git_inflate() fails instead of whatever Z_* status
we'd get from zlib.c. This makes no difference to any error we report,
but makes it more obvious that we don't care about the specific zlib
error codes here.
See d21f842690 (unpack_sha1_header(): detect malformed object header,
2016-09-25) for the commit that added the "return status" code. As far
as I can tell there was never a real reason (e.g. different reporting)
for carrying down the "status" as opposed to "-1".
At the time that d21f842690 was written there was a corresponding
"ret < Z_OK" check right after the unpack_sha1_header() call (the
"unpack_sha1_header()" function was later rename to our current
"unpack_loose_header()").
However, that check was removed in c84a1f3ed4 (sha1_file: refactor
read_object, 2017-06-21) without changing the corresponding return
code.
So let's do the minor cleanup of also changing this function to return
a -1.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the loose_object_info() function returns an error stop faking up
the "oi->typep" to OBJ_BAD. Let the return value of the function
itself suffice. This code cleanup simplifies subsequent changes.
That we set this at all is a relic from the past. Before
052fe5eaca (sha1_loose_object_info: make type lookup optional,
2013-07-12) we would always return the type_from_string(type) via the
parse_sha1_header() function, or -1 (i.e. OBJ_BAD) if we couldn't
parse it.
Then in a combination of 46f034483e (sha1_file: support reading from
a loose object of unknown type, 2015-05-03) and
b3ea7dd32d (sha1_loose_object_info: handle errors from
unpack_sha1_rest, 2017-10-05) our API drifted even further towards
conflating the two again.
Having read the code paths involved carefully I think this is OK. We
are just about to return -1, and we have only one caller:
do_oid_object_info_extended(). That function will in turn go on to
return -1 when we return -1 here.
This might be introducing a subtle bug where a caller of
oid_object_info_extended() would inspect its "typep" and expect a
meaningful value if the function returned -1.
Such a problem would not occur for its simpler oid_object_info()
sister function. That one always returns the "enum object_type", which
in the case of -1 would be the OBJ_BAD.
Having read the code for all the callers of these functions I don't
believe any such bug is being introduced here, and in any case we'd
likely already have such a bug for the "sizep" member (although
blindly checking "typep" first would be a more common case).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a blindspot in the tests for "cat-file" (and by proxy, the guts of
object-file.c) by testing that when we can't decode a loose object
with zlib we'll emit an error from zlib.c.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we look up a missing object with cat_one_file() what error we
print out currently depends on whether we'll error out early in
get_oid_with_context(), or if we'll get an error later from
oid_object_info_extended().
The --allow-unknown-type flag then changes whether we pass the
"OBJECT_INFO_ALLOW_UNKNOWN_TYPE" flag to get_oid_with_context() or
not.
The "-p" flag is yet another special-case in printing the same output
on the deadbeef OID as we'd emit on the deadbeef_short OID for the
"-s" and "-t" options, it also doesn't support the
"--allow-unknown-type" flag at all.
Let's test the combination of the two sets of [-t, -s, -p] and
[--{no-}allow-unknown-type] (the --no-allow-unknown-type is implicit
in not supplying it), as well as a [missing,bogus] object pair.
This extends tests added in 3e370f9faf (t1006: add tests for git
cat-file --allow-unknown-type, 2015-05-03).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the short/long bogus bogus object type variables into a form
where the two sets can be used concurrently. This'll be used by
subsequently added tests.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There wasn't any output tests for this scenario, let's ensure that we
don't regress on it in the changes that come after this.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If fsck we move an object around between .git/objects/?? directories
to simulate a hash mismatch "git fsck" will currently hard die() in
object-file.c. This behavior will be fixed in subsequent commits, but
let's test for it as-is for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor one of the fsck tests to use a throwaway repository. It's a
pervasive pattern in t1450-fsck.sh to spend a lot of effort on the
teardown of a tests so we're not leaving corrupt content for the next
test.
We can instead use the pattern of creating a named sub-repository,
then we don't have to worry about cleaning up after ourselves, nobody
will care what state the broken "hash-mismatch" repository is after
this test runs.
See [1] for related discussion on various "modern" test patterns that
can be used to avoid verbosity and increase reliability.
1. https://lore.kernel.org/git/87y27veeyj.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a blindspot in the fsck tests by checking what we do when we
encounter an unknown "garbage" type produced with hash-object's
--literally option.
This behavior needs to be improved, which'll be done in subsequent
patches, but for now let's test for the current behavior.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was removed in ad0fb65999 (repo-settings: parse
core.untrackedCache, 2019-08-13), but not its corresponding *.h entry.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The init_log_tree_opt() and log_tree_opt_parse() functions were
removed in cd2bdc5309 (Common option parsing for "git log --diff" and
friends, 2006-04-14), but not their corresponding *.h declaration.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was removed in 0579f91dd7 (grep: enable threading with
-p and -W using lazy attribute lookup, 2011-12-12), but not its
corresponding *.h declaration.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cmd_tar_tree() function itself was removed in
925ceccf05 (tar-tree: remove deprecated command, 2013-11-10).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commit we introduced REF_STATES_INIT, but did not
change the "struct show_info" to have a corresponding
initializer. Let's do that, and make it use "REF_STATES_INIT" and
"STRING_LIST_INIT_DUP", doing that requires changing "list" and
"states" away from being pointers.
The resulting end-state is simpler since we omit the local "info_list"
and "states" variables in show() as well as the memset().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use a new REF_STATES_INIT designated initializer instead of assigning
to the "strdup_strings" member of the previously memzero()'d version
of this struct.
The pattern of assigning to "strdup_strings" dates back to
211c89682e (Make git-remote a builtin, 2008-02-29) (when it was
"strdup_paths"), i.e. long before we used anything like our current
established *_INIT patterns consistently.
Then in e61e0cc6b7 (builtin-remote: teach show to display remote
HEAD, 2009-02-25) and e5dcbfd9ab (builtin-remote: new show output
style for push refspecs, 2009-02-25) we added some more of these.
As it turns out we only initialized this struct three times, all the
other uses were of pointers to those initialized structs. So let's
initialize it in those three places, skip the memset(), and pass those
structs down appropriately.
This would be a behavior change if we had codepaths that relied say on
implicitly having had "new_refs" initialized to STRING_LIST_INIT_NODUP
with the memset(), but only set the "strdup_strings" on some other
struct, but then called string_list_append() on "new_refs". There
isn't any such codepath, all of the late assignments to
"strdup_strings" assigned to those structs that we'd use for those
codepaths.
So just initializing them all up-front makes for easier to understand
code, i.e. in the pre-image it looked as though we had that tricky
edge case, but we didn't.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the initialization pattern of "struct urlmatch_config" to use
an *_INIT macro and designated initializers. Right now there's no
other "struct" member of "struct urlmatch_config" which would require
its own *_INIT, but it's good practice not to assume that. Let's also
change this to a designated initializer while we're at it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bottom-up mergesort implementation needs to skip through sublists a
lot. A recursive version could avoid that, but would require log2(n)
stack frames. Explicitly manage a stack of sorted sublists of various
lengths instead to avoid fast-forwarding while also keeping a lid on
memory usage.
While this patch was developed independently, a ranks stack is also used
in https://github.com/mono/mono/blob/master/mono/eglib/sort.frag.h by
the Mono project.
The idea is to keep slots for log2(n_max) sorted sublists, one for each
power of 2. Such a construct can accommodate lists of any length up to
n_max. Since there is a known maximum number of items (effectively
SIZE_MAX), we can preallocate the whole rank stack.
We add items one by one, which is akin to incrementing a binary number.
Make use of that by keeping track of the number of items and check bits
in it instead of checking for NULL in the rank stack when checking if a
sublist of a certain rank exists, in order to avoid memory accesses.
The first item can go into the empty first slot as a sublist of length
2^0. The second one needs to be merged with the previous sublist and
the result goes into the empty second slot as a sublist of length 2^1.
The third one goes into vacated first slot and so on. At the end we
merge all the sublists to get the result.
The new version still performs a stable sort by making sure to put items
seen earlier first when the compare function indicates equality. That's
done by preferring items from sublists with a higher rank.
The new merge function also tries to minimize the number of operations.
Like blame.c::blame_merge(), the function doesn't set the next pointer
if it already points to the right item, and it exits when it reaches the
end of one of the two sublists that it's given. The old code couldn't
do the latter because it kept all items in a single list.
The number of comparisons stays the same, though. Here's example output
of "test-tool mergesort test" for the rand distributions with the most
number of comparisons with the ranks stack:
$ t/helper/test-tool mergesort test | awk '
NR > 1 && $1 != "rand" {next}
$7 > max[$3] {max[$3] = $7; line[$3] = $0}
END {for (n in line) print line[n]}
'
distribut mode n m get_next set_next compare verdict
rand copy 100 32 669 420 569 OK
rand dither 1023 64 9997 5396 8974 OK
rand dither 1024 512 10007 6159 8983 OK
rand dither 1025 256 10993 5988 9968 OK
Here are the differences to the results without this patch:
distribut mode n m get_next set_next compare
rand copy 100 32 -515 -280 0
rand dither 1023 64 -6376 -4834 0
rand dither 1024 512 -6377 -4081 0
rand dither 1025 256 -7461 -5287 0
The numbers of get_next and set_next calls are reduced significantly.
NB: These winners are different than the ones shown in the patch that
introduced the unriffle mode because the addition of the unriffle_skewed
mode in between changed the consumption of rand() values.
Here are the distributions with the most comparisons overall with the
ranks stack:
$ t/helper/test-tool mergesort test | awk '
$7 > max[$3] {max[$3] = $7; line[$3] = $0}
END {for (n in line) print line[n]}
'
distribut mode n m get_next set_next compare verdict
sawtooth unriffle_skewed 100 128 689 632 589 OK
sawtooth unriffle_skewed 1023 1024 10230 10220 9207 OK
sawtooth unriffle 1024 1024 10241 10240 9217 OK
sawtooth unriffle_skewed 1025 2048 11266 10242 10241 OK
And here the differences to before:
distribut mode n m get_next set_next compare
sawtooth unriffle_skewed 100 128 -495 -68 0
sawtooth unriffle_skewed 1023 1024 -6143 -10 0
sawtooth unriffle 1024 1024 -6143 0 0
sawtooth unriffle_skewed 1025 2048 -7188 -1033 0
We get a similar reduction of get_next calls here, but only a slight
reduction of set_next calls, if at all.
And here are the results of p0071-sort.sh before:
0071.12: llist_mergesort() unsorted 0.36(0.33+0.01)
0071.14: llist_mergesort() sorted 0.15(0.13+0.01)
0071.16: llist_mergesort() reversed 0.16(0.14+0.01)
... and here the ones with this patch:
0071.12: llist_mergesort() unsorted 0.24(0.22+0.01)
0071.14: llist_mergesort() sorted 0.12(0.10+0.01)
0071.16: llist_mergesort() reversed 0.12(0.10+0.01)
NB: We can't use t/perf/run to compare revisions in one run because it
uses the test-tool from the worktree, not from the revisions being
tested.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check if sorting takes advantage of already sorted or reversed content,
or if that corner case actually decreases performance, like it would for
a simplistic quicksort implementation.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a mode that turns a sorted list into adversarial input for a
bottom-up mergesort implementation that doubles the length of sorted
sublists at each level -- like our llist_mergesort().
While unriffle mode splits the list in half at each recursion step,
unriffle_skewed splits it into 2^l items and the rest, with 2^l being
the highest power of two smaller than the number of items and thus
2^l >= rest. The rest is unriffled with the tail of the first half to
require a merge to compare the maximum number of elements.
It complements the unriffle mode, which targets balanced merges. If
the number of elements is a power of two then both actually produce the
same result, as 2^l == rest == n/2 at each recursion step in that case.
Here are the results:
$ t/helper/test-tool mergesort test | awk '
$7 > max[$3] {max[$3] = $7; line[$3] = $0}
END {for (n in line) print line[n]}
'
distribut mode n m get_next set_next compare verdict
sawtooth unriffle_skewed 100 128 1184 700 589 OK
sawtooth unriffle_skewed 1023 1024 16373 10230 9207 OK
sawtooth unriffle 1024 1024 16384 10240 9217 OK
sawtooth unriffle_skewed 1025 2048 18454 11275 10241 OK
The sawtooth distribution with m>=n produces a sorted list and
unriffle_skewed mode turns it into adversarial input for unbalanced
merges, which it wins in all cases except for n=1024 -- the resulting
list is the same, but unriffle is tested before unriffle_skewed, so its
result is selected by the AWK script.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a mode that turns sorted items into adversarial input for mergesort.
Do that by running mergesort in reverse and rearranging the items in
such a way that each merge needs the maximum number of operations to
undo it.
To riffle is a card shuffling technique and involves splitting a deck
into two and then to interleave them. A perfect riffle takes one card
from each half in turn. That's similar to the most expensive merge,
which has to take one item from each sublist in turn, which requires the
maximum number of comparisons (n-1).
So unriffle does that in reverse, i.e. it generates the first sublist
out of the items at even indexes and the second sublist out of the items
at odd indexes, without changing their order in any other way. Done
recursively until we reach the trivial sublist length of one, this
twists the list into an order that requires the maximum effort for
mergesort to untangle.
As a baseline, here are the rand distributions with the highest number
of comparisons from "test-tool mergesort test":
$ t/helper/test-tool mergesort test | awk '
NR > 1 && $1 != "rand" {next}
$7 > max[$3] {max[$3] = $7; line[$3] = $0}
END {for (n in line) print line[n]}
'
distribut mode n m get_next set_next compare verdict
rand copy 100 32 1184 700 569 OK
rand reverse_1st_half 1023 256 16373 10230 8976 OK
rand reverse_1st_half 1024 512 16384 10240 8993 OK
rand dither 1025 64 18454 11275 9970 OK
And here are the most expensive ones overall:
$ t/helper/test-tool mergesort test | awk '
$7 > max[$3] {max[$3] = $7; line[$3] = $0}
END {for (n in line) print line[n]}
'
distribut mode n m get_next set_next compare verdict
stagger reverse 100 64 1184 700 580 OK
sawtooth unriffle 1023 1024 16373 10230 9179 OK
sawtooth unriffle 1024 1024 16384 10240 9217 OK
stagger unriffle 1025 2048 18454 11275 10241 OK
The sawtooth distribution with m>=n generates a sorted list. The
unriffle mode is designed to turn that into adversarial input for
mergesort, and that checks out for n=1023 and n=1024, where it produces
the list that requires the most comparisons.
Item counts that are not powers of two have other winners, and that's
because unriffle recursively splits lists into equal-sized halves, while
llist_mergesort() splits them into the biggest power of two smaller than
n and the rest, e.g. for n=1025 it sorts the first 1024 separately and
finally merges them to the last item.
So unriffle mode works as designed for the intended use case, but to
consistently generate adversarial input for unbalanced merges we need
something else.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a subcommand for printing test data. It can be used to generate
special test cases and feed them into the sort subcommand or sort(1) for
performance measurements. It may also be useful to illustrate the
effect of distributions, modes and their parameters.
It generates n integers with the specified distribution and its
distribution-specific parameter m. E.g. m is the maximum value for
the plateau distribution and the length and height of individual teeth
of the sawtooth distribution.
The generated values are printed as zero-padded eight-digit hexadecimal
numbers to make sure alphabetic and numeric order are the same.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adapt the qsort certification program from "Engineering a Sort Function"
by Bentley and McIlroy for testing our linked list sort function. It
generates several lists with various distribution patterns and counts
the number of operations llist_mergesort() needs to order them. It
compares the result to the output of a trusted sort function (qsort(1))
and also checks if the sort is stable.
Also add a test script that makes use of the new subcommand.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Give the code for sorting a text file its own sub-command. This allows
extending the helper, which we'll do in the following patches.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Strip line ending characters to make sure empty lines are sorted like
sort(1) does.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `buf` strbuf is reused again later in the same function, so there
is no benefit to calling strbuf_release(). The subsequent usage is
already using strbuf_reset() to reset the buffer, so releasing it
early is only going to lead to a wasteful reallocation.
Remove the early call to strbuf_release(). The same strbuf is already
cleaned up in the "finish:" section so nothing is leaked, either.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helpers function to handle the unlinking and writing
of the dir-diff submodule and symlink stand-in files.
Use the helpers to implement the guts of the hashmap loops.
This eliminate duplicate code and safeguards the submodules
hashmap loop against the symlink-chasing behavior that 5bafb3576a
(difftool: fix symlink-file writing in dir-diff mode, 2021-09-22)
addressed.
The submodules loop should not strictly require the unlink() call that
this is introducing to them, but it does not necessarily hurt them
either beyond the cost of the extra unlink().
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The paths generated by difftool are passed to user-facing diff tools.
Using paths with repeated slashes in them is a cosmetic blemish that
is exposed to users and can be avoided.
Use a strbuf to create the buffer used for the dir-diff tmpdir.
Strip trailing slashes from the value read from TMPDIR to avoid
repeated slashes in the generated paths.
Adjust the error handling to avoid leaking strbufs and to avoid
returning -1 to cmd_main().
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These changes are made in preparation of, the colorization support for the
"git log" subcommands that, rely on regex functionality (i.e. "--author",
"--committer" and "--grep"). These changes are necessary primarily because
match_one_pattern() expects header lines to be prefixed, however, in
pretty, the prefixes are stripped from the lines because the name-email
pairs need to go through additional parsing, before they can be printed and
because next_match() doesn't handle the case of
"ctx == GREP_CONTEXT_HEAD" at all. So, teach next_match() how to handle the
new case and move match_one_pattern()'s core logic to
headerless_match_one_pattern() while preserving match_one_pattern()'s uses
that depend on the additional processing.
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some circumstances, "git grep --textconv --recurse-submodules"
ignores the textconv attributes from the submodules and erroneously
applies the attributes defined in the superproject on the submodules'
files. The textconv cache is also saved on the superproject, even for
submodule objects.
A fix for these problems will probably require at least three changes:
- Some textconv and attributes functions (as well as their callees) will
have to be adjusted to work with arbitrary repositories. Note that
"fill_textconv()", for example, already receives a "struct repository"
but it writes the textconv cache using "write_loose_object()", which
implicitly works on "the_repository".
- grep.c functions will have to call textconv/userdiff routines passing
the "repo" field from "struct grep_source" instead of the one from
"struct grep_opt". The latter always points to "the_repository" on
"git grep" executions (see its initialization in builtin/grep.c), but
the former points to the correct repository that each source (an
object, file, or buffer) comes from.
- "userdiff_find_by_path()" might need to use a different attributes
stack for each repository it works on or reset its internal static
stack when the repository is changed throughout the calls.
For now, let's add some tests to demonstrate these problems, and also
update a NEEDSWORK comment in grep.h that mentions this bug to reference
the added tests.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When repacking into a geometric series and writing a multi-pack bitmap,
it is beneficial to have the largest resulting pack be the preferred
object source in the bitmap's MIDX, since selecting the large packs can
lead to fewer broken delta chains and better compression.
Teach 'git repack' to identify this pack and pass it to the MIDX write
machinery in order to mark it as preferred.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach `git repack` a new `--write-midx` option for callers that wish to
persist a multi-pack index in their repository while repacking.
There are two existing alternatives to this new flag, but they don't
cover our particular use-case. These alternatives are:
- Call 'git multi-pack-index write' after running 'git repack', or
- Set 'GIT_TEST_MULTI_PACK_INDEX=1' in your environment when running
'git repack'.
The former works, but introduces a gap in bitmap coverage between
repacking and writing a new MIDX (since the repack may have deleted a
pack included in the existing MIDX, invalidating it altogether).
Setting the 'GIT_TEST_' environment variable is obviously unsupported.
In fact, even if it were supported officially, it still wouldn't work,
because it generates the MIDX *after* redundant packs have been dropped,
leading to the same issue as above.
Introduce a new option which eliminates this race by teaching `git
repack` to generate the MIDX at the critical point: after the new packs
have been written and moved into place, but before the redundant packs
have been removed.
This option is compatible with `git repack`'s '--bitmap' option (it
changes the interpretation to be: "write a bitmap corresponding to the
MIDX after one has been generated").
There is a little bit of additional noise in the patch below to avoid
repeating ourselves when selecting which packs to delete. Instead of a
single loop as before (where we iterate over 'existing_packs', decide if
a pack is worth deleting, and if so, delete it), we have two loops (the
first where we decide which ones are worth deleting, and the second
where we actually do the deleting). This makes it so we have a single
check we can make consistently when (1) telling the MIDX which packs we
want to exclude, and (2) actually unlinking the redundant packs.
There is also a tiny change to short-circuit the body of
write_midx_included_packs() when no packs remain in the case of an empty
repository. The MIDX code does not handle this, so avoid trying to
generate a MIDX covering zero packs in the first place.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We only ask whether stderr is a tty before calling
'prune_packed_objects()', but the subsequent patch will add another use.
Extract this check into a variable so that both can use it without
having to call 'isatty()' twice.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new variable `existing_kept_packs` (and corresponding parameter
`fname_kept_list`) added by the previous patch make it seem like
`existing_packs` and `fname_list` are each subsets of the other two
respectively.
In reality, each pair is disjoint: one stores the packs without .keep
files, and the other stores the packs with .keep files. Rename each to
more clearly reflect this.
Suggested-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to be able to write a multi-pack index during repacking, `git
repack` must keep track of which packs it wants to write into the MIDX.
This set is the union of existing packs which will not be deleted,
new pack(s) generated as a result of the repack, and .keep packs.
Prior to this patch, `git repack` populated the list of existing packs
only when repacking all-into-one (i.e., with `-A` or `-a`), but we will
soon need to know this list when repacking when writing a MIDX without
a-i-o.
Populate the list of existing packs unconditionally, and guard removing
packs from that list only when repacking a-i-o.
Additionally, keep track of filenames of kept packs separately, since
this, too, will be used in an upcoming patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To figure out which commits we can write a bitmap for, the multi-pack
index/bitmap code does a reachability traversal, marking any commit
which can be found in the MIDX as eligible to receive a bitmap.
This approach will cause a problem when multi-pack bitmaps are able to
be generated from `git repack`, since the reference tips can change
during the repack. Even though we ignore commits that don't exist in
the MIDX (when doing a scan of the ref tips), it's possible that a
commit in the MIDX reaches something that isn't.
This can happen when a multi-pack index contains some pack which refers
to loose objects (e.g., if a pack was pushed after starting the repack
but before generating the MIDX which depends on an object which is
stored as loose in the repository, and by definition isn't included in
the multi-pack index).
By taking a snapshot of the references before we start repacking, we can
close that race window. In the above scenario (where we have a packed
object pointing at a loose one), we'll either (a) take a snapshot of the
references before seeing the packed one, or (b) take it after, at which
point we can guarantee that the loose object will be packed and included
in the MIDX.
This patch does just that. It writes a temporary "reference snapshot",
which is a list of OIDs that are at the ref tips before writing a
multi-pack bitmap. References that are "preferred" (i.e,. are a suffix
of at least one value of the 'pack.preferBitmapTips' configuration) are
marked with a special '+'.
The format is simple: one line per commit at each tip, with an optional
'+' at the beginning (for preferred references, as described above).
When provided, the reference snapshot is used to drive bitmap selection
instead of the MIDX code doing its own traversal. When it isn't
provided, the usual traversal takes place instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To power a new `--write-midx` mode, `git repack` will want to write a
multi-pack index containing a certain set of packs in the repository.
This new option will be used by `git repack` to write a MIDX which
contains only the packs which will survive after the repack (that is, it
will exclude any packs which are about to be deleted).
This patch effectively exposes the function implemented in the previous
commit via the `git multi-pack-index` builtin. An alternative approach
would have been to call that function from the `git repack` builtin
directly, but this introduces awkward problems around closing and
reopening the object store, so the MIDX will be written out-of-process.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose a variant of the write_midx_file() function which ignores packs
that aren't included in an explicit "allow" list.
This will be used in an upcoming patch to power a new `--stdin-packs`
mode of `git multi-pack-index write` for callers that only want to
include certain packs in a MIDX (and ignore any packs which may have
happened to enter the repository independently, e.g., from pushes).
Those patches will provide test coverage for this new function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
6a8cbc41ba (developer: enable pedantic by default, 2021-09-03)
enables pedantic mode in as many compilers as possible to help gather
feedback on future tightening, so lets do so.
-Wpedantic is missing in some really old gcc 4 versions so lets restrict
it to gcc5 and clang4 (it does work in clang3 AFAIK, but it will be
unlikely that a developer will use such an old compiler anyway).
MinGW gcc is the only one which has -Wno-pedantic-ms-format, and while
that is available also in older compilers, the Windows SDK provides gcc10
so lets aim for that.
Note that in order to target the flag to only Windows, additional changes
were needed in config.mak.uname to propagate the OS detection which also
did some minor refactoring, but which is functionaly equivalent.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bad landmine of a bug which has been with us ever since
PARSE_OPT_SHELL_EVAL was added in 47e9cd28f8 (parseopt: wrap
rev-parse --parseopt usage for eval consumption, 2010-06-12).
It's an argument to parse_options() and should therefore be in "enum
parse_opt_flags", but it was added to the per-option "enum
parse_opt_option_flags" by mistake.
Therefore as soon as we'd have an enum member in the former that
reached its value of "1 << 8" we'd run into a seemingly bizarre bug
where that new option would turn on the unrelated PARSE_OPT_SHELL_EVAL
in "git rev-parse --parseopt" by proxy.
I manually checked that no other enum members suffered from such
overlap, by setting the values to non-overlapping values, and making
the relevant codepaths BUG() out if the given value was above/below
the expected (excluding flags=0 in the case of "enum
parse_opt_flags").
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The summary line had xy, while the description (and other sub-sections)
has XY.
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/ref-paranoia: (71 commits)
refs: drop "broken" flag from for_each_fullref_in()
ref-filter: drop broken-ref code entirely
ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
repack, prune: drop GIT_REF_PARANOIA settings
refs: turn on GIT_REF_PARANOIA by default
refs: omit dangling symrefs when using GIT_REF_PARANOIA
refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
refs-internal.h: reorganize DO_FOR_EACH_* flag documentation
refs-internal.h: move DO_FOR_EACH_* flags next to each other
t5312: be more assertive about command failure
t5312: test non-destructive repack
t5312: create bogus ref as necessary
t5312: drop "verbose" helper
t5600: provide detached HEAD for corruption failures
t5516: don't use HEAD ref for invalid ref-deletion tests
t7900: clean up some more broken refs
The eighth batch
t0000: avoid masking git exit value through pipes
tree-diff: fix leak when not HAVE_ALLOCA_H
pack-revindex.h: correct the time complexity descriptions
...
Remove the now-unused "incomplete" parameter from create_dir_entry(),
all its callers specify it as "1", so let's drop the "incomplete=0"
case. The last caller to use it was search_for_subdir(), but that code
was removed in the preceding commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "mkdir" parameter from the find_containing_dir() function,
the add_ref_entry() function removed in the preceding commit was its
last user.
Since "mkdir" is always "0" we can also remove the parameter from
search_for_subdir(), which in turn means that we can delete most of
that function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function has not been used since 9dd389f3d8 (packed_ref_store:
get rid of the `ref_cache` entirely, 2017-09-25).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was missed in 9939b33d6a (packed-backend: rip out some
now-unused code, 2017-09-08), and has been orphaned since then. Let's
delete it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was added in 3dce444f17 (refs: add a backend method
structure, 2016-09-04), but has never been used by anything. The only
caller that might care uses find_ref_storage_backend() directly.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git_config_key_is_valid() function got left behind in a
refactoring in a9bcf6586d (alias: use the early config machinery to
expand aliases, 2017-06-14),
It previously had two users when it was added in 9e9de18f1a (config:
silence warnings for command names with invalid keys, 2015-08-24), and
after 6a1e1bc0a1 (pager: use callbacks instead of configset,
2016-09-12) only one remained.
By removing it we can get rid of the "quiet" branches in this
function, as well as cases where "store_key" is NULL, for which there
are no other users.
Out of the 5 callers of git_config_parse_key() only one needs to pass
a non-NULL "size_t *baselen_", so we could remove the third parameter
from the public interface. I did not find that potential
simplification to be worthwhile.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove this function left over to accommodate in-flight changes, see
770fedaf9f (string-list.[ch]: add a string_list_init_{nodup,dup}(),
2021-07-01) for the recent change to add
"string_list_init_{nodup,dup}()" initializers.
There was only one user of the API left in remote-curl.c. I don't know
why I didn't include this change to remote-curl.c in
bc40dfb10a (string-list.h users: change to use *_{nodup,dup}(),
2021-07-01), perhaps I just missed it.
In any case, let's change that one user to use the new API, as of
writing this there are no in-flight changes that use, so this seems
like a good time to drop this before we get any new users of this
compatibility API.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanup to limit memory consumption and tighten protocol
message parsing.
* jk/reduce-malloc-in-v2-servers:
ls-refs: reject unknown arguments
serve: reject commands used as capabilities
serve: reject bogus v2 "command=ls-refs=foo"
docs/protocol-v2: clarify some ls-refs ref-prefix details
ls-refs: ignore very long ref-prefix counts
serve: drop "keys" strvec
serve: provide "receive" function for session-id capability
serve: provide "receive" function for object-format capability
serve: add "receive" method for v2 capabilities table
serve: return capability "value" from get_capability()
serve: rename is_command() to parse_command()
The previous changes modified the behavior of 'git add', 'git rm', and
'git mv' to not adjust paths outside the sparse-checkout cone, even if
they exist in the working tree and their cache entries lack the
SKIP_WORKTREE bit. The intention is to warn users that they are doing
something potentially dangerous. The '--sparse' option was added to each
command to allow careful users the same ability they had before.
To improve the discoverability of this new functionality, add a message
to advice.updateSparsePath that mentions the existence of the option.
The previous set of changes also modified the purpose of this message to
include possibly a list of paths instead of only a list of pathspecs.
Make the warning message more clear about this new behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since cmd_mv() does not operate on cache entries and instead directly
checks the filesystem, we can only use path_in_sparse_checkout() as a
mechanism for seeing if a path is sparse or not. Be sure to skip
returning a failure if '-k' is specified.
To ensure that the advice around sparse paths is the only reason a move
failed, be sure to check this as the very last thing before inserting
into the src_for_dst list.
The tests cover a variety of cases such as whether the target is tracked
or untracked, and whether the source or destination are in or outside of
the sparse-checkout definition.
Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a path does not match the sparse-checkout cone but is somehow missing
the SKIP_WORKTREE bit, then 'git rm' currently succeeds in removing the
file. One reason a user might be in this situation is a merge conflict
outside of the sparse-checkout cone. Removing such a file might be
problematic for users who are not sure what they are doing.
Add a check to path_in_sparse_checkout() when 'git rm' is checking if a
path should be considered for deletion. Of course, this check is ignored
if the '--sparse' option is specified, allowing users who accept the
risks to continue with the removal.
This also removes a confusing behavior where a user asks for a directory
to be removed, but only the entries that are within the sparse-checkout
definition are removed. Now, 'git rm <dir>' will fail without '--sparse'
and will succeed in removing all contained paths with '--sparse'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we did previously in 'git add', add a '--sparse' option to 'git rm'
that allows modifying paths outside of the sparse-checkout definition.
The existing checks in 'git rm' are restricted to tracked files that
have the SKIP_WORKTREE bit in the current index. Future changes will
cause 'git rm' to reject removing paths outside of the sparse-checkout
definition, even if they are untracked or do not have the SKIP_WORKTREE
bit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We added checks for path_in_sparse_checkout() to portions of 'git add'
that add warnings and prevent stagins a modification, but we skipped the
--renormalize mode. Update renormalize_tracked_files() to ignore cache
entries whose path is outside of the sparse-checkout cone (unless
--sparse is provided). Add a test in t3705.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We added checks for path_in_sparse_checkout() to portions of 'git add'
that add warnings and prevent staging a modification, but we skipped the
--chmod mode. Update chmod_pathspec() to ignore cache entries whose path
is outside of the sparse-checkout cone (unless --sparse is provided).
Add a test in t3705.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We previously modified 'git add' to refuse updating index entries
outside of the sparse-checkout cone. This is justified to prevent users
from accidentally getting into a confusing state when Git removes those
files from the working tree at some later point.
Unfortunately, this caused some workflows that were previously possible
to become impossible, especially around merge conflicts outside of the
sparse-checkout cone. These were documented in tests within t1092.
We now re-enable these workflows using a new '--sparse' option to 'git
add'. This allows users to signal "Yes, I do know what I'm doing with
these files," and accept the consequences of the files leaving the
worktree later.
We delay updating the advice message until implementing a similar option
in 'git rm' and 'git mv'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git add' adds a tracked file that is outside of the
sparse-checkout cone, it checks the SKIP_WORKTREE bit to see if the file
exists outside of the sparse-checkout cone. This is usually correct,
except in the case of a merge conflict outside of the cone.
Modify add_pathspec_matched_against_index() to be more careful about
paths by checking the sparse-checkout patterns in addition to the
SKIP_WORKTREE bit. This causes 'git add' to no longer allow files
outside of the cone that removed the SKIP_WORKTREE bit due to a merge
conflict.
With only this change, users will only be able to add the file after
adding the file to the sparse-checkout cone. A later change will allow
users to force adding even though the file is outside of the
sparse-checkout cone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The add_files() method in builtin/add.c takes a set of untracked files
that are being added by the input pathspec and inserts them into the
index. If these files are outside of the sparse-checkout cone, then they
gain the SKIP_WORKTREE bit at some point. However, this was not checked
before inserting into the index, so these files are added even though we
want to avoid modifying the index outside of the sparse-checkout cone.
Add a check within add_files() for these files and write the advice
about files outside of the sparse-checkout cone.
This behavior change modifies some existing tests within t1092. These
tests intended to document how a user could interact with the existing
behavior in place. Many of these tests need to be marked as expecting
failure. A future change will allow these tests to pass by adding a flag
to 'git add' that allows users to modify index entries outside of the
sparse-checkout cone.
The 'submodule handling' test is intended to document what happens to
directories that contain a submodule when the sparse index is enabled.
It is not trying to say that users should be able to add submodules
outside of the sparse-checkout cone, so that test can be modified to
avoid that operation.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Within match_pathname(), one successful matching category happens when
the pattern is equal to its non-wildcard prefix. At this point, we have
checked that the input 'pathname' matches the pattern up to the prefix
length, and then we subtraced that length from both 'patternlen' and
'namelen'.
In the case of a directory match, this prefix match should be
sufficient. However, the success condition only cared about _exact_
equality here. Instead, we should allow any path that agrees on this
prefix in the case of PATTERN_FLAG_MUSTBEDIR.
This case was not tested before because of the way unpack_trees() would
match a parent directory before visiting the contained paths. This
approach is changing, so we must change this comparison.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When matching a path against a list of patterns, the ones that require a
directory match previously did not work when a filename is specified.
This was fine when all pattern-matching was done within methods such as
unpack_trees() that check a directory before recursing into the
contained files. However, other commands will start matching individual
files against pattern lists without that recursive approach.
The last_matching_pattern_from_list() logic performs some checks on the
filetype of a path within the index when the PATTERN_FLAG_MUSTBEDIR flag
is set. This works great when setting SKIP_WORKTREE bits within
unpack_trees(), but doesn't work well when passing an arbitrary path
such as a file within a matching directory.
We extract the logic around determining the file type, but attempt to
avoid checking the filesystem if the parent directory already matches
the sparse-checkout patterns. The new path_matches_dir_pattern() method
includes a 'path_parent' parameter that is used to store the parent
directory of 'pathname' between multiple pattern matching tests. This is
loaded lazily, only on the first pattern it finds that has the
PATTERN_FLAG_MUSTBEDIR flag.
If we find that a path has a parent directory, we start by checking to
see if that parent directory matches the pattern. If so, then we do not
need to query the index for the type (which can be expensive). If we
find that the parent does not match, then we still must check the type
from the index for the given pathname.
Note that this does not affect cone mode pattern matching, but instead
the more general -- and slower -- full pattern set. Thus, this does not
affect the sparse index.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add some tests to demonstrate the current behavior around adding files
outside of the sparse-checkout cone. Currently, untracked files are
handled differently from tracked files. A future change will make these
cases be handled the same way.
Further expand checking that a failed 'git add' does not stage changes
to the index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b681b191 introduced the support of systemd timers for git
maintenance.
A test is leveraging the `systemd-analyze verify` utility to verify the
correctness of the systemd unit files generated by git.
But on some systems, although the `systemd-analyze` tool is installed
and supports the `verify` subcommand, it fails with some permission
errors.
So, instead of only checking if the `verify` subcommand exists, a more
reliable way of detecting whether `systemd-analyze verify` can be used
is to try to use it.
The SYSTEMD_ANALYZE prerequisite is now trying to run `systemd-analyze
verify` on a systemd unit file which is shipped by systemd itself.
We can reasonably think that, on systemd hosts, this file is present and
valid.
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the commit_info_init() function addded in ea02ffa385 (mailmap:
simplify map_user() interface, 2013-01-05) and instead initialize the
"struct commit_info" with a macro.
This is the more idiomatic pattern in the codebase, and doesn't leave
us wondering when we see the *_init() function if this struct needs
more complex initialization than a macro can provide.
The get_commit_info() function is only called by the three callers
being changed here immediately after initializing the struct with the
macros, so by moving the initialization to the callers we don't need
to do it in get_commit_info() anymore.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the hostinfo_init() function added in 01cec54e13 (daemon:
deglobalize hostname information, 2015-03-07) and instead initialize
the "struct hostinfo" with a macro.
This is the more idiomatic pattern in the codebase, and doesn't leave
us wondering when we see the *_init() function if this struct needs
more complex initialization than a macro can provide.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the same pattern for cb_init() as the one established in the
recent refactoring of other such patterns in
5726a6b401 (*.c *_init(): define in terms of corresponding *_INIT
macro, 2021-07-01).
It has been pointed out[1] that we could perhaps use this C99
replacement of using a compound literal for all of these:
*t = (struct cb_tree){ 0 };
But let's just stick to the existing pattern established in
5726a6b401 for now, we can leave another weather balloon for some
other time.
1. http://lore.kernel.org/git/ef724a3a-a4b8-65d3-c928-13a7d78f189a@gmail.com
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move various *_INIT macros to use designated initializers. This helps
readability. I've only picked those leftover macros that were not
touched by another in-flight series of mine which changed others, but
also how initialization was done.
In the case of SUBMODULE_ALTERNATE_SETUP_INIT I've left an explicit
initialization of "error_mode", even though
SUBMODULE_ALTERNATE_ERROR_IGNORE itself is defined as "0". Let's not
peek under the hood and assume that enum fields we know the value of
will stay at "0".
The change to "TESTSUITE_INIT" in "t/helper/test-run-command.c" was
part of an earlier on-list version[1] of c90be786da (test-tool
run-command: fix flip-flop init pattern, 2021-09-11).
1. https://lore.kernel.org/git/patch-1.1-0aa4523ab6e-20210909T130849Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the initialization of "struct strbuf" changed in
cbc0f81d96 (strbuf: use designated initializers in STRBUF_INIT,
2017-07-10) to omit specifying "alloc" and "len", as we do with other
"alloc" and "len" (or "nr") in similar structs.
Let's likewise omit the explicit initialization of all fields in the
"struct ipc_client_connect_option" struct added in
59c7b88198 (simple-ipc: add win32 implementation, 2021-03-15).
Do the same for a few other initializers, e.g. STRVEC_INIT and
CACHE_DEF_INIT.
Finally, start incrementally changing the same pattern in
"t/helper/test-run-command.c". This change was part of an earlier
on-list version[1] of c90be786da (test-tool run-command: fix
flip-flop init pattern, 2021-09-11).
1. https://lore.kernel.org/git/patch-1.1-0aa4523ab6e-20210909T130849Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In C it isn't required to specify that all members of a struct are
zero'd out to 0, NULL or '\0', just providing a "{ 0 }" will
accomplish that.
Let's also change code that provided N zero'd fields to just
provide one, and change e.g. "{ NULL }" to "{ 0 }" for
consistency. I.e. even if the first member is a pointer let's use "0"
instead of "NULL". The point of using "0" consistently is to pick one,
and to not have the reader wonder why we're not using the same pattern
everywhere.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro was added and used in c68f837576 (implement fetching of
moved submodules, 2017-10-16) but its last user went away in
be76c21282 (fetch: ensure submodule objects fetched, 2018-12-06).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The same bug fixed in the "COMPUTE_HEADER_DEPENDENCIES=auto" mode in
the preceding commit was also present with
"GENERATE_COMPILATION_DATABASE=yes". Let's fix it so it works again
with "DEVOPTS=1".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some commands have traditionally also removed untracked files (or
directories) that were in the way of a tracked file we needed. Document
these cases.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the last few commits we focused on code in unpack-trees.c that
mistakenly removed untracked files or directories. There may be more of
those, but in this commit we change our focus: callers of toplevel
commands that are expected to remove untracked files or directories.
As noted previously, we have toplevel commands that are expected to
delete untracked files such as 'read-tree --reset', 'reset --hard', and
'checkout --force'. However, that does not mean that other highlevel
commands that happen to call these other commands thought about or
conveyed to users the possibility that untracked files could be removed.
Audit the code for such callsites, and add comments near existing
callsites to mention whether these are safe or not.
My auditing is somewhat incomplete, though; it skipped several cases:
* git-rebase--preserve-merges.sh: is in the process of being
deprecated/removed, so I won't leave a note that there are
likely more bugs in that script.
* contrib/git-new-workdir: why is the -f flag being used in a new
empty directory?? It shouldn't hurt, but it seems useless.
* git-p4.py: Don't see why -f is needed for a new dir (maybe it's
not and is just superfluous), but I'm not at all familiar with
the p4 stuff
* git-archimport.perl: Don't care; arch is long since dead
* git-cvs*.perl: Don't care; cvs is long since dead
Also, the reset --hard in builtin/worktree.c looks safe, due to only
running in an empty directory.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, unpack_trees_options->reset was used to signal that it
was okay to delete any untracked files in the way. This was used by
`git read-tree --reset`, but then started appearing in other places as
well. However, many of the other uses should not be deleting untracked
files in the way. Change this value to an enum so that a value of 1
(i.e. "true") can be split into two:
UNPACK_RESET_PROTECT_UNTRACKED,
UNPACK_RESET_OVERWRITE_UNTRACKED
In order to catch accidental misuses (i.e. where folks call it the way
they traditionally used to), define the special enum value of
UNPACK_RESET_INVALID = 1
which will trigger a BUG().
Modify existing callers so that
read-tree --reset
reset --hard
checkout --force
continue using the UNPACK_RESET_OVERWRITE_UNTRACKED logic, while other
callers, including
am
checkout without --force
stash (though currently dead code; reset always had a value of 0)
numerous callers from rebase/sequencer to reset_head()
will use the new UNPACK_RESET_PROTECT_UNTRACKED value.
Also, note that it has been reported that 'git checkout <treeish>
<pathspec>' currently also allows overwriting untracked files[1]. That
case should also be fixed, but it does not use unpack_trees() and thus
is outside the scope of the current changes.
[1] https://lore.kernel.org/git/15dad590-087e-5a48-9238-5d2826950506@gmail.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change several commands to remove ignored files by default when they are
in the way. Since some commands (checkout, merge) take a
--no-overwrite-ignore option to allow the user to configure this, and it
may make sense to add that option to more commands (and in the case of
merge, actually plumb that configuration option through to more of the
backends than just the fast-forwarding special case), add little
comments about where such flags would be used.
Incidentally, this fixes a test failure in t7112.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid accidental misuse or confusion over ownership by clearly making
unpack_trees_options.dir an internal-only variable.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, every caller of unpack_trees() that wants to ensure ignored
files are overwritten by default needs to:
* allocate unpack_trees_options.dir
* flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags
* call setup_standard_excludes
AND then after the call to unpack_trees() needs to
* call dir_clear()
* deallocate unpack_trees_options.dir
That's a fair amount of boilerplate, and every caller uses identical
code. Make this easier by instead introducing a new boolean value where
the default value (0) does what we want so that new callers of
unpack_trees() automatically get the appropriate behavior. And move all
the handling of unpack_trees_options.dir into unpack_trees() itself.
While preserve_ignored = 0 is the behavior we feel is the appropriate
default, we defer fixing commands to use the appropriate default until a
later commit. So, this commit introduces several locations where we
manually set preserve_ignored=1. This makes it clear where code paths
were previously preserving ignored files when they should not have been;
a future commit will flip these to instead use a value of 0 to get the
behavior we want.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes a long-standing patchwork of ignored files handling in
read-tree and merge-recursive, called out and suggested by Junio long
ago. Quoting from commit dcf0c16ef1 ("core.excludesfile clean-up"
2007-11-16):
git-read-tree takes --exclude-per-directory=<gitignore>,
not because the flexibility was needed. Again, this was
because the option predates the standardization of the ignore
files.
...
On the other hand, I think it makes perfect sense to fix
git-read-tree, git-merge-recursive and git-clean to follow the
same rule as other commands. I do not think of a valid use case
to give an exclude-per-directory that is nonstandard to
read-tree command, outside a "negative" test in the t1004 test
script.
This patch is the first step to untangle this mess.
The next step would be to teach read-tree, merge-recursive and
clean (in C) to use setup_standard_excludes().
History shows each of these were partially or fully fixed:
* clean was taught the new trick in 1617adc7a0 ("Teach git clean to
use setup_standard_excludes()", 2007-11-14).
* read-tree was primarily used by checkout & merge scripts. checkout
and merge later became builtins and were both fixed to use the new
setup_standard_excludes() handling in fc001b526c ("checkout,merge:
loosen overwriting untracked file check based on info/exclude",
2011-11-27). So the primary users were fixed, though read-tree
itself was not.
* merge-recursive has now been replaced as the default merge backend
by merge-ort. merge-ort fixed this by using
setup_standard_excludes() starting early in its implementation; see
commit 6681ce5cf6 ("merge-ort: add implementation of checkout()",
2020-12-13), largely due to its design depending on checkout() and
thus being influenced by the checkout code. However,
merge-recursive itself was not fixed here, in part because its
design meant it had difficulty differentiating between untracked
files, ignored files, leftover tracked files that haven't been
removed yet due to order of processing files, and files written by
itself due to collisions).
Make the conversion more complete by now handling read-tree and
handling at least the unpack_trees() portion of merge-recursive. While
merge-recursive is on its way out, fixing the unpack_trees() portion is
easy and facilitates some of the later changes in this series. Note
that fixing read-tree makes the --exclude-per-directory option to
read-tree useless, so we remove it from the documentation (though we
continue to accept it if passed).
The read-tree changes happen to fix a bug in t1013.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gcc will helpfully raise a -Wcast-function-type warning when casting
between functions that might have incompatible return types
(ex: GetUserNameExW returns bool which is only half the size of the
return type from FARPROC which is long long), so create a new type that
could be used as a completely generic function pointer and cast through
it instead.
Additionaly remove the -Wno-incompatible-pointer-types temporary
flag added in 27e0c3c (win32: allow building with pedantic mode
enabled, 2021-09-03), as it will be no longer needed.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No callers pass in anything but "0" here. Likewise to our sibling
functions. Note that some of them ferry along the flag, but none of
their callers pass anything but "0" either.
Nor is anybody likely to change that. Callers which really want to see
all of the raw refs use for_each_rawref(). And anybody interested in
iterating a subset of the refs will likely be happy to use the
now-default behavior of showing broken refs, but omitting dangling
symlinks.
So we can get rid of this whole feature.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that none of our callers passes the INCLUDE_BROKEN flag, we can drop
it entirely, along with the code to plumb it through to the
for_each_fullref_in() functions.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Of the ref-filter callers, for-each-ref and git-branch both set the
INCLUDE_BROKEN flag (but git-tag does not, which is a weird
inconsistency). But now that GIT_REF_PARANOIA is on by default, that
produces almost the same outcome for all three.
The one exception is that GIT_REF_PARANOIA will omit dangling symrefs.
That's a better behavior for these tools, as they would never include
such a symref in the main output anyway (they can't, as it doesn't point
to an object). Instead they issue a warning to stderr. But that warning
is somewhat useless; a dangling symref is a perfectly reasonable thing
to have in your repository, and is not a sign of corruption. It's much
friendlier to just quietly ignore it.
And in terms of robustness, the warning gains us little. It does not
impact the exit code of either tool. So while the warning _might_ clue
in a user that they have an unexpected broken symref, it would not help
any kind of scripted use.
This patch converts for-each-ref and git-branch to stop using the
INCLUDE_BROKEN flag. That gives them more reasonable behavior, and
harmonizes them with git-tag.
We have to change one test to adapt to the situation. t1430 tries to
trigger all of the REF_ISBROKEN behaviors from the underlying ref code.
It uses for-each-ref to do so (because there isn't any other mechanism).
That will no longer issue a warning about the symref which points to an
invalid name, as it's considered dangling (and we can instead be sure
that it's _not_ mentioned on stderr). Note that we do still complain
about the illegally named "broken..symref"; its problem is not that it's
dangling, but the name of the symref itself is illegal.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that GIT_REF_PARANOIA is the default, we don't need to selectively
enable it for destructive operations. In fact, it's harmful to do so,
because it overrides any GIT_REF_PARANOIA=0 setting that the user may
have provided (because they're trying to work around some corruption).
With these uses gone, we can further clean up the ref_paranoia global,
and make it a static variable inside the refs code.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original point of the GIT_REF_PARANOIA flag was to include broken
refs in iterations, so that possibly-destructive operations would not
silently ignore them (and would generally instead try to operate on the
oids and fail when the objects could not be accessed).
We already turned this on by default for some dangerous operations, like
"repack -ad" (where missing a reachability tip would mean dropping the
associated history). But it was not on for general use, even though it
could easily result in the spreading of corruption (e.g., imagine
cloning a repository which simply omits some of its refs because
their objects are missing; the result quietly succeeds even though you
did not clone everything!).
This patch turns on GIT_REF_PARANOIA by default. So a clone as mentioned
above would actually fail (upload-pack tells us about the broken ref,
and when we ask for the objects, pack-objects fails to deliver them).
This may be inconvenient when working with a corrupted repository, but:
- we are better off to err on the side of complaining about
corruption, and then provide mechanisms for explicitly loosening
safety.
- this is only one type of corruption anyway. If we are missing any
other objects in the history that _aren't_ ref tips, then we'd
behave similarly (happily show the ref, but then barf when we
started traversing).
We retain the GIT_REF_PARANOIA variable, but simply default it to "1"
instead of "0". That gives the user an escape hatch for loosening this
when working with a corrupt repository. It won't work across a remote
connection to upload-pack (because we can't necessarily set environment
variables on the remote), but there the client has other options (e.g.,
choosing which refs to fetch).
As a bonus, this also makes ref iteration faster in general (because we
don't have to call has_object_file() for each ref), though probably not
noticeably so in the general case. In a repo with a million refs, it
shaved a few hundred milliseconds off of upload-pack's advertisement;
that's noticeable, but most repos are not nearly that large.
The possible downside here is that any operation which iterates refs but
doesn't ever open their objects may now quietly claim to have X when the
object is corrupted (e.g., "git rev-list new-branch --not --all" will
treat a broken ref as uninteresting). But again, that's not really any
different than corruption below the ref level. We might have
refs/heads/old-branch as non-corrupt, but we are not actively checking
that we have the entire reachable history. Or the pointed-to object
could even be corrupted on-disk (but our "do we have it" check would
still succeed). In that sense, this is merely bringing ref-corruption in
line with general object corruption.
One alternative implementation would be to actually check for broken
refs, and then _immediately die_ if we see any. That would cause the
"rev-list --not --all" case above to abort immediately. But in many ways
that's the worst of all worlds:
- it still spends time looking up the objects an extra time
- it still doesn't catch corruption below the ref level
- it's even more inconvenient; with the current implementation of
GIT_REF_PARANOIA for something like upload-pack, we can make
the advertisement and let the client choose a non-broken piece of
history. If we bail as soon as we see a broken ref, they cannot even
see the advertisement.
The test changes here show some of the fallout. A non-destructive "git
repack -adk" now fails by default (but we can override it). Deleting a
broken ref now actually tells the hooks the correct "before" state,
rather than a confusing null oid.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dangling symrefs aren't actually a corruption problem. It's perfectly
fine for refs/remotes/origin/HEAD to point to an unborn branch. And in
particular, if you are trying to establish reachability, a symref that
points nowhere doesn't matter either way. Any ref it could point to will
be examined during the rest of the traversal.
It's possible that a symref pointing nowhere _could_ be a sign that the
ref it was meant to point to was deleted accidentally (e.g., via
corruption). But there is no particular reason to think that is true for
any given case, and in the meantime, GIT_REF_PARANOIA kicking in
automatically for some operations means they'll fail unnecessarily.
So let's loosen it just a bit. The new test in t5312 shows off an
example that is safe, but currently fails (and no longer does after this
patch).
Note that we don't do anything if the caller explicitly asked for
DO_FOR_EACH_INCLUDE_BROKEN. In that case they may be looking for
dangling symrefs themselves, and setting GIT_REF_PARANOIA should not
_loosen_ things from what the caller asked for.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the DO_FOR_EACH_INCLUDE_BROKEN flag is used, we include both actual
corrupt refs (illegal names, missing objects), but also symrefs that
point to nothing. This latter is not really a corruption, but just
something that may happen normally. For example, the symref at
refs/remotes/origin/HEAD may point to a tracking branch which is later
deleted. (The local HEAD may also be unborn, of course, but we do not
access it through ref iteration).
Most callers of for_each_ref() etc, do not care. They don't pass
INCLUDE_BROKEN, so don't see it at all. But for those which do pass it,
this somewhat-normal state causes extra warnings (e.g., from
for-each-ref) or even aborts operations (destructive repacks with
GIT_REF_PARANOIA set).
This patch just introduces the flag and the mechanism; there are no
callers yet (and hence no tests). Two things to note on the
implementation:
- we actually skip any symref that does not resolve to a ref. This
includes ones which point to an invalidly-named ref. You could argue
this is a more serious breakage than simple dangling. But the
overall effect is the same (we could not follow the symref), as well
as the impact on things like REF_PARANOIA (either way, a symref we
can't follow won't impact reachability, because we'll see the ref
itself during iteration). The underlying resolution function doesn't
distinguish these two cases (they both get REF_ISBROKEN).
- we change the iterator in refs/files-backend.c where we check
INCLUDE_BROKEN. There's a matching spot in refs/packed-backend.c,
but we don't know need to do anything there. The packed backend does
not support symrefs at all.
The resulting set of flags might be a bit easier to follow if we broke
this down into "INCLUDE_CORRUPT_REFS" and "INCLUDE_DANGLING_SYMREFS".
But there are a few reasons not do so:
- adding a new OMIT_DANGLING_SYMREFS flag lets us leave existing
callers intact, without changing their behavior (and some of them
really do want to see the dangling symrefs; e.g., t5505 has a test
which expects us to report when a symref becomes dangling)
- they're not actually independent. You cannot say "include dangling
symrefs" without also including refs whose objects are not
reachable, because dangling symrefs by definition do not have an
object. We could tweak the implementation to distinguish this, but
in practice nobody wants to ask for that. Adding the OMIT flag keeps
the implementation simple and makes sure we don't regress the
current behavior.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the DO_FOR_EACH_* flags is sprinkled over the
refs-internal.h file. We define the two flags in one spot, and then
describe them in more detail far away from there, in the definitions of
refs_ref_iterator_begin() and ref_iterator_advance_fn().
Let's try to organize this a bit better:
- convert the #defines to an enum. This makes it clear that they are
related, and that the enum shows the complete set of flags.
- combine all descriptions for each flag in a single spot, next to the
flag's definition
- use the enum rather than a bare int for functions which take the
flags. This helps readers realize which flags can be used.
- clarify the mention of flags for ref_iterator_advance_fn(). It does
not take flags itself, but is meant to depend on ones set up
earlier.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are currently two DO_FOR_EACH_* flags, which must not have their
bits overlap. Yet they're defined hundreds of lines apart. Let's move
them next to each other to make it clear that they are related and are a
complete set (which matters if you are adding a new flag and would like
to know what the next available bit is).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When repacking or pruning in a corrupted repository, our tests in t5312
argue that it is OK to complete the operation or bail, as long as we
don't actually delete the objects pointed to by the corruption.
This isn't a wrong line of reasoning, but the tests are a bit permissive
by using test_might_fail. The fact is that we _do_ bail currently, and
if we ever stopped doing so, that would be worthy of a human
investigating. So let's switch these to test_must_fail.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t5312, we create a state with a broken ref, and then make sure that
destructive repacks don't silently ignore the breakage (where a
destructive repack is one that might drop objects). But we don't check
the behavior of non-destructive repacks at all (i.e., ones where we'd
keep unreachable objects).
So let's add a test to confirm the current behavior, which is that
they are allowed (i.e., ignoring the breakage and considering any
objects it points to as unreachable). This may change in the future, but
we'd like for the test suite to alert us to that fact.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests in t5312 create an illegally-named ref, and then see how
various operations handle it. But between those operations, we also do
some more setup (e.g., repacking), and we are subtly depending on how
those setup steps react to the illegal ref.
To future-proof us against those behaviors changing, let's instead
create and clean up our bogus ref on demand in the tests that need it.
This has two small extra advantages:
- the tests are more stand-alone; we do not need an extra test to clean
up the ref before moving on to other parts of the script
- the creation and cleanup is together in one helper function. Because
these depend on touching the refs in the filesystem directly, they
may need to be tweaked for a world with alternate backends (they have
not been noticed so far in the reftable work because with a non-file
backend the tests don't fail; they simply become uninteresting noops
because the broken ref isn't read at all).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5312 has several uses of the "verbose" helper, as described in
8ad1652418 (t5304: use helper to report failure of "test foo = bar",
2014-10-10). Back then the "-x" trace option for tests was new, and was
not as pleasant to use (e.g., some tests failed under "-x", we did not
support BASH_XTRACEFD, etc).
These days it is clear that "-x" is the preferred way to get extra
output, and we don't need to mark up individual tests. Let's get rid of
the uses of "verbose" here, as one step toward eradicating it totally.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When checking how git-clone behaves when it fails, we stimulate some
failures by trying to do a clone from a local repository whose objects
have been removed. Because these clones use local optimizations, there's
a subtle dependency in how the corruption is handled on the sending
side.
If upload-pack does not show us the broken refs (which it does not
currently), then we see only HEAD (which is itself broken), and clone
that as a detached HEAD. When we try to write the ref, we notice that we
never got the object and bail.
But if upload-pack _does_ show us the broken refs (which it may in a
future patch), then we'll realize that HEAD is a symref and just write
that. You'd think we'd fail when writing out the refs themselves, but we
don't; we do a bulk write and skip the connectivity check because of our
--local optimizations. For the non-bare case, we do notice the problem
when we try to checkout. But for a bare repository, we unexpectedly
complete the clone successfully!
At first glance this may seem like a bug. But the whole point of those
local optimizations is to give up some safety for speed. If you want to
be careful, you should be using "--no-local", which would notice that
the pack did not transfer sufficient objects. We could do that in these
tests, but part of the point is for them to fail at specific moments
(and indeed, we have a later test that checks for transport failure).
However, we can make this less subtle and future-proof it against
changes on the upload-pack side by just having an explicit detached
HEAD in the corrupted repo. Now we'll fail as expected during the ref
write if any ref _or_ HEAD is corrupt, whether we're --bare or not.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few tests in t5516 want to assert that we can delete a corrupted ref
whose pointed-to object is missing. They do so by using the "main"
branch, which is also pointed to by HEAD.
This does work, but only because of a subtle assumption about the
implementation. We do not block the deletion because of the invalid ref,
but we _also_ do not notice that the deleted branch is pointed to by
HEAD. And so the safety rule of "do not allow HEAD to be deleted in a
non-bare repository" does not kick in, and the test passes.
Let's instead use a non-HEAD branch. That still tests what we care about
here (deleting a corrupt ref), but without implicitly depending on our
failure to notice that we're deleting HEAD. That will future proof the
test against that behavior changing.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "incremental-repack task" test replaces the object directory with a
known state. As a result, some of our refs point to objects that are not
included in that state.
Commit 3cf5f221be (t7900: clean up some broken refs, 2021-01-19) cleaned
up some of those (that were causing warnings to stderr from the
maintenance process). But there are a few more that were missed. These
aren't hurting anything for now, but it's certainly an unexpected state
to leave the test repository in, and it will become a problem if repack
ever gets more picky about broken refs.
Let's clean up those additional refs (which are all in refs/remotes,
with nothing there that isn't broken), and add an extra "for-each-ref"
call to assert that we've got everything.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the error shown when a http.pinnedPubKey doesn't match to point
the http.pinnedPubKey variable added in aeff8a6121 (http: implement
public key pinning, 2016-02-15), e.g.:
git -c http.pinnedPubKey=sha256/someNonMatchingKey ls-remote https://github.com/git/git.git
fatal: unable to access 'https://github.com/git/git.git/' with http.pinnedPubkey configuration: SSL: public key does not match pinned public key!
Before this we'd emit the exact same thing without the " with
http.pinnedPubkey configuration". The advantage of doing this is that
we're going to get a translated message (everything after the ":" is
hardcoded in English in libcurl), and we've got a reference to the
git-specific configuration variable that's causing the error.
Unfortunately we can't test this easily, as there are no tests that
require https:// in the test suite, and t/lib-httpd.sh doesn't know
how to set up such tests. See [1] for the start of a discussion about
what it would take to have divergent "t/lib-httpd/apache.conf" test
setups. #leftoverbits
1. https://lore.kernel.org/git/YUonS1uoZlZEt+Yd@coredump.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_feature_value() takes an offset, and uses it to seek past the
point in features_list that we've already seen. However if the feature
being searched for does not specify a value, the offset is not
updated. Therefore if we call parse_feature_value() in a loop on a
value-less feature, we'll keep on parsing the same feature over and over
again. This usually isn't an issue: there's no point in using
next_server_feature_value() to search for repeated instances of the same
capability unless that capability typically specifies a value - but a
broken server could send a response that omits the value for a feature
even when we are expecting a value.
Therefore we add an offset update calculation for the no-value case,
which helps ensure that loops using next_server_feature_value() will
always terminate.
next_server_feature_value(), and the offset calculation, were first
added in 2.28 in 2c6a403d96 (connect: add function to parse multiple
v1 capability values, 2020-05-25).
Thanks to Peff for authoring the test.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make githooks(5) the source of truth for what hooks git supports, and
punt out early on hooks we don't know about in find_hook(). This
ensures that the documentation and the C code's idea about existing
hooks doesn't diverge.
We still have Perl and Python code running its own hooks, but that'll
be addressed by Emily Shaffer's upcoming "git hook run" command.
This resolves a long-standing TODO item in bugreport.c of there being
no centralized listing of hooks, and fixes a bug with the bugreport
listing only knowing about 1/4 of the p4 hooks. It didn't know about
the recent "reference-transaction" hook either.
We could make the find_hook() function die() or BUG() out if the new
known_hook() returned 0, but let's make it return NULL just as it does
when it can't find a hook of a known type. Making it die() is overly
anal, and unlikely to be what we need in catching stupid typos in the
name of some new hook hardcoded in git.git's sources. By making this
be tolerant of unknown hook names, changes in a later series to make
"git hook run" run arbitrary user-configured hook names will be easier
to implement.
I have not been able to directly test the CMake change being made
here. Since 4c2c38e800 (ci: modification of main.yml to use cmake for
vs-build job, 2020-06-26) some of the Windows CI has a hard dependency
on CMake, this change works there, and is to my eyes an obviously
correct use of a pattern established in previous CMake changes,
namely:
- 061c2240b1 (Introduce CMake support for configuring Git,
2020-06-12)
- 709df95b78 (help: move list_config_help to builtin/help,
2020-04-16)
- 976aaedca0 (msvc: add a Makefile target to pre-generate the Visual
Studio solution, 2019-07-29)
The LC_ALL=C is needed because at least in my locale the dash ("-") is
ignored for the purposes of sorting, which results in a different
order. I'm not aware of anything in git that has a hard dependency on
the order, but e.g. the bugreport output would end up using whatever
locale was in effect when git was compiled.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the new hook_exists() function instead of find_hook() where the
latter was called in boolean contexts. This make subsequent changes in
a series where we further refactor the hook API clearer, as we won't
conflate wanting to get the path of the hook with checking for its
existence.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a boolean version of the find_hook() function for those callers
who are only interested in checking whether the hook exists, not what
the path to it is.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the find_hook() function from run-command.c to a new hook.c
library. This change establishes a stub library that's pretty
pointless right now, but will see much wider use with Emily Shaffer's
upcoming "configuration-based hooks" series.
Eventually all the hook related code will live in hook.[ch]. Let's
start that process by moving the simple find_hook() function over
as-is.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Here, GCC warns about every use of the INIT_PROC_ADDR macro, for example:
In file included from compat/mingw.c:8:
compat/mingw.c: In function 'mingw_strftime':
compat/win32/lazyload.h:38:12: warning: assignment to
'size_t (*)(char *, size_t, const char *, const struct tm *)'
{aka 'long long unsigned int (*)(char *, long long unsigned int,
const char *, const struct tm *)'} from incompatible pointer type
'FARPROC' {aka 'long long int (*)()'} [-Wincompatible-pointer-types]
38 | (function = get_proc_addr(&proc_addr_##function))
| ^
compat/mingw.c:1014:6: note: in expansion of macro 'INIT_PROC_ADDR'
1014 | if (INIT_PROC_ADDR(strftime))
| ^~~~~~~~~~~~~~
(message wrapped for convenience). Insert a cast to keep the compiler
happy. A cast is fine in these cases because they are generic function
pointer values that have been looked up in a DLL.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests in t3705-add-sparse-checkout.sh check to see how 'git add'
behaves with paths outside the sparse-checkout definition. These
currently check to see if a given warning is present but not that the
index is not updated with the sparse entries. Add a new
'test_sparse_entry_unstaged' helper to be sure 'git add' is behaving
correctly.
We need to modify setup_sparse_entry to actually commit the sparse_entry
file so it exists at HEAD and as an entry in the index, but its exact
contents are not staged in the index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Noting that unpack_trees treats reset=1 & update=1 as license to nuke
untracked files, I looked for code paths that use this combination and
tried to generate testcases which demonstrated unintentional loss of
untracked files and directories. I found several.
I also include testcases for `git reset --{hard,merge,keep}`. A hard
reset is perhaps the most direct test of unpack_tree's reset=1 behavior,
but we cannot make `git reset --hard` preserve untracked files without
some migration work.
Also, the two commands `checkout --force` (because of the --force) and
`read-tree --reset` (because it's plumbing and we need to keep it
backward compatible) were left out as we expect those to continue
removing untracked files and directories.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unusable entries of a damaged pack file are recorded in the oidset
bad_objects. Release it when we're done with the pack.
This doesn't affect intact packs because an empty oidset requires
no allocation.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
54fd3243da ("rebase -i: reread the todo list if `exec` touched it",
2017-04-26) sought to reread the todo list after running an exec
command only if it had been changed. To accomplish this it checks the
stat data of the todo list after running an exec command to see if it
has changed. Unfortunately there are two problems, firstly the
implementation is buggy we actually reread the list after each exec
which is quadratic in the number of commit lookups and secondly the
design is predicated on using nanosecond time stamps which are not the
default.
The implementation bug stems from the fact that we write a new todo
list to disk before running each command but do not update the stat
data to reflect this[1].
The design problem is that it is possible for the user to edit the
todo list without changing its size or inode which means we have to
rely on the mtime to tell us if it has changed. Unfortunately unless
git is built with USE_NSEC it is possible for the original and edited
list to share the same mtime.
Ideally "git rebase --edit-todo" would set a flag that we would then
check in sequencer.c. Unfortunately this is approach will not work as
there are scripts in the wild that write to the todo list directly
without running "git rebase --edit-todo". Instead of relying on stat
data this patch simply reads the possibly edited todo list and
compares it to the original with memcmp(). This is much faster than
reparsing the todo list each time. This patch reduces the time to run
git rebase -r -xtrue v2.32.0~100 v2.32.0
which runs 419 exec commands by 6.6%. For comparison fixing the
implementation bug in stat based approach reduces the time by a
further 1.4% and is indistinguishable from never rereading the todo
list.
[1] https://lore.kernel.org/git/20191125131833.GD23183@szeder.dev/
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code is heavily indented and obscures the high level logic within
the loop. Let's move it to its own function before modifying it in the
next commit. Note that there is a subtle change in behavior if the
todo list cannot be reread. Previously todo_list->current was
incremented before returning, now it returns immediately.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MIDX files are used by default since commit 18e449f86b
(midx: enable core.multiPackIndex by default, 2020-09-25)
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This comment added in dfea575017 (Makefile: lazily compute header
dependencies, 2010-01-26) has been out of date since
92b88eba9f (Makefile: use `git ls-files` to list header files, if
possible, 2019-03-04), when we did exactly what it tells us not to do
and added $(GENERATED_H) to $(OBJECTS) dependencies.
The rest of it was also somewhere between inaccurate and outdated,
since as of b8ba629264 (Makefile: fold MISC_H into LIB_H, 2012-06-20)
it's not followed by a list of header files, that got moved earlier in
the file into LIB_H in 60d24dd255 (Makefile: fold XDIFF_H and VCSSVN_H
into LIB_H, 2012-07-06).
Let's just remove it entirely, to the extent that we have anything
useful to say here the comment on the
"USE_COMPUTED_HEADER_DEPENDENCIES" variable a few lines above this
change does the job for us.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "cmd.sh > $@+ && mv $@+ $@" pattern used for generating the
config-list.h and command-list.h to just "cmd.sh >$@". This was needed
as a guard to ensure that we don't have an empty file if the script
failed, but since 7b76d6bf22 (Makefile: add and use the
".DELETE_ON_ERROR" flag, 2021-06-29) GNU make ensures that doesn't
happen.
There's still a lot of other places in the Makefile where we
needlessly use this pattern, but I'm just changing these because I'm
about to add a new $(GENERATED_H) target, let's have them all look and
act the same way.
Even with ".DELETE_ON_ERROR" there is still a point to using the "mv
$@+ $@" pattern in some cases, e.g. to ensure that you have a working
binary during recompilation (see [1] for the start of a long
discussion about that), but that doesn't apply here. Nothing external
uses $(GENERATED_H) directly, it's only ever used in the context of
the Makefile's own dependency (re-)generation.
1. https://lore.kernel.org/git/8735t93h0u.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change various places that hardcode the names of these two files to
refer to either $(GENERATED_H), or to a new generated-hdrs
target. That target is consistent with the *-objs targets I recently
added in 029bac01a8 (Makefile: add {program,xdiff,test,git,fuzz}-objs
& objects targets, 2021-02-23).
A subsequent commit will add a new generated hook-list.h. By doing
this refactoring we'll only need to add the new file to the
GENERATED_H variable, not EXCEPT_HDRS, the vcbuild/README etc.
Hardcoding command-list.h there seems to have been a case of
copy/paste programming in 976aaedca0 (msvc: add a Makefile target to
pre-generate the Visual Studio solution, 2019-07-29). The
config-list.h was added later in 709df95b78 (help: move
list_config_help to builtin/help, 2020-04-16).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in 44c9e8594e (Fix up header file dependencies and add
sparse checking rules, 2005-07-03), we never marked the phony "check"
target as such.
Perhaps we should just remove it, since as of a combination of
912f9980d2 (Makefile: help people who run 'make check' by mistake,
2008-11-11) 0bcd9ae85d (sparse: Fix errors due to missing
target-specific variables, 2011-04-21) we've been suggesting the user
run "make sparse" directly.
But under that mode it still does something, as well as directing the
user to run "make test" under non-sparse. So let's punt that and
narrowly fix the PHONY bug.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 73c3253d75 (bundle: framework for options before bundle file,
2019-11-10) the "git bundle" command was refactored to use
parse_options(). In that refactoring it started understanding the
"--verbose" flag before the subcommand, e.g.:
git bundle --verbose verify --quiet
However, nothing ever did anything with this "verbose" variable, and
the change wasn't documented. It appears to have been something that
escaped the lab, and wasn't flagged by reviewers at the time. Let's
just remove it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error in "git help no-such-git-command" is handled better.
* ma/help-w-check-for-requested-page:
help: make sure local html page exists before calling external processes
Adjust credential-cache helper to Windows.
* cb/unix-sockets-with-windows:
git-compat-util: include declaration for unix sockets in windows
credential-cache: check for windows specific errors
t0301: fixes for windows compatibility
An oddball OPTION_ARGUMENT feature has been removed from the
parse-options API.
* ab/retire-option-argument:
parse-options API: remove OPTION_ARGUMENT feature
difftool: use run_command() API in run_file_diff()
difftool: prepare "diff" cmdline in cmd_difftool()
difftool: prepare "struct child_process" in cmd_difftool()
Rewrite of "git bisect" in C continues.
* mr/bisect-in-c-4:
bisect--helper: retire `--bisect-next-check` subcommand
bisect--helper: reimplement `bisect_run` shell function in C
bisect--helper: reimplement `bisect_visualize()` shell function in C
run-command: make `exists_in_PATH()` non-static
t6030-bisect-porcelain: add test for bisect visualize
t6030-bisect-porcelain: add tests to control bisect run exit cases
Conditional compilation around versions of libcURL has been
straightened out.
* ab/http-drop-old-curl-plus:
http: don't hardcode the value of CURL_SOCKOPT_OK
http: centralize the accounting of libcurl dependencies
http: correct curl version check for CURLOPT_PINNEDPUBLICKEY
http: correct version check for CURL_HTTP_VERSION_2
http: drop support for curl < 7.18.0 (again)
Makefile: drop support for curl < 7.9.8 (again)
INSTALL: mention that we need libcurl 7.19.4 or newer to build
INSTALL: reword and copy-edit the "libcurl" section
INSTALL: don't mention the "curl" executable at all
Taking advantage of the CGI interface, http-backend has been
updated to enable protocol v2 automatically when the other side
asks for it.
* jk/http-server-protocol-versions:
docs/protocol-v2: point readers transport config discussion
docs/git: discuss server-side config for GIT_PROTOCOL
docs/http-backend: mention v2 protocol
http-backend: handle HTTP_GIT_PROTOCOL CGI variable
t5551: test v2-to-v0 http protocol fallback
Replace a handcrafted data structure used to keep track of bad
objects in the packfile API by an oidset.
* rs/packfile-bad-object-list-in-oidset:
packfile: use oidset for bad objects
packfile: convert has_packed_and_bad() to object_id
packfile: convert mark_bad_packed_object() to object_id
midx: inline nth_midxed_pack_entry()
oidset: make oidset_size() an inline function
When "git am --abort" fails to abort correctly, it still exited
with exit status of 0, which has been corrected.
* en/am-abort-fix:
am: fix incorrect exit status on am fail to abort
t4151: add a few am --abort tests
git-am.txt: clarify --abort behavior
"git update-ref --stdin" failed to flush its output as needed,
which potentially led the conversation to a deadlock.
* ps/update-ref-batch-flush:
t1400: avoid SIGPIPE race condition on fifo
update-ref: fix streaming of status updates
While git can be compiled with SANITIZE=leak, we have not run
regression tests under that mode. Memory leaks have only been fixed as
one-offs without structured regression testing.
This change adds CI testing for it. We'll now build and small set of
whitelisted t00*.sh tests under Linux with a new job called
"linux-leaks".
The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
mode. When running in that mode, we'll assert that we were compiled
with SANITIZE=leak. We'll then skip all tests, except those that we've
opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".
A test setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in turn
make use of the "SANITIZE_LEAK" prerequisite, should they wish to
selectively skip tests even under
"GIT_TEST_PASSING_SANITIZE_LEAK=true". In the preceding commit we
started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".
This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:
$ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set
The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
as follow-up change, but let's start small to begin with.
In ci/run-build-and-tests.sh we make use of the default "*" case to
run "make test" without any GIT_TEST_* modes. SANITIZE=leak is known
to fail in combination with GIT_TEST_SPLIT_INDEX=true in
t0016-oidmap.sh, and we're likely to have other such failures in
various GIT_TEST_* modes. Let's focus on getting the base tests
passing, we can expand coverage to GIT_TEST_* modes later.
It would also be possible to implement a more lightweight version of
this by only relying on setting "LSAN_OPTIONS". See
<YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
<YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of
that. I've opted for this approach of adding a GIT_TEST_* mode instead
because it's consistent with how we handle other special test modes.
Being able to add a "!SANITIZE_LEAK" prerequisite and calling
"test_done" early if it isn't satisfied also means that we can more
incrementally add regression tests without being forced to fix
widespread and hard-to-fix leaks at the same time.
We have tests that do simple checking of some tool we're interested
in, but later on in the script might be stressing trace2, or common
sources of leaks like "git log" in combination with the tool (e.g. the
commit-graph tests). To be clear having a prerequisite could also be
accomplished by using "LSAN_OPTIONS" directly.
On the topic of "LSAN_OPTIONS": It would be nice to have a mode to
aggregate all failures in our various scripts, see [2] for a start at
doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
that for now, it can be added later.
As of writing this we've got major regressions between master..seen,
i.e. the t000*.sh tests and more fixed since 31f9acf9ce (Merge branch
'ah/plugleaks', 2021-08-04) have regressed recently.
See the discussion at <87czsv2idy.fsf@evledraar.gmail.com>[3] about
the lack of this sort of test mode, and 0e5bba53af (add UNLEAK
annotation for reducing leak false positives, 2017-09-08) for the
initial addition of SANITIZE=leak.
See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.
As noted in [5] we can't support this on OSX yet until Clang 14 is
released, at that point we'll probably want to resurrect that
"osx-leaks" job.
1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
5. https://lore.kernel.org/git/20210916035603.76369-1-carenas@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When SANITIZE=leak is specified we'll now add a SANITIZE_LEAK flag to
GIT-BUILD-OPTIONS, this can then be picked up by the test-lib.sh,
which sets a SANITIZE_LEAK prerequisite.
We can then skip specific tests that are known to fail under
SANITIZE=leak, add one such annotation to t0004-unwritable.sh, which
now passes under SANITIZE=leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The difftool dir-diff mode handles symlinks by replacing them with their
readlink(2) values. This allows diff tools to see changes to symlinks
as if they were regular text diffs with the old and new path values.
This is analogous to what "git diff" displays when symlinks change.
The temporary diff directories that are created initially contain
symlinks because they get checked-out using a temporary index that
retains the original symlinks as checked-in to the repository.
A bug was introduced when difftool was rewritten in C that made
difftool write the readlink(2) contents into the pointed-to file rather
than the symlink itself. The write was going through the symlink and
writing to its target rather than writing to the symlink path itself.
Replace symlinks with raw text files by unlinking the symlink path
before writing the readlink(2) content into them.
When 18ec800512 (difftool: handle modified symlinks in dir-diff mode,
2017-03-15) added handling for modified symlinks this bug got recorded
in the test suite. The tests included the pointed-to symlink target
paths. These paths were being reported because difftool was erroneously
writing to them, but they should have never been reported nor written.
Correct the modified-symlinks test cases by removing the target files
from the expected output.
Add a test to ensure that symlinks are written with the readlink(2)
values and that the target files contain their original content.
Reported-by: Alan Blotz <work@blotz.org>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a git_config() call was added in dbfae68969 (help: reuse
print_columns() for help -a, 2012-04-13) to read the column config
we'd always use the resulting "colopts" variable.
Then in 63eae83f8f (help: add "-a --verbose" to list all commands
with synopsis, 2018-05-20) we started only using the "colopts" config
under "--all" if "--no-verbose" was also given, but the "git_config()"
call was not moved inside the "verbose" branch of the code.
This change effectively does that, we'll only call list_commands()
under "--all --no-verbose", so let's have it look up the config it
needs. See 26c7d06783 (help -a: improve and make --verbose default, 2018-09-29) for another case in help.c where we look up config.
The get_colopts() function is named for consistency with the existing
get_alias() function added in 26c7d06783.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "help" builtin has been able to emit configuration variables since
e17ca92637 (completion: drop the hard coded list of config vars,
2018-05-26), but it hasn't produced exactly the format the completion
script wanted. Let's do that.
We got partway there in 2675ea1cc0 (completion: use 'sort -u' to
deduplicate config variable names, 2019-08-13) and
d9438873c4 (completion: deduplicate configuration sections,
2019-08-13), but after both we still needed some sorting,
de-duplicating and awk post-processing of the list.
We can instead simply do the relevant parsing ourselves (we were doing
most of it already), and call string_list_remove_duplicates() after
already sorting the list, so the caller doesn't need to invoke "sort
-u". The "--config-for-completion" output is the same as before after
being passed through "sort -u".
Then add a new "--config-sections-for-completion" option. Under that
output we'll emit config sections like "alias" (instead of "alias." in
the --config-for-completion output).
We need to be careful to leave the "--config-for-completion" option
compatible with users git, but are still running a shell with an older
git-completion.bash. If we e.g. changed the option name they'd see
messages about git-completion.bash being unable to find the
"--config-for-completion" option.
Such backwards compatibility isn't something we should bend over
backwards for, it's only helping users who:
* Upgrade git
* Are in an old shell
* The git-completion.bash in that shell hasn't cached the old
"--config-for-completion" output already.
But since it's easy in this case to retain compatibility, let's do it,
the older versions of git-completion.bash won't care that the input
they get doesn't change after a "sort -u".
While we're at it let's make "--config-for-completion" die if there's
anything left over in "argc", and do the same in the new
"--config-sections-for-completion" option.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a regression test for the --config-for-completion option, this was
tested for indirectly with the test added in 7a09a8f093 (completion:
add tests for 'git config' completion, 2019-08-13), but let's do it
directly here as well.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As preceding commits have incrementally established all of the --all,
--guides, --config and hidden --config-for-completion options are
mutually exclusive. So let's use OPT_CMDMODE() to parse the
command-line instead, and take advantage of its conflicting options
detection.
This is the first command with a hidden CMDMODE, so let's introduce a
OPT_CMDMODE_F() macro to go along with OPT_CMDMODE().
I think this makes the usage information that we emit slightly worse,
e.g. before we'd emit:
$ git help --all --config
fatal: --config and --all cannot be combined
usage: git help [-a|--all] [--[no-]verbose]]
[[-i|--info] [-m|--man] [-w|--web]] [<command>]
or: git help [-g|--guides]
or: git help [-c|--config]
[...]
$
And now:
$ git help --all --config
error: option `config' is incompatible with --all
$
But improving that is a general topic for parse-options.c improvement,
i.e. we should probably emit the full usage in that case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --all and --guides commands could be combined, which wouldn't have
any impact on the output except for:
git help --all --guides --no-verbose
Listing the guide alongside that output was clearly not intended, so
let's error out here. See 002b726a40 (builtin/help.c: add
list_common_guides_help() function, 2013-04-02) for the initial
implementation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in the --config option that's been there ever since its
introduction in 3ac68a93fd (help: add --config to list all available
config, 2018-05-26). Die when --all and --config are combined,
combining them doesn't make sense.
The code for the --config option when combined with an earlier
refactoring done to support the --guide option in
65f98358c0 (builtin/help.c: add --guide option, 2013-04-02) would
cause us to take the "--all" branch early and ignore the --config
option.
Let's instead list these as incompatible, both in the synopsis and
help output, and enforce it in the code itself.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a missing test for checking what the --config output added in
ac68a93fd2 (help: add --config to list all available config,
2018-05-26) looks like. We should not be emitting anything except
config variables and the brief usage information at the end here.
The second test regexp here might not match three-level variables in
general, as their second level could contain ".", but in this case
we're always emitting what we extract from the documentation, so it's
all strings like:
foo.<name>.bar
If we did introduce something like variable example content here we'd
like this to break, since we'd then be likely to break the
git-completion.bash.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in 65f98358c0 (builtin/help.c: add --guide option,
2013-04-02) and a133737b80 (doc: include --guide option description
for "git help", 2013-04-02) which introduced the --guide option, it
cannot be combined with e.g. <command>.
Change the command and the "SYNOPSIS" section to reflect that desired
behavior. Now that we assert this in code we don't need to
exhaustively describe the previous confusing behavior in the
documentation either, instead of silently ignoring the provided
argument we'll now error out.
The "We're done. Ignore any remaining args" comment added in
15f7d49438 (builtin/help.c: split "-a" processing into two,
2013-04-02) can now be removed, it's obvious that we're asserting the
behavior with the check of "argc".
The "--config" option is still missing from the synopsis, it will be
added in a subsequent commit where we'll fix bugs in its
implementation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/grep-haystack-is-read-only:
grep: store grep_source buffer as const
grep: mark "haystack" buffers as const
grep: stop modifying buffer in grep_source_1()
grep: stop modifying buffer in show_line()
grep: stop modifying buffer in strip_timestamp
The "COMPUTE_HEADER_DEPENDENCIES" feature added in [1] was extended to
use auto-detection in [2], that "auto" detection has always piped
STDERR to /dev/null, so any failures on compilers that didn't support
these GCC flags would silently fall back to
"COMPUTE_HEADER_DEPENDENCIES=no".
Later when -Wpedantic support was added to DEVOPTS in [3] we started
passing -Wpedantic in combination with -Werror to the compiler
here. Note (to the pedantic): [3] actually passed "-pedantic", but it
and "-Wpedantic" are synonyms.
Turning on -Wpedantic in [3] broke the auto-detection, since this
relies on compiling an empty program. GCC would loudly complain on
STDERR:
/dev/null:1: error: ISO C forbids an empty translation unit
[-Werror=pedantic]
cc1: note: unrecognized command-line option
‘-Wno-pedantic-ms-format’ may have been intended to silence
earlier diagnostics
cc1: all warnings being treated as errors
But as that ended up in the "$(dep_check)" variable due to the "2>&1"
in [2] we didn't see it.
Then when [4] made DEVOPTS=pedantic the default specifying
"DEVELOPER=1" would effectively set "COMPUTE_HEADER_DEPENDENCIES=no".
To fix these issues let's unconditionally pass -Wno-pedantic after
$(ALL_CFLAGS), we might get a -Wpedantic via config.mak.dev after, or
the builder might specify it via CFLAGS. In either case this will undo
current and future problems with -Wpedantic.
I think it would make sense to simply remove the "2>&1", it would mean
that anyone using a non-GCC-like compiler would get warnings under
COMPUTE_HEADER_DEPENDENCIES=auto, e.g on AIX's xlc would emit:
/opt/IBM/xlc/13.1.3/bin/.orig/xlc: 1501-208 (S) command option D is missing a subargument
Non-zero 40 exit with COMPUTE_HEADER_DEPENDENCIES=auto, set it to "yes" or "no" to quiet auto-detect
And on Solaris with SunCC:
cc: Warning: Option -x passed to ld, if ld is invoked, ignored otherwise
cc: refused to overwrite input file by output file: /dev/null
cc: Warning: Option -x passed to ld, if ld is invoked, ignored otherwise
cc: refused to overwrite input file by output file: /dev/null
Non-zero 1 exit with COMPUTE_HEADER_DEPENDENCIES=auto, set it to "yes" or "no" to quiet auto-detect
Both could be quieted by setting COMPUTE_HEADER_DEPENDENCIES=no
explicitly, as suggested, but let's see if this'll fix it without
emitting too much noise at those that aren't using "gcc" or "clang".
1. f2fabbf76e (Teach Makefile to check header dependencies,
2010-01-26)
2. 111ee18c31 (Makefile: Use computed header dependencies if the
compiler supports it, 2011-08-18)
3. 729b3925ed (Makefile: add a DEVOPTS flag to get pedantic
compilation, 2018-07-24)
4. 6a8cbc41ba (developer: enable pedantic by default, 2021-09-03)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "sparse" target and its *.sp dependencies to be
non-.PHONY. Before this change "make sparse" would take ~5s to re-run
all the *.c files through "cgcc", after it it'll create an empty *.sp
file sitting alongside the *.c file, only if the *.c file or its
dependencies are newer than the *.sp is the *.sp re-made.
We ensure that the recursive dependencies are correct by depending on
the *.o file, which in turn will have correct dependencies by either
depending on all header files, or under
"COMPUTE_HEADER_DEPENDENCIES=yes" the headers it needs.
This means that a plain "make sparse" is much slower, as we'll now
need to make the *.o files just to create the *.sp files, but
incrementally creating the *.sp files is *much* faster and less
verbose, it thus becomes viable to run "sparse" along with "all" as
e.g. "git rebase --exec 'make all sparse'".
On my box with -j8 "make sparse" was fast before, or around 5 seconds,
now it only takes that long the first time, and the common case is
<100ms, or however long it takes GNU make to stat the *.sp file and
see that all the corresponding *.c file and its dependencies are
older.
See 0bcd9ae85d (sparse: Fix errors due to missing target-specific
variables, 2011-04-21) for the modern implementation of the sparse
target being changed here.
It is critical that we use -Wsparse-error here, otherwise the error
would only show up once, but we'd successfully create the empty *.sp
file, and running a second time wouldn't show the error. I'm therefore
not putting it into SPARSE_FLAGS or SP_EXTRA_FLAGS, it's not optional,
the Makefile logic won't behave properly without it.
Appending to $@ without a move is OK here because we're using the
.DELETE_ON_ERROR Makefile feature. See 7b76d6bf22 (Makefile: add and
use the ".DELETE_ON_ERROR" flag, 2021-06-29). GNU make ensures that on
error this file will be removed.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When HTTP/2 is in use, we fail to correctly redact "Authorization" (and
other) headers in our GIT_TRACE_CURL output.
We get the headers in our CURLOPT_DEBUGFUNCTION callback, curl_trace().
It passes them along to curl_dump_header(), which in turn checks
redact_sensitive_header(). We see the headers as a text buffer like:
Host: ...
Authorization: Basic ...
After breaking it into lines, we match each header using skip_prefix().
This is case-sensitive, even though HTTP headers are case-insensitive.
This has worked reliably in the past because these headers are generated
by curl itself, which is predictable in what it sends.
But when HTTP/2 is in use, instead we get a lower-case "authorization:"
header, and we fail to match it. The fix is simple: we should match with
skip_iprefix().
Testing is more complicated, though. We do have a test for the redacting
feature, but we don't hit the problem case because our test Apache setup
does not understand HTTP/2. You can reproduce the issue by applying this
on top of the test change in this patch:
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index afa91e38b0..19267c7107 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -29,6 +29,9 @@ ErrorLog error.log
LoadModule setenvif_module modules/mod_setenvif.so
</IfModule>
+LoadModule http2_module modules/mod_http2.so
+Protocols h2c
+
<IfVersion < 2.4>
LockFile accept.lock
</IfVersion>
@@ -64,8 +67,8 @@ LockFile accept.lock
<IfModule !mod_access_compat.c>
LoadModule access_compat_module modules/mod_access_compat.so
</IfModule>
-<IfModule !mod_mpm_prefork.c>
- LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
+<IfModule !mod_mpm_event.c>
+ LoadModule mpm_event_module modules/mod_mpm_event.so
</IfModule>
<IfModule !mod_unixd.c>
LoadModule unixd_module modules/mod_unixd.so
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 1c2a444ae7..ff74f0ae8a 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -24,6 +24,10 @@ test_expect_success 'create http-accessible bare repository' '
git push public main:main
'
+test_expect_success 'prefer http/2' '
+ git config --global http.version HTTP/2
+'
+
setup_askpass_helper
test_expect_success 'clone http repository' '
but this has a few issues:
- it's not necessarily portable. The http2 apache module might not be
available on all systems. Further, the http2 module isn't compatible
with the prefork mpm, so we have to switch to something else. But we
don't necessarily know what's available. It would be nice if we
could have conditional config, but IfModule only tells us if a
module is already loaded, not whether it is available at all.
This might be a non-issue. The http tests are already optional, and
modern-enough systems may just have both of these. But...
- if we do this, then we'd no longer be testing HTTP/1.1 at all. I'm
not sure how much that matters since it's all handled by curl under
the hood, but I'd worry that some detail leaks through. We'd
probably want two scripts running similar tests, one with HTTP/2 and
one with HTTP/1.1.
- speaking of which, a later test fails with the patch above! The
problem is that it is making sure we used a chunked
transfer-encoding by looking for that header in the trace. But
HTTP/2 doesn't support that, as it has its own streaming mechanisms
(the overall operation works fine; we just don't see the header in
the trace).
Furthermore, even with the changes above, this test still does not
detect the current failure, because we see _both_ HTTP/1.1 and HTTP/2
requests, which confuse it. Quoting only the interesting bits from the
resulting trace file, we first see:
=> Send header: GET /auth/smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
=> Send header: Connection: Upgrade, HTTP2-Settings
=> Send header: Upgrade: h2c
=> Send header: HTTP2-Settings: AAMAAABkAAQCAAAAAAIAAAAA
<= Recv header: HTTP/1.1 401 Unauthorized
<= Recv header: Date: Wed, 22 Sep 2021 20:03:32 GMT
<= Recv header: Server: Apache/2.4.49 (Debian)
<= Recv header: WWW-Authenticate: Basic realm="git-auth"
So the client asks for HTTP/2, but Apache does not do the upgrade for
the 401 response. Then the client repeats with credentials:
=> Send header: GET /auth/smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
=> Send header: Authorization: Basic <redacted>
=> Send header: Connection: Upgrade, HTTP2-Settings
=> Send header: Upgrade: h2c
=> Send header: HTTP2-Settings: AAMAAABkAAQCAAAAAAIAAAAA
<= Recv header: HTTP/1.1 101 Switching Protocols
<= Recv header: Upgrade: h2c
<= Recv header: Connection: Upgrade
<= Recv header: HTTP/2 200
<= Recv header: content-type: application/x-git-upload-pack-advertisement
So the client does properly redact there, because we're speaking
HTTP/1.1, and the server indicates it can do the upgrade. And then the
client will make further requests using HTTP/2:
=> Send header: POST /auth/smart/repo.git/git-upload-pack HTTP/2
=> Send header: authorization: Basic dXNlckBob3N0OnBhc3NAaG9zdA==
=> Send header: content-type: application/x-git-upload-pack-request
And there we can see that the credential is _not_ redacted. This part of
the test is what gets confused:
# Ensure that there is no "Basic" followed by a base64 string, but that
# the auth details are redacted
! grep "Authorization: Basic [0-9a-zA-Z+/]" trace &&
grep "Authorization: Basic <redacted>" trace
The first grep does not match the un-redacted HTTP/2 header, because
it insists on an uppercase "A". And the second one does find the
HTTP/1.1 header. So as far as the test is concerned, everything is OK,
but it failed to notice the un-redacted lines.
We can make this test (and the other related ones) more robust by adding
"-i" to grep case-insensitively. This isn't really doing anything for
now, since we're not actually speaking HTTP/2, but it future-proofs the
tests for a day when we do (either we add explicit HTTP/2 test support,
or it's eventually enabled by default by our Apache+curl test setup).
And it doesn't hurt in the meantime for the tests to be more careful.
The change to use "grep -i", coupled with the changes to use HTTP/2
shown above, causes the test to fail with the current code, and pass
after this patch is applied.
And finally, there's one other way to demonstrate the issue (and how I
actually found it originally). Looking at GIT_TRACE_CURL output against
github.com, you'll see the unredacted output, even if you didn't set
http.version. That's because setting it is only necessary for curl to
send the extra headers in its HTTP/1.1 request that say "Hey, I speak
HTTP/2; upgrade if you do, too". But for a production site speaking
https, the server advertises via ALPN, a TLS extension, that it supports
HTTP/2, and the client can immediately start using it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove spaces in `non - zero` and add a space between the diff
format/mode and option parentheses in difftool's usage strings.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we've split up the write_sub_test_lib_test*() and
run_sub_test_lib_test*() functions let's fix those tests in
t0000-basic.sh that were verbosely copy/pasting earlier tests.
That we caught all of them was asserted with a follow-up change that's
not part of this series[1], we might add such a duplication check at
some later time, but for now let's just one-off remove the duplicate
boilerplate.
1. https://lore.kernel.org/git/patch-v3-6.9-bc79b29f3c-20210805T103237Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the testing for test-lib.sh itself to assert that we have a
exit code of 1, not any non-zero. Improves code added in
0445e6f0a1 (test-lib: '--run' to run only specific tests,
2014-04-30).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the two check_sub_test_lib_test*() functions to avoid
duplicating the same comparison they did of stdout. This duplication
was initially added when check_sub_test_lib_test_err() was added in
0445e6f0a1 (test-lib: '--run' to run only specific tests,
2014-04-30).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of a sub-shell for running the test_cmp of stdout/stderr for
the test author was introduced in this form in 565b6fa87b (tests:
refactor mechanics of testing in a sub test-lib, 2012-12-16), but from
looking at the history that seemed to have diligently copied my
original ad-hoc implementation in 7b90511970 (t/t0000-basic.sh: Run
the passing TODO test inside its own test-lib, 2010-08-19).
There's no reason to use a subshell here, we try to avoid it in
general. It also improves readability, if the test fails we print out
the relative path in the trash directory that needs to be looked
at.
Before that was mostly obscured, since the "write_sub_test_lib_test"
will pick the directory for you from the test name.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the $test_description provided for the generated subtests to be
constant, since the only purpose of having it is that test-lib.sh will
barf if it isn't supplied.
The other purpose of having it was to effectively split up the test
description between the argument to test_expect_success and the
argument to "write_and_run_sub_test_lib_test". Let's only use one of
the two.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the function to write and run tests of the test-lib.sh output
into two functions.
When this was added back in 565b6fa87b (tests: refactor mechanics of
testing in a sub test-lib, 2012-12-16) there was no reason to do this,
but since we started supporting test arguments in
517cd55fd5 (test-lib: self-test that --verbose works, 2013-06-23)
we've started to write out duplicate tests simply to test different
arguments, now we'll be able to re-use them.
This change doesn't consolidate any of those tests yet, it just makes
it possible to do so. All the changes in t0000-basic.sh are a simple
search-replacement.
Since the _run_sub_test_lib_test_common() function doesn't handle
running the test anymore we can do away with the sub-shell, which was
used to scope an "unset" and "export" shell variables.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some commands such as "git stash" emit continued options output with
e.g. "git stash -h", because usage_with_options_internal() prefixes
with its own whitespace the resulting output wasn't properly
aligned. Let's account for the added whitespace, which properly aligns
the output.
The "git stash" command has usage output with a N_() translation that
legitimately stretches across multiple lines;
N_("git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]\n"
" [-u|--include-untracked] [-a|--all] [-m|--message <message>]\n"
[...]
We'd like to have that output aligned with the length of the initial
"git stash " output, but since usage_with_options_internal() adds its
own whitespace prefixing we fell short, before this change we'd emit:
$ git stash -h
usage: git stash list [<options>]
or: git stash show [<options>] [<stash>]
[...]
or: git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
[-u|--include-untracked] [-a|--all] [-m|--message <message>]
[...]
Now we'll properly emit aligned output. I.e. the last four lines
above will instead be (a whitespace-only change to the above):
[...]
or: git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
[-u|--include-untracked] [-a|--all] [-m|--message <message>]
[...]
We could also go for an approach where we have the caller support no
padding of their own, i.e. (same as the first example, except for the
padding on the second line):
N_("git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]\n"
"[-u|--include-untracked] [-a|--all] [-m|--message <message>]\n"
[...]
But to do that we'll need to find the length of "git stash". We can
discover that from the "cmd" in the "struct cmd_struct", but there
might be cases with sub-commands or "git" itself taking arguments that
would make that non-trivial.
Even if it were I still think this approach is better, because this way
we'll get the same legible alignment in the C code. The fact that
usage_with_options_internal() is adding its own prefix padding is an
implementation detail that callers shouldn't need to worry about.
Implementation notes:
We could skip the string_list_split() with a strchr(str, '\n') check,
but we'd then need to duplicate our state machine for strings that do
and don't contain a "\n". It's simpler to just always split into a
"struct string_list", even though the common case is that that "struct
string_list" will contain only one element. This is not
performance-sensitive code.
This change is relatively more complex since I've accounted for making
it future-proof for RTL translation support. Later in
usage_with_options_internal() we have some existing padding code
dating back to d7a38c54a6 (parse-options: be able to generate usages
automatically, 2007-10-15) which isn't RTL-safe, but that code would
be easy to fix. Let's not introduce new RTL translation problems here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the "Sending v2" section, readers are directed to create v2 patches
without using --range-diff. However, it is customary to include a
range-diff against the v1 patches as a reviewer aid.
Update the "Sending v2" section to suggest a simple workflow that uses
the --range-diff option. Also include some explanation for -v2 and
--range-diff to help the reader understand the importance.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a logic error in dfea575017 (Makefile: lazily compute header
dependencies, 2010-01-26) where we'd make whether we cleaned the
.depend dirs contingent on the currently configured
COMPUTE_HEADER_DEPENDENCIES value. Before this running e.g.:
make COMPUTE_HEADER_DEPENDENCIES=yes grep.o
make COMPUTE_HEADER_DEPENDENCIES=no clean
Would leave behind the .depend directory, now it'll be removed.
Normally we'd need to use another variable, but in this case there's
no other uses of $(dep_dirs), as opposed to $(dep_args) which is used
as an argument to $(CC). So just deleting this line makes everything
work correctly.
See http://lore.kernel.org/git/xmqqmto48ufz.fsf@gitster.g for a report
about this issue.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_TEST_INSTALLED was moved from perf-lib.sh to run in df0f5021
(perf-lib.sh: remove GIT_TEST_INSTALLED from perf-lib.sh, 2019-05-07)
and that included a change to how it inspected the existence of a
bin-wrappers directory. However, that included a typo that made the
match of bin-wrappers never work. Specifically, the assignment was
mydir_abs_wrappers="$mydir_abs_wrappers/bin-wrappers"
which uses the same variable before it is initialized. By changing it to
mydir_abs_wrappers="$mydir_abs/bin-wrappers"
We can correctly use the bin-wrappers directory.
This is critical to successfully computing performance of commands that
execute subcommands. The bin-wrappers ensure that the --exec-path is set
correctly.
Reported-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the bitfield added in 58300f4743 (sparse-index: add
index.sparse config option, 2021-03-30) and 3964fc2aae (sparse-index:
add guard to ensure full index, 2021-03-30) to just use an "int"
boolean instead.
It might be smart to optimize the space here in the future, but by
consistently using an "int" we can take its address and pass it to
repo_cfg_bool(), and therefore don't need to handle "sparse_index" as
a special-case when reading the "index.sparse" setting.
There's no corresponding config for "command_requires_full_index", but
let's change it too for consistency and to prevent future bugs
creeping in due to one of these being "unsigned".
Using "int" consistently also prevents subtle bugs or undesired
control flow creeping in here. Before the preceding commit the
initialization of "command_requires_full_index" in
prepare_repo_settings() did nothing, i.e. this:
r->settings.command_requires_full_index = 1
Was redundant to the earlier memset() to -1. Likewise for
"sparse_index" added in 58300f4743 (sparse-index: add index.sparse
config option, 2021-03-30) the code and comment added there was
misleading, we weren't initializing it to off, but re-initializing it
from "1" to "0", and then finally checking the config, and perhaps
setting it to "1" again. I.e. we could have applied this patch before
the preceding commit:
+ assert(r->settings.command_requires_full_index == 1);
r->settings.command_requires_full_index = 1;
/*
* Initialize this as off.
*/
+ assert(r->settings.sparse_index == 1);
r->settings.sparse_index = 0;
if (!repo_config_get_bool(r, "index.sparse", &value) && value)
r->settings.sparse_index = 1;
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the setup code in repo-settings.c in various ways, making the
code shorter, easier to read, and requiring fewer hacks to do the same
thing as it did before:
Since 7211b9e753 (repo-settings: consolidate some config settings,
2019-08-13) we have memset() the whole "settings" structure to -1 in
prepare_repo_settings(), and subsequently relied on the -1 value.
Most of the fields did not need to be initialized to -1, and because
we were doing that we had the enum labels "UNTRACKED_CACHE_UNSET" and
"FETCH_NEGOTIATION_UNSET" purely to reflect the resulting state
created this memset() in prepare_repo_settings(). No other code used
or relied on them, more on that below.
For the rest most of the subsequent "are we -1, then read xyz" can
simply be removed by re-arranging what we read first. E.g. when
setting the "index.version" setting we should have first read
"feature.experimental", so that it (and "feature.manyfiles") can
provide a default for our "index.version".
Instead the code setting it, added when "feature.manyFiles"[1] was
created, was using the UPDATE_DEFAULT_BOOL() macro added in an earlier
commit[2]. That macro is now gone, since it was only needed for this
pattern of reading things in the wrong order.
This also fixes an (admittedly obscure) logic error where we'd
conflate an explicit "-1" value in the config with our own earlier
memset() -1.
We can also remove the UPDATE_DEFAULT_BOOL() wrapper added in
[3]. Using it is redundant to simply using the return value from
repo_config_get_bool(), which is non-zero if the provided key exists
in the config.
Details on edge cases relating to the memset() to -1, continued from
"more on that below" above:
* UNTRACKED_CACHE_KEEP:
In [4] the "unset" and "keep" handling for core.untrackedCache was
consolidated. But it while we understand the "keep" value, we don't
handle it differently than the case of any other unknown value.
So let's retain UNTRACKED_CACHE_KEEP and remove the
UNTRACKED_CACHE_UNSET setting (which was always implicitly
UNTRACKED_CACHE_KEEP before). We don't need to inform any code
after prepare_repo_settings() that the setting was "unset", as far
as anyone else is concerned it's core.untrackedCache=keep. if
"core.untrackedcache" isn't present in the config.
* FETCH_NEGOTIATION_UNSET & FETCH_NEGOTIATION_NONE:
Since these two two enum fields added in [5] don't rely on the
memzero() setting them to "-1" anymore we don't have to provide
them with explicit values.
1. c6cc4c5afd (repo-settings: create feature.manyFiles setting,
2019-08-13)
2. 31b1de6a09 (commit-graph: turn on commit-graph by default,
2019-08-13)
3. 31b1de6a09 (commit-graph: turn on commit-graph by default,
2019-08-13)
4. ad0fb65999 (repo-settings: parse core.untrackedCache,
2019-08-13)
5. aaf633c2ad (repo-settings: create feature.experimental setting,
2019-08-13)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change tweak_untracked_cache() in "read-cache.c" to use a switch() to
have the compiler assert that we checked all possible values in the
"enum untracked_cache_setting" type, and likewise remove the "default"
case in fetch_negotiator_init() in favor of checking for
"FETCH_NEGOTIATION_UNSET" and "FETCH_NEGOTIATION_NONE".
As will be discussed in a subsequent we'll only ever have either of
these set to FETCH_NEGOTIATION_NONE, FETCH_NEGOTIATION_UNSET and
UNTRACKED_CACHE_UNSET within the prepare_repo_settings() function
itself. In preparation for fixing that code let's add a BUG() here to
mark this as unreachable code.
See ad0fb65999 (repo-settings: parse core.untrackedCache, 2019-08-13)
for when the "unset" and "keep" handling for core.untrackedCache was
consolidated, and aaf633c2ad (repo-settings: create
feature.experimental setting, 2019-08-13) for the addition of the
"default" pattern in "fetch-negotiator.c".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of the global ignore_untracked_cache_config variable added in
dae6c322fa (test-dump-untracked-cache: don't modify the untracked
cache, 2016-01-27) we can make use of the new facility to set config
via environment variables added in d8d77153ea (config: allow
specifying config entries via envvar pairs, 2021-01-12).
It's arguably a bit hacky to use setenv() and getenv() to pass
messages between the same program, but since the test helpers are not
the main intended audience of repo-settings.c I think it's better than
hardcoding the test-only special-case in prepare_repo_settings().
This uses the xsetenv() wrapper added in the preceding commit, if we
don't set these in the environment we'll fail in
t7063-status-untracked-cache.sh, but let's fail earlier anyway if that
were to happen.
This breaks any parent process that's potentially using the
GIT_CONFIG_* and GIT_CONFIG_PARAMETERS mechanism to pass one-shot
config setting down to a git subprocess, but in this case we don't
care about the general case of such potential parents. This process
neither spawns other "git" processes, nor is it interested in other
configuration. We might want to pick up other test modes here, but
those will be passed via GIT_TEST_* environment variables.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add fatal wrappers for setenv() and unsetenv(). In d7ac12b25d (Add
set_git_dir() function, 2007-08-01) we started checking its return
value, and since 48988c4d0c (set_git_dir: die when setenv() fails,
2018-03-30) we've had set_git_dir_1() die if we couldn't set it.
Let's provide a wrapper for both, this will be useful in many other
places, a subsequent patch will make another use of xsetenv().
The checking of the return value here is over-eager according to
setenv(3) and POSIX. It's documented as returning just -1 or 0, so
perhaps we should be checking -1 explicitly.
Let's just instead die on any non-zero, if our C library is so broken
as to return something else than -1 on error (and perhaps not set
errno?) the worst we'll do is die with a nonsensical errno value, but
we'll want to die in either case.
Let's make these return "void" instead of "int". As far as I can tell
there's no other x*() wrappers that needed to make the decision of
deviating from the signature in the C library, but since their return
value is only used to indicate errors (so we'd die here), we can catch
unreachable code such as
if (xsetenv(...) < 0)
[...];
I think it would be OK skip the NULL check of the "name" here for the
calls to die_errno(). Almost all of our setenv() callers are taking a
constant string hardcoded in the source as the first argument, and for
the rest we can probably assume they've done the NULL check
themselves. Even if they didn't, modern C libraries are forgiving
about it (e.g. glibc formatting it as "(null)"), on those that aren't,
well, we were about to die anyway. But let's include the check anyway
for good measure.
1. https://pubs.opengroup.org/onlinepubs/009604499/functions/setenv.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usage description for -X and -z options use descriptive instead of
imperative mood. Edit it for consistency with other options.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A rebase started with 'git rebase <A> <B>' is conceptually to first
checkout <B> and run 'git rebase <A>' starting from that state. 'git
rebase --abort' in the middle of such a rebase should take us back to
the state we checked out <B>.
This used to work, even when <B> is a tag that points at a commit,
until Git 2.20.0 when the command was reimplemented in C. The command
now complains that the tag object itself cannot be checked out, which
may be technically correct but is not what the user asked to do.
Fix this old regression by using lookup_commit_reference_by_name()
when parsing <B>. The scripted version did not need to peel the tag
because the commands it passed the tag to (e.g 'git reset') peeled the
tag themselves.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
peel_committish() appears to have been copied from the scripted rebase
but it duplicates the functionality of
lookup_commit_reference_by_name() so lets use that instead.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git uses −1 to signal an error. The builtin rebase converts these to
+1 all over the place using !! (presumably because the in the scripted
version an error was signalled by +1). This is confusing and clutters
the code, we only need to convert the value when the function returns.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our grep_buffer() function takes a non-const buffer, which is confusing:
we don't take ownership of nor write to the buffer.
This mostly comes from the fact that the underlying grep_source struct
in which we store the buffer uses non-const pointer. The memory pointed
to by the struct is sometimes owned by us (for FILE or OID sources), and
sometimes not (for BUF sources).
Let's store it as const, which lets us err on the side of caution (i.e.,
the compiler will warn us if any of our code writes to or tries to free
it).
As a result, we must annotate the one place where we do free it by
casting away the constness. But that's a small price to pay for the
extra safety and clarity elsewhere (and indeed, it already had a comment
explaining why GREP_SOURCE_BUF _didn't_ free it).
And then we can mark grep_buffer() as taking a const buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're grepping in a buffer, we don't need to modify it. So we can
take "const char *" buffers, rather than "char *". This can avoid some
awkward casts in our callers, and make our expectations more clear (we
will not take ownership of the memory, nor will we ever write to it).
These spots don't all necessarily have to be converted at the same time,
but some of them are dependent on others, because we pass
pointers-to-pointers in a few cases. And none of this should change any
behavior, since we're just adding "const" qualifiers (and likewise, the
compiler will let us know if we missed any spots). So it's relatively
low-risk to just do this all at once.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We find the end of each matching line of a buffer, and then temporarily
write a NUL to turn it into a regular C string. But we don't need to do
so, because the only thing we do in the interim is pass the line and its
length (via an "eol" pointer) to match_line(). And that function should
only look at the bytes we passed it, whether it has a terminating NUL or
not.
We can drop this temporary write in order to simplify the code and make
it easier to use const buffers in more of grep.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When showing lines via grep (or looking for funcnames), we call
show_line() on a multi-line buffer. It finds the end of line and marks
it with a NUL. However, we don't need to do so, as the resulting line is
only used along with its "eol" marker:
- we pass both to next_match(), which takes care to look at only the
bytes we specified
- we pass the line to output_color() without its matching eol marker.
However, we do use the "match" struct we got from next_match() to
tell it how many bytes to look at (which can never exceed the string
we passed it).
So we can stop setting and restoring this NUL marker. That makes the
code simpler, and will allow us to take a const buffer in a future
patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When grepping for headers in commit objects, we receive individual
lines (e.g., "author Name <email> 1234 -0000"), and then strip off the
timestamp to do our match. We do so by writing a NUL byte over the
whitespace separator, and then remembering to restore it later.
We had to do it this way when this was added back in a4d7d2c6db (log
--author/--committer: really match only with name part, 2008-09-04),
because we fed the result directly to regexec(), which expects a
NUL-terminated string. But since b7d36ffca0 (regex: use regexec_buf(),
2016-09-21), we have a function which can match part of a buffer.
So instead of modifying the string, we can instead just move the "eol"
pointer, and the rest of the code will do the right thing. This will let
further patches mark more buffers as "const".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as the previous patch, let sub-commands which
support showing or hiding a progress meter handle parsing the
`--progress` or `--no-progress` option, but do not expose it as an
option to the top-level `multi-pack-index` builtin.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "mode" word is useless in a call to open(2) that does not
create a new file. Such a call in the files backend of the ref
subsystem has been cleaned up.
* rs/no-mode-to-open-when-appending:
refs/files-backend: remove unused open mode parameter
The run-command API has been updated so that the callers can easily
ask the file descriptors open for packfiles to be closed immediately
before spawning commands that may trigger auto-gc.
* js/run-command-close-packs:
Close object store closer to spawning child processes
run_auto_maintenance(): implicitly close the object store
run-command: offer to close the object store before running
run-command: prettify the `RUN_COMMAND_*` flags
pull: release packs before fetching
commit-graph: when closing the graph, also release the slab
Various mergy operations have been prepared to work efficiently
with the sparse index.
* ds/mergies-with-sparse-index:
sparse-index: integrate with cherry-pick and rebase
sequencer: ensure full index if not ORT strategy
t1092: add cherry-pick, rebase tests
merge-ort: expand only for out-of-cone conflicts
merge: make sparse-aware with ORT
diff: ignore sparse paths in diffstat
In cone mode, the sparse-index code path learned to remove ignored
files (like build artifacts) outside the sparse cone, allowing the
entire directory outside the sparse cone to be removed, which is
especially useful when the sparse patterns change.
* ds/sparse-index-ignored-files:
sparse-checkout: clear tracked sparse dirs
sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
attr: be careful about sparse directories
sparse-checkout: create helper methods
sparse-index: use WRITE_TREE_MISSING_OK
sparse-index: silently return when cache tree fails
unpack-trees: fix nested sparse-dir search
sparse-index: silently return when not using cone-mode patterns
t7519: rewrite sparse index test
Build clean-up for "make tags" and friends.
* ab/make-tags-cleanup:
Makefile: normalize clobbering & xargs for tags targets
Makefile: remove "cscope.out", not "cscope*" in cscope.out target
Makefile: don't use "FORCE" for tags targets
Makefile: add QUIET_GEN to "cscope" target
Makefile: move ".PHONY: cscope" near its target
Code clean-up around "git serve".
* ab/serve-cleanup:
upload-pack: document and rename --advertise-refs
serve.[ch]: remove "serve_options", split up --advertise-refs code
{upload,receive}-pack tests: add --advertise-refs tests
serve.c: move version line to advertise_capabilities()
serve: move transfer.advertiseSID check into session_id_advertise()
serve.[ch]: don't pass "struct strvec *keys" to commands
serve: use designated initializers
transport: use designated initializers
transport: rename "fetch" in transport_vtable to "fetch_refs"
serve: mark has_capability() as static
More parts of "git submodule add" has been rewritten in C.
* ar/submodule-add-more:
submodule--helper: rename compute_submodule_clone_url()
submodule--helper: remove resolve-relative-url subcommand
submodule--helper: remove add-config subcommand
submodule--helper: remove add-clone subcommand
submodule--helper: convert the bulk of cmd_add() to C
dir: libify and export helper functions from clone.c
submodule--helper: remove repeated code in sync_submodule()
submodule--helper: refactor resolve_relative_url() helper
submodule--helper: add options for compute_submodule_clone_url()
The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up. This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.
* tb/pack-finalize-ordering:
pack-objects: rename .idx files into place after .bitmap files
pack-write: split up finish_tmp_packfile() function
builtin/index-pack.c: move `.idx` files into place last
index-pack: refactor renaming in final()
builtin/repack.c: move `.idx` files into place last
pack-write.c: rename `.idx` files after `*.rev`
pack-write: refactor renaming in finish_tmp_packfile()
bulk-checkin.c: store checksum directly
pack.h: line-wrap the definition of finish_tmp_packfile()
Update the build procedure to use the "-pedantic" build when
DEVELOPER makefile macro is in effect.
* cb/pedantic-build-for-developers:
developer: enable pedantic by default
win32: allow building with pedantic mode enabled
gettext: remove optional non-standard parens in N_() definition
The code to show progress indicator in a few code paths did not
cover between 0-100%, which has been corrected.
* ab/progress-users-adjust-counters:
entry: show finer-grained counter in "Filtering content" progress line
commit-graph: fix bogus counter in "Scanning merged commits" progress line
"git diff --submodule=diff" showed failure from run_command() when
trying to run diff inside a submodule, when the user manually
removes the submodule directory.
* dt/submodule-diff-fixes:
diff --submodule=diff: don't print failure message twice
diff --submodule=diff: do not fail on ever-initialied deleted submodules
t4060: remove unused variable
Reduce number of write(2) system calls while sending the
ref advertisement.
* jv/pkt-line-batch:
upload-pack: use stdio in send_ref callbacks
pkt-line: add stdio packet write functions
"git maintenance" scheduler learned to use systemd timers as a
possible backend.
* lh/systemd-timers:
maintenance: add support for systemd timers on Linux
maintenance: `git maintenance run` learned `--scheduler=<scheduler>`
cache.h: Introduce a generic "xdg_config_home_for(…)" function
The tracing of process ancestry information has been enhanced.
* ab/tr2-leaks-and-fixes:
tr2: log N parent process names on Linux
tr2: do compiler enum check in trace2_collect_process_info()
tr2: leave the parent list empty upon failure & don't leak memory
tr2: stop leaking "thread_name" memory
tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
tr2: remove NEEDSWORK comment for "non-procfs" implementations
The code to make "git grep" recurse into submodules has been
updated to migrate away from the "add submodule's object store as
an alternate object store" mechanism (which is suboptimal).
* jt/grep-wo-submodule-odb-as-alternate:
t7814: show lack of alternate ODB-adding
submodule-config: pass repo upon blob config read
grep: add repository to OID grep sources
grep: allocate subrepos on heap
grep: read submodule entry with explicit repo
grep: typesafe versions of grep_source_init
grep: use submodule-ODB-as-alternate lazy-addition
submodule: lazily add submodule ODBs as alternates
The reachability bitmap file used to be generated only for a single
pack, but now we've learned to generate bitmaps for history that
span across multiple packfiles.
* tb/multi-pack-bitmaps: (29 commits)
pack-bitmap: drop bitmap_index argument from try_partial_reuse()
pack-bitmap: drop repository argument from prepare_midx_bitmap_git()
p5326: perf tests for MIDX bitmaps
p5310: extract full and partial bitmap tests
midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
t7700: update to work with MIDX bitmap test knob
t5319: don't write MIDX bitmaps in t5319
t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
t5326: test multi-pack bitmap behavior
t/helper/test-read-midx.c: add --checksum mode
t5310: move some tests to lib-bitmap.sh
pack-bitmap: write multi-pack bitmaps
pack-bitmap: read multi-pack bitmaps
pack-bitmap.c: avoid redundant calls to try_partial_reuse
pack-bitmap.c: introduce 'bitmap_is_preferred_refname()'
pack-bitmap.c: introduce 'nth_bitmap_object_oid()'
pack-bitmap.c: introduce 'bitmap_num_objects()'
midx: avoid opening multiple MIDXs when writing
midx: close linked MIDXs, avoid leaking memory
...
Optimize code that handles large number of refs in the "git fetch"
code path.
* ps/fetch-optim:
fetch: avoid second connectivity check if we already have all objects
fetch: merge fetching and consuming refs
fetch: refactor fetch refs to be more extendable
fetch-pack: optimize loading of refs via commit graph
connected: refactor iterator to return next object ID directly
fetch: avoid unpacking headers in object existence check
fetch: speed up lookup of want refs via commit-graph
When cloning a repository with an unborn HEAD, we'll set the local HEAD
to match it only if the local repository is non-bare. This is
inconsistent with all other combinations:
remote HEAD | local repo | local HEAD
-----------------------------------------------
points to commit | non-bare | same as remote
points to commit | bare | same as remote
unborn | non-bare | same as remote
unborn | bare | local default
So I don't think this is some clever or subtle behavior, but just a bug
in 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05). And it's
easy to see how we ended up there. Before that commit, the code to set
up the HEAD for an empty repo was guarded by "if (!option_bare)". That's
because the only thing it did was call install_branch_config(), and we
don't want to do so for a bare repository (unborn HEAD or not).
That commit put the handling of unborn HEADs into the same block, since
those also need to call install_branch_config(). But the unborn case has
an additional side effect of calling create_symref(), and we want that
to happen whether we are bare or not.
This patch just pulls all of the "figure out the default branch" code
out of the "!option_bare" block. Only the actual config installation is
kept there.
Note that this does mean we might allocate "ref" and not use it (if the
remote is empty but did not advertise an unborn HEAD). But that's not
really a big deal since this isn't a hot code path, and it keeps the
code simple. The alternative would be handling unborn_head_target
separately, but that gets confusing since its memory ownership is
tangled up with the "ref" variable.
There's just one new test, for the case we're fixing. The other ones in
the table are handled elsewhere (the unborn non-bare case just above,
and the actually-born cases in t5601, t5606, and t5609, as they do not
require v2's "unborn" protocol extension).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ab/retire-option-argument:
parse-options API: remove OPTION_ARGUMENT feature
difftool: use run_command() API in run_file_diff()
difftool: prepare "diff" cmdline in cmd_difftool()
difftool: prepare "struct child_process" in cmd_difftool()
Not sure what happened, but the comment is describing code elsewhere in
the file. Fix the comment to actually discuss the code that follows.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 84e4484f12 (commit-graph: use parse_options_concat(), 2021-08-23) we
unified common options of commit-graph's subcommands into a single
"common_opts" array.
But 84e4484f12 introduced a behavior change which is to accept the
"--[no-]progress" option before any sub-commands, e.g.,
git commit-graph --progress write ...
Prior to that commit, the above would error out with "unknown option".
There are two issues with this behavior change. First is that the
top-level --[no-]progress is not always respected. This is because
isatty(2) is performed in the sub-commands, which unconditionally
overwrites any --[no-]progress that was given at the top-level.
But the second issue is that the existing sub-commands of commit-graph
only happen to both have a sensible interpretation of what `--progress`
or `--no-progress` means. If we ever added a sub-command which didn't
have a notion of progress, we would be forced to ignore the top-level
`--[no-]progress` altogether.
Since we haven't released a version of Git that supports --[no-]progress
as a top-level option for `git commit-graph`, let's remove it.
Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The configuration option `rebase.forkpoint' is only mentioned in the man
page of git-config(1). Since it is a configuration for rebase, mention
it in the documentation of rebase at the --fork-point/--no-fork-point
section. This will help users set a preferred default for their
workflow.
Signed-off-by: Wesley Schwengle <wesley@opperschaap.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert test helper to use `start_bg_command()` when spawning a server
daemon in the background rather than blocks of platform-specific code.
Also, while here, remove _() translation around error messages since
this is a test helper and not Git code.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a variation of `run_command()` and `start_command()` to launch a command
into the background and optionally wait for it to become "ready" before returning.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set an ACL on the named pipe to allow the well-known group EVERYONE
to read and write to the IPC server's named pipe. In the event that
the daemon was started with elevation, allow non-elevated clients
to communicate with the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create "ipc-debug" category events to log unexpected errors
when creating Simple-IPC connections.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the declartion of the `enum ipc_active_state` type outside of
the SUPPORTS_SIMPLE_IPC ifdef.
A later commit will introduce the `fsmonitor_ipc__*()` API and stub in
a "mock" implementation that requires this enum in some function
signatures.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add `command_len` argument to the Simple IPC API.
In my original Simple IPC API, I assumed that the request would always
be a null-terminated string of text characters. The `command`
argument was just a `const char *`.
I found a caller that would like to pass a binary command to the
daemon, so I am amending the Simple IPC API to receive `const char
*command, size_t command_len` arguments.
I considered changing the `command` argument to be a `void *`, but the
IPC layer simply passes it to the pkt-line layer which takes a `const
char *`, so to avoid confusion I left it as is.
Note, the response side has always been a `struct strbuf` which
includes the buffer and length, so we already support returning a
binary answer. (Yes, it feels a little weird returning a binary
buffer in a `strbuf`, but it works.)
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create "child_ready" event to capture the state of a child process
created in the background.
When a child command is started a "child_start" event is generated in
the Trace2 log. For normal synchronous children, a "child_exit" event
is later generated when the child exits or is terminated. The two events
include information, such as the "child_id" and "pid", to allow post
analysis to match-up the command line and exit status.
When a child is started in the background (and may outlive the parent
process), it is not possible for the parent to emit a "child_exit"
event. Create a new "child_ready" event to indicate whether the
child was successfully started. Also include the "child_id" and "pid"
to allow similar post processing.
This will be used in a later commit with the new "start_bg_command()".
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we both can propagate values from the hashcache, and respect
the configuration to enable the hashcache at all, test that both of
these function correctly by hardening their behavior with a test.
Like the hash-cache in classic single-pack bitmaps, this helps more
proportionally the more up-to-date your bitmap coverage is. When our
bitmap coverage is out-of-date with the ref tips, we spend more time
proportionally traversing, and all of that traversal gets the name-hash
filled in.
But for the up-to-date bitmaps, this helps quite a bit. These numbers
are on git.git, with `pack.threads=1` to help see the difference
reflected in the overall runtime.
Test origin/tb/multi-pack-bitmaps HEAD
-------------------------------------------------------------------------------------
5326.4: simulated clone 1.87(1.80+0.07) 1.46(1.42+0.03) -21.9%
5326.5: simulated fetch 2.66(2.61+0.04) 1.47(1.43+0.04) -44.7%
5326.6: pack to file (bitmap) 2.74(2.62+0.12) 1.89(1.82+0.07) -31.0%
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To help test the performance of permuting the contents of the hash-cache
when generating a MIDX bitmap, we need a bitmap which has its hash-cache
populated.
And since multi-pack bitmaps don't add *new* values to the hash-cache,
we have to rely on a single-pack bitmap to generate those values for us.
Therefore, generate a pack bitmap before the MIDX one in order to ensure
that the MIDX bitmap has entries in its hash-cache. Since we don't want
to time generating the pack bitmap, move that to a non-perf test run
before we try to generate the MIDX bitmap.
Likewise, get rid of the pack bitmap afterwords, to make certain that we
are not accidentally using it in the performance tests run later on.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a few typos and alignment issues, and while at it update the
example hashes to show most of the ones available in recent crypt(3).
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some versions of crypt(3) will return NULL when passed an unsupported
hash type (ex: OpenBSD with DES), so check for undef instead of using
it directly.
Also use this to probe the system and select a better hash function in
the tests, so it can pass successfully.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
[jc: <CAPUEspjqD5zy8TLuFA96usU7FYi=0wF84y7NgOVFqegtxL9zbw@mail.gmail.com>]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c057bad370 (git-cvsserver: use a password file cvsserver pserver,
2010-05-15) adds a way for `git cvsserver` to provide authenticated
pserver accounts without having clear text passwords, but uses the
username instead of the password to the call for crypt(3).
Correct that, and make sure the documentation correctly indicates how
to obtain hashed passwords that could be used to populate this
configuration, as well as correcting the hash that was used for the
tests.
This change will require that any user of this feature updates the
hashes in their configuration, but has the advantage of using a more
similar format than cvs uses, probably also easying any migration.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9af0b8dbe2 (t0000-basic: more commit-tree tests., 2006-04-26) adds
tests for commit-tree that mask the return exit from git as described
in a378fee5b0 (Documentation: add shell guidelines, 2018-10-05).
Fix the tests, to avoid pipes by using a temporary file instead.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
b8ba412bf7 (tree-diff: avoid alloca for large allocations, 2016-06-07)
adds a way to route some bigger allocations out of the stack and free
them through the addition of two conveniently named macros, but leaves
the calls to free the xalloca part, which could be also in the heap,
if the system doesn't HAVE_ALLOCA_H (ex: macOS and other BSD).
Add the missing free call, xalloca_free(), which is a noop if we
allocated memory in the stack frame, but a real free() if we
allocated in the heap instead, and while at it, change the expression
to match in both macros for ease of readability.
This avoids a leak reported by LSAN while running t0000 but that
wouldn't fail the test (which is fixed in the next patch):
SUMMARY: LeakSanitizer: 1034 byte(s) leaked in 15 allocation(s).
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Time complexities for pack_pos_to_midx and midx_to_pack_pos are swapped,
correct it.
Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code that optionally creates the *.rev reverse index file has
been optimized to avoid needless computation when it is not writing
the file out.
* ab/reverse-midx-optim:
pack-write: skip *.rev work when not writing *.rev
"make INSTALL_STRIP=-s install" allows the installation step to use
"install -s" to strip the binaries as they get installed.
* bs/install-strip:
make: add INSTALL_STRIP option variable
Teach "test_pause" and "debug" helpers to allow using the HOME and
TERM environment variables the user usually uses.
* pb/test-use-user-env:
test-lib-functions: keep user's debugger config files and TERM in 'debug'
test-lib-functions: optionally keep HOME, TERM and SHELL in 'test_pause'
test-lib-functions: use 'TEST_SHELL_PATH' in 'test_pause'
The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge.
* jc/trivial-threeway-binary-merge:
apply: resolve trivial merge without hitting ll-merge with "--3way"
t1400.190 sometimes fails or even hangs because of the way it uses
fifos. Our goal is to interactively read and write lines from
update-ref, so we have two fifos, in and out. We open a descriptor
connected to "in" and redirect output to that, so that update-ref does
not see EOF as it would if we opened and closed it for each "echo" call.
But we don't do the same for the output. This leads to a race where our
"read response <out" has not yet opened the fifo, but update-ref tries
to write to it and gets SIGPIPE. This can result in the test failing, or
worse, hanging as we wait forever for somebody to write to the pipe.
This is the same proble we fixed in 4783e7ea83 (t0008: avoid SIGPIPE
race condition on fifo, 2013-07-12), and we can fix it the same way, by
opening a second long-running descriptor.
Before this patch, running:
./t1400-update-ref.sh --run=1,190 --stress
failed or hung within a few dozen iterations. After it, I ran it for
several hundred without problems.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently store each submodule gitdir in ".git/modules/<name>", but
this has problems with some submodule naming schemes, as described in a
comment in submodule_name_to_gitdir() in this patch.
Extract the determination of the location of a submodule's gitdir into
its own function submodule_name_to_gitdir(). For now, the problem
remains unsolved, but this puts us in a better position for finding a
solution.
This was motivated, at $DAYJOB, by a part of Android's repo hierarchy
[1]. In particular, there is a repo "build", and several repos of the
form "build/<name>".
This is based on earlier work by Brandon Williams [2].
[1] https://android.googlesource.com/platform/
[2] https://lore.kernel.org/git/20180808223323.79989-2-bmwill@google.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The v2 ls-refs command may receive extra arguments from the client, one
per pkt-line. The spec is pretty clear that the arguments must come from
a specified set, but we silently ignore any unknown entries. For a
well-behaved client this doesn't matter, but it makes testing and
debugging more confusing. Let's tighten this up to match the spec.
In theory this liberal behavior _could_ be useful for extending the
protocol. But:
- every other part of the protocol requires that the server first
indicate that it supports the argument; this includes the fetch and
object-info commands, plus the "unborn" capability added to ls-refs
itself
- it's not a very good extension mechanism anyway; without the server
advertising support, clients would have no idea if the argument was
silently ignored, or accepted and simply had no effect
So we're not really losing anything by tightening this.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our table of v2 "capabilities" contains everything we might tell the
client we support. But there are differences in how we expect the client
to respond. Some of the entries are true capabilities (i.e., we expect
the client to say "yes, I support this"), and some are ones we expect
them to send as commands (with "command=ls-refs" or similar).
When we receive a capability used as a command, we complain about that.
But when we receive a command used as a capability (e.g., just "ls-refs"
in a pkt-line by itself), we silently ignore it.
This isn't really hurting anything (clients shouldn't send it, and we'll
ignore it), but we can tighten up the protocol to match what we expect
to happen.
There are two new tests here. The first one checks a capability used as
a command, which already passes. The second tests a command as a
capability, which this patch fixes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we see a line from the client like "command=ls-refs", we parse
everything after the equals sign as a capability, which we check against
our capabilities table. If we don't recognize the command (e.g.,
"command=foo"), we'll reject it.
But in parse_command(), we use the same get_capability() parser for
parsing non-command lines. So if we see "command=ls-refs=foo", we will
feed "ls-refs=foo" to get_capability(), which will say "OK, that's
ls-refs, with value 'foo'". But then we simply ignore the value
entirely.
The client is violating the spec here, which says:
command = PKT-LINE("command=" key LF)
key = 1*(ALPHA | DIGIT | "-_")
I.e., the key is not even allowed to have an equals sign in it. Whereas
a real non-command capability does allow a value:
capability = PKT-LINE(key[=value] LF)
So by reusing the same get_capability() parser, we are mixing up the
"key" and "capability" tokens. However, since that parser tells us
whether it saw an "=", we can still use it; we just need to reject any
input that produces a non-NULL value field.
The current behavior isn't really hurting anything (the client should
never send such a request, and if it does, we just ignore the "value"
part). But since it does violate the spec, let's tighten it up to
prevent any surprising behavior.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've never documented the fact that a client can provide multiple
ref-prefix capabilities. Let's describe the behavior.
We also never discussed the "best effort" nature of the prefixes. The
client side of git.git has always treated them this way, filtering the
result with local patterns. And indeed any client must do this, because
the prefix patterns are not sufficient to express the usual refspecs
(and so for "foo" we ask for "refs/heads/foo", "refs/tags/foo", and so
on).
So this may be considered a change in the spec with respect to client
expectations / requirements, but it's mostly codifying existing
behavior.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because each "ref-prefix" capability from the client comes in its own
pkt-line, there's no limit to the number of them that a misbehaving
client may send. We read them all into a strvec, which means the client
can waste arbitrary amounts of our memory by just sending us "ref-prefix
foo" over and over.
One possible solution is to just drop the connection when the limit is
reached. If we set it high enough, then only misbehaving or malicious
clients would hit it. But "high enough" is vague, and it's unfriendly if
we guess wrong and a legitimate client hits this.
But we can do better. Since supporting the ref-prefix capability is
optional anyway, the client has to further cull the response based on
their own patterns. So we can simply ignore the patterns once we cross a
certain threshold. Note that we have to ignore _all_ patterns, not just
the ones past our limit (since otherwise we'd send too little data).
The limit here is fairly arbitrary, and probably much higher than anyone
would need in practice. It might be worth limiting it further, if only
because we check it linearly (so with "m" local refs and "n" patterns,
we do "m * n" string comparisons). But if we care about optimizing this,
an even better solution may be a more advanced data structure anyway.
I didn't bother making the limit configurable, since it's so high and
since Git should behave correctly in either case. It wouldn't be too
hard to do, but it makes both the code and documentation more complex.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We collect the set of capabilities the client sends us in a strvec.
While this is usually small, there's no limit to the number of
capabilities the client can send us (e.g., they could just send us
"agent" pkt-lines over and over, and we'd keep adding them to the list).
Since all code has been converted away from using this list, let's get
rid of it. This avoids a potential attack where clients waste our
memory.
Note that we do have to replace it with a flag, because some of the
flush-packet logic checks whether we've seen any valid commands or keys.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than pulling the session-id string from the list of collected
capabilities, we can handle it as soon as we receive it. This gets us
closer to dropping the collected list entirely.
The behavior should be the same, with one exception. Previously if the
client sent us multiple session-id lines, we'd report only the first.
Now we'll pass each one along to trace2. This shouldn't matter in
practice, since clients shouldn't do that (and if they do, it's probably
sensible to log them all).
As this removes the last caller of the static has_capability(), we can
remove it, as well (and in fact we must to avoid -Wunused-function
complaining).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We get any "object-format" specified by the client by searching for it
in the collected list of capabilities the client sent. We can instead
just handle it as soon as they send it. This is slightly more efficient,
and gets us one step closer to dropping that collected list.
Note that we do still have to do our final hash check after receiving
all capabilities (because they might not have sent an object-format line
at all, and we still have to check that the default matches our
repository algorithm). Since the check_algorithm() function would now be
down to a single if() statement, I've just inlined it in its only
caller.
There should be no change of behavior here, except for two
broken-protocol cases:
- if the client sends multiple conflicting object-format capabilities
(which they should not), we'll now choose the last one rather than
the first. We could also detect and complain about the duplicates
quite easily now, which we could not before, but I didn't do so
here.
- if the client sends a bogus "object-format" with no equals sign,
we'll now say so, rather than "unknown object format: ''"
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When this performance test was originally written, `core.multiPackIndex`
was not the default and thus had to be enabled. But now that we have
18e449f86b (midx: enable core.multiPackIndex by default, 2020-09-25), we
no longer need this.
Drop the unnecessary setup (even though it's not hurting anything, it is
unnecessary at best and confusing at worst).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the tests in test_full_bitmap rely on having a tag named
perf-tag in place. We could create it in test_full_bitmap(), but we want
to have it in place before the repack starts.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, the bitmap writing code learned to propagate
values from an existing hash-cache extension into the bitmap that it is
writing.
Now that that functionality exists, let's expose it by teaching the 'git
multi-pack-index' builtin to respect the `pack.writeBitmapHashCache`
option so that the hash-cache may be written at all.
Two minor points worth noting here:
- The 'git multi-pack-index write' sub-command didn't previously read
any configuration (instead this is handled in the base command). A
separate handler is added here to respect this write-specific
config option.
- I briefly considered adding a 'bitmap_flags' field to the static
options struct, but decided against it since it would require
plumbing through a new parameter to the write_midx_file() function.
Instead, a new MIDX-specific flag is added, which is translated to
the corresponding bitmap one.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When an old bitmap exists while writing a new one, we load it and build
a "reposition" table which maps bit positions of objects from the old
bitmap to their respective positions in the new bitmap. This can help
when we encounter a commit which was selected in both the old and new
bitmap, since we only need to permute its bit (not recompute it from
scratch).
We do not, however, repurpose existing namehash values in the case of
the hash-cache extension. There has been thus far no good reason to do
so, since all of the namehash values for objects in the new bitmap would
be populated during the traversal that was just performed by
pack-objects when generating single-pack reachability bitmaps.
But this isn't the case for multi-pack bitmaps, which are written via
`git multi-pack-index write --bitmap` and do not perform any traversal.
In this case all namehash values are set to zero, but we don't even
bother to check the `pack.writeBitmapHashcache` option anyway, so it
fails to matter.
There are two approaches we could take to fill in non-zero hash-cache
values:
- have either the multi-pack-index builtin run its own
traversal to attempt to fill in some values, or let a hypothetical
caller (like `pack-objects` when `repack` eventually drives the
`multi-pack-index` builtin) fill in the values they found during
their traversal
- or copy any existing namehash values that were stored in an
existing bitmap to their corresponding positions in the new bitmap
In a system where a repository is generally repacked with `git repack
--geometric=<d>` and occasionally repacked with `git repack -a`, the
hash-cache coverage will tend towards all objects.
Since populating the hash-cache is additive (i.e., doing so only helps
our delta search), any intermediate lack of full coverage is just fine.
So let's start by just propagating any values from the existing
hash-cache if we see one.
The next patch will respect the `pack.writeBitmapHashcache` option while
writing MIDX bitmaps, and then test this new behavior.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pack-bitmap writer code is about to learn how to propagate values
from an existing hash-cache. To prepare, teach the test-bitmap helper to
dump the values from a bitmap's hash-cache extension in order to test
those changes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a capabilities table that tells us what we should tell the
client we are capable of, and what to do when a client gives us a
particular command (e.g., "command=ls-refs"). But it doesn't tell us
what to do when the client sends us back a capability (e.g.,
"object-format=sha256"). We just collect them all in a strvec and hope
somebody can use them later.
Instead, let's provide a function pointer in the table to act on these.
This will eventually help us avoid collecting the strings, which will be
more efficient and less prone to mischief.
Using the new method is optional, which helps in two ways:
- we can move existing capabilities over to this new system gradually
in individual commits
- some capabilities we don't actually do anything with anyway. For
example, the client is free to say "agent=git/1.2.3" to us, but we
do not act on the information in any way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the client sends v2 capabilities, they may be simple, like:
foo
or have a value like:
foo=bar
(all of the current capabilities actually expect a value, but the
protocol allows for boolean ones).
We use get_capability() to make sure the client's pktline matches a
capability. In doing so, we parse enough to see the "=" and the value
(if any), but we immediately forget it. Nobody cares for now, because they end
up parsing the values out later using has_capability(). But in
preparation for changing that, let's pass back a pointer so the callers
know what we found.
Note that unlike has_capability(), we'll return NULL for a "simple"
capability. Distinguishing these will be useful for some future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The is_command() function not only tells us whether the pktline is a
valid command string, but it also parses out the command (and complains
if we see a duplicate). Let's rename it to make those extra functions
a bit more obvious.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ab/serve-cleanup:
upload-pack: document and rename --advertise-refs
serve.[ch]: remove "serve_options", split up --advertise-refs code
{upload,receive}-pack tests: add --advertise-refs tests
serve.c: move version line to advertise_capabilities()
serve: move transfer.advertiseSID check into session_id_advertise()
serve.[ch]: don't pass "struct strvec *keys" to commands
serve: use designated initializers
transport: use designated initializers
transport: rename "fetch" in transport_vtable to "fetch_refs"
serve: mark has_capability() as static
While 'git version' is probably the least complex git command,
it is a non-experimental user-facing builtin command. As such
it should have a help page.
Both `git help` and `git version` can be called as options
(`--help`/`--version`) that internally get converted to the
corresponding command. Add a small paragraph to
Documentation/git.txt describing how these two options
interact with each other and link to this help page for the
sub-options that `--version` can take. Well, currently there
is only one sub-option, but that could potentially increase
in future versions of Git.
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check that git.html exists, regardless of the page the user wants to open.
Checking whether the requested page exists instead gives us a smoother user
experience in two use cases:
1) The requested page doesn't exist
When calling a git command and there is an error, most users reasonably expect
git to produce an error message on the standard error stream, but in this case
we pass the filepath to git web--browse which passes it on to a browser (or a
helper program like xdg-open or start that should in turn open a browser)
without any error and many GUI based browsers or helpers won't output such a
message onto the standard error stream.
Especially the helper programs tend to show the corresponding error message in
a message box and wait for user input before exiting. This leaves users in
interactive console sessions without an error message in their console,
without a console prompt and without the help page they expected.
2) git.html is missing for some reason, but the user asked for some other page
We currently refuse to show any local html help page when we can't find
git.html. Even if the requested help page exists. If we check for the requested
page instead, we can show the user all available pages and only error out on
those that don't exist.
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Available since Windows 10 release 1803 and Windows Server 2019.
NO_UNIX_SOCKETS is still the default for Windows builds, as they need
to keep backward compatibility with releases up to Windows 7, but allow
including the header otherwise.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Connect and reset errors aren't what will be expected by POSIX but
are instead compatible with the ones used by WinSock.
To avoid any possibility of confusion with other systems, checks
for disconnection and availability had been abstracted into helper
functions that are platform specific.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for a future patch that will allow building with
Unix Sockets in Windows, workaround a couple of issues from the
Mingw-W64 compatibility layer.
test -S is not able to detect that a file is a socket, so use
test -e instead (through a library function).
`mkdir -m` can't represent a valid ACL directly and fails with
permission problems, so instead call mkdir followed by chmod, which
has been enhanced to do so.
The last invocation of mkdir would likely need the same treatment
but SYMLINK is unlikely to be enabled on Windows so it has been
punted for now.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git help` command gained the ability to list config variables in
3ac68a93fd (help: add --config to list all available config, 2018-05-26)
but failed to tell readers of the config documenation itself.
Provide that cross reference.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After reimplementation of `git bisect run` in C,
`--bisect-next-check` subcommand is not needed anymore.
Let's remove it from options list and code.
Mentored by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_run()` shell function
in C and also add `--bisect-run` subcommand to
`git bisect--helper` to call it from git-bisect.sh.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_visualize()` shell function
in C and also add `--bisect-visualize` subcommand to
`git bisect--helper` to call it from git-bisect.sh.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `static` keyword from `exists_in_PATH()` function
and declare the function in `run-command.h` file.
The function will be used in bisect_visualize() in a later
commit.
Mentored by: Christian Couder <chriscool@tuxfamily.org>
Mentored by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test to control breakages in bisect visualize command.
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a gap on bisect run test coverage related with error exits.
Add two tests to control these error cases.
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9512177b68 ("rebase: add --quit to cleanup rebase, leave everything
else untouched", 2016-11-12) seems to have copied the --abort tests
but added two separate tests for the two rebase backends rather than
adding a single test into the existing testrebase() function.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing tests only check that HEAD points to the correct
commit after aborting, they do not check that the original branch
is checked out.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the end of the test we expect the state directory to be missing,
but the tests only check it is not a directory.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
$dotest holds the name of the rebase state directory and was named
after the state directory used at the time the test was added. As we
no longer use that name rename the variable to reflect its purpose.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the setup by using test_commit. Note that this replaces the
branch "pre-rebase" with a tag of the same name. "pre-rebase" is only
used as a fixed reference point so it makes sense to use a tag rather
than a branch.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 97b88dd58c ("git-rebase.sh: Fix --merge --abort failures when
path contains whitespace", 2008-05-04) started running these tests in
a subdirectory with a space in its name. At that time $TEST_DIRECTORY
did not contain a space but shortly after in 4a7aaccd83 ("Rename the
test trash directory to contain spaces.", 2008-05-04) $TEST_DIRECTORY
was changed to contain a space so we no longer need to run these tests
in a subdirectory.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the new git-curl-compat.h header to define CURL_SOCKOPT_OK to its
known value if we're on an older curl version that doesn't have it. It
was hardcoded in http.c in a15d069a19 (http: enable keepalive on TCP
sockets, 2013-10-12).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As discussed in 644de29e22 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In aeff8a6121 (http: implement public key pinning, 2016-02-15) a
dependency and warning() was added if curl older than 7.44.0 was used,
but the relevant code depended on CURLOPT_PINNEDPUBLICKEY, introduced
in 7.39.0.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d73019feb4 (http: add support selecting http version, 2018-11-08)
a dependency was added on CURL_HTTP_VERSION_2, but this feature was
introduced in curl version 7.43.0, not 7.47.0, as the incorrect
version check led us to believe.
As looking through the history of that commit on the mailing list will
reveal[1], the reason for this is that an earlier version of it
depended on CURL_HTTP_VERSION_2TLS, which was introduced in libcurl
7.47.0.
But the version that made it in in d73019feb4 had dropped the
dependency on CURL_HTTP_VERSION_2TLS, but the corresponding version
check was not corrected.
The newest symbol we depend on is CURL_HTTP_VERSION_2. It was added in
7.33.0, but the CURL_HTTP_VERSION_2 alias we used was added in
7.47.0. So we could support an even older version here, but let's just
correct the checked version.
1. https://lore.kernel.org/git/pull.69.git.gitgitgadget@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 644de29e22 (http: drop support for curl < 7.19.4, 2021-07-30) we
dropped support for curl < 7.19.4, so we can drop support for this
non-obvious dependency on curl < 7.18.0.
It's non-obvious because in curl's hex version notation 0x071800 is
version 7.24.0, *not* 7.18.0, so at a glance this patch looks
incorrect.
But it's correct, because the existing version check being removed
here is wrong. The check guards use of the following curl defines:
CURLPROXY_SOCKS4 7.10
CURLPROXY_SOCKS4A 7.18.0
CURLPROXY_SOCKS5 7.10
CURLPROXY_SOCKS5_HOSTNAME 7.18.0
I.e. the oldest version that has these is in fact 7.18.0, not
7.24.0. That we were checking 7.24.0 is just an mistake in
6d7afe07f2 (remote-http(s): support SOCKS proxies, 2015-10-26),
i.e. its author confusing base 10 and base 16.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 1119a15b5c (http: drop support for curl < 7.11.1, 2021-07-30)
support for curl versions older than 7.11.1 was removed, and we
currently require at least version 7.19.4, see 644de29e22 (http: drop
support for curl < 7.19.4, 2021-07-30).
In those changes this Makefile-specific check added in
0890098780 (Decide whether to build http-push in the Makefile,
2005-11-18) was missed, now that we're never going to use such an
ancient curl version we don't need to check that we have at least
7.9.8 here. I have no idea what in http-push.c broke on versions older
than that.
This does not impact "NO_CURL" setups, as this is in the "else" branch
after that check.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without NO_CURL=Y we require at least version "7.19.4" of libcurl, see
644de29e22 (http: drop support for curl < 7.19.4, 2021-07-30). Let's
document this in the "INSTALL" document.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As was noted in 1a85b49b87 (parse-options: make OPT_ARGUMENT() more
useful, 2019-03-14) there's only ever been one user of the
OPT_ARGUMENT(), that user was added in 20de316e33 (difftool: allow
running outside Git worktrees with --no-index, 2019-03-14).
The OPT_ARGUMENT() feature itself was added way back in
580d5bffde (parse-options: new option type to treat an option-like
parameter as an argument., 2008-03-02), but as discussed in
1a85b49b87 wasn't used until 20de316e33 in 2019.
Now that the preceding commit has migrated this code over to using
"struct strvec" to manage the "args" member of a "struct
child_process", we can just use that directly instead of relying on
OPT_ARGUMENT.
This has a minor change in behavior in that if we'll pass --no-index
we'll now always pass it as the first argument, before we'd pass it in
whatever position the caller did. Preserving this was the real value
of OPT_ARGUMENT(), but as it turns out we didn't need that either. We
can always inject it as the first argument, the other end will parse
it just the same.
Note that we cannot remove the "out" and "cpidx" members of "struct
parse_opt_ctx_t" added in 580d5bffde, while they were introduced with
OPT_ARGUMENT() we since used them for other things.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the run_file_diff() function to use the run_command() API
directly, instead of invoking the run_command_v_opt_cd_env() wrapper.
This allows it, like run_dir_diff(), to use the "args" from "struct
strvec", instead of the "const char **argv" passed into
cmd_difftool(). This will be used in the subsequent commit to get rid
of OPT_ARGUMENT() from cmd_difftool().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We call into either run_dir_diff() or run_file_diff(), each of which
sets up a child argv starting with "diff" and some hard-coded options
(depending on which mode we're using). Let's extract that logic into the
caller, which will make it easier to modify the options for cases which
affect both functions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the preparation of the "struct child_process" from run_dir_diff()
to its only caller, cmd_difftool(). This is in preparation for
migrating run_file_diff() to using the run_command() API directly, and
to move more of the shared setup of the two to cmd_difftool().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests for the "usagestr" passed to parse-options.c
usage_with_options_internal() through cmd_parseopt().
These test for edge cases in the existing behavior related to the
"--parseopt" interface doing its own line-splitting with
strbuf_getline(), and the native C interface expecting and potentially
needing to handle newlines within the strings in the array it
accepts. The results are probably something that wasn't anticipated,
but let's make sure we stay backwards compatible with it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "send-pack" was changed to use the parse_options() API in
068c77a518 (builtin/send-pack.c: use parse_options API, 2015-08-19)
it was made to use one very long line, instead it should split them up
with newlines.
Furthermore we were including an inline explanation that you couldn't
combine "--all" and "<ref>", but unlike in the "blame" case this was
not preceded by an empty string.
Let's instead show that --all and <ref> can't be combined in the the
usual language of the usage syntax instead. We can make it clear that
one of the two options "--foo" and "--bar" is mandatory, but that the
two are mutually exclusive by referring to them as "( --foo | --bar
)".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for having continued usage lines properly aligned in
"git <cmd> -h" output, let's have the "[" on the second such lines
align with the "[" on the first line.
In some cases this makes the output worse, because e.g. the "git
ls-remote -h" output had been aligned to account for the extra
whitespace that the usage_with_options_internal() function in
parse-options.c would add.
In other cases such as builtin/stash.c (not changed here), we were
aligned in the C strings, but since that didn't account for the extra
padding in usage_with_options_internal() it would come out looking
misaligned, e.g. code like this:
N_("git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]\n"
" [-u|--include-untracked] [-a|--all] [-m|--message <message>]\n"
Would emit:
or: git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
[-u|--include-untracked] [-a|--all] [-m|--message <message>]
Let's change all the usage arrays which use such continued usage
output via "\n"-embedding to be like builtin/stash.c.
This makes the output worse temporarily, but in a subsequent change
I'll improve the usage_with_options_internal() to take this into
account, at which point all of the strings being changed here will
emit prettier output.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the launchctl_boot_plist() function was added in
a16eb6b1ff (maintenance: skip bootout/bootstrap when plist is
registered, 2021-08-24), an unused call to launchctl_get_uid() was
added along with it. That call appears to have been copy/pasted from
launchctl_boot_plist().
Since we can remove that, we can also get rid of the "result"
variable, whose only purpose was allow for the free() between its
assignment and the return. That pattern also appears to have been
copy/pasted from launchctl_boot_plist().
As the patch shows the returned value from launchctl_get_uid() wasn't
used at all in this function. The launchctl_get_uid() function itself
just calls xstrfmt() and getuid(), neither of which have any subtle
global side-effects, so this removal is safe.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In be5d88e112 (test-tool run-command: learn to run (parts of) the
testsuite, 2019-10-04) an init pattern was added that would use
TESTSUITE_INIT, but then promptly memset() everything back to 0. We'd
then set the "dup" on the two string lists.
Our setting of "next" to "-1" thus did nothing, we'd reset it to "0"
before using it. Let's set it to "0" instead, and trust the
"STRING_LIST_INIT_DUP" to set "strdup_strings" appropriately for us.
Note that while we compile this code, there's no in-tree user for the
"testsuite" target being modified here anymore, see the discussion at
and around <nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet>[1].
1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove untracked files that are unwanted after they are done being used.
While the set of cases in this patch is certainly far from
comprehensive, it was motivated by some work to see what the fallout
would be if we were to make the removal of untracked files as a side
effect of other commands into an error. Some cases were a bit more
involved than the testcase changes included in this patch, but the ones
included here represent the simple cases. While this patch is not that
important since we are not changing the behavior of those other commands
into an error in the near term, I thought these changes were useful
anyway as an explicit documentation of the intent that these untracked
files are no longer useful.
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We converted argv_array (which later became strvec) to use size_t in
819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
order to avoid the possibility of integer overflow. But later, commit
d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
converted these back to ints!
Those two commits were part of the same patch series. I'm pretty sure
what happened is that they were originally written in the opposite order
and then cleaned up and re-ordered during an interactive rebase. And
when resolving the inevitable conflict, I mistakenly took the "rename"
patch completely, accidentally dropping the type change.
We can correct it now; better late than never.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 8de7eeb54b (compression: unify pack.compression configuration
parsing, 2016-11-15) the variables core_compression_level and
core_compression_seen are only set, but never read. Remove them.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two have fallen out of use with the SHA-256 migration.
The last use of $_x40 was removed in fc7e73d7ef (t4013: improve
diff-post-processor logic, 2020-08-21) and
The last use of $_z40 was removed in 7a868c51c2 (t5562: use $ZERO_OID,
2019-12-21), but it was then needlessly refactored to be hash-agnostic
in 192b517589 (t: use hash-specific lookup tables to define test
constants, 2020-02-22). We can just remove it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This variable was last used in code removed in
06f5608c14 (bisect--helper: `bisect_start` shell function partially in
C, 2019-01-02).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "pull with rebase" message previously used by the
git-pull.sh script, which was removed in 49eb8d39c7 (Remove
contrib/examples/*, 2018-03-25).
Even if some out-of-tree user copy/pasted the old git-pull.sh code,
and relied on passing it a "pull with rebase" argument, we'll fall
back on the "*" case here, they just won't get the "pull with rebase"
part of their message translated.
I don't think it's likely that anyone out-of-tree relied on that, but
I'm being conservative here per the discussion that can be found
upthread of [1].
1. https://lore.kernel.org/git/87tuiwjfvi.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The is_zero_oid() function in git-submodule.sh has not been used since
e83e3333b5 (submodule: port submodule subcommand 'summary' from shell
to C, 2020-08-13), so we can remove it.
This was the last user of the sane_egrep() function in
git-sh-setup.sh. I'm not removing it in case some out-of-tree user
relied on it. Per the discussion that can be found upthread of [1].
1. https://lore.kernel.org/git/87tuiwjfvi.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Store the object ID of broken pack entries in an oidset instead of
keeping only their hashes in an unsorted array. The resulting code is
shorter and easier to read. It also handles the (hopefully) very rare
case of having a high number of bad objects better.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The single caller has a full object ID, so pass it on instead of just
its hash member.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All callers have full object IDs, so pass them on instead of just their
hash member.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fill_midx_entry() finds the position of an object ID and passes it to
nth_midxed_pack_entry(), which uses the position to look up the object
ID for its own purposes. Inline the latter into the former to avoid
that lookup.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
oidset_size() just reads a single word from memory and returns it.
Avoid the function call overhead for this trivial operation by turning
it into an inline function.
While we're at it, declare its parameter const to allow it to be used
on read-only oidsets.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 1d53f90ed9 (The "curl" executable is no longer required,
2008-06-15) the wording for requiring curl(1) was changed to the
current "you might also want...".
Mentioning the "curl" executable at all is just confusing, someone
building git might want to use it to debug things, but they might also
just use wget(1) or some other http client. The "curl" executable has
the advantage that you might be able to e.g. reproduce a bug in git's
usage of libcurl with it, but anyone going to those extents is
unlikely to be aided by this note in INSTALL.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify the usage string in the documentation so we group e.g. -i and
--info, and add the missing short options to the "-h" output.
The alignment of the second line is off now, but will be fixed with
another series of mine[1]. In the meantime let's just assume that fix
will make it in eventually for the purposes of this patch, if it's
misaligned for a bit it doesn't matter much.
1. https://lore.kernel.org/git/cover-0.2-00000000000-20210901T110917Z-avarab@gmail.com
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both Johannes and I assumed (perhaps due to familiarity with rebase)
that am --abort would return the user to a clean state. However, since
am, unlike rebase, is intended to be used within a dirty working tree,
--abort will only clean the files involved in the am operation.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user deletes a file and places a directory of untracked files
there, then stashes all these changes, the untracked directory of files
cannot be restored until after the corresponding file in the way is
removed. So, restore changes to tracked files before restoring
untracked files.
There is no counterpart problem to worry about with the user deleting an
untracked file and then add a tracked one in its place. Git does not
track untracked files, and so will not know the untracked file was
deleted, and thus won't be able to stash the removal of that file.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a file is removed from the cache, but there is a file of the same
name present in the working directory, we would normally treat that file
in the working directory as untracked. However, in the case of stash,
doing that would prevent a simple 'git stash push', because the untracked
file would be in the way of restoring the deleted file.
git stash, however, blindly assumes that whatever is in the working
directory for a deleted file is wanted and passes that path along to
update-index. That causes problems when the working directory contains
a directory with the same name as the deleted file. Add some code for
this special case that will avoid passing directory names to
update-index.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are three tests here, because the second bug is documented with
two tests: a file -> directory change and a directory -> file change.
The reason for the two tests is just to verify that both are indeed
broken but that both will be fixed by the same simple change (which will
be provided in a subsequent patch).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We recently added tips for server admins to configure various transports
to support v2's GIT_PROTOCOL variable. While the protocol-v2 document is
pretty technical and not of interest to most admins, it may be a
starting point for them to figure out how to turn on v2. Let's put some
pointers from there to the other documentation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The v2 protocol requires that the GIT_PROTOCOL environment variable gets
passed around, but we don't have any documentation describing how this
is supposed to work. In particular, we need to note what server admins
might need to configure to make things work.
The definition of the GIT_PROTOCOL variable is probably the best place
for this, since:
- we deal with multiple transports (ssh, http, etc).
Transport-specific documentation (like the git-http-backend bits
added in the previous commit) are helpful for those transports, but
this gives a broader overview. Plus we do not have a specific
transport endpoint program for ssh, so this is a reasonable place to
mention it.
- the server side of the protocol involves multiple programs. For now,
upload-pack is the only endpoint which uses GIT_PROTOCOL, but that
will likely expand in the future. We're better off with a central
discussion of what the server admin might need to do. However, for
discoverability, this patch adds a pointer from upload-pack's
documentation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Historically there was a little bit of configuration needed at the
webserver level in order to get the client's v2 protocol probes to Git.
But when we introduced the v2 protocol, we never documented these.
As of the previous commit, this should mostly work out of the box
without any explicit configuration. But it's worth documenting this to
make it clear how we expect it to work, especially in the face of
webservers which don't provide all headers over the CGI interface. Or
anybody who runs across this documentation but has an older version of
Git (or _used_ to have an older version, and wonders why they still have
a SetEnvIf line in their Apache config and whether it's still
necessary).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a client requests the v2 protocol over HTTP, they set the
Git-Protocol header. Webservers will generally make that available to
our CGI as HTTP_GIT_PROTOCOL in the environment. However, that's not
sufficient for upload-pack, etc, to respect it; they look in
GIT_PROTOCOL (without the HTTP_ prefix).
Either the webserver or the CGI is responsible for relaying that HTTP
header into the GIT_PROTOCOL variable. Traditionally, our tests have
configured the webserver to do so, but that's a burden on the server
admin. We can make this work out of the box by having the http-backend
CGI copy the contents of HTTP_GIT_PROTOCOL to GIT_PROTOCOL.
There are no new tests here. By removing the SetEnvIf line from our
test Apache config, we're now relying on this behavior of http-backend
to trigger the v2 protocol there (and there are numerous tests that fail
if this doesn't work).
There is one subtlety here: we copy HTTP_GIT_PROTOCOL only if there is
no existing GIT_PROTOCOL variable. That leaves the webserver admin free
to override the client's decision if they choose. This is unlikely to be
useful in practice, but is more flexible. And indeed, it allows the
v2-to-v0 fallback test added in the previous commit to continue working.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we use the v2 protocol by default, the connection of a v2 client
to a v2 server is well covered by the test suite. And with the
GIT_TEST_PROTOCOL_VERSION knob, we can easily test a v0 client
connecting to a v2-aware server (which will then just speak v0). But we
have no regular tests that a v2 client, when encountering a non-v2-aware
server, will correctly fall back to using v0.
In theory this is a job for the cross-version tests in t/interop, but:
- they cover only git:// and file:// clones
- they are not part of the usual test suite, so nobody ever runs them
anyway
Since using v2 over http requires configuring the web server to pass
along the Git-Protocol header, we can easily create a situation where
the server does not respect the v2 probe, and the conversation falls
back to v0.
This works just fine. This new test is not about fixing any particular
bug, but just making sure that the system works (and continues to work)
as expected.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Support an arbitrary file descriptor expression in the semantic patch
for replacing open+die_errno with xopen, not just an identifier, and
apply it. This makes the error message at the single affected place
more consistent and reduces code duplication.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test that verify-commit/tag will fail when a gpg key is completely
unknown. To do this we have to generate a key, use it for a signature
and delete it from our keyring aferwards completely.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To verify a ssh signature we first call ssh-keygen -Y find-principal to
look up the signing principal by their public key from the
allowedSignersFile. If the key is found then we do a verify. Otherwise
we only validate the signature but can not verify the signers identity.
Verification uses the gpg.ssh.allowedSignersFile (see ssh-keygen(1) "ALLOWED
SIGNERS") which contains valid public keys and a principal (usually
user@domain). Depending on the environment this file can be managed by
the individual developer or for example generated by the central
repository server from known ssh keys with push access. This file is usually
stored outside the repository, but if the repository only allows signed
commits/pushes, the user might choose to store it in the repository.
To revoke a key put the public key without the principal prefix into
gpg.ssh.revocationKeyring or generate a KRL (see ssh-keygen(1)
"KEY REVOCATION LISTS"). The same considerations about who to trust for
verification as with the allowedSignersFile apply.
Using SSH CA Keys with these files is also possible. Add
"cert-authority" as key option between the principal and the key to mark
it as a CA and all keys signed by it as valid for this CA.
See "CERTIFICATES" in ssh-keygen(1).
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For ssh the user.signingkey can be a filename/path or even a literal ssh pubkey.
In push certs and textual output we prefer the ssh fingerprint instead.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If user.signingkey is not set and a ssh signature is requested we call
gpg.ssh.defaultKeyCommand (typically "ssh-add -L") and use the first key we get
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implements the actual sign_buffer_ssh operation and move some shared
cleanup code into a strbuf function
Set gpg.format = ssh and user.signingkey to either a ssh public key
string (like from an authorized_keys file), or a ssh key file.
If the key file or the config value itself contains only a public key
then the private key needs to be available via ssh-agent.
gpg.ssh.program can be set to an alternative location of ssh-keygen.
A somewhat recent openssh version (8.2p1+) of ssh-keygen is needed for
this feature. Since only ssh-keygen is needed it can this way be
installed seperately without upgrading your system openssh packages.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generate some ssh keys and a allowedSignersFile for testing
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Openssh v8.2p1 added some new options to ssh-keygen for signature
creation and verification. These allow us to use ssh keys for git
signatures easily.
In our corporate environment we use PIV x509 Certs on Yubikeys for email
signing/encryption and ssh keys which I think is quite common
(at least for the email part). This way we can establish the correct
trust for the SSH Keys without setting up a separate GPG Infrastructure
(which is still quite painful for users) or implementing x509 signing
support for git (which lacks good forwarding mechanisms).
Using ssh agent forwarding makes this feature easily usable in todays
development environments where code is often checked out in remote VMs / containers.
In such a setup the keyring & revocationKeyring can be centrally
generated from the x509 CA information and distributed to the users.
To be able to implement new signing formats this commit:
- makes the sigc structure more generic by renaming "gpg_output" to
"output"
- introduces function pointers in the gpg_format structure to call
format specific signing and verification functions
- moves format detection from verify_signed_buffer into the check_signature
api function and calls the format specific verify
- renames and wraps sign_buffer to handle format specific signing logic
as well
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logic for auto-correction of misspelt subcommands learned to go
interactive when the help.autocorrect configuration variable is set
to 'prompt'.
* ab/help-autocorrect-prompt:
help.c: help.autocorrect=prompt waits for user action
Tie-break branches that point at the same object in the list of
branches on GitWeb to show the one pointed at by HEAD early.
* gh/gitweb-branch-sort:
gitweb: use HEAD as secondary sort key in git_get_heads_list()
Doc update plus improved error reporting.
* jk/log-warn-on-bogus-encoding:
docs: use "character encoding" to refer to commit-object encoding
logmsg_reencode(): warn when iconv() fails
Code clean up to migrate callers from older advice_config[] based
API to newer advice_if_enabled() and advice_enabled() API.
* ab/retire-advice-config:
advice: move advice.graftFileDeprecated squashing to commit.[ch]
advice: remove use of global advice_add_embedded_repo
advice: remove read uses of most global `advice_` variables
advice: add enum variants for missing advice variables
After "git clone --recurse-submodules", all submodules are cloned
but they are not by default recursed into by other commands. With
submodule.stickyRecursiveClone configuration set, submodule.recurse
configuration is set to true in a repository created by "clone"
with "--recurse-submodules" option.
* mk/clone-recurse-submodules:
clone: set submodule.recurse=true if submodule.stickyRecursiveClone enabled
The output from "git fast-export", when its anonymization feature
is in use, showed an annotated tag incorrectly.
* tk/fast-export-anonymized-tag-fix:
fast-export: fix anonymized tag using original length
Fixes on usage message from "git commit-graph".
* ab/commit-graph-usage:
commit-graph: show "unexpected subcommand" error
commit-graph: show usage on "commit-graph [write|verify] garbage"
commit-graph: early exit to "usage" on !argc
multi-pack-index: refactor "goto usage" pattern
commit-graph: use parse_options_concat()
commit-graph: remove redundant handling of -h
commit-graph: define common usage with a macro
Even when running "git send-email" without its own threaded
discussion support, a threading related header in one message is
carried over to the subsequent message to result in an unwanted
threading, which has been corrected.
* mh/send-email-reset-in-reply-to:
send-email: avoid incorrect header propagation
Buggy tests could damage repositories outside the throw-away test
area we created. We now by default export GIT_CEILING_DIRECTORIES
to limit the damage from such a stray test.
* sg/set-ceiling-during-tests:
test-lib: set GIT_CEILING_DIRECTORIES to protect the surrounding repository
The sparse-index support can corrupt the index structure by storing
a stale and/or uninitialized data, which has been corrected.
* jh/sparse-index-resize-fix:
sparse-index: copy dir_hash in ensure_full_index()
"git fetch --quiet" optimization to avoid useless computation of
info that will never be displayed.
* ps/fetch-omit-formatting-under-quiet:
fetch: skip formatting updated refs with `--quiet`
"git upload-pack" which runs on the other side of "git fetch"
forgot to take the ref namespaces into account when handling
want-ref requests.
* ka/want-ref-in-namespace:
docs: clarify the interaction of transfer.hideRefs and namespaces
upload-pack.c: treat want-ref relative to namespace
t5730: introduce fetch command helper
The advice message that "git cherry-pick" gives when it asks
conflicted replay of a commit to be resolved by the end user has
been updated.
* zh/cherry-pick-advice:
cherry-pick: use better advice message
"git rebase" by default skips changes that are equivalent to
commits that are already in the history the branch is rebased onto;
give messages when this happens to let the users be aware of
skipped commits, and also teach them how to tell "rebase" to keep
duplicated changes.
* js/advise-when-skipping-cherry-picked:
sequencer: advise if skipping cherry-picked commit
In preceding commits the race of renaming .idx files in place before
.rev files and other auxiliary files was fixed in pack-write.c's
finish_tmp_packfile(), builtin/repack.c's "struct exts", and
builtin/index-pack.c's final(). As noted in the change to pack-write.c
we left in place the issue of writing *.bitmap files after the *.idx,
let's fix that issue.
See 7cc8f97108 (pack-objects: implement bitmap writing, 2013-12-21)
for commentary at the time when *.bitmap was implemented about how
those files are written out, nothing in that commit contradicts what's
being done here.
Note that this commit and preceding ones only close any race condition
with *.idx files being written before their auxiliary files if we're
optimistic about our lack of fsync()-ing in this are not tripping us
over. See the thread at [1] for a rabbit hole of various discussions
about filesystem races in the face of doing and not doing fsync() (and
if doing fsync(), not doing it properly).
We may want to fsync the containing directory once after renaming the
*.idx file into place, but that is outside the scope of this series.
1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split up the finish_tmp_packfile() function and use the split-up version
in pack-objects.c in preparation for moving the step of renaming the
*.idx file later as part of a function change.
Since the only other caller of finish_tmp_packfile() was in
bulk-checkin.c, and it won't be needing a change to its *.idx renaming,
provide a thin wrapper for the old function as a static function in that
file. If other callers end up needing the simpler version it could be
moved back to "pack-write.c" and "pack.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as preceding patches to `git repack` and `git
pack-objects`, fix the identical problem in `git index-pack`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the renaming in final() into a helper function, this is
similar in spirit to a preceding refactoring of finish_tmp_packfile()
in pack-write.c.
Before e37d0b8730 (builtin/index-pack.c: write reverse indexes,
2021-01-25) it probably wasn't worth it to have this sort of helper,
due to the differing "else if" case for "pack" files v.s. "idx" files.
But since we've got "rev" as well now, let's do the renaming via a
helper, this is both a net decrease in lines, and improves the
readability, since we can easily see at a glance that the logic for
writing these three types of files is exactly the same, aside from the
obviously differing cases of "*final_name" being NULL, and
"make_read_only_if_same" being different.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as the previous patch, fix the identical problem
from `git repack` (which invokes `pack-objects` with a temporary
location for output, and then moves the files into their final locations
itself).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We treat the presence of an `.idx` file as the indicator of whether or
not it's safe to use a packfile. But `finish_tmp_packfile()` (which is
responsible for writing the pack and moving the temporary versions of
all of its auxiliary files into place) is inconsistent about the write
order.
Specifically, it moves the `.rev` file into place after the `.idx`,
leaving open the possibility to open a pack which looks "ready" (because
the `.idx` file exists and is readable) but appears momentarily to not
have a `.rev` file. This causes Git to fall back to generating the
pack's reverse index in memory.
Though racy, this amounts to an unnecessary slow-down at worst, and
doesn't affect the correctness of the resulting reverse index.
Close this race by moving the .rev file into place before moving the
.idx file into place.
This still leaves the issue of `.idx` files being renamed into place
before the auxiliary `.bitmap` file is renamed when in pack-object.c's
write_pack_file() "write_bitmap_index" is true. That race will be
addressed in subsequent commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the renaming in finish_tmp_packfile() into a helper function.
The callers are now expected to pass a "name_buffer" ending in
"pack-OID." instead of the previous "pack-", we then append "pack",
"idx" or "rev" to it.
By doing the strbuf_setlen() in rename_tmp_packfile() we reuse the
buffer and avoid the repeated allocations we'd get if that function had
its own temporary "struct strbuf".
This approach of reusing the buffer does make the last user in
pack-object.c's write_pack_file() slightly awkward, since we needlessly
do a strbuf_setlen() before calling strbuf_release() for consistency. In
subsequent changes we'll move that bitmap writing code around, so let's
not skip the strbuf_setlen() now.
The previous strbuf_reset() idiom originated with 5889271114
(finish_tmp_packfile():use strbuf for pathname construction,
2014-03-03), which in turn was a minimal adjustment of pre-strbuf code
added in 0e990530ae (finish_tmp_packfile(): a helper function,
2011-10-28).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
finish_bulk_checkin() stores the checksum from finalize_hashfile() by
writing to the `hash` member of `struct object_id`, but that hash has
nothing to do with an object id (it's just a convenient location that
happens to be sized correctly).
Store the hash directly in an unsigned char array. This behaves the same
as writing to the `hash` member, but makes the intent clearer (and
avoids allocating an extra four bytes for the `algo` member of `struct
object_id`). It likewise prevents the possibility of a segfault when
reading `algo` (e.g., by calling `oid_to_hex()`) if it is uninitialized.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The t5562 script occasionally takes 60 extra seconds to complete due to
a race condition in the invoke-with-content-length.pl helper.
The way it's supposed to work is this:
- we set up a SIGCLD handler
- we kick off http-backend and write to it with a set content-length,
but _don't_ close the pipe
- we sleep for 60 seconds, assuming that SIGCLD from http-backend
finishing will interrupt us
- after the sleep finishes (whetherby 60 seconds or because it was
interrupted by the signal), we check a flag to see if our SIGCLD
handler was called. If not, then we complain.
This usually completes immediately, because the signal interrupts our
sleep. But very occasionally the child process dies _before_ we hit the
sleep, so we don't realize it. The test still completes successfully
(because our $exited flag is set), but it takes an extra 60 seconds.
There's no way to check the flag and sleep atomically. So the best we
can do with this approach is to sleep in smaller chunks (say, 1 second)
and check the flag incrementally. Then we waste a maximum of 1 second if
we lose the race. This was proposed in:
https://lore.kernel.org/git/20190218205028.32486-1-max@max630.net/
and it does work. But we can do better.
Instead of blocking on sleep and waiting for the child signal to
interrupt us, we can block on the child exiting and set an alarm signal
to trigger the timeout.
This lets us exit the script immediately when the child behaves (with no
race possible), and wait a maximum of 60 seconds when it doesn't.
Note one small subtlety: perl is very willing to restart the waitpid()
call after the alarm is delivered, even if we've thrown an exception via
die. "perldoc -f alarm" claims you can get around this with an eval/die
combo (and even has some example code), but it doesn't seem to work for
me with waitpid(); instead, we continue waiting until the child exits.
So instead, we'll instruct the child process to exit in the alarm
handler itself. In the original code this was done by calling
close($out). That would continue to work, since our child is always
http-backend, which should exit when its stdin closes. But we can be
even more robust against a hung or confused child by sending a KILL
signal, which should terminate it immediately.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We only need to provide a mode if we are willing to let open(2) create
the file, which is not the case here, so drop the unnecessary parameter.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the catch-all error message with specific ones for opening and
duplicating by calling the wrappers xopen and xdup. The code becomes
easier to follow when error handling is reduced to two letters.
Remove the unnecessary mode parameter while at it -- we expect /dev/null
to already exist.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Starting in commit 0f533c7284 (pack-bitmap: read multi-pack bitmaps,
2021-08-31), we no longer look at the "struct bitmap_index" passed to
try_partial_reuse(). This is because we only handle verbatim reuse from
a single pack: either the pack whose bitmap we're looking at, or the
"preferred" pack of a midx bitmap. And thus the primary item we look at
is the "pack" parameter added by that same commit, and not the
bitmap_git->pack parameter (which would be NULL for a midx bitmap). It's
our caller, reuse_partial_packfile_from_bitmap(), which decides which
pack to use and passes it in to us.
Drop the unused parameter to prevent confusion.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We never look at the repository argument which is passed. This makes
sense, since the multi_pack_index struct already tells us everything we
need to access the files in its associated object directory.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hard work was already done with 'git merge' and the ORT strategy.
Just add extra tests to see that we get the expected results in the
non-conflict cases.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sequencer is used by 'cherry-pick' and 'rebase' to sequence a list
of operations that modify the index. Since we intend to remove the need
for 'command_requires_full_index', we need to ensure the sparse index is
expanded every time it is written to disk between these steps. That is,
unless the merge strategy is 'ort' where the index can remain sparse
throughout.
There are two main places to be extra careful about a full index:
1. Right before calling merge_trees(), ensure the index is full. This
happens within an 'else' where the 'if' block checks if the 'ort'
strategy is selected.
2. During read_and_refresh_cache(), the index might be written to disk
and converted to sparse in the process. Ensure it expands back to
full afterwards by checking if the strategy is _not_ 'ort'. This
'if' statement is the logical negation of the 'if' in item (1).
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests to check that cherry-pick and rebase behave the same in the
sparse-index case as in the full index cases. These tests are agnostic
to GIT_TEST_MERGE_ALGORITHM, so a full CI test suite will check both the
'ort' and 'recursive' strategies on this test.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge conflicts happen often enough to want to avoid expanding a sparse
index when they happen, as long as those conflicts are within the
sparse-checkout cone. If a conflict exists outside of the
sparse-checkout cone, then we still need to expand before iterating over
the index entries. This is critical to do in advance because of how the
original_cache_nr is tracked to allow inserting and replacing cache
entries.
Iterate over the conflicted files and check if any paths are outside of
the sparse-checkout cone. If so, then expand the full index.
Add a test that demonstrates that we do not expand the index, even when
we hit a conflict within the sparse-checkout cone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow 'git merge' to operate without expanding a sparse index, at least
not immediately. The index still will be expanded in a few cases:
1. If the merge strategy is 'recursive', then we enable
command_requires_full_index at the start of the merge_recursive()
method. We expect sparse-index users to also have the 'ort' strategy
enabled.
2. With the 'ort' strategy, if the merge results in a conflicted file,
then we expand the index before updating the working tree. The loop
that iterates over the worktree replaces index entries and tracks
'origintal_cache_nr' which can become completely wrong if the index
expands in the middle of the operation. This safety valve is
important before that loop starts. A later change will focus this
to only expand if we indeed have a conflict outside of the
sparse-checkout cone.
3. Other merge strategies are executed as a 'git merge-X' subcommand,
and those strategies are currently protected with the
'command_requires_full_index' guard.
Some test updates are required, including a mistaken 'git checkout -b'
that did not specify the base branch, causing merges to be fast-forward
merges.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff_populate_filespec() method is used to describe the diff after a
merge operation is complete. In order to avoid expanding a sparse index,
the reuse_worktree_file() needs to be adapted to ignore files that are
outside of the sparse-checkout cone. The file names and OIDs used for
this check come from the merged tree in the case of the ORT strategy,
not the index, hence the ability to look into these paths without having
already expanded the index.
The work done by reuse_worktree_file() is only an optimization, and
requires the file being on disk for it to be of any value. Thus, it is
safe to exit the method early if we do not expect the file on disk.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clean up a TODO in revision.h by removing the "submodule" field from
struct setup_revision_opt. This field is only used to specify the ref
store to use, so use rev_info->repo to determine the ref store instead.
The only users of this field are merge-ort.c and merge-recursive.c.
However, both these files specify the superproject as rev_info->repo and
the submodule as setup_revision_opt->submodule. In order to be able to
pass the submodule as rev_info->repo, all commits must be parsed with
the submodule explicitly specified; this patch does that as well. (An
incremental solution in which only some commits are parsed with explicit
submodule will not work, because if the same commit is parsed twice in
different repositories, there will be 2 heap-allocated object structs
corresponding to that commit, and any flag set by the revision walking
mechanism on one of them will not be reflected onto the other.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for a subsequent commit that migrates code using
add_submodule_odb() to repo_submodule_init(), teach
repo_submodule_init() to support submodules with unabsorbed gitdirs.
(See the documentation for "git submodule absorbgitdirs" for more
information about absorbed and unabsorbed gitdirs.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In get_submodule_repo_for(), there is a fallback code path for the case
in which a submodule has an unabsorbed gitdir. (See the documentation
for "git submodule absorbgitdirs" for more information about absorbed
and unabsorbed gitdirs.) However, this code path is unnecessary, because
such submodules are already handled: when the fetch_task is created in
fetch_task_create(), it will create its own struct submodule with a path
and name, and repo_submodule_init() can handle such a struct.
This fallback was introduced in 26f80ccfc1 ("submodule: migrate
get_next_submodule to use repository structs", 2018-12-05). It was
unnecessary even then, but perhaps it escaped notice because its parent
commit d5498e0871 ("repository: repo_submodule_init to take a submodule
struct", 2018-12-05) was the one that taught repo_submodule_init() to
handle such created structs. Before, it took a path and always checked
.gitmodules, so it truly would have failed if there were no entry in
.gitmodules.
(Note to reviewers: in 26f80ccfc1, the "own struct submodule" I
mentioned is in get_next_submodule(), not fetch_task_create().)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In many cases where we spawned child processes that _may_ trigger a
repack, we explicitly closed the object store first (so that the
`repack` process can delete the `.pack` files, which would otherwise not
be possible on Windows since files cannot be deleted as long as they as
still in use).
Wherever possible, we now use the new `close_object_store` bit of the
`run_command()` API, to delay closing the object store even further.
This makes the code easier to maintain because it is now more obvious
that we only release those file handles because of those child
processes.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before spawning the auto maintenance, we need to make sure that we
release all open file handles to all the `.pack` files (and MIDX files
and commit-graph files and...) so that the maintenance process has the
freedom to delete those files.
So far, we did this manually every time before calling
`run_auto_maintenance()`. With the new `close_object_store` flag, we can
do that implicitly in that function, which is more robust because future
callers won't be able to forget to close the object store.
Note: this changes behavior slightly, as we previously _always_ closed
the object store, but now we only close the object store when actually
running the auto maintenance. In practice, this should not matter (if
anything, it might speed up operations where auto maintenance is
disabled).
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Especially on Windows, where files cannot be deleted if _any_ process
holds an open file handle to them, it is important to close the object
store (releasing all handles to all `.pack` files) before running a
command that might spawn a garbage collection.
This scenario is so common that we frequently see the pattern of closing
the object store before running auto maintenance or another Git command.
Let's make this much more convenient by teaching the `run_command()`
machinery a new flag to release the object store before spawning the
process.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The values were listed unaligned, and with powers of two spelled out in
decimal. The list is easier to parse for human readers if the numbers
are aligned and spelled out as powers of two (using the bit-shift
operator `<<`).
While at it, remove a code comment that was unclear at best, and
confusing at worst.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repository of git-l10n is a fork of "git/git" on GitHub, and uses
GitHub pull request for code review. A helper program "git-po-helper"
can be used to check typos in ".po" files, validate syntax, and check
commit messages. It would be convenient to integrate this helper program
to CI and add comments in pull request.
The new github-action workflow will be enabled for l10n related
operations, such as:
* Operations on a repository named as "git-po", such as a repository
forked from "git-l10n/git-po".
* Push to a branch that contains "l10n" in the name.
* Pull request from a remote branch which has "l10n" in the name, such
as: "l10n/fix-fuzzy-translations".
The new l10n workflow listens to two types of github events:
on: [push, pull_request_target]
The reason we use "pull_request_target" instead of "pull_request" is
that pull requests from forks receive a read-only GITHUB_TOKEN and
workflows cannot write comments back to pull requests for security
reasons. GitHub provides a "pull_request_target" event to resolve
security risks by checking out the base commit from the target
repository, and provide write permissions for the workflow.
By default, administrators can set strict permissions for workflows. The
following code is used to modify the permissions for the GITHUB_TOKEN
and grant write permission in order to create comments in pull-requests.
permissions:
pull-requests: write
This workflow will scan commits one by one. If a commit does not look
like a l10n commit (no file in "po/" has been changed), the scan process
will stop immediately. For a "push" event, no error will be reported
because it is normal to push non-l10n commits merged from upstream. But
for the "pull_request_target" event, errors will be reported. For this
reason, additional option is provided for "git-po-helper".
git-po-helper check-commits \
--github-action-event="${{ github.event_name }}" -- \
<base>..<head>
The output messages of "git-po-helper" contain color codes not only for
console, but also for logfile. This is because "git-po-helper" uses a
package named "logrus" for logging, and I use an additional option
"ForceColor" to initialize "logrus" to print messages in a user-friendly
format in logfile output. These color codes help produce beautiful
output for the log of workflow, but they must be stripped off when
creating comments for pull requests. E.g.:
perl -pe 's/\e\[[0-9;]*m//g' git-po-helper.out
"git-po-helper" may generate two kinds of suggestions, errors and
warnings. All the errors and warnings will be reported in the log of the
l10n workflow. However, warnings in the log of the workflow for a
successfully running "git-po-helper" can easily be ignored by users.
For the "pull_request_target" event, this issue is resolved by creating
an additional comment in the pull request. A l10n contributor should try
to fix all the errors, and should pay attention to the warnings.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Line-wrap the definition of finish_tmp_packfile() to make subsequent
changes easier to read. See 0e990530ae (finish_tmp_packfile(): a
helper function, 2011-10-28) for the commit that introduced this
overly long line.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop. It works basically like this:
start_delayed_progress(p, nr_of_paths_to_filter)
for_each_filter {
display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
for_each_path_handled_by_the_current_filter {
checkout_entry()
}
}
stop_progress(p)
There are two issues with this approach:
- The work done by the last filter (or the only filter if there is
only one) is never counted, so if the last filter still has some
paths to process, then the counter shown in the "done" progress
line will not match the expected total.
The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1
mode[1] helps spot this issue. Under it the 'missing file in
delayed checkout' and 'invalid file in delayed checkout' tests in
't0021-conversion.sh' fail, because both use only one
filter. (The test 'delayed checkout in process filter' uses two
filters but the first one does all the work, so that test already
happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.)
- The progress counter is updated only once per filter, not once per
processed path, so if a filter has a lot of paths to process, then
the counter might stay unchanged for a long while and then make a
big jump (though the user still gets a sense of progress, because
we call display_throughput() after each processed path to show the
amount of processed data).
Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.
After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1
assertion discussed above, but the 'missing file in delayed checkout'
test would still fail.
It'll fail because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all, see [2] for how to
spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not
straightforward to fix it with the current progress.c library (see [3]
for an attempt), so let's leave it for now.
Let's also initialize the *progress to "NULL" while we're at it. Since
7a132c628e (checkout: make delayed checkout respect --quiet and
--no-progress, 2021-08-26) we have had progress conditional on
"show_progress", usually we use the idiom of a "NULL" initialization
of the "*progress", rather than the more verbose ternary added in
7a132c628e.
1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev
3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:
Scanning merged commits: 83% (5/6), done.
This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N. Fix this by passing 'i + 1' to display_progress(), like
most other callsites do.
There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which
catches this issue in the 'fetch.writeCommitGraph' and
'fetch.writeCommitGraph with submodules' tests in
't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of
this series, but future changes to progress.c may add it or similar
assertions to catch this and similar bugs elsewhere.
1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Build update for Apple clang.
* cb/makefile-apple-clang:
build: catch clang that identifies itself as "$VENDOR clang"
build: clang version may not be followed by extra words
build: update detect-compiler for newer Xcode version
"git branch -D <branch>" used to refuse to remove a broken branch
ref that points at a missing commit, which has been corrected.
* rs/branch-allow-deleting-dangling:
branch: allow deleting dangling branches with --force
The delayed checkout code path in "git checkout" etc. were chatty
even when --quiet and/or --no-progress options were given.
* mt/quiet-with-delayed-checkout:
checkout: make delayed checkout respect --quiet and --no-progress
"git diff --relative" segfaulted and/or produced incorrect result
when there are unmerged paths.
* dd/diff-files-unmerged-fix:
diff-lib: ignore paths that are outside $cwd if --relative asked
"git maintenance" scheduler fix for macOS.
* js/maintenance-launchctl-fix:
maintenance: skip bootout/bootstrap when plist is registered
maintenance: create `launchctl` configuration using a lock file
mmap() imitation used to call xmalloc() that dies upon malloc()
failure, which has been corrected to just return an error to the
caller to be handled.
* rs/git-mmap-uses-malloc:
compat: let git_mmap use malloc(3) directly
On Windows, files cannot be removed nor renamed if there are still
handles held by a process. To remedy that, we try to release all open
handles to any `.pack` file before e.g. repacking (which would want to
remove the original `.pack` file(s) after it is done).
Since the `read_cache_unmerged()` and/or the `get_oid()` call in `git
pull` can cause `.pack` files to be opened, we need to release the open
handles before calling `git fetch`: the latter process might want to
spawn an auto-gc, which in turn might want to repack the objects.
This commit is similar in spirit to 5bdece0d70 (gc/repack: release
packs when needed, 2018-12-15).
This fixes https://github.com/git-for-windows/git/issues/3336.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The slab has information about the commit graph. That means that it is
meaningless (and even misleading) when the commit graph was closed.
This seems not to matter currently, but we're about to fix a
Windows-specific bug where `git pull` does not close the object store
before fetching (risking that an implicit auto-gc fails to remove the
now-obsolete pack file(s)), and once we have that bug fix in place, it
does matter: after that bug fix, we will open the object store, do some
stuff with it, then close it, fetch, and then open it again, and do more
stuff. If we close the commit graph without releasing the corresponding
slab, we're hit by a symptom like this in t5520.19:
BUG: commit-reach.c:85: bad generation skip 9223372036854775807
> 3 at 5cd378271655d43a3b4477520014f02213ad1546
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous patches have made "git grep" no longer need to add
submodule ODBs as alternates, at least for the code paths tested in
t7814. Demonstrate this by making adding a submodule ODB as an alternate
fatal in this test.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading the config of a submodule, if reading from a blob, read
using an explicitly specified repository instead of by adding the
submodule's ODB as an alternate and then reading an object from
the_repository.
This makes the "grep --recurse-submodules with submodules without
.gitmodules in the working tree" test in t7814 work when
GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB is true.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Record the repository whenever an OID grep source is created, and teach
the worker threads to explicitly provide the repository when accessing
objects.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, struct repository objects corresponding to submodules are
allocated on the stack in grep_submodule(). This currently works because
they will not be used once grep_submodule() exits, but a subsequent
patch will require these structs to be accessible for longer (perhaps
even in another thread). Allocate them on the heap and clear them only
at the very end.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace an existing parse_object_or_die() call (which implicitly works
on the_repository) with a function call that allows a repository to be
passed in. There is no such direct equivalent to parse_object_or_die(),
but we only need the type of the object, so replace with
oid_object_info().
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep_source_init() can create "struct grep_source" objects and,
depending on the value of the type passed, some void-pointer parameters have
different meanings. Because one of these types (GREP_SOURCE_OID) will
require an additional parameter in a subsequent patch, take the
opportunity to increase clarity and type safety by replacing this
function with individual functions for each type.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the parent commit, Git was taught to add submodule ODBs as alternates
lazily, but grep does not use this because it computes the path to add
directly, not going through add_submodule_odb(). Add an equivalent to
add_submodule_odb() that takes the exact ODB path and teach grep to use
it.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Git to add submodule ODBs as alternates to the object store of
the_repository only upon the first access of an object not in
the_repository, and not when add_submodule_odb() is called.
This provides a means of gradually migrating from accessing a
submodule's object through alternates to accessing a submodule's object
by explicitly passing its repository object. Any Git command can declare
that it might access submodule objects by calling add_submodule_odb()
(as they do now), but the submodule ODBs themselves will not be added
until needed, so individual commands and/or combinations of arguments
can be migrated one by one.
[The advantage of explicit repository-object passing is code clarity (it
is clear which repository an object read is from), performance (there is
no need to linearly search through all submodule ODBs whenever an object
is accessed from any repository, whether superproject or submodule), and
the possibility of future features like partial clone submodules (which
right now is not possible because if an object is missing, we do not
know which repository to lazy-fetch into).]
This commit also introduces an environment variable that a test may set
to make the actual registration of alternates fatal, in order to
demonstrate that its codepaths do not need this registration.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running tests with GIT_TEST_SPLIT_INDEX=1 is supposed to turn on the
split index feature and trigger index splitting (mostly) randomly.
Alas, this has been broken since 6e37c8ed3c (read-cache.c: fix writing
"link" index ext with null base oid, 2019-02-13), and
GIT_TEST_SPLIT_INDEX=1 hasn't triggered any index splitting since
then.
This patch makes GIT_TEST_SPLIT_INDEX work again, though it doesn't
restore the pre-6e37c8ed3c behavior. To understand the bug, the fix,
and the behavior change we first have to look at how
GIT_TEST_SPLIT_INDEX used to work before 6e37c8ed3c:
There are two places where we check the value of
GIT_TEST_SPLIT_INDEX, and before 6e37c8ed3c they worked like this:
1) In the lower-level do_write_index(), where, if
GIT_TEST_SPLIT_INDEX is enabled, we call init_split_index().
This call merely allocates and zero-initializes
'istate->split_index', but does nothing else (i.e. doesn't fill
the base/shared index with cache entries, doesn't actually
write a shared index file, etc.). Pertinent to this issue, the
hash of the base index remains all zeroed out.
2) In the higher-level write_locked_index(), but only when
'istate->split_index' has already been initialized. Then, if
GIT_TEST_SPLIT_INDEX is enabled, it randomly sets the flag that
triggers index splitting later in this function. This
randomness comes from the first byte of the hash of the base
index via an 'if ((first_byte & 15) < 6)' condition.
However, if 'istate->split_index' hasn't been initialized (i.e.
it is still NULL), then write_locked_index() just calls
do_write_locked_index(), which internally calls the above
mentioned do_write_index().
This means that while GIT_TEST_SPLIT_INDEX=1 usually triggered index
splitting randomly, the first two index writes were always
deterministic (though I suspect this was unintentional):
- The initial index write never splits the index.
During the first index write write_locked_index() is called with
'istate->split_index' still uninitialized, so the check in 2) is
not executed. It still calls do_write_index(), though, which
then executes the check in 1). The resulting all zero base
index hash then leads to the 'link' extension being written to
'.git/index', though a shared index file is not written:
$ rm .git/index
$ GIT_TEST_SPLIT_INDEX=1 git update-index --add file
$ test-tool dump-split-index .git/index
own c6ef71168597caec8553c83d9d0048f1ef416170
base 0000000000000000000000000000000000000000
100644 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 0 file
replacements:
deletions:
$ ls -l .git/sharedindex.*
ls: cannot access '.git/sharedindex.*': No such file or directory
- The second index write always splits the index.
When the index written in the previous point is read,
'istate->split_index' is initialized because of the presence of
the 'link' extension. So during the second write
write_locked_index() does run the check in 2), and the first
byte of the all zero base index hash always fulfills the
randomness condition, which in turn always triggers the index
splitting.
- Subsequent index writes will find the 'link' extension with a
real non-zero base index hash, so from then on the check in 2)
is executed and the first byte of the base index hash is as
random as it gets (coming from the SHA-1 of index data including
timestamps and inodes...).
All this worked until 6e37c8ed3c came along, and stopped writing the
'link' extension if the hash of the base index was all zero:
$ rm .git/index
$ GIT_TEST_SPLIT_INDEX=1 git update-index --add file
$ test-tool dump-split-index .git/index
own abbd6f6458d5dee73ae8e210ca15a68a390c6fd7
not a split index
$ ls -l .git/sharedindex.*
ls: cannot access '.git/sharedindex.*': No such file or directory
So, since the first index write with GIT_TEST_SPLIT_INDEX=1 doesn't
write a 'link' extension, in the second index write
'istate->split_index' remains uninitialized, and the check in 2) is
not executed, and ultimately the index is never split.
Fix this by modifying write_locked_index() to make sure to check
GIT_TEST_SPLIT_INDEX even if 'istate->split_index' is still
uninitialized, and initialize it if necessary. The check for
GIT_TEST_SPLIT_INDEX and separate init_split_index() call in
do_write_index() thus becomes unnecessary, so remove it. Furthermore,
add a test to 't1700-split-index.sh' to make sure that
GIT_TEST_SPLIT_INDEX=1 will keep working (though only check the
index splitting on the first index write, because after that it will
be random).
Note that this change does not restore the pre-6e37c8ed3c behaviour,
as it will deterministically split the index already on the first
index write. Since GIT_TEST_SPLIT_INDEX is purely a developer aid,
there is no backwards compatibility issue here. The new behaviour did
trigger test failures in 't0003-attributes.sh' and 't1600-index.sh',
though, which have been fixed in preparatory patches in this series.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse index and split index features are said to be currently
incompatible [1], and consequently GIT_TEST_SPLIT_INDEX=1 might
interfere with the test cases exercising the sparse index feature.
Therefore GIT_TEST_SPLIT_INDEX is already explicitly disabled for the
whole of 't1092-sparse-checkout-compatibility.sh'. There are,
however, two other test cases exercising sparse index, namely
'sparse-index enabled and disabled' in
't1091-sparse-checkout-builtin.sh' and 'status succeeds with sparse
index' in 't7519-status-fsmonitor.sh', and these two could fail with
GIT_TEST_SPLIT_INDEX=1 as well [2].
Unset GIT_TEST_SPLIT_INDEX and disable the split index in these two
test cases to avoid such interference.
Note that this is the minimal change to merely avoid failures when
these test cases are run with GIT_TEST_SPLIT_INDEX=1. Interestingly,
though, without these changes the 'git sparse-checkout init --cone
--sparse-index' commands still succeed even with split index, and set
all the necessary configuration variables and create the initial
'$GIT_DIR/info/sparse-checkout' file, but the test failures are caused
by later sanity checks finding that the index is not in fact a sparse
index. This indicates that 'git sparse-checkout init --sparse-index'
lacks some error checking and its tests lack coverage related to split
index, but fixing those issues is beyond the scope of this patch
series.
[1] https://public-inbox.org/git/48e9c3d6-407a-1843-2d91-22112410e3f8@gmail.com/
[2] Neither of these test cases fail at the moment, because
GIT_TEST_SPLIT_INDEX=1 is broken and never splits the index, and
it broke long before the sparse index feature was added.
This patch series is about to fix GIT_TEST_SPLIT_INDEX, and then
both test cases mentioned above would fail.
(The diff is best viewed with '--ignore-all-space')
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading a split index git always looks for its referenced shared
base index in the gitdir of the current repository, even when reading
an alternate index specified via GIT_INDEX_FILE, and even when that
alternate index file is the "main" '.git/index' file of an other
repository. However, if that split index and its referenced shared
index files were written by a git command running entirely in that
other repository, then, naturally, the shared index file is written to
that other repository's gitdir. Consequently, a git command
attempting to read that shared index file while running in a different
repository won't be able find it and will error out.
I'm not sure in what use case it is necessary to read the index of one
repository by a git command running in a different repository, but it
is certainly possible to do so, and in fact the test 'bare repository:
check that --cached honors index' in 't0003-attributes.sh' does
exactly that. If GIT_TEST_SPLIT_INDEX=1 were to split the index in
just the right moment [1], then this test would indeed fail, because
the referenced shared index file could not be found.
Let's look for the referenced shared index file not only in the gitdir
of the current directory, but, if the shared index is not there, right
next to the split index as well.
[1] We haven't seen this issue trigger a failure in t0003 yet,
because:
- While GIT_TEST_SPLIT_INDEX=1 is supposed to trigger index
splitting randomly, the first index write has always been
deterministic and it has never split the index.
- That alternate index file in the other repository is written
only once in the entire test script, so it's never split.
However, the next patch will fix GIT_TEST_SPLIT_INDEX, and while
doing so it will slightly change its behavior to always split the
index already on the first index write, and t0003 would always
fail without this patch.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests in 't1600-index.sh' check that various bogus index version
values are recognized and an appropriate warning message is issued.
GIT_TEST_SPLIT_INDEX=1 is supposed to trigger index splitting
randomly, and thus might interfere [1] with these tests: splitting the
index means that two index files are written (the shared base index
and the split '.git/index'), and the same warning message is then
issued twice, failing these tests.
Unset GIT_TEST_SPLIT_INDEX in this test script to avoid such
interference.
[1] There is no such interference at the moment, because, alas,
GIT_TEST_SPLIT_INDEX=1 is broken and never split the index. There
was no such interference in the past (before it broke) either,
because the first index write with GIT_TEST_SPLIT_INDEX=1 never
split the index, only the second write did. A subsequent commit
fixing GIT_TEST_SPLIT_INDEX will have all the details on this.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a helper function in the 't1600-index.sh' test script the stderr
of a 'git add' command is redirected to its stdout, but its stdout is
not redirected anywhere. So apparently this redirection is
unnecessary, remove it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ds/sparse-index-ignored-files:
sparse-checkout: clear tracked sparse dirs
sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
attr: be careful about sparse directories
sparse-checkout: create helper methods
sparse-index: use WRITE_TREE_MISSING_OK
sparse-index: silently return when cache tree fails
unpack-trees: fix nested sparse-dir search
sparse-index: silently return when not using cone-mode patterns
t7519: rewrite sparse index test
When changing the scope of a sparse-checkout using cone mode, we might
have some tracked directories go out of scope. The current logic removes
the tracked files from within those directories, but leaves the ignored
files within those directories. This is a bit unexpected to users who
have given input to Git saying they don't need those directories
anymore.
This is something that is new to the cone mode pattern type: the user
has explicitly said "I want these directories and _not_ those
directories." The typical sparse-checkout patterns more generally apply
to "I want files with with these patterns" so it is natural to leave
ignored files as they are. This focus on directories in cone mode
provides us an opportunity to change the behavior.
Leaving these ignored files in the sparse directories makes it
impossible to gain performance benefits in the sparse index. When we
track into these directories, we need to know if the files are ignored
or not, which might depend on the _tracked_ .gitignore file(s) within
the sparse directory. This depends on the indexed version of the file,
so the sparse directory must be expanded.
We must take special care to look for untracked, non-ignored files in
these directories before deleting them. We do not want to delete any
meaningful work that the users were doing in those directories and
perhaps forgot to add and commit before switching sparse-checkout
definitions. Since those untracked files might be code files that
generated ignored build output, also do not delete any ignored files
from these directories in that case. The users can recover their state
by resetting their sparse-checkout definition to include that directory
and continue. Alternatively, they can see the warning that is presented
and delete the directory themselves to regain the performance they
expect.
By deleting the sparse directories when changing scope (or running 'git
sparse-checkout reapply') we regain these performance benefits as if the
repository was in a clean state.
Since these ignored files are frequently build output or helper files
from IDEs, the users should not need the files now that the tracked
files are removed. If the tracked files reappear, then they will have
newer timestamps than the build artifacts, so the artifacts will need to
be regenerated anyway.
Use the sparse-index as a data structure in order to find the sparse
directories that can be safely deleted. Re-expand the index to a full
one if it was full before.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The convert_to_sparse() method checks for the GIT_TEST_SPARSE_INDEX
environment variable or the "index.sparse" config setting before
converting the index to a sparse one. This is for ease of use since all
current consumers are preparing to compress the index before writing it
to disk. If these settings are not enabled, then convert_to_sparse()
silently returns without doing anything.
We will add a consumer in the next change that wants to use the sparse
index as an in-memory data structure, regardless of whether the on-disk
format should be sparse.
To that end, create the SPARSE_INDEX_MEMORY_ONLY flag that will skip
these config checks when enabled. All current consumers are modified to
pass '0' in the new 'flags' parameter.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we integrate the sparse index into more builtins, we occasionally
need to check the sparse-checkout patterns to see if a path is within
the sparse-checkout cone. Create some helper methods that help
initialize the patterns and check for pattern matching to make this
easier.
The existing callers of commands like get_sparse_checkout_patterns() use
a custom 'struct pattern_list' that is not necessarily the one in the
'struct index_state', so there are not many previous uses that could
adopt these helpers. There are just two in builtin/add.c and
sparse-index.c that can use path_in_sparse_checkout().
We add a path_in_cone_mode_sparse_checkout() as well that will only
return false if the path is outside of the sparse-checkout definition
_and_ the sparse-checkout patterns are in cone mode.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When updating the cache tree in convert_to_sparse(), the
WRITE_TREE_MISSING_OK flag indicates that trees might be computed that
do not already exist within the object database. This happens in cases
such as 'git add' creating new trees that it wants to store in
anticipation of a following 'git commit'. If this flag is not specified,
then it might trigger a promisor fetch or a failure due to the object
not existing locally.
Use WRITE_TREE_MISSING_OK during convert_to_sparse() to avoid these
possible reasons for the cache_tree_update() to fail.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If cache_tree_update() returns a non-zero value, then it could not
create the cache tree. This is likely due to a path having a merge
conflict. Since we are already returning early, let's return silently to
avoid making it seem like we failed to write the index at all.
If we remove our dependence on the cache tree within
convert_to_sparse(), then we could still recover from this scenario and
have a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The iterated search in find_cache_entry() was recently modified to
include a loop that searches backwards for a sparse directory entry that
matches the given traverse_info and name_entry. However, the string
comparison failed to actually concatenate those two strings, so this
failed to find a sparse directory when it was not a top-level directory.
This caused some errors in rare cases where a 'git checkout' spanned a
diff that modified files within the sparse directory entry, but we could
not correctly find the entry.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the sparse-index is only enabled when core.sparseCheckoutCone is
also enabled, it is possible for the user to modify the sparse-checkout
file manually in a way that does not match cone-mode patterns. In this
case, we should refuse to convert an index into a sparse index, since
the sparse_checkout_patterns will not be initialized with recursive and
parent path hashsets.
Also silently return if there are no cache entries, which is a simple
case: there are no paths to make sparse!
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse index is tested with the FS Monitor hook and extension since
f8fe49e (fsmonitor: integrate with sparse index, 2021-07-14). This test
was very fragile because it shared an index across sparse and non-sparse
behavior. Since that expansion and contraction could cause the index to
lose its FS Monitor bitmap and token, behavior is fragile to changes in
'git sparse-checkout set'.
Rewrite the test to use two clones of the original repo: full and
sparse. This allows us to also keep the test files (actual, expect,
trace2.txt) out of the repos we are testing with 'git status'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a performance regression introduced in a587b5a786 (pack-write.c:
extract 'write_rev_file_order', 2021-03-30) and stop needlessly
allocating the "pack_order" array and sorting it with
"pack_order_cmp()", only to throw that work away when we discover that
we're not writing *.rev files after all.
This redundant work was not present in the original version of this
code added in 8ef50d9958 (pack-write.c: prepare to write 'pack-*.rev'
files, 2021-01-25). There we'd call write_rev_file() from
e.g. finish_tmp_packfile(), but we'd "return NULL" early in
write_rev_file() if not doing a "WRITE_REV" or "WRITE_REV_VERIFY".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function to add the `exec` commands to the todo list only needed to
be public API because it was not only used internally by the sequencer,
but also by `git rebase --preserve-merges`.
Now that that mode has been removed, we no longer need that function to
be scoped publicly.
Helped-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We no longer support `--preserve-merges`, therefore it does not make
sense to keep mentioning that option, even in code comments.
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we no longer have a `--preserve-merges` backend, this comment
needs to be adjusted.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already passed the `--rebase-merges` option to `git rebase` instead,
now we make this move permanent.
As pointed out by Ævar Arnfjörð Bjarmason, in contrast to the
deprecation of `git rebase`'s `--preserve-merges` backend, `git svn`
only deprecated this option in v2.25.0 (because this developer missed
`git svn`'s use of that backend when deprecating the rebase backend
running up to Git v2.22).
Still, v2.25.0 has been released on January 13th, 2020. In other words,
`git svn` deprecated this option over one and a half years ago, _and_
has been redirecting to the `--rebase-merges` option during all that
time (read: `git svn rebase --preserve-merges` didn't do _precisely_
what the user asked, since v2.25.0, anyway, it fell back to pretending
that the user asked for `git svn rebase --rebase-merges` instead).
It is time to act on that deprecation and remove that option after all.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option was deprecated in favor of `--rebase-merges` some time ago,
and now we retire it.
To assist users to transition away, we do not _actually_ remove the
option, but now we no longer implement the functionality. Instead, we
offer a helpful error message suggesting which option to use.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for `git-rebase--preserve-merges.sh` entering its after
life, we remove this (deprecated) option that would still rely on it.
To help users transition who still did not receive the memo about the
deprecation, we offer a helpful error message instead of throwing our
hands in the air and saying that we don't know that option, never heard
of it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This backend has been deprecated in favor of `git rebase
--rebase-merges`.
In preparation for dropping it, let's remove all the regression tests
that would need it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We ignore them silently, but it actually makes sense to warn the users
that their config setting has no effect.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if those tests try to override that setting, let's not use it
because it is deprecated: let's use `merges` instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back when a1be47e4 (hash-object: fix buffer reuse with --path in a
subdirectory, 2017-03-20) was written, the prefix_filename() helper
used a static piece of memory to the caller, making the caller
responsible for copying it, if it wants to keep it across another
call to the same function. Two callers of the prefix_filename() in
hash-object were made to xstrdup() the value obtained from it.
But in the same series, when e4da43b1 (prefix_filename: return newly
allocated string, 2017-03-20) changed the rule to gave the caller
possession of the memory, we forgot to revert one of the xstrdup()
changes, allowing the returned value to leak.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git bugreport writes bug report to the current directory by default,
instead of repository root.
Fix the documentation.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This script was added in f28ac70f48 (Move all dashed-form commands to
libexecdir, 2007-11-28) when commands such as "git-add" lived in the
bin directory, instead of the git exec directory.
This notice helped someone incorrectly installing version v1.6.0 and
later into a directory built for a pre-v1.6.0 git version.
We're now long past the point where anyone who'd be helped by this
warning is likely to be doing that, so let's just remove this check
and warning to simplify the Makefile.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in my c95e3a3f0b (send-email: move trivial config
handling to Perl, 2021-05-28) where we'd pick the first config key out
of multiple defined ones, instead of using the normal "last key wins"
semantics of "git config --get".
This broke e.g. cases where a .git/config would have a different
sendemail.smtpServer than ~/.gitconfig. We'd pick the ~/.gitconfig
over .git/config, instead of preferring the repository-local
version. The same would go for /etc/gitconfig etc.
The full list of impacted config keys (the %config_settings values
which are references to scalars, not arrays) is:
sendemail.smtpencryption
sendemail.smtpserver
sendemail.smtpserverport
sendemail.smtpuser
sendemail.smtppass
sendemail.smtpdomain
sendemail.smtpauth
sendemail.smtpbatchsize
sendemail.smtprelogindelay
sendemail.tocmd
sendemail.cccmd
sendemail.aliasfiletype
sendemail.envelopesender
sendemail.confirm
sendemail.from
sendemail.assume8bitencoding
sendemail.composeencoding
sendemail.transferencoding
sendemail.sendmailcmd
I.e. having any of these set in say ~/.gitconfig and in-repo
.git/config regressed in v2.33.0 to prefer the --global one over the
--local.
To test this add a test of config priority to one of these config
variables, most don't have tests at all, but there was an existing one
for sendemail.8bitEncoding.
The "git config" (instead of "test_config") is somewhat of an
anti-pattern, but follows established conventions in
t9001-send-email.sh, likewise with any other pattern or idiom in this
test.
The populating of home/.gitconfig and setting of HOME= is copied from
a test in t0017-env-helper.sh added in 1ff750b128 (tests: make
GIT_TEST_GETTEXT_POISON a boolean, 2019-06-21). This test fails
without this bugfix, but now it works.
Reported-by: Eli Schwartz <eschwartz@archlinux.org>
Tested-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
output() reuses the same struct diff_options for multiple calls of
diff_flush(). Set the option no_free to instruct it to keep the
ignore regexes between calls and release them explicitly at the end.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes 19b2517f (diff-merges: move specific diff-index "-m"
handling to diff-index, 2021-05-21).
That commit disabled handling of all diff for merges options in
diff-index on an assumption that they are unused. However, it later
appeared that -c and --cc, even though undocumented and not being
covered by tests, happen to have had particular effect on diff-index
output.
Restore original -c/--cc options handling by diff-index.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2f732bf15e (tr2: log parent process name, 2021-07-21) we started
logging parent process names, but only logged all parents on Windows.
on Linux only the name of the immediate parent process was logged.
Extend the functionality added there to also log full parent chain on
Linux.
This requires us to lookup "/proc/<getppid()>/stat" instead of
"/proc/<getppid()>/comm". The "comm" file just contains the name of the
process, but the "stat" file has both that information, and the parent
PID of that process, see procfs(5). We parse out the parent PID of our
own parent, and recursively walk the chain of "/proc/*/stat" files all
the way up the chain. A parent PID of 0 indicates the end of the
chain.
It's possible given the semantics of Linux's PID files that we end up
getting an entirely nonsensical chain of processes. It could happen if
e.g. we have a chain of processes like:
1 (init) => 321 (bash) => 123 (git)
Let's assume that "bash" was started a while ago, and that as shown
the OS has already cycled back to using a lower PID for us than our
parent process. In the time it takes us to start up and get to
trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent
process might exit, and be replaced by an entirely different process!
We'd racily look up our own getppid(), but in the meantime our parent
would exit, and Linux would have cycled all the way back to starting
an entirely unrelated process as PID 321.
If that happens we'll just silently log incorrect data in our ancestry
chain. Luckily we don't need to worry about this except in this
specific cycling scenario, as Linux does not have PID
randomization. It appears it once did through a third-party feature,
but that it was removed around 2006[1]. For anyone worried about this
edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate
it, but not eliminate it.
One thing we don't need to worry about is getting into an infinite
loop when walking "/proc/*/stat". See 353d3d77f4 (trace2: collect
Windows-specific process information, 2019-02-22) for the related
Windows code that needs to deal with that, and [2] for an explanation
of that edge case.
Aside from potential race conditions it's also a bit painful to
correctly parse the process name out of "/proc/*/stat". A simpler
approach is to use fscanf(), see [3] for an implementation of that,
but as noted in the comment being added here it would fail in the face
of some weird process names, so we need our own parse_proc_stat() to
parse it out.
With this patch the "ancestry" chain for a trace2 event might look
like this:
$ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version | grep ancestry | jq -r .ancestry
[
"bash",
"screen",
"systemd"
]
And in the case of naughty process names like the following. This uses
perl's ability to use prctl(PR_SET_NAME, ...). See
Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on
assignment to $0 on Linux, 2010-04-15)[4]:
$ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry
[
"sh",
"(naughty\nname)",
"bash",
"screen",
"systemd"
]
1. https://grsecurity.net/news#grsec2110
2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/
3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
4. 7636ea95c5
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code added in 2f732bf15e (tr2: log parent process name,
2021-07-21) to use a switch statement without a "default" branch to
have the compiler error if this code ever drifts out of sync with the
members of the "enum trace2_process_info_reason".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent commit I'll be replacing most of this code to log N
parents, but let's first fix bugs introduced in the recent
2f732bf15e (tr2: log parent process name, 2021-07-21).
It was using the strbuf_read_file() in the wrong way, its return value
is either a length or a negative value on error. If we didn't have a
procfs, or otherwise couldn't access it we'd end up pushing an empty
string to the trace2 ancestry array.
It was also using the strvec_push() API the wrong way. That API always
does an xstrdup(), so by detaching the strbuf here we'd leak
memory. Let's instead pass in our pointer for strvec_push() to
xstrdup(), and then free our own strbuf. I do have some WIP changes to
make strvec_push_nodup() non-static, which makes this and some other
callsites nicer, but let's just follow the prevailing pattern of using
strvec_push() for now.
We'll also need to free that "procfs_path" strbuf whether or not
strbuf_read_file() succeeds, which was another source of memory leaks
in 2f732bf15e, i.e. we'd leak that memory as well if we weren't on a
system where we could read the file from procfs.
Let's move all the freeing of the memory to the end of the
function. If we're still at STRBUF_INIT with "name" due to not having
taken the branch where the strbuf_read_file() succeeds freeing it is
redundant. So we could move it into the body of the "if", but just
handling freeing the same way for all branches of the function makes
it more readable.
In combination with the preceding commit this makes all of
t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak introduced in ee4512ed48 (trace2: create new
combined trace facility, 2019-02-22), we were doing a free() of other
memory allocated in tr2tls_create_self(), but not the "thread_name"
"struct strbuf".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite a comment added in 2f732bf15e (tr2: log parent process name,
2021-07-21) to describe what we might do under
TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring
to "something extra".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I'm fairly sure that there is no way on Linux to inspect the process
tree without using procfs, any tool such as ps(1), top(1) etc. that
shows this sort of information ultimately looks the information up in
procfs.
So let's remove this comment added in 2f732bf15e (tr2: log parent
process name, 2021-07-21), it's setting us up for an impossible task.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "unbundle" command added in 2e0afafebd (Add git-bundle: move
objects and references by archive, 2007-02-22) did not show progress
output, even though the underlying API learned how to show progress in
be042aff24 (Teach progress eye-candy to fetch_refs_from_bundle(),
2011-09-18).
Now we'll show "Unbundling objects" using the new --progress-title
option to "git index-pack", to go with its existing "Receiving
objects" and "Indexing objects" (which it shows when invoked with
"--stdin", and with a pack file, respectively).
Unlike "git bundle create" we don't handle "--quiet" here, nor
"--all-progress" and "--all-progress-implied". Those are all specific
to "create" (and "verify", in the case of "--quiet").
The structure of the existing documentation is a bit unclear, e.g. the
documentation for the "--quiet" option added in
79862b6b77 (bundle-create: progress output control, 2019-11-10) only
describes how it works for "create", and not for "verify". That and
other issues in it should be fixed, but I'd like to avoid untangling
that mess right now. Let's just support the standard "--no-progress"
implicitly here, and leave cleaning up the general behavior of "git
bundle" for a later change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a --progress-title option to index-pack, when data is piped into
index-pack its progress is a proxy for whatever's feeding it data.
This option will allow us to set a more relevant progress bar title in
"git bundle unbundle", and is also used in my "bundle-uri" RFC
patches[1] by a new caller in fetch-pack.c.
The code change in cmd_index_pack() won't handle
"--progress-title=xyz", only "--progress-title xyz", and the "(i+1)"
style (as opposed to "i + 1") is a bit odd.
Not using the "--long-option=value" style is inconsistent with
existing long options handled by cmd_index_pack(), but makes the code
that needs to call it better (two strvec_push(), instead of needing a
strvec_pushf()). Since the option is internal-only the inconsistency
shouldn't matter.
I'm copying the pattern to handle it as-is from the handling of the
existing "-o" option in the same function, see 9cf6d3357a (Add
git-index-pack utility, 2005-10-12) for its addition. That's a short
option, but the code to implement the two is the same in functionality
and style. Eventually we'd like to migrate all of this this to
parse_options(), which would make these differences in behavior go
away.
1. https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the "flags" parameter was added in be042aff24 (Teach progress
eye-candy to fetch_refs_from_bundle(), 2011-09-18) there's never been
more than the one flag: BUNDLE_VERBOSE.
Let's have the only caller who cares about that pass "-v" itself
instead through new "extra_index_pack_args" parameter. The flexibility
of being able to pass arbitrary arguments to "unbundle" will be used
in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing mechanism for scheduling background maintenance is done
through cron. On Linux systems managed by systemd, systemd provides an
alternative to schedule recurring tasks: systemd timers.
The main motivations to implement systemd timers in addition to cron
are:
* cron is optional and Linux systems running systemd might not have it
installed.
* The execution of `crontab -l` can tell us if cron is installed but not
if the daemon is actually running.
* With systemd, each service is run in its own cgroup and its logs are
tagged by the service inside journald. With cron, all scheduled tasks
are running in the cron daemon cgroup and all the logs of the
user-scheduled tasks are pretended to belong to the system cron
service.
Concretely, a user that doesn’t have access to the system logs won’t
have access to the log of their own tasks scheduled by cron whereas
they will have access to the log of their own tasks scheduled by
systemd timer.
Although `cron` attempts to send email, that email may go unseen by
the user because these days, local mailboxes are not heavily used
anymore.
In order to schedule git maintenance, we need two unit template files:
* ~/.config/systemd/user/git-maintenance@.service
to define the command to be started by systemd and
* ~/.config/systemd/user/git-maintenance@.timer
to define the schedule at which the command should be run.
Those units are templates that are parameterized by the frequency.
Based on those templates, 3 timers are started:
* git-maintenance@hourly.timer
* git-maintenance@daily.timer
* git-maintenance@weekly.timer
The command launched by those three timers are the same as with the
other scheduling methods:
/path/to/git for-each-repo --exec-path=/path/to
--config=maintenance.repo maintenance run --schedule=%i
with the full path for git to ensure that the version of git launched
for the scheduled maintenance is the same as the one used to run
`maintenance start`.
The timer unit contains `Persistent=true` so that, if the computer is
powered down when a maintenance task should run, the task will be run
when the computer is back powered on.
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Depending on the system, different schedulers can be used to schedule
the hourly, daily and weekly executions of `git maintenance run`:
* `launchctl` for MacOS,
* `schtasks` for Windows and
* `crontab` for everything else.
`git maintenance run` now has an option to let the end-user explicitly
choose which scheduler he wants to use:
`--scheduler=auto|crontab|launchctl|schtasks`.
When `git maintenance start --scheduler=XXX` is run, it not only
registers `git maintenance run` tasks in the scheduler XXX, it also
removes the `git maintenance run` tasks from all the other schedulers to
ensure we cannot have two schedulers launching concurrent identical
tasks.
The default value is `auto` which chooses a suitable scheduler for the
system.
`git maintenance stop` doesn't have any `--scheduler` parameter because
this command will try to remove the `git maintenance run` tasks from all
the available schedulers.
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current implementation of `xdg_config_home(filename)` returns
`$XDG_CONFIG_HOME/git/$filename`, with the `git` subdirectory inserted
between the `XDG_CONFIG_HOME` environment variable and the parameter.
This patch introduces a `xdg_config_home_for(subdir, filename)` function
which is more generic. It only concatenates "$XDG_CONFIG_HOME", or
"$HOME/.config" if the former isn’t defined, with the parameters,
without adding `git` in between.
`xdg_config_home(filename)` is now implemented by calling
`xdg_config_home_for("git", filename)` but this new generic function can
be used to compute the configuration directory of other programs.
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'debug' function in test-lib-functions.sh is used to invoke a
debugger at a specific line in a test. It inherits the value of HOME and
TERM set by 'test-lib.sh': HOME="$TRASH_DIRECTORY" and TERM=dumb.
Changing the value of HOME means that any customization configured in a
developers' debugger configuration file (like $HOME/.gdbinit or
$HOME/.lldbinit) are not available in the debugger invoked by
'test_pause'.
Changing the value of TERM to 'dumb' means that colored output
is disabled in the debugger.
To make the debugging experience with 'debug' more pleasant, leverage
the variable USER_HOME, added in the previous commit, to copy a
developer's ~/.gdbinit and ~/.lldbinit to the test HOME. We do not set
HOME to USER_HOME as in 'test_pause' to avoid user configuration in
$USER_HOME/.gitconfig from interfering with the command being debugged.
Also, add a flag to launch the debugger with the original value of
TERM, and add the same warning as for 'test_pause'.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'test_pause' function, which is designed to help interactive
debugging and exploration of tests, currently inherits the value of HOME
and TERM set by 'test-lib.sh': HOME="$TRASH_DIRECTORY" and TERM=dumb. It
also invokes the shell defined by TEST_SHELL_PATH, which defaults to
/bin/sh (through SHELL_PATH).
Changing the value of HOME means that any customization configured in a
developers' shell startup files and any Git aliases defined in their
global Git configuration file are not available in the shell invoked by
'test_pause'.
Changing the value of TERM to 'dumb' means that colored output
is disabled for all commands in that shell.
Using /bin/sh as the shell invoked by 'test_pause' is not ideal since
some platforms (i.e. Debian and derivatives) use Dash as /bin/sh, and
this shell is usually compiled without readline support, which makes for
a poor interactive command line experience.
To make the interactive command line experience in the shell invoked by
'test_pause' more pleasant, save the values of HOME and TERM in
USER_HOME and USER_TERM before changing them in test-lib.sh, and add
options to 'test_pause' to optionally use these variables to invoke the
shell. Also add an option to invoke SHELL instead of TEST_SHELL_PATH, so
that developer's interactive shell is used.
We use options instead of changing the behaviour unconditionally since
these three variables can slightly change command behaviour. Moreover,
using the original HOME means commands could overwrite files in a user's
home directory. Be explicit about these caveats in the new 'Usage'
section in test-lib-functions.sh.
Finally, add '[options]' to the test_pause synopsys in t/README, and
mention that the full list of helper functions and their options can be
found in test-lib-functions.sh.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
3f824e91c8 (t/Makefile: introduce TEST_SHELL_PATH, 2017-12-08)
made it easy to use a different shell for the tests than 'SHELL_PATH'
used at compile time. But 'test_pause' still invokes 'SHELL_PATH'.
If TEST_SHELL_PATH is set, invoke that shell in 'test_pause' for
consistency.
Suggested-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add $(INSTALL_STRIP), which allows passing stripping options to
$(INSTALL).
For this to work, installing executables must be split to installing
compiled binaries and scripts portions, since $(INSTALL_STRIP) is only
meaningful to the former.
Users can set this variable depending on their system. For example,
Linux users can use `-s --strip-program=strip`, while FreeBSD users can
simply set to `-s` and choose strip program with $STRIPBIN.
[original outline by Đoàn Trần Công Danh]
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ll_binary_merge() function assumes that the ancestor blob is
different from either side of the new versions, and always fails
the merge in conflict, unless -Xours or -Xtheirs is in effect.
The normal "merge" machineries all resolve the trivial cases
(e.g. if our side changed while their side did not, the result
is ours) without triggering the file-level merge drivers, so the
assumption is warranted.
The code path in "git apply --3way", however, does not check for
the trivial three-way merge situation and always calls the
file-level merge drivers. This used to be perfectly OK back
when we always first attempted a straight patch application and
used the three-way code path only as a fallback. Any binary
patch that can be applied as a trivial three-way merge (e.g. the
patch is based exactly on the version we happen to have) would
always cleanly apply, so the ll_binary_merge() that is not
prepared to see the trivial case would not have to handle such a
case.
This no longer is true after we made "--3way" to mean "first try
three-way and then fall back to straight application", and made
"git apply -3" on a binary patch that is based on the current
version no longer apply.
Teach "git apply -3" to first check for the trivial merge cases
and resolve them without hitting the file-level merge drivers.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
[jc: stolen tests from Jerry's patch]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update to the command line completion (in contrib/) for tcsh.
* ti/tcsh-completion-regression-fix:
completion: tcsh: Fix regression by drop of wrapper functions
Command line completion updates.
* fc/completion-updates:
completion: bash: add correct suffix in variables
completion: bash: fix for multiple dash commands
completion: bash: fix for suboptions with value
completion: bash: fix prefix detection in branch.*
Various bugs in "git rebase -r" have been fixed.
* pw/rebase-r-fixes:
rebase -r: fix merge -c with a merge strategy
rebase -r: don't write .git/MERGE_MSG when fast-forwarding
rebase -i: add another reword test
rebase -r: make 'merge -c' behave like reword
Checking out all the paths from HEAD during the last conflicted
step in "git rebase" and continuing would cause the step to be
skipped (which is expected), but leaves MERGE_MSG file behind in
$GIT_DIR and confuses the next "git commit", which has been
corrected.
* pw/rebase-skip-final-fix:
rebase --continue: remove .git/MERGE_MSG
rebase --apply: restore some tests
t3403: fix commit authorship
Use upload-artifacts v1 (instead of v2) for 32-bit linux, as the
new version has a blocker bug for that architecture.
* cb/ci-use-upload-artifacts-v1:
ci: use upload-artifacts v1 for dockerized jobs
"git commit --fixup" now works with "--edit" again, after it was
broken in v2.32.
* jk/commit-edit-fixup-fix:
commit: restore --edit when combined with --fixup
The revision traversal API has been optimized by taking advantage
of the commit-graph, when available, to determine if a commit is
reachable from any of the existing refs.
* ps/connectivity-optim:
revision: avoid hitting packfiles when commits are in commit-graph
commit-graph: split out function to search commit position
revision: stop retrieving reference twice
connected: do not sort input revisions
revision: separate walk and unsorted flags
With the codebase firmly C99 compatible and most compilers supporting
newer versions by default, it could help bring visibility to problems.
Reverse the DEVOPTS=pedantic flag to provide a fallback for people stuck
with gcc < 5 or some other compiler that either doesn't support this flag
or has issues with it, and while at it also enable -Wpedantic which used
to be controversial[1] when Apple compilers and clang had widely divergent
version numbers.
Ideally any compiler found to have issues with these flags will be added
to an exception, and indeed, one was added to safely process windows
headers that would use non standard print identifiers, but it is expected
that more will be needed, so it could be considered a weather balloon.
[1] https://lore.kernel.org/git/20181127100557.53891-1-carenas@gmail.com/
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation to building with pedantic mode enabled, change a couple
of places where the current mingw gcc compiler provided with the SDK
reports issues.
A full fix for the incompatible use of (void *) to store function
pointers has been punted, with the minimal change to instead use a
generic function pointer (FARPROC), and therefore the (hopefully)
temporary need to disable incompatible pointer warnings.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the USE_PARENS_AROUND_GETTEXT_N compile-time option which was
meant to catch an inadvertent mistake which is too obscure to
maintain this facility.
The backstory of how USE_PARENS_AROUND_GETTEXT_N came about is: When I
added the N_() macro in 6578483036 (i18n: add no-op _() and N_()
wrappers, 2011-02-22) it was defined as:
#define N_(msgid) (msgid)
This is non-standard C, as was noticed and fixed in 642f85faab (i18n:
avoid parenthesized string as array initializer, 2011-04-07).
I.e. this needed to be defined as:
#define N_(msgid) msgid
Then in e62cd35a3e (i18n: log: mark parseopt strings for translation,
2012-08-20) when "builtin_log_usage" was marked for translation the
string concatenation for passing to usage() added in 1c370ea4e5
(Show usage string for 'git log -h', 'git show -h' and 'git diff -h',
2009-08-06) was faithfully preserved:
- "git log [<options>] [<since>..<until>] [[--] <path>...]\n"
- " or: git show [options] <object>...",
+ N_("git log [<options>] [<since>..<until>] [[--] <path>...]\n")
+ N_(" or: git show [options] <object>..."),
This was then fixed to be the expected array of usage strings in
e66dc0cc4b (log.c: fix translation markings, 2015-01-06) rather than
a string with multiple "\n"-delimited usage strings, and finally in
290c8e7a3f (gettext.h: add parentheses around N_ expansion if
supported, 2015-01-11) USE_PARENS_AROUND_GETTEXT_N was added to ensure
this mistake didn't happen again.
I think that even if this was a N_()-specific issue this
USE_PARENS_AROUND_GETTEXT_N facility wouldn't be worth it, the issue
would be too rare to worry about.
But I also think that 290c8e7a3f which introduced
USE_PARENS_AROUND_GETTEXT_N misattributed the problem. The issue
wasn't with the N_() macro added in e62cd35a3e, but that before the
N_() macro existed in the codebase the initial migration to
parse_options() in 1c370ea4e5 continued passsing in a "\n"-delimited
string, when the new API it was migrating to supported and expected
the passing of an array.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When executing git-update-ref(1) with the `--stdin` flag, then the user
can queue updates and, since e48cf33b61 (update-ref: implement
interactive transaction handling, 2020-04-02), interactively drive the
transaction's state via a set of transactional verbs. This interactivity
is somewhat broken though: while the caller can use these verbs to drive
the transaction's state, the status messages which confirm that a verb
has been processed is not flushed. The caller may thus be left hanging
waiting for the acknowledgement.
Fix the bug by flushing stdout after writing the status update. Add a
test which catches this bug.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In make_remote(), we store the return value of hashmap_put() and check
it using assert(), but don't otherwise use it. If Git is compiled with
NDEBUG, then the assert() becomes a noop, and nobody looks at the
variable at all. This causes some compilers to produce warnings.
Let's switch it instead to a BUG(). This accomplishes the same thing,
but is always compiled in (and we don't have to worry about the cost;
the check is cheap, and this is not a hot code path).
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the trailing dot from the warning we emit about gc.log. It's
common for various terminal UX's to allow the user to select "words",
and by including the trailing dot a user wanting to select the path to
gc.log will need to manually remove the trailing dot.
Such a user would also probably need to adjust the path if it e.g. had
spaces in it, but this should address this very common case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Suggested-by: Jan Judas <snugar.i@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These new performance tests demonstrate effectively the same behavior as
p5310, but use a multi-pack bitmap instead of a single-pack one.
Notably, p5326 does not create a MIDX bitmap with multiple packs. This
is so we can measure a direct comparison between it and p5310. Any
difference between the two is measuring just the overhead of using MIDX
bitmaps.
Here are the results of p5310 and p5326 together, measured at the same
time and on the same machine (using a Xenon W-2255 CPU):
Test HEAD
------------------------------------------------------------------------
5310.2: repack to disk 96.78(93.39+11.33)
5310.3: simulated clone 9.98(9.79+0.19)
5310.4: simulated fetch 1.75(4.26+0.19)
5310.5: pack to file (bitmap) 28.20(27.87+8.70)
5310.6: rev-list (commits) 0.41(0.36+0.05)
5310.7: rev-list (objects) 1.61(1.54+0.07)
5310.8: rev-list count with blob:none 0.25(0.21+0.04)
5310.9: rev-list count with blob:limit=1k 2.65(2.54+0.10)
5310.10: rev-list count with tree:0 0.23(0.19+0.04)
5310.11: simulated partial clone 4.34(4.21+0.12)
5310.13: clone (partial bitmap) 11.05(12.21+0.48)
5310.14: pack to file (partial bitmap) 31.25(34.22+3.70)
5310.15: rev-list with tree filter (partial bitmap) 0.26(0.22+0.04)
versus the same tests (this time using a multi-pack index):
Test HEAD
------------------------------------------------------------------------
5326.2: setup multi-pack index 78.99(75.29+11.58)
5326.3: simulated clone 11.78(11.56+0.22)
5326.4: simulated fetch 1.70(4.49+0.13)
5326.5: pack to file (bitmap) 28.02(27.72+8.76)
5326.6: rev-list (commits) 0.42(0.36+0.06)
5326.7: rev-list (objects) 1.65(1.58+0.06)
5326.8: rev-list count with blob:none 0.26(0.21+0.05)
5326.9: rev-list count with blob:limit=1k 2.97(2.86+0.10)
5326.10: rev-list count with tree:0 0.25(0.20+0.04)
5326.11: simulated partial clone 5.65(5.49+0.16)
5326.13: clone (partial bitmap) 12.22(13.43+0.38)
5326.14: pack to file (partial bitmap) 30.05(31.57+7.25)
5326.15: rev-list with tree filter (partial bitmap) 0.24(0.20+0.04)
There is slight overhead in "simulated clone", "simulated partial
clone", and "clone (partial bitmap)". Unsurprisingly, that overhead is
due to using the MIDX's reverse index to map between bit positions and
MIDX positions.
This can be reproduced by running "git repack -adb" along with "git
multi-pack-index write --bitmap" in a large-ish repository. Then run:
$ perf record -o pack.perf git -c core.multiPackIndex=false \
pack-objects --all --stdout >/dev/null </dev/null
$ perf record -o midx.perf git -c core.multiPackIndex=true \
pack-objects --all --stdout >/dev/null </dev/null
and compare the two with "perf diff -c delta -o 1 pack.perf midx.perf".
The most notable results are below (the next largest positive delta is
+0.14%):
# Event 'cycles'
#
# Baseline Delta Shared Object Symbol
# ........ ....... .................. ..........................
#
+5.86% git [.] nth_midxed_offset
+5.24% git [.] nth_midxed_pack_int_id
3.45% +0.97% git [.] offset_to_pack_pos
3.30% +0.57% git [.] pack_pos_to_offset
+0.30% git [.] pack_pos_to_midx
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new p5326 introduced by the next patch will want these same tests,
interjecting its own setup in between. Move them out so that both perf
tests can reuse them.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment
variable to also write a multi-pack bitmap when
'GIT_TEST_MULTI_PACK_INDEX' is set.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A number of these tests are focused only on pack-based bitmaps and need
to be updated to disable 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' where
necessary.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test is specifically about generating a midx still respecting a
pack-based bitmap file. Generating a MIDX bitmap would confuse the test.
Let's override the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' variable to
make sure we don't do so.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generating a MIDX bitmap confuses many of the tests in t5310, which
expect to control whether and how bitmaps are written. Since the
relevant MIDX-bitmap tests here are covered already in t5326, let's just
disable the flag for the whole t5310 script.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generating a MIDX bitmap causes tests which repack in a partial clone to
fail because they are missing objects. Missing objects is an expected
component of tests in t0410, so disable this knob altogether. Graceful
degradation when writing a bitmap with missing objects is tested in
t5326.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch introduces a new test, t5326, which tests the basic
functionality of multi-pack bitmaps.
Some trivial behavior is tested, such as:
- Whether bitmaps can be generated with more than one pack.
- Whether clones can be served with all objects in the bitmap.
- Whether follow-up fetches can be served with some objects outside of
the server's bitmap
These use lib-bitmap's tests (which in turn were pulled from t5310), and
we cover cases where the MIDX represents both a single pack and multiple
packs.
In addition, some non-trivial and MIDX-specific behavior is tested, too,
including:
- Whether multi-pack bitmaps behave correctly with respect to the
pack-reuse machinery when the base for some object is selected from
a different pack than the delta.
- Whether multi-pack bitmaps correctly respect the
pack.preferBitmapTips configuration.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequent tests will want to check for the existence of a multi-pack
bitmap which matches the multi-pack-index stored in the pack directory.
The multi-pack bitmap includes the hex checksum of the MIDX it
corresponds to in its filename (for example,
'$packdir/multi-pack-index-<checksum>.bitmap'). As a result, some tests
want a way to learn what '<checksum>' is.
This helper addresses that need by printing the checksum of the
repository's multi-pack-index.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We'll soon be adding a test script that will cover many of the same
bitmap concepts as t5310, but for MIDX bitmaps. Let's pull out as many
of the applicable tests as we can so we don't have to rewrite them.
There should be no functional change to t5310; we still run the same
operations in the same order.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Write multi-pack bitmaps in the format described by
Documentation/technical/bitmap-format.txt, inferring their presence with
the absence of '--bitmap'.
To write a multi-pack bitmap, this patch attempts to reuse as much of
the existing machinery from pack-objects as possible. Specifically, the
MIDX code prepares a packing_data struct that pretends as if a single
packfile has been generated containing all of the objects contained
within the MIDX.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This prepares the code in pack-bitmap to interpret the new multi-pack
bitmaps described in Documentation/technical/bitmap-format.txt, which
mostly involves converting bit positions to accommodate looking them up
in a MIDX.
Note that there are currently no writers who write multi-pack bitmaps,
and that this will be implemented in the subsequent commit. Note also
that get_midx_checksum() and get_midx_filename() are made non-static so
they can be called from pack-bitmap.c.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
try_partial_reuse() is used to mark any bits in the beginning of a
bitmap whose objects can be reused verbatim from the pack they came
from.
Currently this function returns void, and signals nothing to the caller
when bits could not be reused. But multi-pack bitmaps would benefit from
having such a signal, because they may try to pass objects which are in
bounds, but from a pack other than the preferred one.
Any extra calls are noops because of a conditional in
reuse_partial_packfile_from_bitmap(), but those loop iterations can be
avoided by letting try_partial_reuse() indicate when it can't accept any
more bits for reuse, and then listening to that signal.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a recent commit, pack-objects learned support for the
'pack.preferBitmapTips' configuration. This patch prepares the
multi-pack bitmap code to respect this configuration, too.
The yet-to-be implemented code will find that it is more efficient to
check whether each reference contains a prefix found in the configured
set of values rather than doing an additional traversal.
Implement a function 'bitmap_is_preferred_refname()' which will perform
that check. Its caller will be added in a subsequent patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent patch to support reading MIDX bitmaps will be less noisy
after extracting a generic function to fetch the nth OID contained in
the bitmap.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent patch to support reading MIDX bitmaps will be less noisy
after extracting a generic function to return how many objects are
contained in a bitmap.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Opening multiple instance of the same MIDX can lead to problems like two
separate packed_git structures which represent the same pack being added
to the repository's object store.
The above scenario can happen because prepare_midx_pack() checks if
`m->packs[pack_int_id]` is NULL in order to determine if a pack has been
opened and installed in the repository before. But a caller can
construct two copies of the same MIDX by calling get_multi_pack_index()
and load_multi_pack_index() since the former manipulates the
object store directly but the latter is a lower-level routine which
allocates a new MIDX for each call.
So if prepare_midx_pack() is called on multiple MIDXs with the same
pack_int_id, then that pack will be installed twice in the object
store's packed_git pointer.
This can lead to problems in, for e.g., the pack-bitmap code, which does
something like the following (in pack-bitmap.c:open_pack_bitmap()):
struct bitmap_index *bitmap_git = ...;
for (p = get_all_packs(r); p; p = p->next) {
if (open_pack_bitmap_1(bitmap_git, p) == 0)
ret = 0;
}
which is a problem if two copies of the same pack exist in the
packed_git list because pack-bitmap.c:open_pack_bitmap_1() contains a
conditional like the following:
if (bitmap_git->pack || bitmap_git->midx) {
/* ignore extra bitmap file; we can only handle one */
warning("ignoring extra bitmap file: %s", packfile->pack_name);
close(fd);
return -1;
}
Avoid this scenario by not letting write_midx_internal() open a MIDX
that isn't also pointed at by the object store. So long as this is the
case, other routines should prefer to open MIDXs with
get_multi_pack_index() or reprepare_packed_git() instead of creating
instances on their own. Because get_multi_pack_index() returns
`r->object_store->multi_pack_index` if it is non-NULL, we'll only have
one instance of a MIDX open at one time, avoiding these problems.
To encourage this, drop the `struct multi_pack_index *` parameter from
`write_midx_internal()`, and rely instead on the `object_dir` to find
(or initialize) the correct MIDX instance.
Likewise, replace the call to `close_midx()` with
`close_object_store()`, since we're about to replace the MIDX with a new
one and should invalidate the object store's memory of any MIDX that
might have existed beforehand.
Note that this now forbids passing object directories that don't belong
to alternate repositories over `--object-dir`, since before we would
have happily opened a MIDX in any directory, but now restrict ourselves
to only those reachable by `r->objects->multi_pack_index` (and alternate
MIDXs that we can see by walking the `next` pointer).
As far as I can tell, supporting arbitrary directories with
`--object-dir` was a historical accident, since even the documentation
says `<alt>` when referring to the value passed to this option.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching refs, we are doing two connectivity checks:
- The first one is done such that we can skip fetching refs in the
case where we already have all objects referenced by the updated
set of refs.
- The second one verifies that we have all objects after we have
fetched objects.
We always execute both connectivity checks, but this is wasteful in case
the first connectivity check already notices that we have all objects
locally available.
Skip the second connectivity check in case we already had all objects
available. This gives us a nice speedup when doing a mirror-fetch in a
repository with about 2.3M refs where the fetching repo already has all
objects:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 30.025 s ± 0.081 s [User: 27.070 s, System: 4.933 s]
Range (min … max): 29.900 s … 30.111 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 25.574 s ± 0.177 s [User: 22.855 s, System: 4.683 s]
Range (min … max): 25.399 s … 25.765 s 5 runs
Summary
'HEAD: git-fetch' ran
1.17 ± 0.01 times faster than 'HEAD~: git-fetch'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions `fetch_refs()` and `consume_refs()` must always be called
together such that we first obtain all missing objects and then update
our local refs to match the remote refs. In a subsequent patch, we'll
further require that `fetch_refs()` must always be called before
`consume_refs()` such that it can correctly assert that we have all
objects after the fetch given that we're about to move the connectivity
check.
Make this requirement explicit by merging both functions into a single
`fetch_and_consume_refs()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor `fetch_refs()` code to make it more extendable by explicitly
handling error cases. The refactored code should behave the same.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to negotiate a packfile, we need to dereference refs to see
which commits we have in common with the remote. To do so, we first look
up the object's type -- if it's a tag, we peel until we hit a non-tag
object. If we hit a commit eventually, then we return that commit.
In case the object ID points to a commit directly, we can avoid the
initial lookup of the object type by opportunistically looking up the
commit via the commit-graph, if available, which gives us a slight speed
bump of about 2% in a huge repository with about 2.3M refs:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 31.634 s ± 0.258 s [User: 28.400 s, System: 5.090 s]
Range (min … max): 31.280 s … 31.896 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 31.129 s ± 0.543 s [User: 27.976 s, System: 5.056 s]
Range (min … max): 30.172 s … 31.479 s 5 runs
Summary
'HEAD: git-fetch' ran
1.02 ± 0.02 times faster than 'HEAD~: git-fetch'
In case this fails, we fall back to the old code which peels the
objects to a commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The object ID iterator used by the connectivity checks returns the next
object ID via an out-parameter and then uses a return code to indicate
whether an item was found. This is a bit roundabout: instead of a
separate error code, we can just return the next object ID directly and
use `NULL` pointers as indicator that the iterator got no items left.
Furthermore, this avoids a copy of the object ID.
Refactor the iterator and all its implementations to return object IDs
directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs:
Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch
Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s]
Range (min … max): 29.934 s … 30.406 s 10 runs
Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch
Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s]
Range (min … max): 29.696 s … 29.996 s 10 runs
Summary
'328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran
1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch'
While this 1% speedup could be labelled as statistically insignificant,
the speedup is consistent on my machine. Furthermore, this is an end to
end test, so it is expected that the improvement in the connectivity
check itself is more significant.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When updating local refs after the fetch has transferred all objects, we
do an object existence test as a safety guard to avoid updating a ref to
an object which we don't have. We do so via `oid_object_info()`: if it
returns an error, then we know the object does not exist.
One side effect of `oid_object_info()` is that it parses the object's
type, and to do so it must unpack the object header. This is completely
pointless: we don't care for the type, but only want to assert that the
object exists.
Refactor the code to use `repo_has_object_file()`, which both makes the
code's intent clearer and is also faster because it does not unpack
object headers. In a real-world repo with 2.3M refs, this results in a
small speedup when doing a mirror-fetch:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 33.686 s ± 0.176 s [User: 30.119 s, System: 5.262 s]
Range (min … max): 33.512 s … 33.944 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 31.247 s ± 0.195 s [User: 28.135 s, System: 5.066 s]
Range (min … max): 30.948 s … 31.472 s 5 runs
Summary
'HEAD: git-fetch' ran
1.08 ± 0.01 times faster than 'HEAD~: git-fetch'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When updating our local refs based on the refs fetched from the remote,
we need to iterate through all requested refs and load their respective
commits such that we can determine whether they need to be appended to
FETCH_HEAD or not. In cases where we're fetching from a remote with
exceedingly many refs, resolving these refs can be quite expensive given
that we repeatedly need to unpack object headers for each of the
referenced objects.
Speed this up by opportunistically trying to resolve object IDs via the
commit graph. We only do so for any refs which are not in "refs/tags":
more likely than not, these are going to be a commit anyway, and this
lets us avoid having to unpack object headers completely in case the
object is a commit that is part of the commit-graph. This significantly
speeds up mirror-fetches in a real-world repository with
2.3M refs:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 56.482 s ± 0.384 s [User: 53.340 s, System: 5.365 s]
Range (min … max): 56.050 s … 57.045 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 33.727 s ± 0.170 s [User: 30.252 s, System: 5.194 s]
Range (min … max): 33.452 s … 33.871 s 5 runs
Summary
'HEAD: git-fetch' ran
1.67 ± 0.01 times faster than 'HEAD~: git-fetch'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a repository has at least one alternate, the MIDX belonging to each
alternate is accessed through the `next` pointer on the main object
store's copy of the MIDX. close_midx() didn't bother to close any
of the linked MIDXs. It likewise didn't free the memory pointed to by
`m`, leaving uninitialized bytes with live pointers to them left around
in the heap.
Clean this up by closing linked MIDXs, and freeing up the memory pointed
to by each of them. When callers call close_midx(), then they can
discard the entire linked list of MIDXs and set their pointer to the
head of that list to NULL.
This isn't strictly required for the upcoming patches, but it makes it
much more difficult (though still possible, for e.g., by calling
`close_midx(m->next)` which leaves `m->next` pointing at uninitialized
bytes) to have pointers to uninitialized memory.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 9218c6a40c (midx: allow marking a pack as preferred, 2021-03-30), the
multi-pack index code learned how to select a pack which all duplicate
objects are selected from. That is, if an object appears in multiple
packs, select the copy in the preferred pack before breaking ties
according to the other rules like pack mtime and readdir() order.
Not specifying a preferred pack can cause serious problems with
multi-pack reachability bitmaps, because these bitmaps rely on having at
least one pack from which all duplicates are selected. Not having such a
pack causes problems with the code in pack-objects to reuse packs
verbatim (e.g., that code assumes that a delta object in a chunk of pack
sent verbatim will have its base object sent from the same pack).
So why does not marking a pack preferred cause problems here? The reason
is roughly as follows:
- Ties are broken (when handling duplicate objects) by sorting
according to midx_oid_compare(), which sorts objects by OID,
preferred-ness, pack mtime, and finally pack ID (more on that
later).
- The psuedo pack-order (described in
Documentation/technical/pack-format.txt under the section
"multi-pack-index reverse indexes") is computed by
midx_pack_order(), and sorts by pack ID and pack offset, with
preferred packs sorting first.
- But! Pack IDs come from incrementing the pack count in
add_pack_to_midx(), which is a callback to
for_each_file_in_pack_dir(), meaning that pack IDs are assigned in
readdir() order.
When specifying a preferred pack, all of that works fine, because
duplicate objects are correctly resolved in favor of the copy in the
preferred pack, and the preferred pack sorts first in the object order.
"Sorting first" is critical, because the bitmap code relies on finding
out which pack holds the first object in the MIDX's pseudo pack-order to
determine which pack is preferred.
But if we didn't specify a preferred pack, and the pack which comes
first in readdir() order does not also have the lowest timestamp, then
it's possible that that pack (the one that sorts first in pseudo-pack
order, which the bitmap code will treat as the preferred one) did *not*
have all duplicate objects resolved in its favor, resulting in breakage.
The fix is simple: pick a (semi-arbitrary, non-empty) preferred pack
when none was specified. This forces that pack to have duplicates
resolved in its favor, and (critically) to sort first in pseudo-pack
order. Unfortunately, testing this behavior portably isn't possible,
since it depends on readdir() order which isn't guaranteed by POSIX.
(Note that multi-pack reachability bitmaps have yet to be implemented;
so in that sense this patch is fixing a bug which does not yet exist.
But by having this patch beforehand, we can prevent the bug from ever
materializing.)
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The soon-to-be-implemented multi-pack bitmap treats object in the first
bit position specially by assuming that all objects in the pack it was
selected from are also represented from that pack in the MIDX. In other
words, the pack from which the first object was selected must also have
all of its other objects selected from that same pack in the MIDX in
case of any duplicates.
But this assumption relies on the fact that there is at least one object
in that pack to begin with; otherwise the object in the first bit
position isn't from a preferred pack, in which case we can no longer
assume that all objects in that pack were also selected from the same
pack.
Guard this assumption by checking the number of objects in the given
preferred pack, and failing if the given pack is empty.
To make sure we can safely perform this check, open any packs which are
contained in an existing MIDX via prepare_midx_pack(). The same is done
for new packs via the add_pack_to_midx() callback, but packs picked up
from a previous MIDX will not yet have these opened.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a new multi-pack index, write_midx_internal() attempts to
clean up any auxiliary files (currently just the MIDX's `.rev` file, but
soon to include a `.bitmap`, too) corresponding to the MIDX it's
replacing.
This step should happen after the new MIDX is written into place, since
doing so beforehand means that the old MIDX could be read without its
corresponding .rev file.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If using --object-dir to point into an object directory which belongs to
a different repository than the one in the current working directory,
such as:
git init repo
git -C repo ... # add some objects
cd alternate
git multi-pack-index --object-dir ../repo/.git/objects write
the binary will segfault trying to access the object-dir via the repo it
found, but that's not fully initialized. Worse, if we later call
clear_midx_files_ext(), we will use `the_repository` and remove files
out of the wrong object directory.
Fix this by using the given object_dir (or the object directory of
`the_repository` if `--object-dir` wasn't given) to properly to clean up
the *.rev files, avoiding the crash.
Original-patch-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index command supports working with arbitrary object
directories via the `--object-dir` flag. Though this has historically
worked in arbitrary repositories (including when the command itself was
run outside of a Git repository), this has been somewhat of an accident.
For example, running:
git multi-pack-index write --object-dir=/path/to/repo/objects
outside of a Git repository causes a BUG(). This is because the
top-level `cmd_multi_pack_index()` function stops parsing when it sees
"write", and then fills in the default object directory (the result of
calling `get_object_directory()`) before handing off to
`cmd_multi_pack_index_write()`. But there is no repository to
initialize, and so calling `get_object_directory()` results in a BUG()
(indicating that the current repository is not initialized).
Another case where this doesn't quite work as expected is when operating
in a SHA-256 repository. To see the failure, try this in your shell:
git init --object-format=sha256 repo
git -C repo commit --allow-empty base
git -C repo repack -d
git multi-pack-index --object-dir=$(pwd)/repo/.git/objects write
and observe that we cannot open the `.idx` file in "repo", because the
outermost process assumes that any repository that it works in also uses
the default value of `the_hash_algo` (at the time of writing, SHA-1).
There may be compelling reasons for trying to work around these bugs,
but working in arbitrary `--object-dir`'s is non-standard enough (and
likewise, these bugs prevalent enough) that I don't think any workflows
would be broken by abandoning this behavior.
Accordingly, restrict the `multi-pack-index` builtin to only work when
inside of a Git repository (i.e., its main utility becomes selecting
which alternate to operate in), which avoids both of the bugs above.
(Note that you can still trigger a bug when writing a MIDX in an
alternate which does not use the same object format as the repository
which it is an alternate of, but that is an unrelated bug to this one).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In both protocol v0 and v2, upload-pack writes one pktline packet per
advertised ref to stdout. That means one or two write(2) syscalls per
ref. This is problematic if these writes become network sends with
high overhead.
This commit changes both send_ref callbacks to use buffered IO using
stdio.
To give an example of the impact: I set up a single-threaded loop that
calls ls-remote (with HTTP and protocol v2) on a local GitLab
instance, on a repository with 11K refs. When I switch from Git
v2.32.0 to this patch, I see a 40% reduction in CPU time for Git, and
65% for Gitaly (GitLab's Git RPC service).
So using buffered IO not only saves syscalls in upload-pack, it also
saves time in things that consume upload-pack's output.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds three new functions to pkt-line.c: packet_fwrite,
packet_fwrite_fmt and packet_fflush. Besides writing a pktline flush
packet, packet_fflush also flushes the stdio buffer of the stream.
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expand the section about namespaces in the documentation of
`transfer.hideRefs` to point out the subtle differences between
`upload-pack` and `receive-pack`.
ffcfb68176 (upload-pack.c: treat want-ref relative to namespace,
2021-07-30) taught `upload-pack` to reject `want-ref`s for hidden refs,
which is now mentioned. It is clarified that at no point the name of a
hidden ref is revealed, but the object id it points to may.
Signed-off-by: Kim Altintop <kim@eagain.st>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'upload-pack' runs within the context of a git namespace, treat any
'want-ref' lines the client sends as relative to that namespace.
Also check if the wanted ref is hidden via 'hideRefs'. If it is hidden,
respond with an error as if the ref didn't exist.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Kim Altintop <kim@eagain.st>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Assembling a "raw" fetch command to be fed directly to "test-tool serve-v2"
is extracted into a test helper.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kim Altintop <kim@eagain.st>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
clear the "me" structure, but while we freed parts of the
mailmap_entry structure, we didn't free the structure itself. The same
goes for the "mailmap_info" structure.
This brings the number of SANITIZE=leak failures in t4203-mailmap.sh
down from 50 to 49. Not really progress as far as the number of
failures is concerned, but as far as I can tell this fixes all leaks
in mailmap.c itself. There's still users of it such as builtin/log.c
that call read_mailmap() without a clear_mailmap(), but that's on
them.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 7f40759496 (fast-export: tighten anonymize_mem() interface to
handle only strings, 2020-06-23) changed the interface used in anonymizing
strings, but failed to update the size of annotated tag messages to match
the new anonymized string.
As a result, exporting tags having messages longer than 13 characters
would create output that couldn't be parsed by fast-import,
as the data length indicated was larger than the data output.
Reset the message size when anonymizing, and add a tag with a "long"
message to the test.
Signed-off-by: Tal Kelrich <hasturkun@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in a2ba162cda (object-info: support for retrieving
object info, 2021-04-20) which appears to have been based on a
misunderstanding of how the pkt-line.c API works. There is no need to
strdup() input to packet_writer_write(), it's just a printf()-like
format function.
This fixes a potentially large memory leak, since the number of OID
lines the "object-info" call can be arbitrarily large (or a small one
if the request is small).
This makes t5701-git-serve.sh pass again under SANITIZE=leak, as it
did before a2ba162cda.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Bruno Albuquerque <bga@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we fail to start a diff command inside a submodule, immediately
exit the routine rather than trying to finish the command and printing
a second message.
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have ever initialized a submodule, open_submodule will open it.
If you then delete the submodule's worktree directory (but don't
remove it from .gitmodules), git diff --submodule=diff would error out
as it attempted to chdir into the now-deleted working tree directory.
This only matters if the submodules git dir is absorbed. If not, then
we no longer have anywhere to run the diff. But that case does not
trigger this error, because in that case, open_submodule fails, so we
don't resolve a left commit, so we exit early, which is the only thing
we could do.
If absorbed, then we can run the diff from the submodule's absorbed
git dir (.git/modules/sm2). In practice, that's a bit more
complicated, because `git diff` expects to be run from inside a
working directory, not a git dir. So it looks in the config for
core.worktree, and does chdir("../../../sm2"), which is the very dir
that we're trying to avoid visiting because it's been deleted. We
work around this by setting GIT_WORK_TREE (and GIT_DIR) to ".". It is
little weird to set GIT_WORK_TREE to something that is not a working
tree just to avoid an unnecessary chdir, but it works.
Signed-off-by: David Turner <dturner@twosigma.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bring the "commit-graph" command in line with the error output and
general pattern in cmd_multi_pack_index().
Let's test for that output, and also cover the same potential bug as
was fixed in the multi-pack-index command in
88617d11f9 (multi-pack-index: fix potential segfault without
sub-command, 2021-07-19).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the parse_options() invocation in the commit-graph code to
error on unknown leftover argv elements, in addition to the existing
and implicit erroring via parse_options() on unknown options.
We'd already error in cmd_commit_graph() on e.g.:
git commit-graph unknown verify
git commit-graph --unknown verify
But here we're calling parse_options() twice more for the "write" and
"verify" subcommands. We did not do the same checking for leftover
argv elements there. As a result we'd silently accept garbage in these
subcommands, let's not do that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than guarding all of the !argc with an additional "if" arm
let's do an early goto to "usage". This also makes it clear that
"save_commit_buffer" is not needed in this case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the "goto usage" pattern added in
cd57bc41bb (builtin/multi-pack-index.c: display usage on unrecognized
command, 2021-03-30) and 88617d11f9 (multi-pack-index: fix potential
segfault without sub-command, 2021-07-19) to maintain the same
brevity, but in a form that doesn't run afoul of the recommendation in
CodingGuidelines about braces:
When there are multiple arms to a conditional and some of them
require braces, enclose even a single line block in braces for
consistency[...]
Let's also change "argv == 0" to juts "!argv", per:
Do not explicitly compare an integral value with constant 0 or
'\0', or a pointer value with constant NULL[...]
I'm changing this because in a subsequent commit I'll make
builtin/commit-graph.c use the same pattern, having the two similarly
structured commands match aids readability.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of the parse_options_concat() so we don't need to copy/paste
common options like --object-dir.
This is inspired by a similar change to "checkout" in 2087182272
(checkout: split options[] array in three pieces, 2019-03-29), and the
same pattern in the multi-pack-index command, see
60ca94769c (builtin/multi-pack-index.c: split sub-commands,
2021-03-30).
A minor behavior change here is that now we're going to list both
--object-dir and --progress first, before we'd list --progress along
with other options.
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we don't handle the -h option here like most parse_options() users
we'll fall through and it'll do the right thing for us.
I think this code added in 4ce58ee38d (commit-graph: create
git-commit-graph builtin, 2018-04-02) was always redundant,
parse_options() did this at the time, and the commit-graph code never
used PARSE_OPT_NO_INTERNAL_HELP.
We don't need a test for this, it's tested by the t0012-help.sh test
added in d691551192 (t0012: test "-h" with builtins, 2017-05-30).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Share the usage message between these three variables by using a
macro. Before this new options needed to copy/paste the usage
information, see e.g. 809e0327f5 (builtin/commit-graph.c: introduce
'--max-new-filters=<n>', 2020-09-18).
See b25b727494 (builtin/multi-pack-index.c: define common usage with
a macro, 2021-03-30) for another use of this pattern (but on-list this
one came first).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Silently skipping commits when rebasing with --no-reapply-cherry-picks
(currently the default behavior) can cause user confusion. Issue
warnings when this happens, as well as advice on how to preserve the
skipped commits.
These warnings and advice are displayed only when using the (default)
"merge" rebase backend.
Update the git-rebase docs to mention the warnings and advice.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The most significant of this batch is of course "merge -sort".
Thanks, Elijah and everybody who helped the topic.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current implementation of GIT_TEST_FAIL_PREREQS is broken in
that checking for the lack of a prerequisite would not work. Avoid
the use of "if ! test_have_prereq X" in a test script.
* bc/t5607-avoid-broken-test-fail-prereqs:
t5607: avoid using prerequisites to select algorithm
"git range-diff" code clean-up.
* jk/range-diff-fixes:
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
"git apply" miscounted the bytes and failed to read to the end of
binary hunks.
* jk/apply-binary-hunk-parsing-fix:
apply: keep buffer/size pair in sync when parsing binary hunks
Remind developers that the userdiff patterns should be kept simple
and permissive, assuming that the contents they apply are always
syntactically correct.
* jc/userdiff-pattern-hint:
userdiff: comment on the builtin patterns
Use `ort` instead of `recursive` as the default merge strategy.
* en/ort-becomes-the-default:
Update docs for change of default merge backend
Change default merge backend from recursive to ort
Documentation updates.
* en/merge-strategy-docs:
Update error message and code comment
merge-strategies.txt: add coverage of the `ort` merge strategy
git-rebase.txt: correct out-of-date and misleading text about renames
merge-strategies.txt: fix simple capitalization error
merge-strategies.txt: avoid giving special preference to patience algorithm
merge-strategies.txt: do not imply using copy detection is desired
merge-strategies.txt: update wording for the resolve strategy
Documentation: edit awkward references to `git merge-recursive`
directory-rename-detection.txt: small updates due to merge-ort optimizations
git-rebase.txt: correct antiquated claims about --rebase-merges
"git pull" had various corner cases that were not well thought out
around its --rebase backend, e.g. "git pull --ff-only" did not stop
but went ahead and rebased when the history on other side is not a
descendant of our history. The series tries to fix them up.
* en/pull-conflicting-options:
pull: fix handling of multiple heads
pull: update docs & code for option compatibility with rebasing
pull: abort by default when fast-forwarding is not possible
pull: make --rebase and --no-rebase override pull.ff=only
pull: since --ff-only overrides, handle it first
pull: abort if --ff-only is given and fast-forwarding is impossible
t7601: add tests of interactions with multiple merge heads and config
t7601: test interaction of merge/rebase/fast-forward flags and options
Based on current experience, when running git clone --recurse-submodules,
developers do not expect other commands such as pull or checkout to run
recursively into active submodules. However, setting submodule.recurse=true
at this step could make for a simpler workflow by eliminating the need for
the --recurse-submodules option in subsequent commands. To collect more
data on developers' preference in regards to making submodule.recurse=true
a default config value in the future, deploy this feature under the opt in
submodule.stickyRecursiveClone flag.
Signed-off-by: Mahi Kolla <mkolla2@illinois.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If multiple independent patches are sent with send-email, even if the
"In-Reply-To" and "References" headers are not managed by --thread or
--in-reply-to, their values may be propagated from prior patches to
subsequent patches with no such headers defined.
To mitigate this and potential future issues, make sure all global
patch-specific variables are always either handled by
command-specific code (e.g. threading), or are reset to their default
values for every iteration.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Marvin Häuser <mhaeuser@posteo.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching, Git will by default print a list of all updated refs in a
nicely formatted table. In order to come up with this table, Git needs
to iterate refs twice: first to determine the maximum column width, and
a second time to actually format these changed refs.
While this table will not be printed in case the user passes `--quiet`,
we still go out of our way and do all these steps. In fact, we even do
more work compared to not passing `--quiet`: without the flag, we will
skip all references in the column width computation which have not been
updated, but if it is set we will now compute widths for all refs.
Fix this issue by completely skipping both preparation of the format and
formatting data for display in case the user passes `--quiet`, improving
performance especially with many refs. The following benchmark shows a
nice speedup for a quiet mirror-fetch in a repository with 2.3M refs:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 26.929 s ± 0.145 s [User: 24.194 s, System: 4.656 s]
Range (min … max): 26.692 s … 27.068 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 25.189 s ± 0.094 s [User: 22.556 s, System: 4.606 s]
Range (min … max): 25.070 s … 25.314 s 5 runs
Summary
'HEAD: git-fetch' ran
1.07 ± 0.01 times faster than 'HEAD~: git-fetch'
While at it, this patch also fixes `adjust_refcol_width()` such that it
skips unchanged refs in case the user passed `--quiet`, where verbosity
will be negative. While this function won't be called anymore if so,
this brings the comment in line with actual code. Furthermore, needless
`verbosity >= 0` checks are now removed in `store_updated_refs()`: we
never print to the `note` buffer anymore in case `verbosity < 0`, so we
won't end up in that code block anyway.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call fspathcmp() instead of open-coding it. This shortens the code and
makes it less repetitive.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Every once in a while a test somehow manages to escape from its trash
directory and modifies the surrounding repository, whether because of
a bug in git itself, a bug in a test [1], or e.g. when trying to run
tests with a shell that is, in general, unable to run our tests [2].
Set GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/.." as an additional
safety measure to protect the surrounding repository at least from
modifications by git commands executed in the tests (assuming that
handling of ceiling directories during repository discovery is not
broken, and, of course, it won't save us from regular shell commands,
e.g. 'cd .. && rm -f ...').
[1] e.g. https://public-inbox.org/git/20210423051255.GD2947267@szeder.dev
[2] $ git symbolic-ref HEAD
refs/heads/master
$ ksh ./t2011-checkout-invalid-head.sh
[... a lot of "not ok" ...]
$ git symbolic-ref HEAD
refs/heads/other
(In short: 'ksh' doesn't support the 'local' builtin command,
which is used by 'test_oid', causing it to return with error
whenever it's called, leaving ZERO_OID set to empty, so when the
test 'checkout main from invalid HEAD' runs 'echo $ZERO_OID
>.git/HEAD' it writes a corrupt (not invalid) HEAD, and subsequent
git commands don't recognize the repository in the trash directory
anymore, but operate on the surrounding repo.)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix syntax and correct the format of printf in MyFirstObjectWalk.txt
Signed-off-by: Zoker <kaixuanguiqu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Copy the 'index_state->dir_hash' back to the real istate after expanding
a sparse index.
A crash was observed in 'git status' during some hashmap lookups with
corrupted hashmap entries. During an index expansion, new cache-entries
are added to the 'index_state->name_hash' and the 'dir_hash' in a
temporary 'index_state' variable 'full'. However, only the 'name_hash'
hashmap from this temp variable was copied back into the real 'istate'
variable. The original copy of the 'dir_hash' was incorrectly
preserved. If the table in the 'full->dir_hash' hashmap were realloced,
the stale version (in 'istate') would be corrupted.
The test suite does not operate on index sizes sufficiently large to
trigger this reallocation, so they do not cover this behavior.
Increasing the test suite to cover such scale is fragile and likely
wasteful.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the original code from 08cdfb1337 (pack-objects --keep-unreachable,
2007-09-16), we add each object to the packing list with type
`obj->type`, where `obj` comes from `lookup_unknown_object()`. Unless we
had already looked up and parsed the object, this will be `OBJ_NONE`.
That's fine, since oe_set_type() sets the type_valid bit to '0', and we
determine the real type later on.
So the only thing we need from the object lookup is access to the
`flags` field so that we can mark that we've added the object with
`OBJECT_ADDED` to avoid adding it again (we can just pass `OBJ_NONE`
directly instead of grabbing it from the object).
But add_object_entry() already rejects duplicates! This has been the
behavior since 7a979d99ba (Thin pack - create packfile with missing
delta base., 2006-02-19), but 08cdfb1337 didn't take advantage of it.
Moreover, to do the OBJECT_ADDED check, we have to do a hash lookup in
`obj_hash`.
So we can drop the lookup_unknown_object() call completely, *and* the
OBJECT_ADDED flag, too, since the spot we're touching here is the only
location that checks it.
In the end, we perform the same number of hash lookups, but with the
added bonus that we don't waste memory allocating an OBJ_NONE object (if
we were traversing, we'd need it eventually, but the whole point of this
code path is not to traverse).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is used to implement `pack-objects`'s `--keep-unreachable`
option, but can be simplified in a couple of ways:
- add_objects_in_unpacked_packs() iterates over all packs (and then
all packed objects) itself, but could use for_each_packed_object()
instead since the missing flags necessary were added in the previous
commit
- objects are added to an in_pack array which store (off_t, object)
tuples, and then sorted in offset order when we could iterate
objects in offset order.
There is a slight behavior change here: before we would have added
objects in sorted offset order among _all_ packs. Handing objects to
create_object_entry() in pack order for each pack (instead of
feeding objects from all packs simultaneously their offset relative
to different packs) is much more reasonable, if different than how
the code currently works.
- objects in a single pack are iterated in index order and searched
for in order to discover their offsets, which is much less efficient
than using the on-disk reverse index
Simplify the function by addressing each of the above and moving the
core of the loop into a callback function that we then pass to
for_each_packed_object() instead of open-coding the latter function
ourselves.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next patch will reimplement a function that wants to iterate over
packed objects while ignoring packs which are marked as kept (either
in-core or on-disk).
Teach for_each_packed_object() to ignore all objects from those packs by
adding a new flag for each of the "kept" states that a pack can be in.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no other API docs in bundle.h, but this is at least a
start. We'll add a parameter to this function in a subsequent commit,
but let's start by documenting it.
The "/**" comment (as opposed to "/*") signifies the start of API
documentation. See [1] and bdfdaa4978 (strbuf.h: integrate
api-strbuf.txt documentation, 2015-01-16) and 6afbbdda33 (strbuf.h:
unify documentation comments beginnings, 2015-01-16) for a discussion
of that convention.
1. https://lore.kernel.org/git/874kbeecfu.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git branch only allows deleting branches that point to valid commits.
Skip that check if --force is given, as the caller is indicating with
it that they know what they are doing and accept the consequences.
This allows deleting dangling branches, which previously had to be
reset to a valid start-point using --force first.
Reported-by: Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass the struct object_id on instead of just its hash member.
This is simpler and avoids the need to guess the algorithm.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only one of the callers of rev_is_head() provides two hashes to compare.
Move that check there and convert it to struct object_id.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The word "encoding" can mean a lot of things (e.g., base64 or
quoted-printable encoding in emails, HTML entities, URL encoding, and so
on). The documentation for i18n.commitEncoding and i18n.logOutputEncoding
uses the phrase "character encoding" to make this more clear.
Let's use that phrase in other places to make it clear what kind of
encoding we are talking about. This patch covers the gui.encoding
option, as well as the --encoding option for git-log, etc (in this
latter case, I word-smithed the sentence a little at the same time).
That, coupled with the mention of iconv in the --encoding description,
should make this more clear.
The other spot I looked at is the working-tree-encoding section of
gitattributes(5). But it gives specific examples of encodings that I
think make the meaning pretty clear already.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user asks for a pretty-printed commit to be converted (either
explicitly with --encoding=foo, or implicitly because the commit is
non-utf8 and we want to convert it), we pass it through iconv(). If that
fails, we fall back to showing the input verbatim, but don't tell the
user that the output may be bogus.
Let's add a warning to do so, along with a mention in the documentation
for --encoding. Two things to note about the implementation:
- we could produce the warning closer to the call to iconv() in
reencode_string_len(), which would let us relay the value of errno.
But this is not actually very helpful. reencode_string_len() does
not know we are operating on a commit, and indeed does not know that
the caller won't produce an error of its own. And the errno values
from iconv() are seldom helpful (iconv_open() only ever produces
EINVAL; perhaps EILSEQ from iconv() might be illuminating, but it
can also return EINVAL for incomplete sequences).
- if the reason for the failure is that the output charset is not
supported, then the user will see this warning for every commit we
try to display. That might be ugly and overwhelming, but on the
other hand it is making it clear that every one of them has not been
converted (and the likely outcome anyway is to re-try the command
with a supported output encoding).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'Filtering contents...' progress report from delayed checkout is
displayed even when checkout and clone are invoked with --quiet or
--no-progress. Furthermore, it is displayed unconditionally, without
first checking whether stdout is a tty. Let's fix these issues and also
add some regression tests for the two code paths that currently use
delayed checkout: unpack_trees.c:check_updates() and
builtin/checkout.c:checkout_worktree().
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git column's '--nl' option can be used to specify a "string to be
printed at the end of each line" (quoting the man page), but this
option and its mandatory argument has been parsed as OPT_INTEGER since
the introduction of the command in 7e29b8254f (Add column layout
skeleton and git-column, 2012-04-21). Consequently, any non-number
argument is rejected by parse-options, and any number other than 0
leads to segfault:
$ printf "%s\n" one two |git column --mode=plain --nl=foo
error: option `nl' expects a numerical value
$ printf "%s\n" one two |git column --mode=plain --nl=42
Segmentation fault (core dumped)
$ printf "%s\n" one two |git column --mode=plain --nl=0
one
two
Parse this option as OPT_STRING.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the initial reference advertisement, the Git server will first
announce all of its references to the client. The logic is handled in
`send_ref()`, which will allocate a new buffer for each refline it is
about to send. This is quite wasteful: instead of allocating a new
buffer each time, we can just reuse a buffer.
Improve this by passing in a buffer via the `ls_refs_data` struct which
is then reused on each reference.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and apply a semantic patch for using xopen() instead of calling
open(2) and die() or die_errno() explicitly. This makes the error
messages more consistent and shortens the code.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the flags O_CREAT and O_EXCL are both given then open(2) is supposed
to create the file and error out if it already exists. The error
message in that case looks like this:
fatal: could not open 'foo' for writing: File exists
Without further context this is confusing: Why should the existence of
the file pose a problem? Isn't that a requirement for writing to it?
Add a more specific error message for that case to tell the user that we
actually don't expect the file to preexist, so the example becomes:
fatal: unable to create 'foo': File exists
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes it explicit how alternative ref backends should report errors in
read_raw_ref_fn.
read_raw_ref_fn needs to supply a credible errno for a number of cases. These
are primarily:
1) The files backend calls read_raw_ref from lock_raw_ref, and uses the
resulting error codes to create/remove directories as needed.
2) ENOENT should be translated in a zero OID, optionally with REF_ISBROKEN set,
returning the last successfully resolved symref. This is necessary so
read_raw_ref("HEAD") on an empty repo returns refs/heads/main (or the default branch
du-jour), and we know on which branch to create the first commit.
Make this information flow explicit by adding a failure_errno to the signature
of read_raw_ref. All errnos from the files backend are still propagated
unchanged, even though inspection suggests only ENOTDIR, EISDIR and ENOENT are
relevant.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/files-backend.c::lock_ref_oid_basic() tries to signal how it failed
to its callers using errno.
It is safe to stop setting errno here, because the callers of this
file-scope static function are
* files_copy_or_rename_ref()
* files_create_symref()
* files_reflog_expire()
None of them looks at errno after seeing a negative return from
lock_ref_oid_basic() to make any decision, and no caller of these three
functions looks at errno after they signal a failure by returning a
negative value. In particular,
* files_copy_or_rename_ref() - here, calls are followed by error()
(which performs I/O) or write_ref_to_lockfile() (which calls
parse_object() which may perform I/O)
* files_create_symref() - here, calls are followed by error() or
create_symref_locked() (which performs I/O and does not inspect
errno)
* files_reflog_expire() - here, calls are followed by error() or
refs_reflog_exists() (which calls a function in a vtable that is not
documented to use and/or preserve errno)
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit does not change code; it documents the fact that an alternate ref
backend does not need to return EINVAL from read_raw_ref_fn to function
properly.
This is correct, because refs_read_raw_ref is only called from;
* resolve_ref_unsafe(), which does not care for the EINVAL errno result.
* refs_verify_refname_available(), which does not inspect errno.
* files-backend.c, where errno is overwritten on failure.
* packed-backend.c (is_packed_transaction_needed), which calls it for the
packed ref backend, which never emits EINVAL.
A grep for EINVAL */*c reveals that no code checks errno against EINVAL after
reading references. In addition, the refs.h file does not mention errno at all.
A grep over resolve_ref_unsafe() turned up the following callers that inspect
errno:
* sequencer.c::print_commit_summary, which uses it for die_errno
* lock_ref_oid_basic(), which only treats EISDIR and ENOTDIR specially.
The files ref backend does use EINVAL. The files backend does not call into
the generic API (refs_read_raw), but into the files-specific function
(files_read_raw_ref), which we are not changing in this commit.
As the errno sideband is unintuitive and error-prone, remove EINVAL
value, as a step towards getting rid of the errno sideband altogether.
Spotted by Ævar Arnfjörð Bjarmason <avarab@gmail.com>.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the raceproof_create_file() API added to cache.h and
object-file.c in 177978f56a (raceproof_create_file(): new function,
2017-01-06) to its only user, refs/files-backend.c.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a follow-up to the preceding commit where we removed the adjacent
"errno == EISDIR" condition in the same function, remove the
"last_errno != ENOTDIR" condition here.
It's not possible for us to hit this condition added in
5b2d8d6f21 (lock_ref_sha1_basic(): improve diagnostics for ref D/F
conflicts, 2015-05-11). Since a1c1d8170d (refs_resolve_ref_unsafe:
handle d/f conflicts for writes, 2017-10-06) we've explicitly caught
these in refs_resolve_ref_unsafe() before returning NULL:
if (errno != ENOENT &&
errno != EISDIR &&
errno != ENOTDIR)
return NULL;
We'd then always return the refname from refs_resolve_ref_unsafe()
even if we were in a broken state as explained in the preceding
commit. The elided context here is a call to refs_resolve_ref_unsafe().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we lock a reference like "foo" we need to handle the case where
"foo" exists, but is an empty directory. That's what this code added
in bc7127ef0f (ref locking: allow 'foo' when 'foo/bar' used to exist
but not anymore., 2006-09-30) seems like it should be dealing with.
Except it doesn't, and we never take this branch. The reason is that
when bc7127ef0f was written this looked like:
ref = resolve_ref([...]);
if (!ref && errno == EISDIR) {
[...]
And in resolve_ref() we had this code:
fd = open(path, O_RDONLY);
if (fd < 0)
return NULL;
I.e. we would attempt to read "foo" with open(), which would fail with
EISDIR and we'd return NULL. We'd then take this branch, call
remove_empty_directories() and continue.
Since a1c1d8170d (refs_resolve_ref_unsafe: handle d/f conflicts for
writes, 2017-10-06) we don't. E.g. in the case of
files_copy_or_rename_ref() our callstack will look something like:
[...] ->
files_copy_or_rename_ref() ->
lock_ref_oid_basic() ->
refs_resolve_ref_unsafe()
At that point the first (now only) refs_resolve_ref_unsafe() call in
lock_ref_oid_basic() would do the equivalent of this in the resulting
call to refs_read_raw_ref() in refs_resolve_ref_unsafe():
/* Via refs_read_raw_ref() */
fd = open(path, O_RDONLY);
if (fd < 0)
/* get errno == EISDIR */
/* later, in refs_resolve_ref_unsafe() */
if ([...] && errno != EISDIR)
return NULL;
[...]
/* returns the refs/heads/foo to the caller, even though it's a directory */
return refname;
I.e. even though we got an "errno == EISDIR" we won't take this
branch, since in cases of EISDIR "resolved" is always
non-NULL. I.e. we pretend at this point as though everything's OK and
there is no "foo" directory.
We then proceed with the entire ref update and don't call
remove_empty_directories() until we call commit_ref_update(). See
5387c0d883 (commit_ref(): if there is an empty dir in the way, delete
it, 2016-05-05) for the addition of that code, and
a1c1d8170d (refs_resolve_ref_unsafe: handle d/f conflicts for writes,
2017-10-06) for the commit that changed the original codepath added in
bc7127ef0f to use this "EISDIR" handling.
Further historical commentary:
Before the two preceding commits the caller in files_reflog_expire()
was the only one out of our 4 callers that would pass non-NULL as an
oid. We would then set a (now gone) "resolve_flags" to
"RESOLVE_REF_READING" and just before that "errno != EISDIR" check do:
if (resolve_flags & RESOLVE_REF_READING)
return NULL;
There may have been some case where this ended up mattering and we
couldn't safely make this change before we removed the "oid"
parameter, but I don't think there was, see [1] for some discussion on
that.
In any case, now that we've removed the "oid" parameter in a preceding
commit we can be sure that this code is redundant, so let's remove it.
1. http://lore.kernel.org/git/871r801yp6.fsf@evledraar.gmail.com
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commit the last caller that passed a non-NULL OID was
changed to pass NULL to lock_ref_oid_basic(). As noted in preceding
commits use of this API has been going away (we should use ref
transactions, or lock_raw_ref()), so we're unlikely to gain new
callers that want to pass the "oid".
So let's remove it, doing so means we can remove the "mustexist"
condition, and therefore anything except the "flags =
RESOLVE_REF_NO_RECURSE" case.
Furthermore, since the verify_lock() function we called did most of
its work when the "oid" was passed (as "old_oid") we can inline the
trivial part of it that remains in its only remaining caller. Without
a NULL "oid" passed it was equivalent to calling refs_read_ref_full()
followed by oidclr().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the the preceding commit the "oid" parameter to reflog_expire()
is always NULL, but it was not cleaned up to reduce the size of the
diff. Let's do that subsequent API and documentation cleanup now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During reflog expiry, the cmd_reflog_expire() function first iterates
over all reflogs in logs/*, and then one-by-one acquires the lock for
each one and expires it. This behavior has been with us since this
command was implemented in 4264dc15e1 ("git reflog expire",
2006-12-19).
Change this to stop calling lock_ref_oid_basic() with the OID we saw
when we looped over the logs, instead have it pass the OID it managed
to lock.
This mostly mitigates a race condition where e.g. "git gc" will fail
in a concurrently updated repository because the branch moved since
"git reflog expire --all" was started. I.e. with:
error: cannot lock ref '<refname>': ref '<refname>' is at <OID-A> but expected <OID-B>
This behavior of passing in an "oid" was needed for an edge-case that
I've untangled in this and preceding commits though, namely that we
needed this OID because we'd:
1. Lookup the reflog name/OID via dwim_log()
2. With that OID, lock the reflog
3. Later in builtin/reflog.c we use the OID we looked as input to
lookup_commit_reference_gently(), assured that it's equal to the
OID we got from dwim_log().
We can be sure that this change is safe to make because between
dwim_log (step #1) and lock_ref_oid_basic (step #2) there was no other
logic relevant to the OID or expiry run in the cmd_reflog_expire()
caller.
We can thus treat that code as a black box, before and after this
change it would get an OID that's been locked, the only difference is
that now we mostly won't be failing to get the lock due to the TOCTOU
race[0]. That failure was purely an implementation detail in how the
"current OID" was looked up, it was divorced from the locking
mechanism.
What do we mean with "mostly"? It mostly mitigates it because we'll
still run into cases where the ref is locked and being updated as we
want to expire it, and other git processes wanting to update the refs
will in turn race with us as we expire the reflog.
That remaining race can in turn be mitigated with the
core.filesRefLockTimeout setting, see 4ff0f01cb7 ("refs: retry
acquiring reference locks for 100ms", 2017-08-21). In practice if that
value is high enough we'll probably never have ref updates or reflog
expiry failing, since the clients involved will retry for far longer
than the time any of those operations could take.
See [1] for an initial report of how this impacted "git gc" and a
large discussion about this change in early 2019. In particular patch
looked good to Michael Haggerty, see his[2]. That message seems to not
have made it to the ML archive, its content is quoted in full in my
[3].
I'm leaving behind now-unused code the refs API etc. that takes the
now-NULL "unused_oid" argument, and other code that can be simplified now
that we never have on OID in that context, that'll be cleaned up in
subsequent commits, but for now let's narrowly focus on fixing the
"git gc" issue. As the modified assert() shows we always pass a NULL
oid to reflog_expire() now.
Unfortunately this sort of probabilistic contention is hard to turn
into a test. I've tested this by running the following three subshells
in concurrent terminals:
(
rm -rf /tmp/git &&
git init /tmp/git &&
while true
do
head -c 10 /dev/urandom | hexdump >/tmp/git/out &&
git -C /tmp/git add out &&
git -C /tmp/git commit -m"out"
done
)
(
rm -rf /tmp/git-clone &&
git clone file:///tmp/git /tmp/git-clone &&
while git -C /tmp/git-clone pull
do
date
done
)
(
while git -C /tmp/git-clone reflog expire --all
do
date
done
)
Before this change the "reflog expire" would fail really quickly with
the "but expected" error noted above.
After this change both the "pull" and "reflog expire" will run for a
while, but eventually fail because I get unlucky with
core.filesRefLockTimeout (the "reflog expire" is in a really tight
loop). As noted above that can in turn be mitigated with higher values
of core.filesRefLockTimeout than the 100ms default.
As noted in the commentary added in the preceding commit there's also
the case of branches being racily deleted, that can be tested by
adding this to the above:
(
while git -C /tmp/git-clone branch topic master &&
git -C /tmp/git-clone branch -D topic
do
date
done
)
With core.filesRefLockTimeout set to 10 seconds (it can probably be a
lot lower) I managed to run all four of these concurrently for about
an hour, and accumulated ~125k commits, auto-gc's and all, and didn't
have a single failure. The loops visibly stall while waiting for the
lock, but that's expected and desired behavior.
0. https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
1. https://lore.kernel.org/git/87tvg7brlm.fsf@evledraar.gmail.com/
2. http://lore.kernel.org/git/b870a17d-2103-41b8-3cbc-7389d5fff33a@alum.mit.edu
3. https://lore.kernel.org/git/87pnqkco8v.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a comment about why it is that we need to check for the the
existence of a reflog we're deleting after we've successfully acquired
the lock in files_reflog_expire(). As noted in [1] the lock protocol
for reflogs is somewhat intuitive.
This early exit code the comment applies to dates all the way back to
4264dc15e1 (git reflog expire, 2006-12-19).
1. https://lore.kernel.org/git/54DCDA42.2060800@alum.mit.edu/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the repo_dwim_log() function initially added as dwim_log() in
eb3a48221f (log --reflog: use dwim_log, 2007-02-09) to accept a NULL
oid parameter. The refs_resolve_ref_unsafe() function it invokes
already deals with it, but it didn't.
This allows for a bit more clarity in a reflog-walk.c codepath added
in f2eba66d4d (Enable HEAD@{...} and make it independent from the
current branch, 2007-02-03). We'll shortly use this in
builtin/reflog.c as well.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Re-indent this argument list that's been mis-indented since it was
added in 34c319970d (refs/debug: trace into reflog expiry too,
2021-04-23). This makes a subsequent change smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the unused "skip" parameter to lock_raw_ref(), it was never
used. We do use it when passing "skip" to the
refs_rename_ref_available() function in files_copy_or_rename_ref(),
but not here.
This is part of a larger series that modifies lock_ref_oid_basic()
extensively, there will be no more modifications of this function in
this series, but since the preceding commit removed this unused
parameter from lock_ref_oid_basic(), let's do it here too for
consistency.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The lock_ref_oid_basic() function has gradually been replaced by use
of the file transaction API, there are only 4 remaining callers of
it.
None of those callers pass non-NULL "extras" and "skip" parameters,
the last such caller went away in 92b1551b1d (refs: resolve symbolic
refs first, 2016-04-25), so let's remove the parameters.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the last commit we removed the REF_DELETING flag from
lock_ref_oid_basic(). Since then all of the remaining callers do pass
REF_NO_DEREF, but that has been ignored completely since
7a418f3a17 (lock_ref_sha1_basic(): only handle REF_NODEREF mode,
2016-04-22).
So we can simply get rid of the parameter entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the squashing of the advice.graftFileDeprecated advice over to an
external variable in commit.[ch], allowing advice() to purely use the
new-style API of invoking advice() with an enum.
See 8821e90a09 (advice: don't pointlessly suggest
--convert-graft-file, 2018-11-27) for why quieting this advice was
needed. It's more straightforward to move this code to commit.[ch] and
use it builtin/replace.c, than to go through the indirection of
advice.[ch].
Because this was the last advice_config variable we can remove that
old facility from advice.c.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The external use of this variable was added in 532139940c (add: warn
when adding an embedded repository, 2017-06-14). For the use-case it's
more straightforward to track whether we've shown advice in
check_embedded_repo() than setting the global variable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c4a09cc9cc (Merge branch 'hw/advise-ng', 2020-03-25), a new API for
accessing advice variables was introduced and deprecated `advice_config`
in favor of a new array, `advice_setting`.
This patch ports all but two uses which read the status of the global
`advice_` variables over to the new `advice_enabled` API. We'll deal
with advice_add_embedded_repo and advice_graft_file_deprecated
separately.
Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In daef1b300b (Merge branch 'hw/advice-add-nothing', 2020-02-14), two
advice settings were introduced into the `advice_config` array.
Subsequently, c4a09cc9cc (Merge branch 'hw/advise-ng', 2020-03-25)
started to deprecate `advice_config` in favor of a new array,
`advice_setting`.
However, the latter branch did not include the former branch, and
therefore `advice_setting` is missing the two entries added by the
`hw/advice-add-nothing` branch.
These are currently the only entries in `advice_config` missing from
`advice_setting`.
Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For diff family commands, we can tell them to exclude changes outside
of some directories if --relative is requested.
In diff_unmerge(), NULL will be returned if the requested path is
outside of the interesting directories, thus we'll run into NULL
pointer dereference in run_diff_files when trying to dereference
its return value.
Checking for return value of diff_unmerge before dereferencing
is not sufficient, though. Since, diff engine will try to work on such
pathspec later.
Let's not run diff on those unintesting entries, instead.
As a side effect, by skipping like that, we can save some CPU cycles.
Reported-by: Thomas De Zeeuw <thomas@slight.dev>
Tested-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In test_atom(), we're piping the output of cat-file to tail(1),
thus, losing its exit status.
Let's use a temporary file to preserve git exit status code.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t6300, some tests are guarded behind some prerequisites.
Thus, objects created by those tests ain't available if those
prerequisites are unsatistified. Attempting to run "cat-file"
on those objects will run into failure.
In fact, running t6300 in an environment without gpg(1),
we'll see those warnings:
fatal: Not a valid object name refs/tags/signed-empty
fatal: Not a valid object name refs/tags/signed-short
fatal: Not a valid object name refs/tags/signed-long
Let's put those commands into the real tests, in order to:
* skip their execution if prerequisites aren't satistified.
* check their exit status code
The expected value for objects with type: commit needs to be
computed outside the test because we can't rely on "$3" there.
Furthermore, to prevent the accidental usage of that computed
expected value, BUG out on unknown object's type.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 0696232390 (pack-redundant: fix crash when one packfile in repo,
2020-12-16) added one some new tests to t5323. At the time, the sub-repo
we used was called "master". But in a parallel branch, this was switched
to "main".
When the latter branch was merged in 27d7c8599b (Merge branch
'js/default-branch-name-tests-final-stretch', 2021-01-25), some of those
spots caused textual conflicts, but some (for tests that were far enough
away from other changed code) were just semantic. The merge resolution
fixed up most spots, but missed this one.
Even though this did impact actual code, it turned out not to fail the
tests. Running 'cd "$master_repo"' ended up staying in the same
directory, running the test in the main trash repo instead of the
sub-repo. But because the point of the test is checking behavior when
there are no packfiles, it worked in either repo (since both are empty
at this point in the script).
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Loading of ref tips to prepare for common ancestry negotiation in
"git fetch-pack" has been optimized by taking advantage of the
commit graph when available.
* ps/fetch-pack-load-refs-optim:
fetch-pack: speed up loading of refs via commit graph
Bugfix for common ancestor negotiation recently introduced in "git
push" code path.
* jt/push-negotiation-fixes:
fetch: die on invalid --negotiation-tip hash
send-pack: fix push nego. when remote has refs
send-pack: fix push.negotiate with remote helper
trace2 logs learned to show parent process name to see in what
context Git was invoked.
* es/trace2-log-parent-process-name:
tr2: log parent process name
tr2: make process info collection platform-generic
A handful of tests that assumed implementation details of files
backend for refs have been cleaned up.
* hn/refs-test-cleanup:
t6001: avoid direct file system access
t6500: use "ls -1" to snapshot ref database state
t7064: use update-ref -d to remove upstream branch
t1410: mark test as REFFILES
t1405: mark test for 'git pack-refs' as REFFILES
t1405: use 'git reflog exists' to check reflog existence
t2402: use ref-store test helper to create broken symlink
t3320: use git-symbolic-ref rather than filesystem access
t6120: use git-update-ref rather than filesystem access
t1503: mark symlink test as REFFILES
t6050: use git-update-ref rather than filesystem access
Final batch for "merge -sort" optimization.
* en/ort-perf-batch-15:
merge-ort: remove compile-time ability to turn off usage of memory pools
merge-ort: reuse path strings in pool_alloc_filespec
merge-ort: store filepairs and filespecs in our mem_pool
diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc
merge-ort: switch our strmaps over to using memory pools
merge-ort: set up a memory pool
merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
diffcore-rename: use a mem_pool for exact rename detection's hashmap
merge-ort: rename str{map,intmap,set}_func()
Pathname expansion (like "~username/") learned a way to specify a
location relative to Git installation (e.g. its $sharedir which is
$(prefix)/share), with "%(prefix)".
* js/expand-runtime-prefix:
expand_user_path: allow in-flight topics to keep using the old name
interpolate_path(): allow specifying paths relative to the runtime prefix
Use a better name for the function interpolating paths
expand_user_path(): clarify the role of the `real_home` parameter
expand_user_path(): remove stale part of the comment
tests: exercise the RUNTIME_PREFIX feature
Prepare the "ref-filter" machinery that drives the "--format"
option of "git for-each-ref" and its friends to be used in "git
cat-file --batch".
* zh/ref-filter-raw-data:
ref-filter: add %(rest) atom
ref-filter: use non-const ref_format in *_atom_parser()
ref-filter: --format=%(raw) support --perl
ref-filter: add %(raw) atom
ref-filter: add obj-type check in grab contents
Input validation of "git pack-objects --stdin-packs" has been
corrected.
* ab/pack-stdin-packs-fix:
pack-objects: fix segfault in --stdin-packs option
pack-objects tests: cover blindspots in stdin handling
Support for ancient versions of cURL library (pre 7.19.4) has been
dropped.
* ab/http-drop-old-curl:
http: rename CURLOPT_FILE to CURLOPT_WRITEDATA
http: drop support for curl < 7.19.3 and < 7.17.0 (again)
http: drop support for curl < 7.19.4
http: drop support for curl < 7.16.0
http: drop support for curl < 7.11.1
"git add" can work better with the sparse index.
* ds/add-with-sparse-index:
add: remove ensure_full_index() with --renormalize
add: ignore outside the sparse-checkout in refresh()
pathspec: stop calling ensure_full_index
add: allow operating on a sparse-only index
t1092: test merge conflicts outside cone
"git bisect" spawned "git show-branch" only to pretty-print the
title of the commit after checking out the next version to be
tested; this has been rewritten in C.
* jc/bisect-sans-show-branch:
bisect: simplify return code from bisect_checkout()
bisect: do not run show-branch just to show the current commit
The die() routine adds a "fatal: " prefix, there is no reason to add
another one. Fixes code added in e65123a71d (builtin rebase: support
`git rebase <upstream> <switch-to>`, 2018-09-04).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set packet_trace_identity() for ls-remote. This replaces the generic
"git" identity in GIT_TRACE_PACKET=<file> traces to "ls-remote", e.g.:
[...] packet: upload-pack> version 2
[...] packet: upload-pack> agent=git/2.32.0-dev
[...] packet: ls-remote< version 2
[...] packet: ls-remote< agent=git/2.32.0-dev
Where in an "git ls-remote file://<path>" dialog ">" is the sender (or
"to the server") and "<" is the recipient (or "received by the
client").
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
xmalloc() dies on error, allows zero-sized allocations and enforces
GIT_ALLOC_LIMIT for testing. Our mmap replacement doesn't need any of
that. Let's cut out the wrapper, reject zero-sized requests as required
by POSIX and use malloc(3) directly. Allocation errors were needlessly
handled by git_mmap() before; this code becomes reachable now.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On macOS, we use launchctl to manage the background maintenance
schedule. This uses a set of .plist files to describe the schedule, but
these files are also registered with 'launchctl bootstrap'. If multiple
'git maintenance start' commands run concurrently, then they can collide
replacing these schedule files and registering them with launchctl.
To avoid extra launchctl commands, do a check for the .plist files on
disk and check if they are registered using 'launchctl list <name>'.
This command will return with exit code 0 if it exists, or exit code 113
if it does not.
We can test this behavior using the GIT_TEST_MAINT_SCHEDULER environment
variable.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When two `git maintenance` processes try to write the `.plist` file, we
need to help them with serializing their efforts.
The 150ms time-out value was determined from thin air.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, $(pwd) returns a drive-letter style path C:/foo, while $PWD
contains a POSIX style /c/foo path. When we want to interpolate the
current directory in the PATH variable, we must not use the C:/foo style,
because the meaning of the colon is ambiguous. Use the POSIX style.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The variable D is never defined in test t5582, more severely the test
fails if D is defined by something outside the test suite, so remove
this spurious line.
Signed-off-by: Mickey Endito <mickey.endito.2323@protonmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new submodule--helper subcommand `run-update-procedure` that runs
the update procedure if the SHA1 of the submodule does not match what
the superproject expects.
This is an intermediate change that works towards total conversion of
`submodule update` from shell to C.
Specific error codes are returned so that the shell script calling the
subcommand can take a decision on the control flow, and preserve the
error messages across subsequent recursive calls of `cmd_update`.
This change is more focused on doing a faithful conversion, so for now we
are not too concerned with trying to reduce subprocess spawns.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the technical documentation to describe the multi-pack bitmap
format. This patch merely introduces the new format, and describes its
high-level ideas. Git does not yet know how to read nor write these
multi-pack variants, and so the subsequent patches will:
- Introduce code to interpret multi-pack bitmaps, according to this
document.
- Then, introduce code to write multi-pack bitmaps from the 'git
multi-pack-index write' sub-command.
Finally, the implementation will gain tests in subsequent patches (as
opposed to inline with the patch teaching Git how to write multi-pack
bitmaps) to avoid a cyclic dependency.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a new bitmap, the bitmap writer code attempts to read the
existing bitmap (if one is present). This is done in order to quickly
permute the bits of any bitmaps for commits which appear in the existing
bitmap, and were also selected for the new bitmap.
But since this code was added in 341fa34887 (pack-bitmap-write: use
existing bitmaps, 2020-12-08), the resources associated with opening an
existing bitmap were never released.
It's fine to ignore this, but it's bad hygiene. It will also cause a
problem for the multi-pack-index builtin, which will be responsible not
only for writing bitmaps, but also for expiring any old multi-pack
bitmaps.
If an existing bitmap was reused here, it will also be expired. That
will cause a problem on platforms which require file resources to be
closed before unlinking them, like Windows. Avoid this by ensuring we
close reused bitmaps with free_bitmap_index() before removing them.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The set of objects covered by a bitmap must be closed under
reachability, since it must be the case that there is a valid bit
position assigned for every possible reachable object (otherwise the
bitmaps would be incomplete).
Pack bitmaps are never written from 'git repack' unless repacking
all-into-one, and so we never write non-closed bitmaps (except in the
case of partial clones where we aren't guaranteed to have all objects).
But multi-pack bitmaps change this, since it isn't known whether the
set of objects in the MIDX is closed under reachability until walking
them. Plumb through a bit that is set when a reachable object isn't
found.
As soon as a reachable object isn't found in the set of objects to
include in the bitmap, bitmap_writer_build() knows that the set is not
closed, and so it now fails gracefully.
A test is added in t0410 to trigger a bitmap write without full
reachability closure by removing local copies of some reachable objects
from a promisor remote.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The special `--test-bitmap` mode of `git rev-list` is used to compare
the result of an object traversal with a bitmap to check its integrity.
This mode does not, however, assert that the types of reachable objects
are stored correctly.
Harden this mode by teaching it to also check that each time an object's
bit is marked, the corresponding bit should be set in exactly one of the
type bitmaps (whose type matches the object's true type).
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git cherry-pick", upon seeing a conflict, says:
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
as if running "git commit" to conclude the resolution of
this single step were the end of the story. This stems from
the fact that the command originally was to pick a single
commit and not a range of commits, and the message was
written back then and has not been adjusted.
When picking a range of commits and the command stops with a
conflict in the middle of the range, however, after
resolving the conflict and (optionally) recording the result
with "git commit", the user has to run "git cherry-pick
--continue" to have the rest of the range dealt with,
"--skip" to drop the current commit, or "--abort" to discard
the series.
Suggest use of "git cherry-pick --continue/--skip/--abort"
so that the message also covers the case where a range of
commits are being picked.
Similarly, this optimization can be applied to git revert,
suggest use of "git revert --continue/--skip/--abort" so
that the message also covers the case where a range of
commits are being reverted.
It is worth mentioning that now we use advice() to print
the content of GIT_CHERRY_PICK_HELP in print_advice(), each
line of output will start with "hint: ".
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a rebase is started with a --strategy option other than "ort" or
"recursive" then "merge -c" does not allow the user to reword the
commit message.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fast-forwarding we do not create a new commit so .git/MERGE_MSG
is not removed and can end up seeding the message of a commit made
after the rebase has finished. Avoid writing .git/MERGE_MSG when we
are fast-forwarding by writing the file after the fast-forward
checks. Note that there are no changes to the fast-forward code, it is
simply moved.
Note that the way this change is implemented means we no longer write
the author script when fast-forwarding either. I believe this is safe
for the reasons below but it is a departure from what we do when
fast-forwarding a non-merge commit. If we reword the merge then 'git
commit --amend' will keep the authorship of the commit we're rewording
as it ignores GIT_AUTHOR_* unless --reset-author is passed. It will
also export the correct GIT_AUTHOR_* variables to any hooks and we
already test the authorship of the reworded commit. If we are not
rewording then we no longer call spilt_ident() which means we are no
longer checking the commit author header looks sane. However this is
what we already do when fast-forwarding non-merge commits in
skip_unnecessary_picks() so I don't think we're breaking any promises
by not checking the author here.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
None of the existing reword tests check that there are no uncommitted
changes when the editor is opened. Reuse the editor script from the
last commit to fix this omission.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce warnings reported by git-po-helper to 23 which look like false
positives. Since Bulgarian uses Cyrillic we mark substitutable parts
in capitals rather than quoting with <> which saves two chars per term
(2.5% real estate on a 80 char line) plus it is safer and simpler as
the <> chars have special (sometimes destructive) meaning for the
shells. It is also easier for the user to parse. See for example
the message:
git mailinfo [<options>] <msg> <patch> < mail >info
where both <> characters serve multiple functions.
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
If the user runs git log while rewording a commit it is confusing if
sometimes we're amending the commit that's being reworded and at other
times we're creating a new commit depending on whether we could
fast-forward or not[1]. For this reason the reword command ensures
that there are no uncommitted changes when rewording. The reword
command also allows the user to edit the todo list while the rebase is
paused. As 'merge -c' also rewords commits make it behave like reword
and add a test.
[1] https://lore.kernel.org/git/xmqqlfvu4be3.fsf@gitster-ct.c.googlers.com/T/#m133009cb91cf0917bcf667300f061178be56680a
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The lock_ref_oid_basic() function has gradually been replaced by
most callers no longer performing a low-level "acquire lock,
update and release", and instead using the ref transaction API.
So there are only 4 remaining callers of lock_ref_oid_basic().
None of those callers pass REF_DELETING anymore, the last caller went
away in 92b1551b1d (refs: resolve symbolic refs first,
2016-04-25).
Before that we'd refactored and moved this code in:
- 8df4e51138 (struct ref_update: move "have_old" into "flags",
2015-02-17)
- 7bd9bcf372 (refs: split filesystem-based refs code into a new
file, 2015-11-09)
- 165056b2fc (lock_ref_for_update(): new function, 2016-04-24)
We then finally stopped using it in 92b1551b1d (noted above). So let's
remove the handling of this parameter.
By itself this change doesn't benefit us much, but it's the start of
even more removal of unused code in and around this function in
subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In e0cc8ac820 (packed_ref_store: make class into a subclass of
`ref_store`, 2017-06-23) a die() was added to packed_create_reflog(),
but not to any of the other reflog callbacks, let's do that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rules creating the $(LIB_FILE) and $(XDIFF_LIB) archives used to
be:
$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
until commit 7b76d6bf22 (Makefile: add and use the ".DELETE_ON_ERROR"
flag, 2021-06-29) removed the '$(RM) $@' part, claiming that "we can
rely on the "c" (create) being present in ARFLAGS", and (I presume)
assuming that it means that the named archive is created from scratch.
Unfortunately, that's not what the 'c' flag does, it merely "Suppress
the diagnostic message that is written to standard error by default
when the archive is created" [1]. Consequently, all object files that
are already present in an existing archive and are not replaced will
remain there. This leads to linker errors in back-to-back builds of
different revisions without a 'make clean' between them if source
files going into these archives are renamed in between:
# The last commit renaming files that go into 'libgit.a':
# bc62692757 (hash-lookup: rename from sha1-lookup, 2020-12-31)
# sha1-lookup.c => hash-lookup.c | 14 +++++++-------
# sha1-lookup.h => hash-lookup.h | 12 ++++++------
$ git checkout bc62692757^
HEAD is now at 7a7d992d0d sha1-lookup: rename `sha1_pos()` as `hash_pos()`
$ make
[...]
$ git checkout 7b76d6bf22
HEAD is now at 7b76d6bf22 Makefile: add and use the ".DELETE_ON_ERROR" flag
$ make
[...]
AR libgit.a
LINK git
/usr/bin/ld: libgit.a(hash-lookup.o): in function `bsearch_hash':
/home/szeder/src/git/hash-lookup.c:105: multiple definition of `bsearch_hash'; libgit.a(sha1-lookup.o):/home/szeder/src/git/sha1-lookup.c:105: first defined here
collect2: error: ld returned 1 exit status
make: *** [Makefile:2213: git] Error 1
Restore the original make rules to first remove $(LIB_FILE) and
$(XDIFF_LIB) and then create them from scratch to avoid these build
errors.
[1] Quoting POSIX at:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ar.html
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cleanup of old compat wrappers in bash completion caused a
regression on tcsh completion that still uses them.
Let's update the tcsh call site as well for addressing it.
Fixes: 441ecdab37 ("completion: bash: remove old compat wrappers")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
__gitcomp automatically adds a suffix, but __gitcomp_nl and others
don't, we need to specify a space by default.
Can be tested with:
git config branch.autoSetupMe<tab>
This fix only works for versions of bash greater than 4.0, before that
"local sfx" creates an empty string, therefore the unset expansion
doesn't work. The same happens in zsh.
Therefore we don't add the test for that for now.
The correct fix for all shells requires semantic changes in __gitcomp,
but that can be done later.
Cc: SZEDER Gábor <szeder.dev@gmail.com>
Tested-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We need to ignore options that don't start with -- as well.
Depending on the value of COMP_WORDBREAKS the last word could be
duplicated otherwise.
Can be tested with:
git merge -X diff-algorithm=<tab>
Tested-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise we are completely ignoring the --cur argument.
The issue can be tested with:
git clone --config=branch.<tab>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Tested-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e9f79acb28 (ci: upgrade to using actions/{up,down}load-artifacts v2,
2021-06-23) changed all calls to that action from v1 to v2, but there
is still an open bug[1] that affects all nodejs actions and prevents
its use in 32-bit linux (as used by the Linux32 container)
move all dockerized jobs to use v1 that was built in C# and therefore
doesn't have this problem, which will otherwise manifest with confusing
messages like:
/usr/bin/docker exec 0285adacc4536b7cd962079c46f85fa05a71e66d7905b5e4b9b1a0e8b305722a sh -c "cat /etc/*release | grep ^ID"
OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: no such file or directory: unknown
[1] https://github.com/actions/runner/issues/1011
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent changes to --fixup, adding amend suboption, caused the
--edit flag to be ignored as use_editor was always set to zero.
Restore edit_flag having higher priority than fixup_message when
deciding the value of use_editor by moving the edit flag condition
later in the method.
Signed-off-by: Joel Klinghed <the_jk@spawned.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user skips the final commit by removing all the changes from
the index and worktree with 'git restore' (or read-tree) and then runs
'git rebase --continue' .git/MERGE_MSG is left behind. This will seed
the commit message the next time the user commits which is not what we
want to happen.
Reported-by: Victor Gambier <vgambier@excilys.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
980b482d28 ("rebase tests: mark tests specific to the am-backend with
--am", 2020-02-15) sought to prepare tests testing the "apply" backend
in preparation for 2ac0d6273f ("rebase: change the default backend
from "am" to "merge"", 2020-02-15). However some tests seem to have
been missed leading to us testing the "merge" backend twice. This
patch fixes some cases that I noticed while adding tests to these
files, I have not audited all the other rebase test files. I've
reworded a couple of the test descriptions to make it clear which
backend they are testing.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Setting GIT_AUTHOR_* when committing with --amend will only change the
author if we also pass --reset-author. This commit is used in some
tests that ensure the author ident does not change when rebasing.
Creating this commit without changing the authorship meant that the
test would not catch regressions that caused rebase to discard the
original authorship information.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
make sure it uses a supported OS branch and uses all the resources
that can be allocated efficiently.
while only 1GB of memory is needed, 2GB is the minimum for a 2 CPU
machine (the default), but by increasing parallelism wall time has
been reduced by 35%.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Function `traverse_trees_and_blobs` not only works on trees and blobs,
but also on tags, the function name is somewhat misleading. This commit
rename it to `traverse_non_commits`.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this test, we currently use the SHA1 prerequisite to specify the
algorithm we're using to test, since SHA-256 bundles are always v3,
whereas SHA-1 bundles default to v2, and as a result the default output
differs.
However, this causes a problem if we run with GIT_TEST_FAIL_PREREQS set,
since that means that we'll unexpectedly fail the SHA1 prerequisite,
resulting in incorrect expected output. Let's fix this by checking
against the built-in data called "algo", which tells us which algorithm
is in use. This should work in any situation, making our test a little
more robust.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
similar to the recently added sparse task, it is nice to know as early
as possible.
add a dockerized build using fedora (that usually has the latest gcc)
to be ahead of the curve and avoid older ISO C issues at the same time.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, the git diff hunk headers show the wrong method signature if the
method has a qualified return type, an array return type, or a generic return
type because the regex doesn't allow dots (.), [], or < and > in the return
type. Also, type parameter declarations couldn't be matched.
Add several t4018 tests asserting the right hunk headers for different cases:
- enum constant change
- change in generic method with bounded type parameters
- change in generic method with wildcard
- field change in a nested class
Signed-off-by: Tassilo Horn <tsdh@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remind developers that they do not need to go overboard to implement
patterns to prepare for invalid constructs. They only have to be
sufficiently permissive, assuming that the payload is syntactically
correct, and that may allow them to be simpler.
Text stolen mostly from, and further improved by, Johannes Sixt.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ds/add-with-sparse-index:
add: remove ensure_full_index() with --renormalize
add: ignore outside the sparse-checkout in refresh()
pathspec: stop calling ensure_full_index
add: allow operating on a sparse-only index
t1092: test merge conflicts outside cone
It is useful for performance monitoring and debugging purposes to know
the wire protocol used for remote operations. This may differ from the
version set in local configuration due to differences in version and/or
configuration between the server and the client. Therefore, log the
negotiated wire protocol version via trace2, for both clients and
servers.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's rename 'compute_submodule_clone_url()' to 'resolve_relative_url()'
to make it clear that this internal helper need not be used exclusively
for computing submodule clone URLs.
Since the original 'resolve-relative-url' subcommand and its C entry
point has been removed in c461095ae3 (submodule--helper: remove
resolve-relative-url subcommand, 2021-07-02), this rename can be done
without causing any confusion about which function it actually binds to.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The shell subcommand `resolve-relative-url` is no longer required, as
its last caller has been removed when it was converted to C.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also no longer needed is this subcommand, as all of its functionality is
being called by the newly-introduced `module_add()` directly within C.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We no longer need this subcommand, as all of its functionality is being
called by the newly-introduced `module_add()` directly within C.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the 'add' subcommand to `submodule--helper.c` that does all
the work 'submodule add' past the parsing of flags.
We also remove the constness of the sm_path field of the `add_data`
struct. This is needed so that it can be modified by
normalize_path_copy().
As with the previous conversions, this is meant to be a faithful
conversion with no modification to the behaviour of `submodule add`.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions can be useful to other parts of Git. Let's move them to
dir.c, while renaming them to be make their functionality more explicit.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This part of `sync_submodule()` is doing the same thing that
`compute_submodule_clone_url()` is doing. Let's reuse that helper here.
Note that this change adds a small overhead where we allocate and free
the 'remote' twice, but that is a small price to pay for the higher
level of abstraction we get.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the helper function to resolve a relative url, by reusing the
existing `compute_submodule_clone_url()` function.
`compute_submodule_clone_url()` performs the same work that
`resolve_relative_url()` is doing, so we eliminate this code repetition
by moving the former function's definition up, and calling it inside
`resolve_relative_url()`.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's modify the interface to `compute_submodule_clone_url()` function
by adding two more arguments, so that we can reuse this in various parts
of `submodule--helper.c` that follow a common pattern, which is--read
the remote url configuration of the superproject and then call
`relative_url()`.
This function is nearly identical to `resolve_relative_url()`, the only
difference being the extra warning message. We can add a quiet flag to
it, to suppress that warning when not needed, and then refactor
`resolve_relative_url()` by using this function, something we will do in
the next patch.
We also rename the local variable 'relurl' to avoid potential confusion
with the 'rel_url' parameter while we are at it.
Having this functionality factored out will be useful for converting the
rest of `submodule add` in subsequent patches.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We parse through binary hunks by looping through the buffer with code
like:
llen = linelen(buffer, size);
...do something with the line...
buffer += llen;
size -= llen;
However, before we enter the loop, there is one call that increments
"buffer" but forgets to decrement "size". As a result, our "size" is off
by the length of that line, and subsequent calls to linelen() may look
past the end of the buffer for a newline.
The fix is easy: we just need to decrement size as we do elsewhere.
This bug goes all the way back to 0660626caf (binary diff: further
updates., 2006-05-05). Presumably nobody noticed because it only
triggers if the patch is corrupted, and even then we are often "saved"
by luck. We use a strbuf to store the incoming patch, so we overallocate
there, plus we add a 16-byte run of NULs as slop for memory comparisons.
So if this happened accidentally, the common case is that we'd just read
a few uninitialized bytes from the end of the strbuf before producing
the expected "this patch is corrupted" error complaint.
However, it is possible to carefully construct a case which reads off
the end of the buffer. The included test does so. It will pass both
before and after this patch when run normally, but using a tool like
ASan shows that we get an out-of-bounds read before this patch, but not
after.
Reported-by: Xingman Chen <xichixingman@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we iterate through the buffer containing git-log output, parsing
lines, we use an "int" to store the size of an individual line. This
should be a size_t, as we have no guarantee that there is not a
malicious 2GB+ commit-message line in the output.
Overflowing this integer probably doesn't do anything _too_ terrible. We
are not using the value to size a buffer, so the worst case is probably
an out-of-bounds read from before the array. But it's easy enough to
fix.
Note that we have to use ssize_t here, since we also store the length
result from parse_git_diff_header(), which may return a negative value
for error. That function actually returns an int itself, which has a
similar overflow problem, but I'll leave that for another day. Much
of the apply.c code uses ints and should be converted as a whole; in the
meantime, a negative return from parse_git_diff_header() will be
interpreted as an error, and we'll bail (so we can't handle such a case,
but given that it's likely to be malicious anyway, the important thing
is we don't have any memory errors).
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing our buffer of output from git-log, we have a
find_end_of_line() helper that finds the next newline, and gives us the
number of bytes to move past it, or the size of the whole remaining
buffer if there is no newline.
But trying to handle both those cases leads to some oddities:
- we try to overwrite the newline with NUL in the caller, by writing
over line[len-1]. This is at best redundant, since the helper will
already have done so if it saw a newline. But if it didn't see a
newline, it's actively wrong; we'll overwrite the byte at the end of
the (unterminated) line.
We could solve this just dropping the extra NUL assignment in the
caller and just letting the helper do the right thing. But...
- if we see a "diff --git" line, we'll restore the newline on top of
the NUL byte, so we can pass the string to parse_git_diff_header().
But if there was no newline in the first place, we can't do this.
There's no place to put it (the current code writes a newline
over whatever byte we obliterated earlier). The best we can do is
feed the complete remainder of the buffer to the function (which is,
in fact, a string, by virtue of being a strbuf).
To solve this, the caller needs to know whether we actually found a
newline or not. We could modify find_end_of_line() to return that
information, but we can further observe that it has only one caller.
So let's just inline it in that caller.
Nobody seems to have noticed this case, probably because git-log would
never produce input that doesn't end with a newline. Arguably we could
just return an error as soon as we see that the output does not end in a
newline. But the code to do so actually ends up _longer_, mostly because
of the cleanup we have to do in handling the error.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "offset" variable was was introduced in 44b67cb62b (range-diff:
split lines manually, 2019-07-11), but it has never done anything
useful. We use it to count up the number of bytes we've consumed, but we
never look at the result. It was probably copied accidentally from an
almost-identical loop in apply.c:find_header() (and the point of that
commit was to make use of the parse_git_diff_header() function which
underlies both).
Because the variable was set but not used, most compilers didn't seem to
notice, but the upcoming clang-14 does complain about it, via its
-Wunused-but-set-variable warning.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new "add-config" subcommand to `git submodule--helper` with the
goal of converting part of the shell code in git-submodule.sh related to
`git submodule add` into C code. This new subcommand sets the
configuration variables of a newly added submodule, by registering the
url in local git config, as well as the submodule name and path in the
.gitmodules file. It also sets 'submodule.<name>.active' to "true" if
the submodule path has not already been covered by any pathspec
specified in 'submodule.active'.
This is meant to be a faithful conversion from shell to C, although we
add comments to areas that could be improved in future patches, after
the conversion has settled.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When queueing references in git-rev-list(1), we try to optimize parsing
of commits via the commit-graph. To do so, we first look up the object's
type, and if it is a commit we call `repo_parse_commit()` instead of
`parse_object()`. This is quite inefficient though given that we're
always uncompressing the object header in order to determine the type.
Instead, we can opportunistically search the commit-graph for the object
ID: in case it's found, we know it's a commit and can directly fill in
the commit object without having to uncompress the object header.
Expose a new function `lookup_commit_in_graph()`, which tries to find a
commit in the commit-graph by ID, and convert `get_reference()` to use
this function. This provides a big performance win in cases where we
load references in a repository with lots of references pointing to
commits. The following has been executed in a real-world repository with
about 2.2 million refs:
Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
Time (mean ± σ): 4.458 s ± 0.044 s [User: 4.115 s, System: 0.342 s]
Range (min … max): 4.409 s … 4.534 s 10 runs
Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
Time (mean ± σ): 3.089 s ± 0.015 s [User: 2.768 s, System: 0.321 s]
Range (min … max): 3.061 s … 3.105 s 10 runs
Summary
'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran
1.44 ± 0.02 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `find_commit_in_graph()` assumes that the caller has passed
an object which was already determined to be a commit given that it will
access the commit's graph position, which is stored in a commit slab. In
a subsequent patch, we want to search for an object ID though without
knowing whether it is a commit or not, which is not currently possible.
Split out the logic to search the commit graph for a given object ID to
prepare for this change. This commit also renames the function to
`find_commit_pos_in_graph()`, which more accurately reflects what this
function does. Furthermore, in order to allow for the searched object ID
to be const, we need to adjust `bsearch_graph()`'s signature to accept a
constant object ID as input, too.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When queueing up references for the revision walk, `handle_one_ref()`
will resolve the reference's object ID via `get_reference()` and then
queue the ID as pending object via `add_pending_oid()`. But given that
`add_pending_oid()` is only a thin wrapper around `add_pending_object()`
which fist calls `get_reference()`, we effectively resolve the reference
twice and thus duplicate some of the work.
Fix the issue by instead calling `add_pending_object()` directly, which
takes the already-resolved object as input. In a repository with lots of
refs, this translates into a near 10% speedup:
Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
Time (mean ± σ): 5.015 s ± 0.038 s [User: 4.698 s, System: 0.316 s]
Range (min … max): 4.970 s … 5.089 s 10 runs
Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
Time (mean ± σ): 4.606 s ± 0.029 s [User: 4.260 s, System: 0.345 s]
Range (min … max): 4.565 s … 4.657 s 10 runs
Summary
'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran
1.09 ± 0.01 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to compute whether objects reachable from a set of tips are all
connected, we do a revision walk with these tips as positive references
and `--not --all`. `--not --all` will cause the revision walk to load
all preexisting references as uninteresting, which can be very expensive
in repositories with many references.
Benchmarking the git-rev-list(1) command highlights that by far the most
expensive single phase is initial sorting of the input revisions: after
all references have been loaded, we first sort commits by author date.
In a real-world repository with about 2.2 million references, it makes
up about 40% of the total runtime of git-rev-list(1).
Ultimately, the connectivity check shouldn't really bother about the
order of input revisions at all. We only care whether we can actually
walk all objects until we hit the cut-off point. So sorting the input is
a complete waste of time.
Introduce a new "--unsorted-input" flag to git-rev-list(1) which will
cause it to not sort the commits and adjust the connectivity check to
always pass the flag. This results in the following speedups, executed
in a clone of gitlab-org/gitlab [1]:
Benchmark #1: git rev-list --objects --quiet --not --all --not $(cat newrev)
Time (mean ± σ): 7.639 s ± 0.065 s [User: 7.304 s, System: 0.335 s]
Range (min … max): 7.543 s … 7.742 s 10 runs
Benchmark #2: git rev-list --unsorted-input --objects --quiet --not --all --not $newrev
Time (mean ± σ): 4.995 s ± 0.044 s [User: 4.657 s, System: 0.337 s]
Range (min … max): 4.909 s … 5.048 s 10 runs
Summary
'git rev-list --unsorted-input --objects --quiet --not --all --not $(cat newrev)' ran
1.53 ± 0.02 times faster than 'git rev-list --objects --quiet --not --all --not $newrev'
[1]: https://gitlab.com/gitlab-org/gitlab.git. Note that not all refs
are visible to clients.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d540b70c85 (merge: cleanup messages like commit, 2019-04-17) adds
a way to change part of the helper text using a single call to
strbuf_add_commented_addf but with two formats with varying number
of parameters.
this trigger a warning in old versions of Xcode (ex 8.0), so use
instead two independent calls with a matching number of parameters
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The case statement in detect-compiler notices 'clang', 'FreeBSD
clang' and 'Apple clang', but there are other platforms that follow
the '$VENDOR clang' pattern (e.g. Debian).
Generalize the pattern to catch them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_family and get_version helpers of detect-compiler assume
that the line to identify the version from the compilers have a
token "version", followed by the version number, followed by some
other string, e.g.
$ CC=gcc get_version_line
gcc version 10.2.1 20210110 (Debian 10.2.1-6)
But that is not necessarily true, e.g.
$ CC=clang get_version_line
Debian clang version 11.0.1-2
Tweak the script not to require extra string after the version.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
1da1580e4c (Makefile: detect compiler and enable more warnings in
DEVELOPER=1, 2018-04-14) uses the output of the compiler banner to
detect the compiler family.
Apple had since changed the wording used to refer to its compiler
as clang instead of LLVM as shown by:
$ cc --version
Apple clang version 12.0.5 (clang-1205.0.22.9)
Target: x86_64-apple-darwin20.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
so update the script to match, and allow DEVELOPER=1 to work as
expected again in macOS.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it clear that `ort` is the default merge strategy now rather than
`recursive`, including moving `ort` to the front of the list of merge
strategies.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few reasons to switch the default:
* Correctness
* Extensibility
* Performance
I'll provide some summaries about each.
=== Correctness ===
The original impetus for a new merge backend was to fix issues that were
difficult to fix within recursive's design. The success with this goal
is perhaps most easily demonstrated by running the following:
$ git grep -2 KNOWN_FAILURE t/ | grep -A 4 GIT_TEST_MERGE_ALGORITHM
$ git grep test_expect_merge_algorithm.failure.success t/
$ git grep test_expect_merge_algorithm.success.failure t/
In order, these greps show:
* Seven sets of submodule tests (10 total tests) that fail with
recursive but succeed with ort
* 22 other tests that fail with recursive, but succeed with ort
* 0 tests that pass with recursive, but fail with ort
=== Extensibility ===
Being able to perform merges without touching the working tree or index
makes it possible to create new features that were difficult with the
old backend:
* Merging, cherry-picking, rebasing, reverting in bare repositories...
or just on branches that aren't checked out.
* `git diff AUTO_MERGE` -- ability to see what changes the user has
made to resolve conflicts so far (see commit 5291828df8 ("merge-ort:
write $GIT_DIR/AUTO_MERGE whenever we hit a conflict", 2021-03-20)
* A --remerge-diff option for log/show, used to show diffs for merges
that display the difference between what an automatic merge would
have created and what was recorded in the merge. (This option will
often result in an empty diff because many merges are clean, but for
the non-clean ones it will show how conflicts were fixed including
the removal of conflict markers, and also show additional changes
made outside of conflict regions to e.g. fix semantic conflicts.)
* A --remerge-diff-only option for log/show, similar to --remerge-diff
but also showing how cherry-picks or reverts differed from what an
automatic cherry-pick or revert would provide.
The last three have been implemented already (though only one has been
submitted upstream so far; the others were waiting for performance work
to complete), and I still plan to implement the first one.
=== Performance ===
I'll quote from the summary of my final optimization for merge-ort
(while fixing the testcase name from 'no-renames' to 'few-renames'):
Timings
Infinite
merge- merge- Parallelism
recursive recursive of rename merge-ort
v2.30.0 current detection current
---------- --------- ----------- ---------
few-renames: 18.912 s 18.030 s 11.699 s 198.3 ms
mega-renames: 5964.031 s 361.281 s 203.886 s 661.8 ms
just-one-mega: 149.583 s 11.009 s 7.553 s 264.6 ms
Speedup factors
Infinite
merge- merge- Parallelism
recursive recursive of rename
v2.30.0 current detection merge-ort
---------- --------- ----------- ---------
few-renames: 1 1.05 1.6 95
mega-renames: 1 16.5 29 9012
just-one-mega: 1 13.6 20 565
And, for partial clone users:
Factor reduction in number of objects needed
Infinite
merge- merge- Parallelism
recursive recursive of rename
v2.30.0 current detection merge-ort
---------- --------- ----------- ---------
mega-renames: 1 1 1 181.3
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--no-walk` flag supports two modes: either it sorts the revisions
given as input input or it doesn't. This is reflected in a single
`no_walk` flag, which reflects one of the three states "walk", "don't
walk but without sorting" and "don't walk but with sorting".
Split up the flag into two separate bits, one indicating whether we
should walk or not and one indicating whether the input should be sorted
or not. This will allow us to more easily introduce a new flag
`--unsorted-input`, which only impacts the sorting bit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the "tags", "TAGS" and "cscope.out" targets rely on piping into
xargs with an "echo <list> | xargs" pattern, we need to make sure
we're in an append mode.
Unlike my recent change to make use of ".DELETE_ON_ERROR" in
7b76d6bf22 (Makefile: add and use the ".DELETE_ON_ERROR" flag,
2021-06-29), we really do need the "rm $@+" at the beginning (note,
not "rm $@").
This is because the xargs command may decide to invoke the program
multiple times. We need to make sure we've got a union of its results
at the end.
For "ctags" and "etags" we used the "-a" flag for this, for cscope
that behavior is the default. Its "-u" flag disables its equivalent of
an implicit "-a" flag.
Let's also consistently use the $@ and $@+ names instead of needlessly
hardcoding or referring to more verbose names in the "tags" and "TAGS"
rules.
These targets could perhaps be improved in the future by factoring
this "echo <list> | xargs" pattern so that we make intermediate tags
files for each source file, and then assemble them into one "tags"
file at the end.
The etags manual page suggests that doing that (or perhaps just
--update) might be counter-productive, in any case, the tag building
is fast enough for me, so I'm leaving that for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before we generate a "cscope.out" file, remove that file explicitly,
and not everything matching "cscope*". This doesn't change any
behavior of the Makefile in practice, but makes this rule less
confusing, and consistent with other similar rules.
The cscope target was added in a2a9150bf0 (makefile: Add a cscope
target, 2007-10-06). It has always referred to cscope* instead of to
cscope.out in .gitignore and the "clean" target, even though we only
ever generated a cscope.out file.
This was seemingly done to aid use-cases where someone invoked cscope
with the "-q" flag, which would make it create a "cscope.in.out" and
"cscope.po.out" files in addition to "cscope.out".
But us removing those files we never generated is confusing, so let's
only remove the file we need to, furthermore let's use the "-f" flag
to explicitly name the cscope.out file, even though it's the default
if not "-f" argument is supplied.
It is somewhat inconsistent to change from the glob here but not in
the "clean" rule and .gitignore, an earlier version of this change
updated those as well, but see [1][2] for why they were kept.
1. https://lore.kernel.org/git/87k0lit57x.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/87im0kn983.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --advertise-refs documentation in git-upload-pack added in
9812f2136b (upload-pack.c: use parse-options API, 2016-05-31) hasn't
been entirely true ever since v2 support was implemented in
e52449b672 (connect: request remote refs using v2, 2018-03-15). Under
v2 we don't advertise the refs at all, but rather dump the
capabilities header.
This option has always been an obscure internal implementation detail,
it wasn't even documented for git-receive-pack. Since it has exactly
one user let's rename it to --http-backend-info-refs, which is more
accurate and points the reader in the right direction. Let's also
cross-link this from the protocol v1 and v2 documentation.
I'm retaining a hidden --advertise-refs alias in case there's any
external users of this, and making both options hidden to the bash
completion (as with most other internal-only options).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "advertise capabilities" mode of serve.c added in
ed10cb952d (serve: introduce git-serve, 2018-03-15) is only used by
the http-backend.c to call {upload,receive}-pack with the
--advertise-refs parameter. See 42526b478e (Add stateless RPC options
to upload-pack, receive-pack, 2009-10-30).
Let's just make cmd_upload_pack() take the two (v2) or three (v2)
parameters the the v2/v1 servicing functions need directly, and pass
those in via the function signature. The logic of whether daemon mode
is implied by the timeout belongs in the v1 function (only used
there).
Once we split up the "advertise v2 refs" from "serve v2 request" it
becomes clear that v2 never cared about those in combination. The only
time it mattered was for v1 to emit its ref advertisement, in that
case we wanted to emit the smart-http-only "no-done" capability.
Since we only do that in the --advertise-refs codepath let's just have
it set "do_done" itself in v1's upload_pack() just before send_ref(),
at that point --advertise-refs and --stateless-rpc in combination are
redundant (the only user is get_info_refs() in http-backend.c), so we
can just pass in --advertise-refs only.
Since we need to touch all the serve() and advertise_capabilities()
codepaths let's rename them to less clever and obvious names, it's
been suggested numerous times, the latest of which is [1]'s suggestion
for protocol_v2_serve_loop(). Let's go with that.
1. https://lore.kernel.org/git/CAFQ2z_NyGb8rju5CKzmo6KhZXD0Dp21u-BbyCb2aNxLEoSPRJw@mail.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --advertise-refs option had no explicit tests of its own, only
other http tests that would fail at a distance if it it was
broken. Let's test its behavior explicitly.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The advertise_capabilities() is only called from serve() and we always
emit this version line before it. In a subsequent commit I'll make
builtin/upload-pack.c sometimes call advertise_capabilities()
directly, so it'll make sense to have this line emitted by
advertise_capabilities(), not serve() itself.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 6b5b6e422e (serve: advertise session ID in v2 capabilities,
2020-11-11) the check for transfer.advertiseSID was added to the
beginning of the main serve() loop. Thus on startup of the server we'd
populate it.
Let's instead use an explicit lazy initialization pattern in
session_id_advertise() itself, we'll still look the config up only
once per-process, but by moving it out of serve() itself the further
changing of that routine becomes easier.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The serve.c API added in ed10cb952d (serve: introduce git-serve,
2018-03-15) was passing in the raw capabilities "keys", but nothing
downstream of it ever used them.
Let's remove that code because it's not needed. If we do end up
needing to pass information about the advertisement in the future
it'll make more sense to have serve.c parse the capabilities keys and
pass the result of its parsing, rather than expecting expecting its
API users to parse the same keys again.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the declaration of the protocol_capability struct to use
designated initializers, this makes this more verbose now, but a
follow-up commit will add a new field. At that point these lines would
be too dense to be on one line comfortably.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the assignments to the various transport_vtables to use
designated initializers, this makes the code easier to read and
maintain.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the "fetch" member of the transport_vtable to "fetch_refs" for
consistency with the existing "push_refs". Neither of them just push
"refs" but refs and objects, but having the two match makes the code
more readable than having it be inconsistent, especially since
"fetch_refs" is a lot easier to grep for than "fetch".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The has_capability() function introduced in ed10cb952d (serve:
introduce git-serve, 2018-03-15) has never been used anywhere except
serve.c, so let's mark it as static.
It was later changed from "extern" in 554544276a (*.[ch]: remove
extern from function declarations using spatch, 2019-04-29), but we
could have simply marked it as "static" instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were two locations in the code that referred to 'merge-recursive'
but which were also applicable to 'merge-ort'. Update them to more
general wording.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 58634dbff8 ("rebase: Allow merge strategies to be used when
rebasing", 2006-06-21) added the --merge option to git-rebase so that
renames could be detected (at least when using the `recursive` merge
backend). However, git-am -3 gained that same ability in commit
579c9bb198 ("Use merge-recursive in git-am -3.", 2006-12-28). As such,
the comment about being able to detect renames is not particularly
noteworthy. Remove it.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have diff-algorithm that explains why there are special diff
algorithms, so we do not need to re-explain patience. patience exists
as its own toplevel option for historical reasons, but there's no reason
to give it special preference or document it again and suggest it's more
important than other diff algorithms, so just refer to it as a
deprecated shorthand for `diff-algorithm=patience`.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stating that the recursive strategy "currently cannot make use of
detected copies" implies that this is a technical shortcoming of the
current algorithm. I disagree with that. I don't see how copies could
possibly be used in a sane fashion in a merge algorithm -- would we
propagate changes in one file on one side of history to each copy of
that file when merging? That makes no sense to me. I cannot think of
anything else that would make sense either. Change the wording to
simply state that we ignore any copies.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is probably helpful to cover the default merge strategy first, so
move the text for the resolve strategy to later in the document.
Further, the wording for "resolve" claimed that it was "considered
generally safe and fast", which might imply in some readers minds that
the same is not true of other strategies. Rather than adding this text
to all the strategies, just remove it from this one.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few places in the documentation referred to the "`recursive` strategy"
using the phrase "`git merge-recursive`", suggesting that it was forking
subprocesses to call a toplevel builtin. Perhaps that was relevant to
when rebase was a shell script, but it seems like a rather indirect way
to refer to the `recursive` strategy. Simplify the references.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 0c4fd732f0 ("Move computation of dir_rename_count from
merge-ort to diffcore-rename", 2021-02-27), much of the logic for
computing directory renames moved into diffcore-rename.
directory-rename-detection.txt had claims that all of that logic was
found in merge-recursive. Update the documentation.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --rebase-merges was first introduced, it only worked with the
`recursive` strategy. Some time later, it gained support for merges
using the `octopus` strategy. The limitation of only supporting these
two strategies was documented in 25cff9f109 ("rebase -i --rebase-merges:
add a section to the man page", 2018-04-25) and lifted in e145d99347
("rebase -r: support merge strategies other than `recursive`",
2019-07-31). However, when the limitation was lifted, the documentation
was not updated. Update it now.
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing reference negotiation, git-fetch-pack(1) is loading all refs
from disk in order to determine which commits it has in common with the
remote repository. This can be quite expensive in repositories with many
references though: in a real-world repository with around 2.2 million
refs, fetching a single commit by its ID takes around 44 seconds.
Dominating the loading time is decompression and parsing of the objects
which are referenced by commits. Given the fact that we only care about
commits (or tags which can be peeled to one) in this context, there is
thus an easy performance win by switching the parsing logic to make use
of the commit graph in case we have one available. Like this, we avoid
hitting the object database to parse these commits but instead only load
them from the commit-graph. This results in a significant performance
boost when executing git-fetch in said repository with 2.2 million refs:
Benchmark #1: HEAD~: git fetch $remote $commit
Time (mean ± σ): 44.168 s ± 0.341 s [User: 42.985 s, System: 1.106 s]
Range (min … max): 43.565 s … 44.577 s 10 runs
Benchmark #2: HEAD: git fetch $remote $commit
Time (mean ± σ): 19.498 s ± 0.724 s [User: 18.751 s, System: 0.690 s]
Range (min … max): 18.629 s … 20.454 s 10 runs
Summary
'HEAD: git fetch $remote $commit' ran
2.27 ± 0.09 times faster than 'HEAD~: git fetch $remote $commit'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify code maintenance by removing the ability to toggle between
usage of memory pools and direct allocations. This allows us to also
remove paths_to_free since it was solely about bookkeeping to make sure
we freed the necessary paths, and allows us to remove some auxiliary
functions.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we introduced new documentation that talks
about "[commit|object] prerequsite(s)", but also faithfully moved
around existing documentation that talks about the "basis".
Let's change both that moved-around documentation and other existing
documentation in the file to consistently use "[commit|object]"
prerequisite(s)" instead of talking about "basis". The mention of
"basis" isn't wrong, but readers will be helped by us using only one
term throughout the document for this concept.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite the "DESCRIPTION" section for "git bundle" to start by talking
about what bundles are in general terms, rather than diving directly
into one example of what they might be used for.
This changes documentation that's been substantially the same ever
since the command was added in 2e0afafebd (Add git-bundle: move
objects and references by archive, 2007-02-22).
I've split up the DESCRIPTION into that section and a "BUNDLE FORMAT"
section, it briefly discusses the format, but then links to the
technical/bundle-format.txt documentation.
The "the user must specify a basis" part of this is discussed below in
"SPECIFYING REFERENCES", and will be further elaborated on in a
subsequent commit. So I'm removing that part and letting the mention
of "revision exclusions" suffice.
There was a discussion about whether to say anything at all about
"thin packs" here[1]. I think it's good to mention it for the curious
reader willing to read the technical docs, but let's explicitly say
that there's no "thick pack", and that the difference shouldn't
matter.
1. http://lore.kernel.org/git/xmqqk0mbt5rj.fsf@gitster.g
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elaborate on the restriction that you cannot provide a revision that
doesn't resolve to a reference in the "SPECIFYING REFERENCES" section
with examples.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split out the discussion bout "object prerequisites" into its own
section, and add some more examples of the common cases.
See 2e0afafebd (Add git-bundle: move objects and references by
archive, 2007-02-22) for the introduction of the documentation being
changed here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By doing ls -1 .git/{reftable,refs/heads}, we can capture changes to both
reftable and packed/loose ref storage.
This relies on the fact that git-pack-refs (which we're looking for here)
changes the number (loose/packed storage) and/or names (reftable) files used for
ref storage.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous code tested this by writing $ZERO_OID explicitly in the packed-refs
file. This is a type of corruption that doesn't reflect realistic use-cases. In
addition, even the ref-store test-tool refuses to write invalid OIDs.
(update-ref interprets $ZERO_OID is deleting the ref).
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test takes a lock on the target of a symref, and then verifies that it is
possible to expire the symref's reflog. In reftable, one can only take a global
lock (which would prevent the symref reflog from being expired altogether.)
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests verifies that "pack-refs" causes loose refs to be packed. As both
loose and packed refs are concepts specific to the files backend, mark the test
as REFFILES.
Check the outcome of the pack-refs operation. This was apparently forgotten in
the commit introducing this test: 16feb99d (Mar 26 2017, "t1405: some basic
tests on main ref store").
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the conditional use of CURLAUTH_DIGEST_IE and
CURLOPT_USE_SSL. These two have been split from earlier simpler checks
against LIBCURL_VERSION_NUM for ease of review.
According to
https://github.com/curl/curl/blob/master/docs/libcurl/symbols-in-versions
the CURLAUTH_DIGEST_IE flag became available in 7.19.3, and
CURLOPT_USE_SSL in 7.17.0.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the last commit we dropped support for curl < 7.16.0, let's
continue that and drop support for versions older than 7.19.3. This
allows us to simplify the code by getting rid of some "#ifdef"'s.
Git was broken with vanilla curl < 7.19.4 from v2.12.0 until
v2.15.0. Compiling with it was broken by using CURLPROTO_* outside any
"#ifdef" in aeae4db174 (http: create function to get curl allowed
protocols, 2016-12-14), and fixed in v2.15.0 in f18777ba6e (http: fix
handling of missing CURLPROTO_*, 2017-08-11).
It's unclear how much anyone was impacted by that in practice, since
as noted in [1] RHEL versions using curl older than that still
compiled, because RedHat backported some features. Perhaps other
vendors did the same.
Still, it's one datapoint indicating that it wasn't in active use at
the time. That (the v2.12.0 release) was in Feb 24, 2017, with v2.15.0
on Oct 30, 2017, it's now mid-2021.
1. http://lore.kernel.org/git/c8a2716d-76ac-735c-57f9-175ca3acbcb0@jupiterrise.com;
followed-up by f18777ba6e (http: fix handling of missing CURLPROTO_*,
2017-08-11)
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the last commit we dropped support for curl < 7.11.1, let's
continue that and drop support for versions older than 7.16.0. This
allows us to get rid of some now-obsolete #ifdefs.
Choosing 7.16.0 is a somewhat arbitrary cutoff:
1. It came out in October of 2006, almost 15 years ago.
Besides being a nice round number, around 10 years is
a common end-of-life support period, even for conservative
distributions.
2. That version introduced the curl_multi interface, which
gives us a lot of bang for the buck in removing #ifdefs
RHEL 5 came with curl 7.15.5[1] (released in August 2006). RHEL 5's
extended life cycle program ended on 2020-11-30[1]. RHEL 6 comes with
curl 7.19.7 (released in November 2009), and RHEL 7 comes with
7.29.0 (released in February 2013).
1. http://lore.kernel.org/git/873e1f31-2a96-5b72-2f20-a5816cad1b51@jupiterrise.com
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drop support for this ancient version of curl and simplify the code by
allowing us get rid of some "#ifdef"'s.
Git will not build with vanilla curl older than 7.11.1 due our use of
CURLOPT_POSTFIELDSIZE in 37ee680d9b
(http.postbuffer: allow full range of ssize_t values,
2017-04-11). This field was introduced in curl 7.11.1.
We could solve these compilation problems with more #ifdefs,
but it's not worth the trouble. Version 7.11.1 came out in
March of 2004, over 17 years ago. Let's declare that too old
and drop any existing ifdefs that go further back. One
obvious benefit is that we'll have fewer conditional bits
cluttering the code.
This patch drops all #ifdefs that reference older versions
(note that curl's preprocessor macros are in hex, so we're
looking for 070b01, not 071101).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pool_alloc_filespec() was written so that the code when pool != NULL
mimicked the code from alloc_filespec(), which including allocating
enough extra space for the path and then copying it. However, the path
passed to pool_alloc_filespec() is always going to already be in the
same memory pool, so we may as well reuse it instead of copying it.
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 198.5 ms ± 3.4 ms 198.3 ms ± 2.9 ms
mega-renames: 679.1 ms ± 5.6 ms 661.8 ms ± 5.9 ms
just-one-mega: 271.9 ms ± 2.8 ms 264.6 ms ± 2.5 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 198.1 ms ± 2.6 ms 198.5 ms ± 3.4 ms
mega-renames: 715.8 ms ± 4.0 ms 679.1 ms ± 5.6 ms
just-one-mega: 276.8 ms ± 4.2 ms 271.9 ms ± 2.8 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to be able to allocate filespecs and filepairs using a mem_pool.
However, filespec data will still remain outside the pool (perhaps in
the future we could plumb the pool through the various diff APIs to
allocate the filespec data too, but for now we are limiting the scope).
Add some extra functions to allocate these appropriately based on the
non-NULL-ness of opt->priv->pool, as well as some extra functions to
handle correctly deallocating the relevant parts of them. A future
commit will make use of these new functions.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For all the strmaps (including strintmaps and strsets) whose memory is
unconditionally freed as part of clear_or_reinit_internal_opts(), switch
them over to using our new memory pool.
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 202.5 ms ± 3.2 ms 198.1 ms ± 2.6 ms
mega-renames: 1.072 s ± 0.012 s 715.8 ms ± 4.0 ms
just-one-mega: 357.3 ms ± 3.9 ms 276.8 ms ± 4.2 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-ort has a lot of data structures, and they all tend to be freed
together in clear_or_reinit_internal_opts(). Set up a memory pool to
allow us to make these allocations and deallocations faster. Future
commits will adjust various callers to make use of this memory pool.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the code more flexible so that it can handle both being run with or
without a memory pool by adding utility functions which will either call
xmalloc, xcalloc, xstrndup
or
mem_pool_alloc, mem_pool_calloc, mem_pool_strndup
depending on whether we have a non-NULL memory pool. A subsequent
commit will make use of these.
(We will actually be dropping these functions soon and just assuming we
always have a memory pool, but the flexibility was very useful during
development of merge-ort so I want to be able to restore it if needed.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Exact rename detection, via insert_file_table(), uses a hashmap to store
files by oid. Use a mem_pool for the hashmap entries so these can all be
allocated and deallocated together.
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 204.2 ms ± 3.0 ms 202.5 ms ± 3.2 ms
mega-renames: 1.076 s ± 0.015 s 1.072 s ± 0.012 s
just-one-mega: 364.1 ms ± 7.0 ms 357.3 ms ± 3.9 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to make it clearer that these three variables holding a
function refer to functions that will clear the strmap/strintmap/strset,
rename them to str{map,intmap,set}_clear_func().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --renormalize option updates the EOL conversions for the tracked
files. However, the loop already ignores files marked with the
SKIP_WORKTREE bit, so it will continue to do so with a sparse index
because the sparse directory entries also have this bit set.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since b243012 (refresh_index(): add flag to ignore SKIP_WORKTREE
entries, 2021-04-08), 'git add --refresh <path>' will output a warning
message when the path is outside the sparse-checkout definition. The
implementation of this warning happened in parallel with the
sparse-index work to add ensure_full_index() calls throughout the
codebase.
Update this loop to have the proper logic that checks to see if the
pathspec is outside the sparse-checkout definition. This avoids the need
to expand the sparse directory entry and determine if the path is
tracked, untracked, or ignored. We simply avoid updating the stat()
information because there isn't even an entry that matches the path!
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The add_pathspec_matches_against_index() focuses on matching a pathspec
to file entries in the index. This already works correctly for its only
use: checking if untracked files exist in the index.
The compatibility checks in t1092 already test that 'git add <dir>'
works for a directory outside of the sparse cone. That provides coverage
for removing this guard.
This finalizes our ability to run 'git add .' without expanding a sparse
index to a full one. This is evidenced by an update to t1092 and by
these performance numbers for p2000-sparse-operations.sh:
Test HEAD~1 HEAD
--------------------------------------------------------------------------------
2000.10: git add . (full-index-v3) 0.37(0.28+0.07) 0.36(0.27+0.06) -2.7%
2000.11: git add . (full-index-v4) 0.33(0.26+0.06) 0.32(0.28+0.05) -3.0%
2000.12: git add . (sparse-index-v3) 0.57(0.53+0.07) 0.06(0.06+0.07) -89.5%
2000.13: git add . (sparse-index-v4) 0.57(0.53+0.07) 0.05(0.03+0.09) -91.2%
While the ~90% improvement is shown by the test results, it is worth
noting that expanding the sparse index was adding overhead in previous
commits. Comparing to the full index case, we see the performance go
from 0.33s to 0.05s, an 85% improvement.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Disable command_requires_full_index for 'git add'. This does not require
any additional removals of ensure_full_index(). The main reason is that
'git add' discovers changes based on the pathspec and the worktree
itself. These are then inserted into the index directly, and calls to
index_name_pos() or index_file_exists() already call expand_to_path() at
the appropriate time to support a sparse-index.
Add a test to check that 'git add -A' and 'git add <file>' does not
expand the index at all, as long as <file> is not within a sparse
directory. This does not help the global 'git add .' case.
We can measure the improvement using p2000-sparse-operations.sh with
these results:
Test HEAD~1 HEAD
------------------------------------------------------------------------------
2000.6: git add -A (full-index-v3) 0.35(0.30+0.05) 0.37(0.29+0.06) +5.7%
2000.7: git add -A (full-index-v4) 0.31(0.26+0.06) 0.33(0.27+0.06) +6.5%
2000.8: git add -A (sparse-index-v3) 0.57(0.53+0.07) 0.05(0.04+0.08) -91.2%
2000.9: git add -A (sparse-index-v4) 0.58(0.55+0.06) 0.05(0.05+0.06) -91.4%
While the 91% improvement seems impressive, it's important to recognize
that previously we had significant overhead for expanding the
sparse-index. Comparing to the full index case, 'git add -A' goes from
0.37s to 0.05s, which is "only" an 86% improvement.
This modification to 'git add' creates some behavior change depending on
the use of a sparse index. We modify a test in t1092 to demonstrate
these changes which will be remedied in future changes.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Conflicts can occur outside of the sparse-checkout definition, and in
that case users might try to resolve the conflicts in several ways.
Document a few of these ways in a test. Make it clear that this behavior
is not necessarily the optimal flow, since users can become confused
when Git deletes these files from the worktree in later commands.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function was designed to return only BISECT_OK (0) or
BISECT_FAILED (-1) and no other values, but there were two issues:
- The comment misspelled BISECT_FAILED as BISECT_FAILURE, even
though the logic it described (i.e. any non-zero return should be
reported as a single BISECT_FAILED) was correct.
- It took the return value from run_command_v_opt(), and assumed it
was either -1 or 1 upon error, which is not the case; it can relay
errors from wait_or_whine(), which can report exit status of the
child process.
Translate any error return from run_command_v_opt() to BISECT_FAILED,
and simplify the resulting code by losing the 'res' variable that is
no longer needed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In scripted versions of "git bisect", we used "git show-branch" to
describe a single commit in the bisect log and also to the interactive
user after checking out the next version to be tested.
The former use of "git show-branch" was lost when the helper
function that wrote bisect log entries was rewritten at 0f30233a
(bisect--helper: `bisect_write` shell function in C, 2019-01-02) in
C
But we've kept the latter ever since 0871984d (bisect: make "git
bisect" use new "--next-all" bisect-helper function, 2009-05-09)
started using the faithful C-rewrite introduced at ef24c7ca
(bisect--helper: add "--next-exit" to output bisect results,
2009-04-19).
Showing "[<full hex>] <subject>" is simple enough with our helper
pretty.c::format_commit_message() and spawning show-branch is an
overkill. Let's lose one external process.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since Git learned to detect its install location at runtime, there
was the slightly awkward problem that it was impossible to specify paths
relative to said location.
For example, if a version of Git was shipped with custom SSL
certificates to use, there was no portable way to specify
`http.sslCAInfo`.
In Git for Windows, the problem was "solved" for years by interpreting
paths starting with a slash as relative to the runtime prefix.
However, this is not correct: such paths _are_ legal on Windows, and
they are interpreted as absolute paths in the same drive as the current
directory.
After a lengthy discussion, and an even lengthier time to mull over the
problem and its best solution, and then more discussions, we eventually
decided to introduce support for the magic sequence `%(prefix)/`. If a
path starts with this, the remainder is interpreted as relative to the
detected (runtime) prefix. If built without runtime prefix support, Git
will simply interpolate the compiled-in prefix.
If a user _wants_ to specify a path starting with the magic sequence,
they can prefix the magic sequence with `./` and voilà, the path won't
be expanded.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not immediately clear what `expand_user_path()` means, so let's
rename it to `interpolate_path()`. This also opens the path for
interpolating more than just a home directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `real_home` parameter only has an effect when expanding paths
starting with `~/`, not when expanding paths starting with `~<user>/`.
Let's make that clear.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 395de250d9 (Expand ~ and ~user in core.excludesfile,
commit.template, 2009-11-17), the `user_path()` function was refactored
into the `expand_user_path()`. During that refactoring, the `buf`
parameter was lost, but the code comment above said function still talks
about it. Let's remove that stale part of the comment.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, we refrained from adding a regression test in 7b6c6496374
(system_path(): Add prefix computation at runtime if RUNTIME_PREFIX set,
2008-08-10), and in 226c0ddd0d (exec_cmd: RUNTIME_PREFIX on some POSIX
systems, 2018-04-10).
The reason was that it was deemed too tricky to test.
Turns out that it is not tricky to test at all: we simply create a
pseudo-root, copy the `git` executable into the `git/` subdirectory of
that pseudo-root, then copy a script into the `libexec/git-core/`
directory and expect that to be picked up.
As long as the trash directory is in a location where binaries can be
executed, this works.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
%(rest) is a atom used for cat-file batch mode, which can split
the input lines at the first whitespace boundary, all characters
before that whitespace are considered to be the object name;
characters after that first run of whitespace (i.e., the "rest"
of the line) are output in place of the %(rest) atom.
In order to let "cat-file --batch=%(rest)" use the ref-filter
interface, add %(rest) atom for ref-filter.
Introduce the reject_atom() to reject the atom %(rest) for
"git for-each-ref", "git branch", "git tag" and "git verify-tag".
Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Suggected-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because the perl language can handle binary data correctly,
add the function perl_quote_buf_with_len(), which can specify
the length of the data and prevent the data from being truncated
at '\0' to help `--format="%(raw)"` support `--perl`.
Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.
The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depends on the output of the atom being
text (specifically, no embedded NULs in it).
E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.
Therefore, we need to find a way to record the length of the
atom_value's member `s`. Although strbuf can already record the
string and its length, if we want to replace the type of atom_value's
member `s` with strbuf, many places in ref-filter that are filled
with dynamically allocated mermory in `v->s` are not easy to replace.
At the same time, we need to check if `v->s == NULL` in
populate_value(), and strbuf cannot easily distinguish NULL and empty
strings, but c-style "const char *" can do it. So add a new member in
`struct atom_value`: `s_size`, which can record raw object size, it
can help us add raw object data to the buffer or compare two buffers
which contain raw object data.
Note that `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, and `--perl` because if the binary raw data is passed to a
variable in such languages, these may not support arbitrary binary data
in their string variable type.
Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase. We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag. So it's needed to pass a
`struct expand_data *data` instread of only `void *buf` to both
`grab_sub_body_contents()` and `grab_values()` to be able to check
the object type.
Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.
Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be useful to tell who invoked Git - was it invoked manually by a
user via CLI or script? By an IDE? In some cases - like 'repo' tool -
we can influence the source code and set the GIT_TRACE2_PARENT_SID
environment variable from the caller process. In 'repo''s case, that
parent SID is manipulated to include the string "repo", which means we
can positively identify when Git was invoked by 'repo' tool. However,
identifying parents that way requires both that we know which tools
invoke Git and that we have the ability to modify the source code of
those tools. It cannot scale to keep up with the various IDEs and
wrappers which use Git, most of which we don't know about. Learning
which tools and wrappers invoke Git, and how, would give us insight to
decide where to improve Git's usability and performance.
Unfortunately, there's no cross-platform reliable way to gather the name
of the parent process. If procfs is present, we can use that; otherwise
we will need to discover the name another way. However, the process ID
should be sufficient to look up the process name on most platforms, so
that code may be shareable.
Git for Windows gathers similar information and logs it as a "data_json"
event. However, since "data_json" has a variable format, it is difficult
to parse effectively in some languages; instead, let's pursue a
dedicated "cmd_ancestry" event to record information about the ancestry
of the current process and a consistent, parseable way.
Git for Windows also gathers information about more than one generation
of parent. In Linux further ancestry info can be gathered with procfs,
but it's unwieldy to do so. In the interest of later moving Git for
Windows ancestry logging to the 'cmd_ancestry' event, and in the
interest of later adding more ancestry to the Linux implementation - or
of adding this functionality to other platforms which have an easier
time walking the process tree - let's make 'cmd_ancestry' accept an
array of parentage.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To pave the way for non-Windows platforms to define
trace2_collect_process_info(), reorganize the stub-or-definition schema
to something which doesn't directly reference Windows.
Platforms which want to collect parent process information in the
future should:
1. Add an implementation to compat/ (e.g. compat/somearch/procinfo.c)
2. Add that object to COMPAT_OBJS to config.mak.uname
(e.g. COMPAT_OBJS += compat/somearch/procinfo.o)
3. Define HAVE_PLATFORM_PROCINFO in config.mak.uname
In the Windows case, this definition lives in
compat/win32/trace2_win32_process_info.c, which is already conditionally
added to COMPAT_OBJS; so let's add HAVE_PLATFORM_PROCINFO to hint to the
build that compat/stub/procinfo.c should not be used.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With multiple heads, we should not allow rebasing or fast-forwarding.
Make sure any fast-forward request calls out specifically the fact that
multiple branches are in play. Also, since we cannot fast-forward to
multiple branches, fix our computation of can_ff.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-pull.txt includes merge-options.txt, which is written assuming
merges will happen. git-pull has allowed rebases for many years; update
the documentation to reflect that.
While at it, pass any `--signoff` flag through to the rebase backend too
so that we don't have to document it as merge-specific. Rebase has
supported the --signoff flag for years now as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have for some time shown a long warning when the user does not
specify how to reconcile divergent branches with git pull. Make it an
error now.
Initial-patch-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the last few precedence tests failing in t7601 by now implementing
the logic to have --[no-]rebase override a pull.ff=only config setting.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are both merge and rebase branches in the logic, and previously
both had to handle fast-forwarding. Merge handled that implicitly
(because git merge handles it directly), while in rebase it was
explicit. Given that the --ff-only flag is meant to override any
--rebase or --no-rebase, make the code reflect that by handling
--ff-only before the merge-vs-rebase logic.
It turns out that this also fixes a bug for submodules. Previously,
when --ff-only was given, the code would run `merge --ff-only` on the
main module, and then run `submodule update --recursive --rebase` on the
submodules. With this change, we still run `merge --ff-only` on the
main module, but now run `submodule update --recursive --checkout` on
the submodules. I believe this better reflects the intent of --ff-only
to have it apply to both the main module and the submodules.
(Sidenote: It is somewhat interesting that all merges pass `--checkout`
to submodule update, even when `--no-ff` is specified, meaning that it
will only do fast-forward merges for submodules. This was discussed in
commit a6d7eb2c7a ("pull: optionally rebase submodules (remote submodule
changes only)", 2017-06-23). The same limitations apply now as then, so
we are not trying to fix this at this time.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "FORCE" dependency from the "tags", "TAGS" and "cscope"
targets, instead make them depend on whether or not the relevant
source files have changed.
For the cscope target we need to change it to depend on the actual
generated file while we generate while we're at it, as the next commit
will discuss we always generate a cscope.out file.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the "check_sub_test_lib_test()" and its sister functions to a new
lib-subtest.sh.
In the future (not in this series) I'd like to test test-lib's output
in a more targeted and smaller test, and I'll need these functions to
do that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The warning about pulling without specifying how to reconcile divergent
branches says that after setting pull.rebase to true, --ff-only can
still be passed on the command line to require a fast-forward. Make that
actually work.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
[en: updated tests; note 3 fixes and 1 new failure]
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were already code checking that --rebase was incompatible with
a merge of multiple heads. However, we were sometimes throwing warnings
about lack of specification of rebase vs. merge when given multiple
heads. Since rebasing is disallowed with multiple merge heads, that
seems like a poor warning to print; we should instead just assume
merging is wanted.
Add a few tests checking multiple merge head behavior.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interaction of rebase and merge flags and options was not well
tested. Add several tests to check for correct behavior from the
following rules:
* --ff-only vs. --[no-]rebase
(and the related pull.ff=only vs. pull.rebase)
* --rebase[=!false] vs. --no-ff and --ff
(and the related pull.rebase=!false overrides pull.ff=!only)
* command line flags take precedence over config, except:
* --no-rebase heeds pull.ff=!only
* pull.rebase=!false vs --no-ff and --ff
For more details behind these rules and a larger table of individual
cases, refer to https://lore.kernel.org/git/xmqqwnpqot4m.fsf@gitster.g/
and the links found therein.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a full hexadecimal hash is given as a --negotiation-tip to "git
fetch", and that hash does not correspond to an object, "git fetch" will
segfault if --negotiate-only is given and will silently ignore that hash
otherwise. Make these cases fatal errors, just like the case when an
invalid ref name or abbreviated hash is given.
While at it, mark the error messages as translatable.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
did not test the case in which a remote advertises at least one ref. In
such a case, "remote_refs" in get_commons_through_negotiation() in
send-pack.c would also contain those refs with a zero ref->new_oid (in
addition to the refs being pushed with a nonzero ref->new_oid). Passing
them as negotiation tips to "git fetch" causes an error, so filter them
out.
(The exact error that would happen in "git fetch" in this case is a
segmentation fault, which is unwanted. This will be fixed in the
subsequent commit.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
introduced the push.negotiate config variable and included a test. The
test only covered pushing without a remote helper, so the fact that
pushing with a remote helper doesn't work went unnoticed.
This is ultimately caused by the "url" field not being set in the args
struct. This field being unset probably went unnoticed because besides
push negotiation, this field is only used to generate a "pushee" line in
a push cert (and if not given, no such line is generated). Therefore,
set this field.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a segfault in the --stdin-packs option added in
339bce27f4 (builtin/pack-objects.c: add '--stdin-packs' option,
2021-02-22).
The read_packs_list_from_stdin() function didn't check that the lines
it was reading were valid packs, and thus when doing the QSORT() with
pack_mtime_cmp() we'd have a NULL "util" field. The "util" field is
used to associate the names of included/excluded packs with the
packed_git structs they correspond to.
The logic error was in assuming that we could iterate all packs and
annotate the excluded and included packs we got, as opposed to
checking the lines we got on stdin. There was a check for excluded
packs, but included packs were simply assumed to be valid.
As noted in the test we'll not report the first bad line, but whatever
line sorted first according to the string-list.c API. In this case I
think that's fine. There was further discussion of alternate
approaches in [1].
Even though we're being lazy let's assert the line we do expect to get
in the test, since whoever changes this code in the future might miss
this case, and would want to update the test and comments.
1. http://lore.kernel.org/git/YND3h2l10PlnSNGJ@nand.local
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't show the very verbose $(FIND_SOURCE_FILES) command on every
"make cscope" invocation.
See my recent 3c80fcb591 (Makefile: add QUIET_GEN to "tags" and "TAGS"
targets, 2021-03-28) for the same fix for the other adjacent targets.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the ".PHONY: cscope" rule to live alongside the "cscope" target
itself, not to be all the way near the bottom where we define the
"FORCE" rule.
That line was last modified in 2f76919517 (MinGW: avoid collisions
between "tags" and "TAGS", 2010-09-28).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cover blindspots in the testing of stdin handling, including the
"!len" condition added in b5d97e6b0a (pack-objects: run rev-list
equivalent internally., 2006-09-04). The codepath taken with --revs
and read_object_list_from_stdin() acts differently in some of these
common cases, let's test for those.
The "--stdin --revs" test being added here stresses the combination of
--stdin-packs and the revision.c --stdin argument, some of this was
covered in a test added in 339bce27f4 (builtin/pack-objects.c: add
'--stdin-packs' option, 2021-02-22), but let's make sure that
GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=true keeps erroring out about
--stdin, and it isn't picked up by the revision.c API's handling of
that option.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "heads" section on the gitweb summary page shows heads in
`-committerdate` order (ie. the most recently-modified ones at the
top), tie-breaking equal-dated refs using the implicit `refname` sort
fallback. This recency-based ordering appears in multiple places in the
UI, such as the project listing, the tags list, and even the
shortlog and log views.
Given two equal-dated refs, however, sorting the `HEAD` ref before
the non-`HEAD` ref provides more useful signal than merely sorting by
refname. For example, say we had "master" and "trunk" both pointing at
the same commit but "trunk" was `HEAD`, sorting "trunk" first helps
communicate its special status as the default branch that you'll check
out if you clone the repo.
Add `-HEAD` as a secondary sort key to the `git for-each-ref` call
in `git_get_heads_list()` to provide the desired behavior. The most
recently committed refs will appear first, but `HEAD`-ness will be used
as a tie-breaker. Note that `refname` is the implicit fallback sort key,
which means that two same-dated non-`HEAD` refs will continue to be
sorted in lexicographical order, as they are today.
Signed-off-by: Greg Hurrell <greg@hurrell.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-10 09:49:20 +09:00
662 changed files with 82337 additions and 70566 deletions
@ -485,7 +507,9 @@ static int batch_unordered_packed(const struct object_id *oid,
uint32_tpos,
void*data)
{
returnbatch_unordered_object(oid,data);
returnbatch_unordered_object(oid,pack,
nth_packed_object_offset(pack,pos),
data);
}
staticintbatch_objects(structbatch_options*opt)
@ -529,6 +553,8 @@ static int batch_objects(struct batch_options *opt)
if(has_promisor_remote())
warning("This repository uses promisor remotes. Some objects may not be loaded.");
read_replace_refs=0;
cb.opt=opt;
cb.expand=&data;
cb.scratch=&output;
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.