"To dereference" and "to peel" were sometimes used in in-code
comments and documentation but without description in the glossary.
* vd/glossary-dereference-peel:
glossary: add definitions for dereference & peel
Add 'gitglossary' definitions for "dereference" (as it used for both symrefs
and objects) and "peel". These terms are used in options and documentation
throughout Git, but they are not clearly defined anywhere and the behavior
they refer to depends heavily on context. Provide explicit definitions to
clarify existing documentation to users and help contributors to use the
most appropriate terminology possible in their additions to Git.
Update other definitions in the glossary that use the term "dereference" to
link to 'def_dereference'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rev-list --unpacked --objects" failed to exclude packed
non-commit objects, which has been corrected.
* tb/rev-list-unpacked-fix:
pack-bitmap: drop --unpacked non-commit objects from results
list-objects: drop --unpacked non-commit objects from results
Leakfix.
* ps/leakfixes:
setup: fix leaking repository format
setup: refactor `upgrade_repository_format()` to have common exit
shallow: fix memory leak when registering shallow roots
test-bloom: stop setting up Git directory twice
Another step to deprecate test_i18ngrep.
* jc/test-i18ngrep:
tests: teach callers of test_i18ngrep to use test_grep
test framework: further deprecate test_i18ngrep
"git merge-file" learns a mode to read three contents to be merged
from blob objects.
* bc/merge-file-object-input:
merge-file: add an option to process object IDs
git-merge-file doc: drop "-file" from argument placeholders
"git rev-list --missing" did not work for missing commit objects,
which has been corrected.
* kn/rev-list-missing-fix:
rev-list: add commit object support in `--missing` option
rev-list: move `show_commit()` to the bottom
revision: rename bit to `do_not_die_on_missing_objects`
Teach "git show-ref" a mode to check the existence of a ref.
* ps/show-ref:
t: use git-show-ref(1) to check for ref existence
builtin/show-ref: add new mode to check for reference existence
builtin/show-ref: explicitly spell out different modes in synopsis
builtin/show-ref: ensure mutual exclusiveness of subcommands
builtin/show-ref: refactor options for patterns subcommand
builtin/show-ref: stop using global vars for `show_one()`
builtin/show-ref: stop using global variable to count matches
builtin/show-ref: refactor `--exclude-existing` options
builtin/show-ref: fix dead code when passing patterns
builtin/show-ref: fix leaking string buffer
builtin/show-ref: split up different subcommands
builtin/show-ref: convert pattern to a local variable
The codepath to traverse the commit-graph learned to notice that a
commit is missing (e.g., corrupt repository lost an object), even
though it knows something about the commit (like its parents) from
what is in commit-graph.
* ps/do-not-trust-commit-graph-blindly-for-existence:
commit: detect commits that exist in commit-graph but not in the ODB
commit-graph: introduce envvar to disable commit existence checks
Further limit tree depth max to avoid Windows build running out of
the stack space.
* jk/tree-name-and-depth-limit:
max_tree_depth: lower it for MSVC to avoid stack overflows
When performing revision queries with `--objects` and
`--use-bitmap-index`, the output may incorrectly contain objects which
are packed, even when the `--unpacked` option is given. This affects
traversals, but also other querying operations, like `--count`,
`--disk-usage`, etc.
Like in the previous commit, the fix is to exclude those objects from
the result set before they are shown to the user (or, in this case,
before the bitmap containing the result of the traversal is enumerated
and its objects listed).
This is performed by a new function in pack-bitmap.c, called
`filter_packed_objects_from_bitmap()`. Note that we do not have to
inspect individual bits in the result bitmap, since we know that the
first N (where N is the number of objects in the bitmap's pack/MIDX)
bits correspond to objects which packed by definition.
In other words, for an object to have a bitmap position (not in the
extended index), it must appear in either the bitmap's pack or one of
the packs in its MIDX.
This presents an appealing optimization to us, which is that we can
simply memset() the corresponding number of `eword_t`'s to zero,
provided that we handle any objects which spill into the next word (but
don't occupy all 64 bits of the word itself).
We only have to handle objects in the bitmap's extended index. These
objects may (or may not) appear in one or more pack(s). Since these
objects are known to not appear in either the bitmap's MIDX or pack,
they may be stored as loose, appear in other pack(s), or both.
Before returning a bitmap containing the result of the traversal back to
the caller, drop any bits from the extended index which appear in one or
more packs. This implements the correct behavior for rev-list operations
which use the bitmap index to compute their result.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-rev-list(1), we describe the `--unpacked` option as:
Only useful with `--objects`; print the object IDs that are not in
packs.
This is true of commits, which we discard via get_commit_action(), but
not of the objects they reach. So if we ask for an --objects traversal
with --unpacked, we may get arbitrarily many objects which are indeed
packed.
I am nearly certain this behavior dates back to the introduction of
`--unpacked` via 12d2a18780 ("git rev-list --unpacked" shows only
unpacked commits, 2005-07-03), but I couldn't get that revision of Git
to compile for me. At least as early as v2.0.0 this has been subtly
broken:
$ git.compile --version
git version 2.0.0
$ git.compile rev-list --objects --all --unpacked
72791fe96c93f9ec5c311b8bc966ab349b3b5bbe
05713d991c18bbeef7e154f99660005311b5004d v1.0
153ed8b7719c6f5a68ce7ffc43133e95a6ac0fdb
8e4020bb5a8d8c873b25de15933e75cc0fc275df one
9200b628cf9dc883a85a7abc8d6e6730baee589c two
3e6b46e1b7e3b91acce99f6a823104c28aae0b58 unpacked.t
There, only the first, third, and sixth entries are loose, with the
remaining set of objects belonging to at least one pack.
The implications for this are relatively benign: bare 'git repack'
invocations which invoke pack-objects with --unpacked are impacted, and
at worst we'll store a few extra objects that should have been excluded.
Arguably changing this behavior is a backwards-incompatible change,
since it alters the set of objects emitted from rev-list queries with
`--objects` and `--unpacked`. But I argue that this change is still
sensible, since the existing implementation deviates from
clearly-written documentation.
The fix here is straightforward: avoid showing any non-commit objects
which are contained in packs by discarding them within list-objects.c,
before they are shown to the user. Note that similar treatment for
`list-objects.c::show_commit()` is not needed, since that case is
already handled by `revision.c::get_commit_action()`.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Offer a slightly more verbose description of the issue fixed by
7144dee3ec (credential/libsecret: erase matching creds only, 2023-07-26)
and cb626f8e5c (credential/wincred: erase matching creds only,
2023-07-26).
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git bugreport" learned to complain when it received a command line
argument that it will not use.
* es/bugreport-no-extra-arg:
bugreport: reject positional arguments
t0091-bugreport: stop using i18ngrep
"git send-email" did not have certain pieces of data computed yet
when it tried to validate the outging messages and its recipient
addresses, which has been sorted out.
* ms/send-email-validate-fix:
send-email: move validation code below process_address_list
"git reflog expire --single-worktree" has been broken for the past
20 months or so, which has been corrected.
* rs/reflog-expire-single-worktree-fix:
reflog: fix expire --single-worktree
"cd sub && git grep -f patterns" tried to read "patterns" file at
the top level of the working tree; it has been corrected to read
"sub/patterns" instead.
* jc/grep-f-relative-to-cwd:
grep: -f <path> is relative to $cwd
While populating the `repository_format` structure may cause us to
allocate memory, we do not call `clear_repository_format()` in some
places and thus potentially leak memory. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `upgrade_repository_format()` function has multiple exit paths,
which means that there is no common cleanup of acquired resources.
While this isn't much of a problem right now, we're about to fix a
memory leak that would require us to free the resource in every one of
those exit paths.
Refactor the code to have a common exit path so that the subsequent
memory leak fix becomes easier to implement.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When registering shallow roots, we unset the list of parents of the
to-be-registered commit if it's already been parsed. This causes us to
leak memory though because we never free this list. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're setting up the Git directory twice in the `test-tool bloom`
helper, once at the beginning of `cmd_bloom()` and once in the local
subcommand implementation `get_bloom_filter_for_commit()`. This can lead
to memory leaks as we'll overwrite variables of `the_repository` with
newly allocated data structures. On top of that it's simply unnecessary.
Fix this by only setting up the Git directory once.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The perl script introduced by 86b008ee61 (t: add library for munging
chunk-format files, 2023-10-09) uses pack("Q") and unpack("Q") to read
and write 64-bit values ("quadwords" in perl parlance) from the on-disk
chunk files. However, some builds of perl may not support 64-bit
integers at all, and throw an exception here. While some 32-bit
platforms may still support 64-bit integers in perl (such as our linux32
CI environment), others reportedly don't (the NonStop 32-bit builds).
We can work around this by treating the 64-bit values as two 32-bit
values. We can't ever combine them into a single 64-bit value, but in
practice this is OK. These are representing file offsets, and our files
are much smaller than 4GB. So the upper half of the 64-bit value will
always be 0.
We can just introduce a few helper functions which perform the
translation and double-check our assumptions.
Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In April, GitHub announced that the `macos-13` pool is available:
https://github.blog/changelog/2023-04-24-github-actions-macos-13-is-now-available/.
It is only a matter of time until the `macos-12` pool is going away,
therefore we should switch now, without pressure of a looming deadline.
Since the `macos-13` runners no longer include Python2, we also drop
specifically testing with Python2 and switch uniformly to Python3, see
https://github.com/actions/runner-images/blob/HEAD/images/macos/macos-13-Readme.md
for details about the software available on the `macos-13` pool's
runners.
Also, on macOS 13, Homebrew seems to install a `gcc@9` package that no
longer comes with a regular `unistd.h` (there seems only to be a
`ssp/unistd.h`), and hence builds would fail with:
In file included from base85.c:1:
git-compat-util.h:223:10: fatal error: unistd.h: No such file or directory
223 | #include <unistd.h>
| ^~~~~~~~~~
compilation terminated.
The reason why we install GCC v9.x explicitly is historical, and back in
the days it was because it was the _newest_ version available via
Homebrew: 176441bfb5 (ci: build Git with GCC 9 in the 'osx-gcc' build
job, 2019-11-27).
To reinstate the spirit of that commit _and_ to fix that build failure,
let's switch to the now-newest GCC version: v13.x.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 41771fa435 (cache.h: remove dependence on hex.h; make other files
include it explicitly, 2023-02-24) we added this as part of a larger
mechanical refactor. But strvec doesn't actually depend on hex.h, so
remove it.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
They are equivalents and the former still exists, so as long as the
only change this commit makes are to rewrite test_i18ngrep to
test_grep, there won't be any new bug, even if there still are
callers of test_i18ngrep remaining in the tree, or when merged to
other topics that add new uses of test_i18ngrep.
This patch was produced more or less with
git grep -l -e 'test_i18ngrep ' 't/t[0-9][0-9][0-9][0-9]-*.sh' |
xargs perl -p -i -e 's/test_i18ngrep /test_grep /'
and a good way to sanity check the result yourself is to run the
above in a checkout of c4603c1c (test framework: further deprecate
test_i18ngrep, 2023-10-31) and compare the resulting working tree
contents with the result of applying this patch to the same commit.
You'll see that test_i18ngrep in a few t/lib-*.sh files corrected,
in addition to the manual reproduction.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update an error message (which would probably never been seen).
* ob/sequencer-reword-error-message:
sequencer: fix error message on failure to copy SQUASH_MSG
UBSAN options were not propagated through the test framework to git
run via the httpd, unlike ASAN options, which has been corrected.
* jk/test-pass-ubsan-options-to-http-test:
test-lib: set UBSAN_OPTIONS to match ASan
An error message given by "git send-email" when given a malformed
address did not give correct information, which has been corrected.
* tb/send-email-extract-valid-address-error-message-fix:
git-send-email.perl: avoid printing undef when validating addresses
HTTP Header redaction code has been adjusted for a newer version of
cURL library that shows its traces differently from earlier
versions.
* jk/redact-h2h3-headers-fix:
http: update curl http/2 info matching for curl 8.3.0
http: factor out matching of curl http/2 trace lines
Clarify how "alias.foo = : git cmd ; aliased-command-string" should
be spelled with necessary whitespaces around punctuation marks to
work.
* pb/completion-aliases-doc:
completion: improve doc for complex aliases
"git diff --cached" codepath did not fill the necessary stat
information for a file when fsmonitor knows it is clean and ended
up behaving as if it is not clean, which has been corrected.
* js/diff-cached-fsmonitor-fix:
diff-lib: fix check_removed when fsmonitor is on
Update "git maintainance" timers' implementation based on systemd
timers to work with WSL.
* js/systemd-timers-wsl-fix:
maintenance(systemd): support the Windows Subsystem for Linux
"git diff --no-index -R <(one) <(two)" did not work correctly,
which has been corrected.
* pw/diff-no-index-from-named-pipes:
diff --no-index: fix -R with stdin
The completion script (in contrib/) has been taught to treat the
"-t" option to "git checkout" and "git switch" just like the
"--track" option, to complete remote-tracking branches.
* js/complete-checkout-t:
completion(switch/checkout): treat --track and -t the same
"git grep -e A --no-or -e B" is accepted, even though the negation
of "or" did not mean anything, which has been tightened.
* rs/grep-no-no-or:
grep: reject --no-or
References from description of the `--patch` option in various
manual pages have been simplified and improved.
* so/diff-doc-for-patch-update:
doc/diff-options: fix link to generating patch section
Various fixes to the behaviour of "rebase -i" when the command got
interrupted by conflicting changes.
cf. <6b927687-cf6e-d73e-78fb-bd4f46736928@gmx.de>
* pw/rebase-i-after-failure:
rebase -i: fix adding failed command to the todo list
rebase --continue: refuse to commit after failed command
rebase: fix rewritten list for failed pick
sequencer: factor out part of pick_commits()
sequencer: use rebase_path_message()
rebase -i: remove patch file after conflict resolution
rebase -i: move unlink() calls
"git for-each-ref --sort='contents:size'" sorts the refs according
to size numerically, giving a ref that points at a blob twelve-byte
(12) long before showing a blob hundred-byte (100) long.
* ks/ref-filter-sort-numerically:
ref-filter: sort numerically when ":size" is used
"git diff --no-such-option" and other corner cases around the exit
status of the "diff" command has been corrected.
* jk/diff-result-code-cleanup:
diff: drop useless "status" parameter from diff_result_code()
diff: drop useless return values in git-diff helpers
diff: drop useless return from run_diff_{files,index} functions
diff: die when failing to read index in git-diff builtin
diff: show usage for unknown builtin_diff_files() options
diff-files: avoid negative exit value
diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
The use of API between two calls to require_clean_work_tree() from
the sequencer code has been cleaned up for consistency.
* ob/sequencer-empty-hint-fix:
sequencer: rectify empty hint in call of require_clean_work_tree()
transfer.unpackLimit ought to be used as a fallback, but overrode
fetch.unpackLimit and receive.unpackLimit instead.
* ts/unpacklimit-config-fix:
transfer.unpackLimit: fetch/receive.unpackLimit takes precedence
"git diff -w --exit-code" with various options did not work
correctly, which is being addressed.
* jc/diff-exit-code-with-w-fixes:
diff: the -w option breaks --exit-code for --raw and other output modes
t4040: remove test that succeeded for a wrong reason
diff: teach "--stat -w --exit-code" to notice differences
diff: mode-only change should be noticed by "--patch -w --exit-code"
diff: move the fallback "--exit-code" code down
The commit-graph verification code that detects mixture of zero and
non-zero generation numbers has been updated.
* tb/commit-graph-verify-fix:
commit-graph: avoid repeated mixed generation number warnings
t/t5318-commit-graph.sh: test generation zero transitions during fsck
commit-graph: verify swapped zero/non-zero generation cases
commit-graph: introduce `commit_graph_generation_from_graph()`
Tweak GitHub Actions CI so that pushing the same commit to multiple
branch tips at the same time will not waste building and testing
the same thing twice.
* jc/ci-skip-same-commit:
ci: avoid building from the same commit in parallel
Overly long label names used in the sequencer machinery are now
chopped to fit under filesystem limitation.
* mp/rebase-label-length-limit:
rebase: allow overriding the maximal length of the generated labels
sequencer: truncate labels to accommodate loose refs
GitHub CI workflow has learned to trigger Coverity check.
* js/ci-coverity:
coverity: detect and report when the token or project is incorrect
coverity: allow running on macOS
coverity: support building on Windows
coverity: allow overriding the Coverity project
coverity: cache the Coverity Build Tool
ci: add a GitHub workflow to submit Coverity scans
Tests with LSan from time to time seem to emit harmless message
that makes our tests unnecessarily flakey; we work it around by
filtering the uninteresting output.
* jk/test-lsan-denoise-output:
test-lib: ignore uninteresting LSan output
Flakey "git p4" tests, as well as "git svn" tests, are now skipped
in the (rather expensive) sanitizer CI job.
* js/ci-san-skip-p4-and-svn-tests:
ci(linux-asan-ubsan): let's save some time
Tests that are known to pass with LSan are now marked as such.
* tb/mark-more-tests-as-leak-free:
leak tests: mark t5583-push-branches.sh as leak-free
leak tests: mark t3321-notes-stripspace.sh as leak-free
leak tests: mark a handful of tests as leak-free
There seems to be some internal stack overflow detection in MSVC's
`malloc()` machinery that seems to be independent of the `stack reserve`
and `heap reserve` sizes specified in the executable (editable via
`EDITBIN /STACK:<n> <exe>` and `EDITBIN /HEAP:<n> <exe>`).
In the newly test cases added by `jk/tree-name-and-depth-limit`, this
stack overflow detection is unfortunately triggered before Git can print
out the error message about too-deep trees and exit gracefully. Instead,
it exits with `STATUS_STACK_OVERFLOW`. This corresponds to the numeric
value -1073741571, something the MSYS2 runtime we sadly need to use to
run Git's test suite cannot handle and which it internally maps to the
exit code 127. Git's test suite, in turn, mistakes this to mean that the
command was not found, and fails both test cases.
Here is an example stack trace from an example run:
[0x0] ntdll!RtlpAllocateHeap+0x31 0x4212603f50 0x7ff9d6d4cd49
[0x1] ntdll!RtlpAllocateHeapInternal+0x6c9 0x42126041b0 0x7ff9d6e14512
[0x2] ntdll!RtlDebugAllocateHeap+0x102 0x42126042b0 0x7ff9d6dcd8b0
[0x3] ntdll!RtlpAllocateHeap+0x7ec70 0x4212604350 0x7ff9d6d4cd49
[0x4] ntdll!RtlpAllocateHeapInternal+0x6c9 0x42126045b0 0x7ff9596ed480
[0x5] ucrtbased!heap_alloc_dbg_internal+0x210 0x42126046b0 0x7ff9596ed20d
[0x6] ucrtbased!heap_alloc_dbg+0x4d 0x4212604750 0x7ff9596f037f
[0x7] ucrtbased!_malloc_dbg+0x2f 0x42126047a0 0x7ff9596f0dee
[0x8] ucrtbased!malloc+0x1e 0x42126047d0 0x7ff730fcc1ef
[0x9] git!do_xmalloc+0x2f 0x4212604800 0x7ff730fcc2b9
[0xa] git!do_xmallocz+0x59 0x4212604840 0x7ff730fca779
[0xb] git!xmallocz_gently+0x19 0x4212604880 0x7ff7311b0883
[0xc] git!unpack_compressed_entry+0x43 0x42126048b0 0x7ff7311ac9a4
[0xd] git!unpack_entry+0x554 0x42126049a0 0x7ff7311b0628
[0xe] git!cache_or_unpack_entry+0x58 0x4212605250 0x7ff7311ad3a8
[0xf] git!packed_object_info+0x98 0x42126052a0 0x7ff7310a92da
[0x10] git!do_oid_object_info_extended+0x3fa 0x42126053b0 0x7ff7310a44e7
[0x11] git!oid_object_info_extended+0x37 0x4212605460 0x7ff7310a38ba
[0x12] git!repo_read_object_file+0x9a 0x42126054a0 0x7ff7310a6147
[0x13] git!read_object_with_reference+0x97 0x4212605560 0x7ff7310b4656
[0x14] git!fill_tree_descriptor+0x66 0x4212605620 0x7ff7310dc0a5
[0x15] git!traverse_trees_recursive+0x3f5 0x4212605680 0x7ff7310dd831
[0x16] git!unpack_callback+0x441 0x4212605790 0x7ff7310b4c95
[0x17] git!traverse_trees+0x5d5 0x42126058a0 0x7ff7310dc0f2
[0x18] git!traverse_trees_recursive+0x442 0x4212605980 0x7ff7310dd831
[0x19] git!unpack_callback+0x441 0x4212605a90 0x7ff7310b4c95
[0x1a] git!traverse_trees+0x5d5 0x4212605ba0 0x7ff7310dc0f2
[0x1b] git!traverse_trees_recursive+0x442 0x4212605c80 0x7ff7310dd831
[0x1c] git!unpack_callback+0x441 0x4212605d90 0x7ff7310b4c95
[0x1d] git!traverse_trees+0x5d5 0x4212605ea0 0x7ff7310dc0f2
[0x1e] git!traverse_trees_recursive+0x442 0x4212605f80 0x7ff7310dd831
[0x1f] git!unpack_callback+0x441 0x4212606090 0x7ff7310b4c95
[0x20] git!traverse_trees+0x5d5 0x42126061a0 0x7ff7310dc0f2
[0x21] git!traverse_trees_recursive+0x442 0x4212606280 0x7ff7310dd831
[...]
[0xfad] git!cmd_main+0x2a2 0x42126ff740 0x7ff730fb6345
[0xfae] git!main+0xe5 0x42126ff7c0 0x7ff730fbff93
[0xfaf] git!wmain+0x2a3 0x42126ff830 0x7ff731318859
[0xfb0] git!invoke_main+0x39 0x42126ff8a0 0x7ff7313186fe
[0xfb1] git!__scrt_common_main_seh+0x12e 0x42126ff8f0 0x7ff7313185be
[0xfb2] git!__scrt_common_main+0xe 0x42126ff960 0x7ff7313188ee
[0xfb3] git!wmainCRTStartup+0xe 0x42126ff990 0x7ff9d5ed257d
[0xfb4] KERNEL32!BaseThreadInitThunk+0x1d 0x42126ff9c0 0x7ff9d6d6aa78
[0xfb5] ntdll!RtlUserThreadStart+0x28 0x42126ff9f0 0x0
I verified manually that `traverse_trees_cur_depth` was 562 when that
happened, which is far below the 2048 that were already accepted into
Git as a hard limit.
Despite many attempts to figure out which of the internals trigger this
`STATUS_STACK_OVERFLOW` and how to maybe increase certain sizes to avoid
running into this issue and let Git behave the same way as under Linux,
I failed to find any build-time/runtime knob we could turn to that
effect.
Note: even switching to using a different allocator (I used mimalloc
because that's what Git for Windows uses for its GCC builds) does not
help, as the zlib code used to unpack compressed pack entries _still_
uses the regular `malloc()`. And runs into the same issue.
Note also: switching to using a different allocator _also_ for zlib code
seems _also_ not to help. I tried that, and it still exited with
`STATUS_STACK_OVERFLOW` that seems to have been triggered by a
`mi_assert_internal()`, i.e. an internal assertion of mimalloc...
So the best bet to work around this for now seems to just lower the
maximum allowed tree depth _even further_ for MSVC builds.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git merge-file knows how to merge files on the file system already. It
would be helpful, however, to allow it to also merge single blobs.
Teach it an `--object-id` option which means that its arguments are
object IDs and not files to allow it to do so.
We handle the empty blob specially since read_mmblob doesn't read it
directly and otherwise users cannot specify an empty ancestor.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git merge-file` takes three positional arguments. Each of them is
documented as `<foo-file>`. In preparation for teaching this command to
alternatively take three object IDs, make these placeholders a bit more
generic by dropping the "-file" parts. Instead, clarify early that the
three arguments are filenames. Even after the next commit, we can afford
to present this file-centric view up front and in the general
discussion, since it will remain the default one.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in 32f3c541e3 (multi-pack-index: write pack names in chunk,
2018-07-12) the MIDX's "Packfile Names" (or "PNAM", for short) chunk was
described as containing an array of string entries. e0d1bcf825 notes
that this is the only chunk in the MIDX format's specification that is
not guaranteed to be 4-byte aligned, and so should be placed last.
This isn't quite accurate: the entries within the PNAM chunk are not
guaranteed to be 4-byte aligned since they are arbitrary strings, but
the chunk itself is 4-byte aligned since the ending is padded with NUL
bytes.
That padding has always been there since 32f3c541e3 via
midx.c::write_midx_pack_names(), which ended with:
i = MIDX_CHUNK_ALIGNMENT - (written % MIDX_CHUNK_ALIGNMENT)
if (i < MIDX_CHUNK_ALIGNMENT) {
unsigned char padding[MIDX_CHUNK_ALIGNMENT];
memset(padding, 0, sizeof(padding))
hashwrite(f, padding, i);
written += i;
}
In fact, 32f3c541e3's log message itself describes the chunk in its
first paragraph with:
Since filenames are not well structured, add padding to keep good
alignment in later chunks.
So these have always been externally aligned. Correct the corresponding
part of our documentation to reflect that.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e0d1bcf825 (multi-pack-index: add format details, 2018-07-12) describes
the MIDX's "PNAM" chunk as having entries which are "null-terminated
strings".
This is a typo, as strings are terminated with a NUL character, which is
a distinct concept from "NULL" or "null", which we typically reserve for
the void pointer to address 0.
Correct the documentation accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert tests that use `test_path_is_file` and `test_path_is_missing` to
instead use a set of helpers `test_ref_exists` and `test_ref_missing`.
These helpers are implemented via the newly introduced `git show-ref
--exists` command. Thus, we can avoid intimate knowledge of how the ref
backend stores references on disk.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we have multiple ways to show the value of a given reference, we
do not have any way to check whether a reference exists at all. While
commands like git-rev-parse(1) or git-show-ref(1) can be used to check
for reference existence in case the reference resolves to something
sane, neither of them can be used to check for existence in some other
scenarios where the reference does not resolve cleanly:
- References which have an invalid name cannot be resolved.
- References to nonexistent objects cannot be resolved.
- Dangling symrefs can be resolved via git-symbolic-ref(1), but this
requires the caller to special case existence checks depending on
whether or not a reference is symbolic or direct.
Furthermore, git-rev-list(1) and other commands do not let the caller
distinguish easily between an actually missing reference and a generic
error.
Taken together, this seems like sufficient motivation to introduce a
separate plumbing command to explicitly check for the existence of a
reference without trying to resolve its contents.
This new command comes in the form of `git show-ref --exists`. This
new mode will exit successfully when the reference exists, with a
specific exit code of 2 when it does not exist, or with 1 when there
has been a generic error.
Note that the only way to properly implement this command is by using
the internal `refs_read_raw_ref()` function. While the public function
`refs_resolve_ref_unsafe()` can be made to behave in the same way by
passing various flags, it does not provide any way to obtain the errno
with which the reference backend failed when reading the reference. As
such, it becomes impossible for us to distinguish generic errors from
the explicit case where the reference wasn't found.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The synopsis treats the `--verify` and the implicit mode the same. They
are slightly different though:
- They accept different sets of flags.
- The implicit mode accepts patterns while the `--verify` mode
accepts references.
Split up the synopsis for these two modes such that we can disambiguate
those differences.
While at it, drop "--quiet" from the pattern mode's synopsis. It does
not make a lot of sense to list patterns, but squelch the listing output
itself. The description for "--quiet" is adapted accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-show-ref(1) command has three different modes, of which one is
implicit and the other two can be chosen explicitly by passing a flag.
But while these modes are standalone and cause us to execute completely
separate code paths, we gladly accept the case where a user asks for
both `--exclude-existing` and `--verify` at the same time even though it
is not obvious what will happen. Spoiler: we ignore `--verify` and
execute the `--exclude-existing` mode.
Let's explicitly detect this invalid usage and die in case both modes
were requested.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The patterns subcommand is the last command that still uses global
variables to track its options. Convert it to use a structure instead
with the same motivation as preceding commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `show_one()` function implicitly receives a bunch of options which
are tracked via global variables. This makes it hard to see which
subcommands of git-show-ref(1) actually make use of these options.
Introduce a `show_one_options` structure that gets passed down to this
function. This allows us to get rid of more global state and makes it
more explicit which subcommands use those options.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When passing patterns to git-show-ref(1) we're checking whether any
reference matches -- if none do, we indicate this condition via an
unsuccessful exit code.
We're using a global variable to count these matches, which is required
because the counter is getting incremented in a callback function. But
now that we have the `struct show_ref_data` in place, we can get rid of
the global variable and put the counter in there instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's not immediately obvious options which options are applicable to
what subcommand in git-show-ref(1) because all options exist as global
state. This can easily cause confusion for the reader.
Refactor options for the `--exclude-existing` subcommand to be contained
in a separate structure. This structure is stored on the stack and
passed down as required. Consequently, it clearly delimits the scope of
those options and requires the reader to worry less about global state.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When passing patterns to `git show-ref` we have some code that will
cause us to die if `verify && !quiet` is true. But because `verify`
indicates a different subcommand of git-show-ref(1) that causes us to
execute `cmd_show_ref__verify()` and not `cmd_show_ref__patterns()`, the
condition cannot ever be true.
Let's remove this dead code.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a leaking string buffer in `git show-ref --exclude-existing`. While
the buffer is technically not leaking because its variable is declared
as static, there is no inherent reason why it should be.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While not immediately obvious, git-show-ref(1) actually implements three
different subcommands:
- `git show-ref <patterns>` can be used to list references that
match a specific pattern.
- `git show-ref --verify <refs>` can be used to list references.
These are _not_ patterns.
- `git show-ref --exclude-existing` can be used as a filter that
reads references from standard input, performing some conversions
on each of them.
Let's make this more explicit in the code by splitting up the three
subcommands into separate functions. This also allows us to address the
confusingly named `patterns` variable, which may hold either patterns or
reference names.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `pattern` variable is a global variable that tracks either the
reference names (not patterns!) for the `--verify` mode or the patterns
for the non-verify mode. This is a bit confusing due to the slightly
different meanings.
Convert the variable to be local. While this does not yet fix the double
meaning of the variable, this change allows us to address it in a
subsequent patch more easily by explicitly splitting up the different
subcommands of git-show-ref(1).
Note that we introduce a `struct show_ref_data` to pass the patterns to
`show_ref()`. While this is overengineered now, we will extend this
structure in a subsequent patch.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--missing` object option in rev-list currently works only with
missing blobs/trees. For missing commits the revision walker fails with
a fatal error.
Let's extend the functionality of `--missing` option to also support
commit objects. This is done by adding a `missing_objects` field to
`rev_info`. This field is an `oidset` to which we'll add the missing
commits as we encounter them. The revision walker will now continue the
traversal and call `show_commit()` even for missing commits. In rev-list
we can then check if the commit is a missing commit and call the
existing code for parsing `--missing` objects.
A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, using the `--missing=print` option while disabling the
alternate object directory allows us to find the boundary objects
between the main and alternate object directory.
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `show_commit()` function already depends on `finish_commit()`, and
in the upcoming commit, we'll also add a dependency on
`finish_object__ma()`. Since in C symbols must be declared before
they're used, let's move `show_commit()` below both `finish_commit()`
and `finish_object__ma()`, so the code is cleaner as a whole without the
need for declarations.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bit `do_not_die_on_missing_tree` is used in revision.h to ensure the
revision walker does not die when encountering a missing tree. This is
currently exclusively set within `builtin/rev-list.c` to ensure the
`--missing` option works with missing trees.
In the upcoming commits, we will extend `--missing` to also support
missing commits. So let's rename the bit to
`do_not_die_on_missing_objects`, which is object type agnostic and can
be used for both trees/commits.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/do-not-trust-commit-graph-blindly-for-existence:
commit: detect commits that exist in commit-graph but not in the ODB
commit-graph: introduce envvar to disable commit existence checks
Commit graphs can become stale and contain references to commits that do
not exist in the object database anymore. Theoretically, this can lead
to a scenario where we are able to successfully look up any such commit
via the commit graph even though such a lookup would fail if done via
the object database directly.
As the commit graph is mostly intended as a sort of cache to speed up
parsing of commits we do not want to have diverging behaviour in a
repository with and a repository without commit graphs, no matter
whether they are stale or not. As commits are otherwise immutable, the
only thing that we really need to care about is thus the presence or
absence of a commit.
To address potentially stale commit data that may exist in the graph,
our `lookup_commit_in_graph()` function will check for the commit's
existence in both the commit graph, but also in the object database. So
even if we were able to look up the commit's data in the graph, we would
still pretend as if the commit didn't exist if it is missing in the
object database.
We don't have the same safety net in `parse_commit_in_graph_one()`
though. This function is mostly used internally in "commit-graph.c"
itself to validate the commit graph, and this usage is fine. We do
expose its functionality via `parse_commit_in_graph()` though, which
gets called by `repo_parse_commit_internal()`, and that function is in
turn used in many places in our codebase.
For all I can see this function is never used to directly turn an object
ID into a commit object without additional safety checks before or after
this lookup. What it is being used for though is to walk history via the
parent chain of commits. So when commits in the parent chain of a graph
walk are missing it is possible that we wouldn't notice if that missing
commit was part of the commit graph. Thus, a query like `git rev-parse
HEAD~2` can succeed even if the intermittent commit is missing.
It's unclear whether there are additional ways in which such stale
commit graphs can lead to problems. In any case, it feels like this is a
bigger bug waiting to happen when we gain additional direct or indirect
callers of `repo_parse_commit_internal()`. So let's fix the inconsistent
behaviour by checking for object existence via the object database, as
well.
This check of course comes with a performance penalty. The following
benchmarks have been executed in a clone of linux.git with stable tags
added:
Benchmark 1: git -c core.commitGraph=true rev-list --topo-order --all (git = master)
Time (mean ± σ): 2.913 s ± 0.018 s [User: 2.363 s, System: 0.548 s]
Range (min … max): 2.894 s … 2.950 s 10 runs
Benchmark 2: git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
Time (mean ± σ): 3.834 s ± 0.052 s [User: 3.276 s, System: 0.556 s]
Range (min … max): 3.780 s … 3.961 s 10 runs
Benchmark 3: git -c core.commitGraph=false rev-list --topo-order --all (git = master)
Time (mean ± σ): 13.841 s ± 0.084 s [User: 13.152 s, System: 0.687 s]
Range (min … max): 13.714 s … 13.995 s 10 runs
Benchmark 4: git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
Time (mean ± σ): 13.762 s ± 0.116 s [User: 13.094 s, System: 0.667 s]
Range (min … max): 13.645 s … 14.038 s 10 runs
Summary
git -c core.commitGraph=true rev-list --topo-order --all (git = master) ran
1.32 ± 0.02 times faster than git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
4.72 ± 0.05 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
4.75 ± 0.04 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = master)
We look at a ~30% regression in general, but in general we're still a
whole lot faster than without the commit graph. To counteract this, the
new check can be turned off with the `GIT_COMMIT_GRAPH_PARANOIA` envvar.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our `lookup_commit_in_graph()` helper tries to look up commits from the
commit graph and, if it doesn't exist there, falls back to parsing it
from the object database instead. This is intended to speed up the
lookup of any such commit that exists in the database. There is an edge
case though where the commit exists in the graph, but not in the object
database. To avoid returning such stale commits the helper function thus
double checks that any such commit parsed from the graph also exists in
the object database. This makes the function safe to use even when
commit graphs aren't updated regularly.
We're about to introduce the same pattern into other parts of our code
base though, namely `repo_parse_commit_internal()`. Here the extra
sanity check is a bit of a tougher sell: `lookup_commit_in_graph()` was
a newly introduced helper, and as such there was no performance hit by
adding this sanity check. If we added `repo_parse_commit_internal()`
with that sanity check right from the beginning as well, this would
probably never have been an issue to begin with. But by retrofitting it
with this sanity check now we do add a performance regression to
preexisting code, and thus there is a desire to avoid this or at least
give an escape hatch.
In practice, there is no inherent reason why either of those functions
should have the sanity check whereas the other one does not: either both
of them are able to detect this issue or none of them should be. This
also means that the default of whether we do the check should likely be
the same for both. To err on the side of caution, we thus rather want to
make `repo_parse_commit_internal()` stricter than to loosen the checks
that we already have in `lookup_commit_in_graph()`.
The escape hatch is added in the form of a new GIT_COMMIT_GRAPH_PARANOIA
environment variable that mirrors GIT_REF_PARANOIA. If enabled, which is
the default, we will double check that commits looked up in the commit
graph via `lookup_commit_in_graph()` also exist in the object database.
This same check will also be added in `repo_parse_commit_internal()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As an attempt to come up with a useful mechanism to ensure that
certain messages are left untranslated [*], we earlier wrote
GIT_TEST_GETTEXT_POISON off as a failed experiment.
But the output from the test helper was easier to use while
debugging failed tests, compared to the same test writtein with the
plain-vanilla "grep". Especially when a test that expects a certain
string to appear in the output (e.g. "this test must fail with this
message") fails, "grep message output" would just silently fail and
in a &&-chained sequence of commands, it is hard to tell which step
failed. test_i18ngrep explicitly said "we wanted to see a line that
match this pattern but did not see a hit in this file".
What we have as test_i18ngrep in our tree still retains this verbose
output (even though we got rid of the "poison" support). Let's
rename it to test_grep (because it is no longer about i18n at all)
and then make test_i18ngrep a thin wrapper around it. Existing
callers of test_i18ngrep can be mechanically rewritten to instead
use test_grep over time, but it does not have to be done in this
commit.
[Footnote]
* The idea was that human-facing messages are often translated, but
there are messages that should never be translated. We use
"grep" only for the latter kind of messages, and then run tests
in "poison" mode that spew garbage for translatable messages. If
such a test run fails, it means these messages tested with "grep"
were marked for translation by mistake. test_i18ngrep was to be
used for other messages that are to be translated, and was to
always "succeed" when runing under the "poison" mode.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Information on how users are accessing hosted repositories can be
helpful to server operators. For example, being able to broadly
differentiate between fetches and initial clones; the use of shallow
repository features; or partial clone filters.
a29263c (fetch-pack: add tracing for negotiation rounds, 2022-08-02)
added some information on have counts to fetch-pack itself to help
diagnose negotiation; but from a git-upload-pack (server) perspective,
there's no means of accessing such information without using
GIT_TRACE_PACKET to examine the protocol packets.
Improve this by emitting a Trace2 JSON event from upload-pack with
summary information on the contents of a fetch request.
* haves, wants, and want-ref counts can help determine (broadly) between
fetches and clones, and the use of single-branch, etc.
* shallow clone depth, tip counts, and deepening options.
* any partial clone filter type.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath to handle recipient addresses `git send-email
--compose` learns from the user was completely broken, which has
been corrected.
* jk/send-email-fix-addresses-from-composed-messages:
send-email: handle to/cc/bcc from --compose message
Revert "send-email: extract email-parsing code into a subroutine"
doc/send-email: mention handling of "reply-to" with --compose
Code clean-up.
* ob/rebase-cleanup:
rebase: move parse_opt_keep_empty() down
rebase: handle --strategy via imply_merge() as well
rebase: simplify code related to imply_merge()
The pathspec code carelessly dereferenced NULL while emitting an
error message, which has been corrected.
* kh/pathspec-error-wo-repository-fix:
grep: die gracefully when outside repository
"git p4" tried to store symlinks to LFS when told, but has been
fixed not to do so, because it does not make sense.
* mm/p4-symlink-with-lfs:
git-p4 shouldn't attempt to store symlinks in LFS
The attribute subsystem learned to honor `attr.tree` configuration
that specifies which tree to read the .gitattributes files from.
* jc/attr-tree-config:
attr: add attr.tree for setting the treeish to read attributes from
attr: read attributes from HEAD when bare repo
Many typos, ungrammatical sentences and wrong phrasing have been
fixed.
* sn/typo-grammo-phraso-fixes:
t/README: fix multi-prerequisite example
doc/gitk: s/sticked/stuck/
git-jump: admit to passing merge mode args to ls-files
doc/diff-options: improve wording of the log.diffMerges mention
doc: fix some typos, grammar and wording issues
33d7bdd645 (builtin/reflog.c: use parse-options api for expire, delete
subcommands, 2022-01-06) broke the option --single-worktree of git
reflog expire and added a non-printable short flag for it, presumably by
accident. While before it set the variable "all_worktrees" to 0, now it
sets it to 1, its default value. --no-single-worktree is required now
to set it to 0.
Fix it by replacing the variable with one that has the opposite meaning,
to avoid the negation and its potential for confusion. The new variable
"single_worktree" directly captures whether --single-worktree was given.
Also remove the unprintable short flag SOH (start of heading) because it
is undocumented, hard to use and is likely to have been added by mistake
in connection with the negation bug above.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use parentheses and pipes to present alternatives in the argument help
for the --empty options of git am and git rebase, like in the rest of
the documentation.
While at it remove a stray use of the enum empty_action value
STOP_ON_EMPTY_COMMIT to indicate that no short option is present.
While it has a value of 0 and thus there is no user-visible change,
that enum is not meant to hold short option characters. Hard-code 0,
like we do for other options without a short option.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let the parse-options code detect and handle the use of options that are
incompatible with --show-current-patch. This requires exposing the
distinction between the "raw" and "diff" sub-modes. Do that by
splitting the mode RESUME_SHOW_PATCH into RESUME_SHOW_PATCH_RAW and
RESUME_SHOW_PATCH_DIFF and stop tracking sub-modes in a separate struct.
The result is a simpler callback function and more precise error
messages. The original reports a spurious argument or a NULL pointer:
$ git am --show-current-patch --show-current-patch=diff
error: options '--show-current-patch=diff' and '--show-current-patch=raw' cannot be used together
$ git am --show-current-patch=diff --show-current-patch
error: options '--show-current-patch=(null)' and '--show-current-patch=diff' cannot be used together
With this patch we get the more precise:
$ git am --show-current-patch --show-current-patch=diff
error: --show-current-patch=diff is incompatible with --show-current-patch
$ git am --show-current-patch=diff --show-current-patch
error: --show-current-patch is incompatible with --show-current-patch=diff
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only a single PARSE_OPT_CMDMODE option can be specified for the same
variable at the same time. This is enforced by get_value(), but the
error messages are imprecise in three ways:
1. If a non-PARSE_OPT_CMDMODE option changes the value variable of a
PARSE_OPT_CMDMODE option then an ominously vague message is shown:
$ t/helper/test-tool parse-options --set23 --mode1
error: option `mode1' : incompatible with something else
Worse: If the order of options is reversed then no error is reported at
all:
$ t/helper/test-tool parse-options --mode1 --set23
boolean: 0
integer: 23
magnitude: 0
timestamp: 0
string: (not set)
abbrev: 7
verbose: -1
quiet: 0
dry run: no
file: (not set)
Fortunately this can currently only happen in the test helper; actual
Git commands don't share the same variable for the value of options with
and without the flag PARSE_OPT_CMDMODE.
2. If there are multiple options with the same value (synonyms), then
the one that is defined first is shown rather than the one actually
given on the command line, which is confusing:
$ git am --resolved --quit
error: option `quit' is incompatible with --continue
3. Arguments of PARSE_OPT_CMDMODE options are not handled by the
parse-option machinery. This is left to the callback function. We
currently only have a single affected option, --show-current-patch of
git am. Errors for it can show an argument that was not actually given
on the command line:
$ git am --show-current-patch --show-current-patch=diff
error: options '--show-current-patch=diff' and '--show-current-patch=raw' cannot be used together
The options --show-current-patch and --show-current-patch=raw are
synonyms, but the error accuses the user of input they did not actually
made. Or it can awkwardly print a NULL pointer:
$ git am --show-current-patch=diff --show-current-patch
error: options '--show-current-patch=(null)' and '--show-current-patch=diff' cannot be used together
The reasons for these shortcomings is that the current code checks
incompatibility only when encountering a PARSE_OPT_CMDMODE option at the
command line, and that it searches the previous incompatible option by
value.
Fix the first two points by checking all PARSE_OPT_CMDMODE variables
after parsing each option and by storing all relevant details if their
value changed. Do that whether or not the changing options has the flag
PARSE_OPT_CMDMODE set. Report an incompatibility only if two options
change the variable to different values and at least one of them is a
PARSE_OPT_CMDMODE option. This changes the output of the first three
examples above to:
$ t/helper/test-tool parse-options --set23 --mode1
error: --mode1 is incompatible with --set23
$ t/helper/test-tool parse-options --mode1 --set23
error: --set23 is incompatible with --mode1
$ git am --resolved --quit
error: --quit is incompatible with --resolved
Store the argument of PARSE_OPT_CMDMODE options of type OPTION_CALLBACK
as well to allow taking over the responsibility for compatibility
checking from the callback function. The next patch will use this
capability to fix the messages for git am --show-current-patch.
Use a linked list for storing the PARSE_OPT_CMDMODE variables. This
somewhat outdated data structure is simple and suffices, as the number
of elements per command is currently only zero or one. We do support
multiple different command modes variables per command, but I don't
expect that we'd ever use a significant number of them. Once we do we
can switch to a hashmap.
Since we no longer need to search the conflicting option, the all_opts
parameter of get_value() is no longer used. Remove it.
Extend the tests to check for both conflicting option names, but don't
insist on a particular order.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-bugreport already rejected unrecognized flag arguments, like
`--diaggnose`, but this doesn't help if the user's mistake was to forget
the `--` in front of the argument. This can result in a user's intended
argument not being parsed with no indication to the user that something
went wrong. Since git-bugreport presently doesn't take any positionals
at all, let's reject all positionals and give the user a usage hint.
Signed-off-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since e6545201ad (Merge branch 'ab/detox-config-gettext', 2021-04-13),
test_i18ngrep is no longer required. Quit using it in the bugreport
tests, since it's setting a bad example for tests added later.
Signed-off-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tutorial in Documentation/MyFirstContribution.txt has steps to print
some text using the "_" function. However, this leads to compiler errors
when running "make" since "gettext.h" is not #included.
Update docs with a note to #include "gettext.h" in "builtin/psuh.c".
Signed-off-by: Jacob Stopak <jacob@initialcommit.io>
Reviewed-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move validation logic below processing of email address lists so that
email validation gets the proper email addresses. As a side effect,
some initialization needed to be moved down. In order for validation
and the actual email sending to have the same initial state, the
initialized variables that get modified by pre_process_file are
encapsulated in a new function.
This fixes email address validation errors when the optional
perl module Email::Valid is installed and multiple addresses are passed
in on a single to/cc argument like --to=foo@example.com,bar@example.com.
A new test was added to t9001 to expose failures with this case in the
future.
Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/SubmittingPatches informs the contributor that gitk's
context menu command "Copy commit summary" can be used to obtain the
conventional format of referencing existing commits. This command in
gitk was renamed to "Copy commit reference" in commit [1], following
implementation of Git's "reference" pretty format in [2].
Update mention of this gitk command in Documentation/SubmittingPatches
to its new name.
[1] b8b60957ce (gitk: rename "commit summary" to "commit reference",
2019-12-12)
[2] commit 1f0fc1d (pretty: implement 'reference' format, 2019-11-20)
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index file has room only for lower 32-bit of the file size in
the cached stat information, which means cached stat information
will have 0 in its sd_size member for a file whose size is multiple
of 4GiB. This is mistaken for a racily clean path. Avoid it by
storing a bogus sd_size value instead for such files.
* bc/racy-4gb-files:
Prevent git from rehashing 4GiB files
t: add a test helper to truncate files
Feeding "git stash store" with a random commit that was not created
by "git stash create" now errors out.
* jc/fail-stash-to-store-non-stash:
stash: be careful what we store
The codepaths that read "chunk" formatted files have been corrected
to pay attention to the chunk size and notice broken files.
* jk/chunk-bounds: (21 commits)
t5319: make corrupted large-offset test more robust
chunk-format: drop pair_chunk_unsafe()
commit-graph: detect out-of-order BIDX offsets
commit-graph: check bounds when accessing BIDX chunk
commit-graph: check bounds when accessing BDAT chunk
commit-graph: bounds-check generation overflow chunk
commit-graph: check size of generations chunk
commit-graph: bounds-check base graphs chunk
commit-graph: detect out-of-bounds extra-edges pointers
commit-graph: check size of commit data chunk
midx: check size of revindex chunk
midx: bounds-check large offset chunk
midx: check size of object offset chunk
midx: enforce chunk alignment on reading
midx: check size of pack names chunk
commit-graph: check consistency of fanout table
midx: check size of oid lookup chunk
commit-graph: check size of oid fanout chunk
midx: stop ignoring malformed oid fanout chunk
t: add library for munging chunk-format files
...
The description of the `git bisect run` command syntax at the beginning
of the manpage is `git bisect run <cmd>...`, which isn't quite clear
about what `<cmd>` is or what the `...` mean; one could think that it is
the whole (quoted) command line with all arguments in a single string,
or that it supports multiple commands, or that it doesn't accept
commands with arguments at all.
Change to `git bisect run <cmd> [<arg>...]` to clarify the syntax,
in both the manpage and the `git bisect -h` command output.
Additionally, change `--term-{new,bad}` et al to `--term-(new|bad)`
for consistency with the synopsis syntax conventions.
Signed-off-by: Javier Mora <cousteaulecommandant@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As per the CodingGuidelines document, it is recommended that error messages
such as die(), error() and warning(), should start with a lowercase letter
and should not end with a period.
This patch adjusts tests to match updated messages.
Signed-off-by: Isoken June Ibizugbe <isokenjune@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike "git log --pretty=%D", "git log --pretty="%(decorate)" did
not auto-initialize the decoration subsystem, which has been
corrected.
* ak/pretty-decorate-more-fix:
pretty: fix ref filtering for %(decorate) formats
The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.
* vd/loose-ref-iteration-optimization:
files-backend.c: avoid stat in 'loose_fill_ref_dir'
dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
dir.[ch]: expose 'get_dtype'
ref-cache.c: fix prefix matching in ref iteration
"git merge-tree" learned to take strategy backend specific options
via the "-X" option, like "git merge" does.
* ty/merge-tree-strategy-options:
merge: introduce {copy|clear}_merge_options()
merge-tree: add -X strategy option
The "-v" option is shown in the SYNOPSIS section near the top, but
"-q" is not shown anywhere there.
List "-q" alongside "-v".
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This moves it right next to parse_opt_empty(), which is a much more
logical place. As a side effect, this removes the need for a forward
declaration of imply_merge().
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At least after the successive trimming of enum rebase_type mentioned in
the previous commit, this code did exactly what imply_merge() does, so
just call it instead.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code's evolution left in some bits surrounding enum rebase_type that
don't really make sense any more. In particular, it makes no sense to
invoke imply_merge() if the type is already known not to be
REBASE_APPLY, and it makes no sense to assign the type after calling
imply_merge().
enum rebase_type had more values until commit a74b35081c ("rebase: drop
support for `--preserve-merges`") and commit 10cdb9f38a ("rebase: rename
the two primary rebase backends"). The latter commit also renamed
imply_interactive() to imply_merge().
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user writes a message via --compose, send-email will pick up
various headers like "From", "Subject", etc and use them for other
patches as if they were specified on the command-line. But we don't
handle "To", "Cc", or "Bcc" this way; we just tell the user "those
aren't interpeted yet" and ignore them.
But it seems like an obvious thing to want, especially as the same
feature exists when the cover letter is generated separately by
format-patch. There it is gated behind the --to-cover option, but I
don't think we'd need the same control here; since we generate the
--compose template ourselves based on the existing input, if the user
leaves the lines unchanged then the behavior remains the same.
So let's fill in the implementation; like those other headers we already
handle, we just need to assign to the initial_* variables. The only
difference in this case is that they are arrays, so we'll feed them
through parse_address_line() to split them (just like we would when
reading a single string via prompting).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit b6049542b9.
Prior to that commit, we read the results of the user editing the
"--compose" message in a loop, picking out parts we cared about, and
streaming the result out to a ".final" file. That commit split the
reading/interpreting into two phases; we'd now read into a hash, and
then pick things out of the hash.
The goal was making the code more readable. And in some ways it did,
because the ugly regexes are confined to the reading phase. But it also
introduced several bugs, because now the two phases need to match each
other. In particular:
- we pick out headers like "Subject: foo" with a case-insensitive
regex, and then use the user-provided header name as the key in a
case-sensitive hash. So if the user wrote "subject: foo", we'd no
longer recognize it as a subject.
- the namespace for the hash keys conflates header names with meta
information like "body". If you put "body: foo" in your message, it
would be misinterpreted as the actual message body (nobody is likely
to do that in practice, but it seems like an unnecessary danger).
- the handling for to/cc/bcc is totally broken. The behavior before
that commit is to recognize and skip those headers, with a note to
the user that they are not yet handled. Not great, but OK. But
after the patch, the reading side now splits the addresses into a
perl array-ref. But the interpreting side doesn't handle this at
all, and blindly prints the stringified array-ref value. This leads
to garbage like:
(mbox) Adding to: ARRAY (0x555b4345c428) from line 'To: ARRAY(0x555b4345c428)'
error: unable to extract a valid address from: ARRAY (0x555b4345c428)
What to do with this address? ([q]uit|[d]rop|[e]dit):
Probably not a huge deal, since nobody should even try to use those
headers in the first place (since they were not implemented). But
the new behavior is worse, and indicative of the sorts of problems
that come from having the two layers.
The revert had a few conflicts, due to later work in this area from
15dc3b9161 (send-email: rename variable for clarity, 2018-03-04) and
d11c943c78 (send-email: support separate Reply-To address, 2018-03-04).
I've ported the changes from those commits over as part of the conflict
resolution.
The new tests show the bugs. Note the use of GIT_SEND_EMAIL_NOTTY in the
second one. Without it, the test is happy to reach outside the test
harness to the developer's actual terminal (when run with the buggy
state before this patch).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for git-send-email lists the headers handled specially
by --compose in a way that implies that this is the complete set of
headers that are special. But one more was added by d11c943c78
(send-email: support separate Reply-To address, 2018-03-04) and never
documented.
Let's add it, and reword the documentation slightly to avoid having to
specify the list of headers twice (as it is growing and will continue to
do so as we add new features).
If you read the code, you may notice that we also handle MIME-Version
specially, in that we'll avoid over-writing user-provided MIME headers.
I don't think this is worth mentioning, as it's what you'd expect to
happen (as opposed to the other headers, which are picked up to be used
in later emails). And certainly this feature existed when the
documentation was expanded in 01d3861217 (git-send-email.txt: describe
--compose better, 2009-03-16), and we chose not to mention it then.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Die gracefully when `git grep --no-index` is run outside of a Git
repository and the path is outside the directory tree.
If you are not in a Git repository and say:
git grep --no-index search ..
You trigger a `BUG`:
BUG: environment.c:213: git environment hasn't been setup
Aborted (core dumped)
Because `..` is a valid path which is treated as a pathspec. Then
`pathspec` figures out that it is not in the current directory tree. The
`BUG` is triggered when `pathspec` tries to advise the user about how the
path is not in the current (non-existing) repository.
Reported-by: ks1322 ks1322 <ks1322@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-p4.py would attempt to put a symlink in LFS if its file extension
matched git-p4.largeFileExtensions.
Git LFS doesn't store symlinks because smudge/clean filters don't handle
symlinks. They never get passed to the filter process nor the
smudge/clean filters, nor could that occur without a change to the
protocol or command-line interface. Unless Git learned how to send them
to the filters, Git LFS would have a hard time using them in any useful
way.
Git LFS's goal is to move large files out of the repository history, and
symlinks are functionally limited to 4 KiB or a similar size on most
systems.
Signed-off-by: Matthew McClain <mmcclain@noprivs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests in t7601 use "test -f" and "test ! -f" to see if a path
exists or is missing.
Use test_path_is_file and test_path_is_missing helper functions to
clarify these tests a bit better. This especially matters for the
"missing" case because "test ! -f F" will be happy if "F" exists as a
directory, but the intent of the test is that "F" should not exist, even
as a directory. The updated code expresses this better.
Signed-off-by: Dorcas AnonoLitunya <anonolitunya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git am` passes the value given to its `--whitespace` option through
to the underlying `git apply`, and the value is called <action> over
there. Fix the documentation for the command that calls the value
<option> to say <action> instead.
Note that the option help given by `git am -h` already calls the
value <action>, so there is no need to make a matching change there.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix "git merge-tree" to stop segfaulting when the --attr-source
option is used.
* jc/merge-ort-attr-index-fix:
merge-ort: initialize repo in index state
"git repack" learned "--max-cruft-size" to prevent cruft packs from
growing without bounds.
* tb/repack-max-cruft-size:
repack: free existing_cruft array after use
builtin/repack.c: avoid making cruft packs preferred
builtin/repack.c: implement support for `--max-cruft-size`
builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE
t7700: split cruft-related tests to t7704
These error messages say "new_index" as if that spelling has some
significance to the end users (e.g. the file "$GIT_DIR/new_index"
has some issues), but that is not the case at all. The i18n folks
were made to include the word literally in the translated messages,
which was not a good idea at all. Spell it "new index", as we are
just telling the users that we failed to create a new index file.
The term is expected to be translated to the end-users' languages,
not left as if it were a literal file name.
This dates all the way back to the first re-implemenation of "git
commit" command in C (the scripted version did not have such wording
in its error messages), in f5bbc322 (Port git commit to C.,
2007-11-08).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As described in the CodingGuidelines document, a single line message
given to die() and its friends should not capitalize its first word,
and should not add full-stop at the end.
Signed-off-by: Naomi Ibe <naomi.ibeh69@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for geometric repacking mentions a "--unpacked" option
that supposedly changes how loose objects are rolled up. This option has
never existed, and the implied behaviour, namely to include all unpacked
objects into the resulting packfile, is in fact the default behaviour.
Correct the documentation to not mention this option.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `-g` switch is a shorthand for `--geometric=` and allows the user to
specify the geometric. The documentation is wrong though and indicates
that the syntax for the shorthand is `-g=<factor>`. In fact though, the
option must be specified without the equals sign via `-g<factor>`.
Fix the syntax accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test t5319.88 ("reader bounds-checks large offset table") can fail
intermittently. The failure mode looks like this:
1. An earlier test sets up "objects64", a directory that can be used
to produce a midx with a corrupted large-offsets table. To get the
large offsets, it corrupts the normal ".idx" file to have a fake
large offset, and then builds a midx from that.
That midx now has a large offset table, which is what we want. But
we also have a .idx on disk that has a corrupted entry. We'll call
the object with the corrupted large-offset "X".
2. In t5319.88, we further corrupt the midx by reducing the size of
the large-offset chunk (because our goal is to make sure we do not
do an out-of-bounds read on it).
3. We then enumerate all of the objects with "cat-file --batch-check
--batch-all-objects", expecting to see a complaint when we try to
show object X. We use --batch-all-objects because our objects64
repo doesn't actually have any refs (but if we check them all, one
of them will be the failing one). The default batch-check format
includes %(objecttype) and %(objectsize), both of which require us
to access the actual pack data (and thus requires looking at the
offset).
4a. Usually, this succeeds. We try to output object X, do a lookup via
the midx for the type/size lookup, and run into the corrupt
large-offset table.
4b. But sometimes we hit a different error. If another object points
to X as a delta base, then trying to find the type of that object
requires walking the delta chain to the base entry (since only the
base has the concrete type; deltas themselves are either OFS_DELTA
or REF_DELTA).
Normally this would not require separate offset lookups at all, as
deltas are usually stored as OFS_DELTA, specifying the relative
offset to the base. But the corrupt idx created in step 1 is done
directly with "git pack-objects" and does not pass the
--delta-base-offset option, meaning we have REF_DELTA entries!
Those do have to consult an index to find the location of the base
object, and they use the pack .idx to do this. The same pack .idx
that we know is corrupted from step 1!
Git does notice the error, but it does so by seeing the corrupt
.idx file, not the corrupt midx file, and the error it reports is
different, causing the test to fail.
The set of objects created in the test is deterministic. But the delta
selection seems not to be (which is not too surprising, as it is
multi-threaded). I have seen the failure in Windows CI but haven't
reproduced it locally (not even with --stress). Re-running a failed
Windows CI job tends to work. But when I download and examine the trash
directory from a failed run, it shows a different set of deltas than I
get locally. But the exact source of non-determinism isn't that
important; our test should be robust against any order.
There are a few options to fix this:
a. It would be OK for the "objects64" setup to "unbreak" the .idx file
after generating the midx. But then it would be hard for subsequent
tests to reuse it, since it is the corrupted idx that forces the
midx to have a large offset table.
b. The "objects64" setup could use --delta-base-offset. This would fix
our problem, but earlier tests have many hard-coded offsets. Using
OFS_DELTA would change the locations of objects in the pack (this
might even be OK because I think most of the offsets are within the
.idx file, but it seems brittle and I'm afraid to touch it).
c. Our cat-file output is in oid order by default. Since we store
bases before deltas, if we went in pack order (using the
"--unordered" flag), we'd always see our corrupt X before any delta
which depends on it. But using "--unordered" means we skip the midx
entirely. That makes sense, since it is just enumerating all of
the packs, using the offsets found in their .idx files directly.
So it doesn't work for our test.
d. We could ask directly about object X, rather than enumerating all
of them. But that requires further hard-coding of the oid (both
sha1 and sha256) of object X. I'd prefer not to introduce more
brittleness.
e. We can use a --batch-check format that looks at the pack data, but
doesn't have to chase deltas. The problem in this case is
%(objecttype), which has to walk to the base. But %(objectsize)
does not; we can get the value directly from the delta itself.
Another option would be %(deltabase), where we report the REF_DELTA
name but don't look at its data.
I've gone with option (e) here. It's kind of subtle, but it's simple and
has no side effects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --merge-base X other args..." insisted that X must be a
commit and errored out when given an annotated tag that peels to a
commit, but we only need it to be a committish. This has been
corrected.
* ar/diff-index-merge-base-fix:
diff: fix --merge-base with annotated tags
In .gitmodules files, submodules are keyed by their names, and the
path to the submodule whose name is $name is specified by the
submodule.$name.path variable. There were a few codepaths that
mixed the name and path up when consulting the submodule database,
which have been corrected. It took long for these bugs to be found
as the name of a submodule initially is the same as its path, and
the problem does not surface until it is moved to a different path,
which apparently happens very rarely.
* js/submodule-fix-misuse-of-path-and-name:
t7420: test that we correctly handle renamed submodules
t7419: test that we correctly handle renamed submodules
t7419, t7420: use test_cmp_config instead of grepping .gitmodules
t7419: actually test the branch switching
submodule--helper: return error from set-url when modifying failed
submodule--helper: use submodule_from_path in set-{url,branch}
Leakfix.
* jk/commit-graph-leak-fixes:
commit-graph: clear oidset after finishing write
commit-graph: free write-context base_graph_name during cleanup
commit-graph: free write-context entries before overwriting
commit-graph: free graph struct that was not added to chain
commit-graph: delay base_graph assignment in add_graph_to_chain()
commit-graph: free all elements of graph chain
commit-graph: move slab-clearing to close_commit_graph()
merge: free result of repo_get_merge_bases()
commit-reach: free temporary list in get_octopus_merge_bases()
t6700: mark test as leak-free
Test coverage for trailers has been improved.
* la/trailer-test-and-doc-updates:
trailer doc: <token> is a <key> or <keyAlias>, not both
trailer doc: separator within key suppresses default separator
trailer doc: emphasize the effect of configuration variables
trailer --unfold help: prefer "reformat" over "join"
trailer --parse docs: add explanation for its usefulness
trailer --only-input: prefer "configuration variables" over "rules"
trailer --parse help: expose aliased options
trailer --no-divider help: describe usual "---" meaning
trailer: trailer location is a place, not an action
trailer doc: narrow down scope of --where and related flags
trailer: add tests to check defaulting behavior with --no-* flags
trailer test description: this tests --where=after, not --where=before
trailer tests: make test cases self-contained
The index stores file sizes using a uint32_t. This causes any file
that is a multiple of 2^32 to have a cached file size of zero.
Zero is a special value used by racily clean. This causes git to
rehash every file that is a multiple of 2^32 every time git status
or git commit is run.
This patch mitigates the problem by making all files that are a
multiple of 2^32 appear to have a size of 1<<31 instead of zero.
The value of 1<<31 is chosen to keep it as far away from zero
as possible to help prevent things getting mixed up with unpatched
versions of git.
An example would be to have a 2^32 sized file in the index of
patched git. Patched git would save the file as 2^31 in the cache.
An unpatched git would very much see the file has changed in size
and force it to rehash the file, which is safe. The file would
have to grow or shrink by exactly 2^31 and retain all of its
ctime, mtime, and other attributes for old git to not notice
the change.
This patch does not change the behavior of any file that is not
an exact multiple of 2^32.
Signed-off-by: Jason D. Hatton <jhatton@globalfinishing.com>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we're going to work with some large files which will
be at least 4 GiB in size. To take advantage of the sparseness
functionality on most Unix systems and avoid running the system out of
disk, it would be convenient to use truncate(2) to simply create a
sparse file of sufficient size.
However, the GNU truncate(1) utility isn't portable, so let's write a
tiny test helper that does the work for us.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
44451a2 (attr: teach "--attr-source=<tree>" global option to "git",
2023-05-06) provided the ability to pass in a treeish as the attr
source. In the context of serving Git repositories as bare repos like we
do at GitLab however, it would be easier to point --attr-source to HEAD
for all commands by setting it once.
Add a new config attr.tree that allows this.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The motivation for 44451a2e5e (attr: teach "--attr-source=<tree>" global
option to "git" , 2023-05-06), was to make it possible to use
gitattributes with bare repositories.
To make it easier to read gitattributes in bare repositories however,
let's just make HEAD:.gitattributes the default. This is in line with
how mailmap works, 8c473cecfd (mailmap: default mailmap.blob in bare
repositories, 2012-12-13).
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GitHub CI workflow has learned to trigger Coverity check.
* js/ci-coverity:
coverity: detect and report when the token or project is incorrect
coverity: allow running on macOS
coverity: support building on Windows
coverity: allow overriding the Coverity project
coverity: cache the Coverity Build Tool
ci: add a GitHub workflow to submit Coverity scans
Just like OPT_FILENAME() does, "git grep -f <path>" should treat
the <path> relative to the original $cwd by paying attention to the
prefix the command is given.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git stash store" is meant to store what "git stash create"
produces, as these two are implementation details of the end-user
facing "git stash save" command. Even though it is clearly
documented as such, users would try silly things like "git stash
store HEAD" to render their stash unusable.
Worse yet, because "git stash drop" does not allow such a stash
entry to be removed, "git stash clear" would be the only way to
recover from such a mishap. Reuse the logic that allows "drop" to
refrain from working on such a stash entry to teach "store" to avoid
storing an object that is not a stash entry in the first place.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When mostly the same set of options are to be used to perform
multiple merges, one instance of the merge_options structure may
want to be created and used by copying from the same template
instance. We saw such a use recently in "git merge-tree".
Let's make the pattern official by introducing copy_merge_options()
as a supported way to make a copy of the structure, and also give
clear_merge_options() to release any resources held by a copied
instance. Currently we only make a shallow copy, so the former is a
mere structure assignment while the latter is a no-op, but this may
change in the future as the members of merge_options structure
evolve.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shuffle some bits across headers and sources to prepare for
libification effort.
* cw/prelim-cleanup:
parse: separate out parsing functions from config.h
config: correct bad boolean env value error message
wrapper: reduce scope of remove_or_warn()
hex-ll: separate out non-hash-algo functions
The "streaming" interface used for bulk-checkin codepath has been
narrowed to take only blob objects for now, with no real loss of
functionality.
* eb/limit-bulk-checkin-to-blobs:
bulk-checkin: only support blobs in index_bulk_checkin
Some references are special in the context of worktrees as they are
considered to be per-worktree instead of shared across all of the
worktrees. Most importantly, this includes "refs/worktree/" that have
explicitly been designed such that users can create per-woorktree refs.
But there are also special references that have an associated meaning
like "refs/bisect/", which is used to track state of git-bisect(1).
These special per-worktree references are documented in git-worktree(1),
but one instance is missing. In a9be29c981 (sequencer: make refs
generated by the `label` command worktree-local, 2018-04-25), we have
converted "refs/rewritten/" to be a per-worktree reference as well.
These references are used by our sequencer infrastructure to generate
labels for rebased commits. So in order to allow for multiple concurrent
rebases to happen in different worktrees, these references need to be
tracked per worktree.
We forgot to update our documentation to mention these new per-worktree
references, which is fixed by this patch.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no callers left, and we don't want anybody to add new ones (they
should use the not-unsafe version instead). So let's drop the function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The BIDX chunk tells us the offsets at which each commit's Bloom filters
can be found in the BDAT chunk. We compute the length of each filter by
checking the offsets of neighbors and subtracting them.
If the offsets are out of order, then we'll get a negative length, which
we then store as a very large unsigned value. This can cause us to read
out-of-bounds memory, as we access the hash data modulo "filter->len *
BITS_PER_WORD".
We can easily detect this case when loading the individual filters.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We load the bloom_filter_indexes chunk using pair_chunk(), so we have no
idea how big it is. This can lead to out-of-bounds reads if it is
smaller than expected, since we index it based on the number of commits
found elsewhere in the graph file.
We can check the chunk size up front, like we do for CDAT and other
chunks with one fixed-size record per commit.
The test case demonstrates the problem. It actually won't segfault,
because we end up reading random data from the follow-on chunk (BDAT in
this case), and the bounds checks added in the previous patch complain.
But this is by no means assured, and you can craft a commit-graph file
with BIDX at the end (or a smaller BDAT) that does segfault.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When loading Bloom filters from a commit-graph file, we use the offset
values in the BIDX chunk to index into the memory mapped for the BDAT
chunk. But since we don't record how big the BDAT chunk is, we just
trust that the BIDX offsets won't cause us to read outside of the chunk
memory. A corrupted or malicious commit-graph file will cause us to
segfault (in practice this isn't a very interesting attack, since
commit-graph files are local-only, and the worst case is an
out-of-bounds read).
We can't fix this by checking the chunk size during parsing, since the
data in the BDAT chunk doesn't have a fixed size (that's why we need the
BIDX in the first place). So we'll fix it in two parts:
1. Record the BDAT chunk size during parsing, and then later check
that the BIDX offsets we look up are within bounds.
2. Because the offsets are relative to the end of the BDAT header, we
must also make sure that the BDAT chunk is at least as large as the
expected header size. Otherwise, we overflow when trying to move
past the header, even for an offset of "0". We can check this
early, during the parsing stage.
The error messages are rather verbose, but since this is not something
you'd expect to see outside of severe bugs or corruption, it makes sense
to err on the side of too many details. Sadly we can't mention the
filename during the chunk-parsing stage, as we haven't set g->filename
at this point, nor passed it down through the stack.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the generation entry in a commit-graph doesn't fit, we instead insert
an offset into a generation overflow chunk. But since we don't record
the size of the chunk, we may read outside the chunk if the offset we
find on disk is malicious or corrupted.
We can't check the size of the chunk up-front; it will vary based on how
many entries need overflow. So instead, we'll do a bounds-check before
accessing the chunk memory. Unfortunately there is no error-return from
this function, so we'll just have to die(), which is what it does for
other forms of corruption.
As with other cases, we can drop the st_mult() call, since we know our
bounds-checked value will fit within a size_t.
Before this patch, the test here actually "works" because we read
garbage data from the next chunk. And since that garbage data happens
not to provide a generation number which changes the output, it appears
to work. We could construct a case that actually segfaults or produces
wrong output, but it would be a bit tricky. For our purposes its
sufficient to check that we've detected the bounds error.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We neither check nor record the size of the generations chunk we parse
from a commit-graph file. This should have one uint32_t for each commit
in the file; if it is smaller (due to corruption, etc), we may read
outside the mapped memory.
The included test segfaults without this patch, as it shrinks the size
considerably (and the chunk is near the end of the file, so we read off
the end of the array rather than accidentally reading another chunk).
We can fix this by checking the size up front (like we do for other
fixed-size chunks, like CDAT).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are loading a commit-graph chain, we check that each slice of the
chain points to the appropriate set of base graphs via its BASE chunk.
But since we don't record the size of the chunk, we may access
out-of-bounds memory if the file is corrupted.
Since we know the number of entries we expect to find (based on the
position within the commit-graph-chain file), we can just check the size
up front.
In theory this would also let us drop the st_mult() call a few lines
later when we actually access the memory, since we know that the
computed offset will fit in a size_t. But because the operands
"g->hash_len" and "n" have types "unsigned char" and "int", we'd have to
cast to size_t first. Leaving the st_mult() does that cast, and makes it
more obvious that we don't have an overflow problem.
Note that the test does not actually segfault before this patch, since
it just reads garbage from the chunk after BASE (and indeed, it even
rejects the file because that garbage does not have the expected hash
value). You could construct a file with BASE at the end that did
segfault, but corrupting the existing one is easy, and we can check
stderr for the expected message.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an entry in a commit-graph file has more than 2 parents, the
fixed-size parent fields instead point to an offset within an "extra
edges" chunk. We blindly follow these, assuming that the chunk is
present and sufficiently large; this can lead to an out-of-bounds read
for a corrupt or malicious file.
We can fix this by recording the size of the chunk and adding a
bounds-check in fill_commit_in_graph(). There are a few tricky bits:
1. We'll switch from working with a pointer to an offset. This makes
some corner cases just fall out naturally:
a. If we did not find an EDGE chunk at all, our size will
correctly be zero (so everything is "out of bounds").
b. Comparing "size / 4" lets us make sure we have at least 4 bytes
to read, and we never compute a pointer more than one element
past the end of the array (computing a larger pointer is
probably OK in practice, but is technically undefined
behavior).
c. The current code casts to "uint32_t *". Replacing it with an
offset avoids any comparison between different types of pointer
(since the chunk is stored as "unsigned char *").
2. This is the first case in which fill_commit_in_graph() may return
anything but success. We need to make sure to roll back the
"parsed" flag (and any parents we might have added before running
out of buffer) so that the caller can cleanly fall back to
loading the commit object itself.
It's a little non-trivial to do this, and we might benefit from
factoring it out. But we can wait on that until we actually see a
second case where we return an error.
As a bonus, this lets us drop the st_mult() call. Since we've already
done a bounds check, we know there won't be any integer overflow (it
would imply our buffer is larger than a size_t can hold).
The included test does not actually segfault before this patch (though
you could construct a case where it does). Instead, it reads garbage
from the next chunk which results in it complaining about a bogus parent
id. This is sufficient for our needs, though (we care that the fallback
succeeds, and that stderr mentions the out-of-bounds read).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We expect a commit-graph file to have a fixed-size data record for each
commit in the file (and we know the number of commits to expct from the
size of the lookup table). If we encounter a file where this is too
small, we'll look past the end of the chunk (and possibly even off the
mapped memory).
We can fix this by checking the size up front when we record the
pointer.
The included test doesn't segfault, since it ends up reading bytes
from another chunk. But it produces nonsense results, since the values
it reads are garbage. Our test notices this by comparing the output to a
non-corrupted run of the same command (and of course we also check that
the expected error is printed to stderr).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we load a revindex from disk, we check the size of the file
compared to the number of objects we expect it to have. But when we use
a RIDX chunk stored directly in the midx, we just access the memory
directly. This can lead to out-of-bounds memory access for a corrupted
or malicious multi-pack-index file.
We can catch this by recording the RIDX chunk size, and then checking it
against the expected size when we "load" the revindex. Note that this
check is much simpler than the one that load_revindex_from_disk() does,
because we just have the data array with no header (so we do not need
to account for the header size, and nor do we need to bother validating
the header values).
The test confirms both that we catch this case, and that we continue the
process (the revindex is required to use the midx bitmaps, but we
fallback to a non-bitmap traversal).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we see a large offset bit in the regular midx offset table, we
use the entry as an index into a separate large offset table (just like
a pack idx does). But we don't bounds-check the access to that large
offset table (nor even record its size when we parse the chunk!).
The equivalent code for a regular pack idx is in check_pack_index_ptr().
But things are a bit simpler here because of the chunked format: we can
just check our array index directly.
As a bonus, we can get rid of the st_mult() here. If our array
bounds-check is successful, then we know that the result will fit in a
size_t (and the bounds check uses a division to avoid overflow
entirely).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The object offset chunk has one fixed-size entry for each object in the
midx. But since we don't check its size, we may access out-of-bounds
memory if we see a corrupt or malicious midx file.
Sine the entries are fixed-size, the total length can be known up-front,
and we can just check it while parsing the chunk (this is similar to
what we do when opening pack idx files, which contain a similar offset
table).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The midx reader assumes chunks are aligned to a 4-byte boundary: we
treat the fanout chunk as an array of uint32_t, indexing it to feed the
results to ntohl(). Without aligning the chunks, we may violate the
CPU's alignment constraints. Though many platforms allow this, some do
not. And certanily UBSan will complain, since it is undefined behavior.
Even though most chunks are naturally 4-byte-aligned (because they are
storing uint32_t or larger types), PNAM is not. It stores NUL-terminated
pack names, so you can have a valid chunk with any length. The writing
side handles this by 4-byte-aligning the chunk, introducing a few extra
NULs as necessary. But since we don't check this on the reading side, we
may end up with a misaligned fanout and trigger the undefined behavior.
We have two options here:
1. Swap out ntohl(fanout[i]) for get_be32(fanout+i) everywhere. The
latter handles alignment itself. It's possible that it's slightly
slower (though in practice I'm not sure how true that is,
especially for these code paths which then go on to do a binary
search).
2. Enforce the alignment when reading the chunks. This is easy to do,
since the table-of-contents reader can check it in one spot.
I went with the second option here, just because it places less burden
on maintenance going forward (it is OK to continue using ntohl), and we
know it can't have any performance impact on the actual reads.
The commit-graph code uses the same chunk API. It's usually also 4-byte
aligned, but some chunks are not (like Bloom filter BDAT chunks). So
we'll pass "1" here to allow any alignment. It doesn't suffer from the
same problem as midx with its fanout because the fanout chunk is always
the first (and the rest of the format dictates that the first chunk will
start aligned).
The new test shows the effect on a midx with a misaligned PNAM chunk.
Note that the midx-reading code treats chunk-toc errors as soft, falling
back to the non-midx path rather than calling die(), as we do for other
parsing errors. Arguably we should make all of these behave the same,
but that's out of scope for this patch. For now the test just expects
the fallback behavior.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We parse the pack-name chunk as a series of NUL-terminated strings. But
since we don't look at the chunk size, there's nothing to guarantee that
we don't parse off the end of the chunk (or even off the end of the
mapped file).
We can record the length, and then as we parse make sure that we never
walk past it.
The new test exercises the case, though note that it does not actually
segfault before this patch. It hits a NUL byte somewhere in one of the
other chunks, and comes up with a garbage pack name. You could construct
one that reads out-of-bounds (e.g., a PNAM chunk at the end of file),
but this case is simple and sufficient to check that we detect the
problem.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use bsearch_hash() to look up items in the oid index of a
commit-graph. It also has a fanout table to reduce the initial range in
which we'll search. But since the fanout comes from the on-disk file, a
corrupted or malicious file can cause us to look outside of the
allocated index memory.
One solution here would be to pass the total table size to
bsearch_hash(), which could then bounds check the values it reads from
the fanout. But there's an inexpensive up-front check we can do, and
it's the same one used by the midx and pack idx code (both of which
likewise have fanout tables and use bsearch_hash(), but are not affected
by this bug):
1. We can check the value of the final fanout entry against the size
of the table we got from the index chunk. These must always match,
since the fanout is just slicing up the index.
As a side note, the midx and pack idx code compute it the other
way around: they use the final fanout value as the object count, and
check the index size against it. Either is valid; if they
disagree we cannot know which is wrong (a corrupted fanout value,
or a too-small table of oids).
2. We can quickly scan the fanout table to make sure it is
monotonically increasing. If it is, then we know that every value
is less than or equal to the final value, and therefore less than
or equal to the table size.
It would also be sufficient to just check that each fanout value is
smaller than the final one, but the midx and pack idx code both do
a full monotonicity check. It's the same cost, and it catches some
other corruptions (though not all; the checks done by "commit-graph
verify" are more complete but more expensive, and our goal here is
to be fast and memory-safe).
There are two new tests. One just checks the final fanout value (this is
the mirror image of the "too small oid lookup" case added for the midx
in the previous commit; it's flipped here because commit-graph considers
the oid lookup chunk to be the source of truth).
The other actually creates a fanout with many out-of-bounds entries, and
prior to this patch, it does cause the segfault you'd expect. But note
that the error is not "your fanout entry is out-of-bounds", but rather
"fanout value out of order". That's because we leave the final fanout
value in place (to get past the table size check), making the index
non-monotonic (the second-to-last entry is big, but the last one must
remain small to match the actual table).
We need adjustments to a few existing tests, as well:
- an earlier test in t5318 corrupts the fanout and runs "commit-graph
verify". Its message is now changed, since we catch the problem
earlier (during the load step, rather than the careful validation
step).
- in t5324, we test that "commit-graph verify --shallow" does not do
expensive verification on the base file of the chain. But the
corruption it uses (munging a byte at offset 1000) happens to be in
the middle of the fanout table. And now we detect that problem in
the cheaper checks that are performed for every part of the graph.
We'll push this back to offset 1500, which is only caught by the
more expensive checksum validation.
Likewise, there's a later test in t5324 which munges an offset 100
bytes into a file (also in the fanout table) that is referenced by
an alternates file. So we now find that corruption during the load
step, rather than the verification step. At the very least we need
to change the error message (like the case above in t5318). But it
is probably good to make sure we handle all parts of the
verification even for alternate graph files. So let's likewise
corrupt byte 1500 and make sure we found the invalid checksum.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading an on-disk multi-pack-index, we take the number of objects
in the midx from the final value of the fanout table. But we just
blindly assume that the chunk containing the actual oid entries is the
correct size. This can lead to us reading out-of-bounds memory if the
lookup chunk is too small (or if the fanout is corrupted; when they
don't agree we cannot tell which one is wrong).
Note that we bump the assignment of m->num_objects into the fanout
parser callback, so that it's set when we parse the lookup table
(otherwise we'd have to manually record the lookup table size and check
it later).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We load the oid fanout chunk with pair_chunk(), which means we never see
the size of the chunk. We just assume the on-disk file uses the
appropriate size, and if it's too small we'll access random memory.
It's easy to check this up-front; the fanout always consists of 256
uint32's, since it is a fanout of the first byte of the hash pointing
into the oid index. These parameters can't be changed without
introducing a new chunk type.
This matches the similar check in the midx OIDF chunk (but note that
rather than checking for the error immediately, the graph code just
leaves parts of the struct NULL and checks for required fields later).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we load the oid-fanout chunk, our callback checks that its size is
reasonable and returns an error if not. However, the caller only checks
our return value against CHUNK_NOT_FOUND, so we end up ignoring the
error completely! Using a too-small fanout table means we end up
accessing random memory for the fanout and segfault.
We can fix this by checking for any non-zero return value, rather than
just CHUNK_NOT_FOUND, and adjusting our error message to cover both
cases. We could handle each error code individually, but there's not
much point for such a rare case. The extra message produced in the
callback makes it clear what is going on.
The same pattern is used in the adjacent code. Those cases are actually
OK for now because they do not use a custom callback, so the only error
they can get is CHUNK_NOT_FOUND. But let's convert them, as this is an
accident waiting to happen (especially as we convert some of them away
from pair_chunk). The error messages are more verbose, but it should be
rare for a user to see these anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When testing corruption of files using the chunk format (like
commit-graphs and midx files), it's helpful to be able to modify bytes
in specific chunks. This requires being able both to read the
table-of-contents (to find the chunk to modify) but also to adjust it
(to account for size changes in the offsets of subsequent chunks).
We have some tests already which corrupt chunk files, but they have some
downsides:
1. They are very brittle, as they manually compute the expected size
of a particular instance of the file (e.g., see the definitions
starting with NUM_OBJECTS in t5319).
2. Because they rely on manual offsets and don't read the
table-of-contents, they're limited to overwriting bytes. But there
are many interesting corruptions that involve changing the sizes of
chunks (especially smaller-than-expected ones).
This patch adds a perl script which makes such corruptions easy. We'll
use it in subsequent patches.
Note that we could get by with just a big "perl -e" inside the helper
function. I chose to put it in a separate script for two reasons. One,
so we don't have to worry about the extra layer of shell quoting. And
two, the script is kind of big, and running the tests with "-x" would
repeatedly dump it into the log output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pair_chunk() function is provided as an easy helper for parsing
chunks that just want a pointer to a set of bytes. But every caller has
a hidden bug: because we return only the pointer without the matching
chunk size, the callers have no clue how many bytes they are allowed to
look at. And as a result, they may read off the end of the mmap'd data
when the on-disk file does not match their expectations.
Since chunk files are typically used for local-repository data like
commit-graph files and midx's, the security implications here are pretty
mild. The worst that can happen is that you hand somebody a corrupted
repository tarball, and running Git on it does an out-of-bounds read and
crashes. So it's worth being more defensive, but we don't need to drop
everything and fix every caller immediately.
I noticed the problem because the pair_chunk_fn() callback does not look
at its chunk_size argument, and wanted to annotate it to silence
-Wunused-parameter. We could do that now, but we'd lose the hint that
this code should be audited and fixed.
So instead, let's set ourselves up for going down that path:
1. Provide a pair_chunk() function that does return the size, which
prepares us for fixing these cases.
2. Rename the existing function to pair_chunk_unsafe(). That gives us
an easy way to grep for cases which still need to be fixed, and the
name should cause anybody adding new calls to think twice before
using it.
There are no callers of the "safe" version yet, but we'll add some in
subsequent patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Modify the 'readdir' loop in 'loose_fill_ref_dir' to, rather than 'stat' a
file to determine whether it is a directory or not, use 'get_dtype'.
Currently, the loop uses 'stat' to determine whether each dirent is a
directory itself or not in order to construct the appropriate ref cache
entry. If 'stat' fails (returning a negative value), the dirent is silently
skipped; otherwise, 'S_ISDIR(st.st_mode)' is used to check whether the entry
is a directory.
On platforms that include an entry's d_type in in the 'dirent' struct, this
extra 'stat' check is redundant. We can use the 'get_dtype' method to
extract this information on platforms that support it (i.e. where
NO_D_TYPE_IN_DIRENT is unset), and derive it with 'stat' on platforms that
don't. Because 'stat' is an expensive call, this confers a
modest-but-noticeable performance improvement when iterating over large
numbers of refs (approximately 20% speedup in 'git for-each-ref' in a 30k
ref repo).
Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set
to 1 to replicate the existing handling of symlink dirents. This
unfortunately requires calling 'stat' on the associated entry regardless of
platform, but symlinks in the loose ref store are highly unlikely since
they'd need to be created manually by a user.
Note that this patch also changes the condition for skipping creation of a
ref entry from "when 'stat' fails" to "when the d_type is anything other
than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because
the platform doesn't support d_type in dirents or some other reason) or
DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If
the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be
skipped. However, it will also be skipped if it is any other valid d_type
(e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not
handle these properly anyway, so we can safely constrain accepted types to
directories and regular files.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a 'follow_symlink' boolean option to 'get_type()'. If 'follow_symlink'
is enabled, DT_LNK (in addition to DT_UNKNOWN) d_types triggers the
stat-based d_type resolution, using 'stat' instead of 'lstat' to get the
type of the followed symlink. Note that symlinks are not followed
recursively, so a symlink pointing to another symlink will still resolve to
DT_LNK.
Update callers in 'diagnose.c' to specify 'follow_symlink = 0' to preserve
current behavior.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to
'dir.h' so that it is accessible to callers in other files. The function and
its documentation are moved verbatim except for a small addition to the
description clarifying what the 'path' arg represents.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'cache_ref_iterator_advance' to skip over refs that are not matched
by the given prefix.
Currently, a ref entry is considered "matched" if the entry name is fully
contained within the prefix:
* prefix: "refs/heads/v1"
* entry: "refs/heads/v1.0"
OR if the prefix is fully contained in the entry name:
* prefix: "refs/heads/v1.0"
* entry: "refs/heads/v1"
The first case is always correct, but the second is only correct if the ref
cache entry is a directory, for example:
* prefix: "refs/heads/example"
* entry: "refs/heads/"
Modify the logic in 'cache_ref_iterator_advance' to reflect these
expectations:
1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
ref cache entry do not overlap at all. Skip this entry.
2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
inside this entry if it is a directory. Skip if the entry is not a
directory, otherwise iterate over it.
3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
that the cache entry (directory or not) is fully contained by or equal to
the prefix. Iterate over this entry.
Note that condition 2 relies on the names of directory entries having the
appropriate trailing slash. The existing function documentation of
'create_dir_entry' explicitly calls out the trailing slash requirement, so
this is a safe assumption to make.
This bug generally doesn't have any user-facing impact, since it requires:
1. using a non-empty prefix without a trailing slash in an iteration like
'for_each_fullref_in',
2. the callback to said iteration not reapplying the original filter (as
for-each-ref does) to ensure unmatched refs are skipped, and
3. the repository having one or more refs that match part of, but not all
of, the prefix.
However, there are some niche scenarios that meet those criteria
(specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add
tests covering those cases to demonstrate the fix in this patch.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
initialize_attr_index() does not initialize the repo member of
attr_index. Starting in 44451a2e5e (attr: teach "--attr-source=<tree>"
global option to "git", 2023-05-06), this became a problem because
istate->repo gets passed down the call chain starting in
git_check_attr(). This gets passed all the way down to
replace_refs_enabled(), which segfaults when accessing r->gitdir.
Fix this by initializing the repository in the index state.
Signed-off-by: John Cai <johncai86@gmail.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'--dd' only makes sense for 'git log' and 'git show', so add it to
__git_log_show_options which is referenced in the completion for these
two commands.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option provides a shortcut to request diff with respect to first
parent for any kind of commit, universally. It's implemented as pure
synonym for "--diff-merges=first-parent --patch".
Gives user quick and universal way to see what changes, exactly, were
brought to a branch by merges as well as by regular commits.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Put descriptions of convenience shortcuts first, so they are the
first things reader observes rather than lengthy detailed stuff.
* Get rid of very long line containing all the --diff-merges formats
by replacing them with <format>, and putting each supported format
on its own line.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The DESCRIPTION's "first form" is actually the 1st, 2nd, 3rd and 5th
form in SYNOPSIS, the "second form" is the 4th one.
Interestingly, this state of affairs was introduced in
97fe725075 (cat-file docs: fix SYNOPSIS and "-h" output, 2021-12-28)
with the claim of "Now the two will match again." ("the two" being
DESCRIPTION and SYNOPSIS)...
The description also suffers from other correctness and clarity issues,
e.g., the "first form" paragraph discusses -p, -s and -t, but leaves out
-e, which is included in the corresponding SYNOPSIS section; the second
paragraph mentions <format>, which doesn't occur in SYNOPSIS at all, and
of the three batch options, really only describes the behavior of
--batch-check. Also the mention of "drivers" seems an implementation
detail not adding much clarity in a short summary (and isn't expanded
upon in the rest of the man page, either).
Rather than trying to maintain one-to-one (or N-to-M) correspondence
between the DESCRIPTION and SYNOPSIS forms, creating duplication and
providing opportunities for error, shorten the former into a concise
summary describing the two general modes of operation: batch and
non-batch, leaving details to the subsequent manual sections.
While here, fix a grammar error in the description of -e and make the
following further minor improvements:
NAME:
shorten ("content or type and size" isn't the whole story; say
"details" and leave the actual details to later sections)
SYNOPSIS and --help:
move the (--textconv | --filters) form before --batch, closer
to the other non-batch forms
Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The soft limit of the first line of the commit message should be
"no more than 50 characters" or "50 characters or less", but not
"less than 50 character".
This is an addition to commit c2c349a15c (doc: correct the 50 characters
soft limit, 2023-09-28).
Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of extraneous whitespace, replace tab-after-fullstop with
space, etc.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark pretty formats containing "%(decorate" as requiring decoration in
userformat_find_requirements(), same as "%d" and "%D".
Without this, cmd_log_init_finish() didn't invoke load_ref_decorations()
with the decoration_filter it puts together, and hence filtering options
such as --decorate-refs were quietly ignored.
Amend one of the %(decorate) checks in t4205-log-pretty-formats.sh to
test this.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allocate an array of packed_git pointers so that we can sort the list
of cruft packs, but we never free the array, causing a small leak. Note
that we don't need to free the packed_git structs themselves; they're
owned by the repository object.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No disrespect to other mailing list archives, but the local part of
their URLs will become pretty much meaningless once the archives go
out of service, and we learned the lesson hard way when $gmane
stopped serving.
Let's point into https://lore.kernel.org/ for an article that can be
found there, because the local part of the URL has the Message-Id:
that can be used to find the same message in other archives, even if
lore goes down.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We build up a string list of listen addresses from the command-line
arguments, but never free it. This causes t5811 to complain of a leak
(though curiously it seems to do so only when compiled with gcc, not
with clang).
To handle this correctly, we have to do a little refactoring:
- there are two exit points from the main function, depending on
whether we are entering the main loop or serving a single client
(since rather than a traditional fork model, we re-exec ourselves
with the extra "--serve" argument to accommodate Windows).
We don't need --listen at all in the --serve case, of course, but it
is passed along by the parent daemon, which simply copies all of the
command-line options it got.
- we just "return serve()" to run the main loop, giving us no chance
to do any cleanup
So let's use a "ret" variable to store the return code, and give
ourselves a single exit point at the end. That gives us one place to do
cleanup.
Note that this code also uses the "use a no-dup string-list, but
allocate strings we add to it" trick, meaning string_list_clear() will
not realize it should free them. We can fix this by switching to a "dup"
string-list, but using the "append_nodup" function to add to it (this is
preferable to tweaking the strdup_strings flag before clearing, as it
puts all the subtle memory-ownership code together).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of release_revisions() is to free memory associated with the
rev_info struct, but we have several "struct decoration" members that
are left untouched. Since the previous commit introduced a function to
do that, we can just call it.
We do have to provide some specialized callbacks to map the void
pointers onto real ones (the alternative would be casting the existing
function pointers; this generally works because "void *" is usually
interchangeable with a struct pointer, but it is technically forbidden
by the standard).
Since the line-log code does not expose the type it stores in the
decoration (nor of course the function to free it), I put this behind a
generic line_log_free() entry point. It's possible we may need to add
more line-log specific bits anyway (running t4211 shows a number of
other leaks in the line-log code).
While this doubtless cleans up many leaks triggered by the test suite,
the only script which becomes leak-free is t4217, as it does very little
beyond a simple traversal (its existing leak was from the use of
--children, which is now fixed).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's not currently any way to free the resources associated with a
decoration struct. As a result, we have several memory leaks which
cannot easily be plugged.
Let's add a "clear" function and make use of it in the example code of
t9004. This removes the only leak from that script, so we can mark it as
passing the leak sanitizer.
Curiously this leak is found only when running SANITIZE=leak with clang,
but not with gcc. But it is a bog-standard leak: we allocate some
memory in a local variable struct, and then exit main() without
releasing it. I'm not sure why gcc doesn't find it. After this
patch, both compilers report it as leak-free.
Note that the clear function takes a callback to free the individual
entries. That's not needed for our example (which is just decorating
with ints), but will be for real callers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing a `--geometric` repack, we make sure that the preferred pack
(if writing a MIDX) is the largest pack that we *didn't* repack. That
has the effect of keeping the preferred pack in sync with the pack
containing a majority of the repository's reachable objects.
But if the repository happens to double in size, we'll repack
everything. Here we don't specify any `--preferred-pack`, and instead
let the MIDX code choose.
In the past, that worked fine, since there would only be one pack to
choose from: the one we just wrote. But it's no longer necessarily the
case that there is one pack to choose from. It's possible that the
repository also has a cruft pack, too.
If the cruft pack happens to come earlier in lexical order (and has an
earlier mtime than any non-cruft pack), we'll pick that pack as
preferred. This makes it impossible to reuse chunks of the reachable
pack verbatim from pack-objects, so is sub-optimal.
Luckily, this is a somewhat rare circumstance to be in, since we would
have to repack the entire repository during a `--geometric` repack, and
the cruft pack would have to sort ahead of the pack we just created.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cruft packs are an alternative mechanism for storing a collection of
unreachable objects whose mtimes are recent enough to avoid being
pruned out of the repository.
When cruft packs were first introduced back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20) and
a7d493833f (builtin/pack-objects.c: --cruft with expiration,
2022-05-20), the recommended workflow consisted of:
- Repacking periodically, either by packing anything loose in the
repository (via `git repack -d`) or producing a geometric sequence
of packs (via `git repack --geometric=<d> -d`).
- Every so often, splitting the repository into two packs, one cruft
to store the unreachable objects, and another non-cruft pack to
store the reachable objects.
Repositories may (out of band with the above) choose periodically to
prune out some unreachable objects which have aged out of the grace
period by generating a pack with `--cruft-expiration=<approxidate>`.
This allowed repositories to maintain relatively few packs on average,
and quarantine unreachable objects together in a cruft pack, avoiding
the pitfalls of holding unreachable objects as loose while they age out
(for more, see some of the details in 3d89a8c118
(Documentation/technical: add cruft-packs.txt, 2022-05-20)).
This all works, but can be costly from an I/O-perspective when
frequently repacking a repository that has many unreachable objects.
This problem is exacerbated when those unreachable objects are rarely
(if every) pruned.
Since there is at most one cruft pack in the above scheme, each time we
update the cruft pack it must be rewritten from scratch. Because much of
the pack is reused, this is a relatively inexpensive operation from a
CPU-perspective, but is very costly in terms of I/O since we end up
rewriting basically the same pack (plus any new unreachable objects that
have entered the repository since the last time a cruft pack was
generated).
At the time, we decided against implementing more robust support for
multiple cruft packs. This patch implements that support which we were
lacking.
Introduce a new option `--max-cruft-size` which allows repositories to
accumulate cruft packs up to a given size, after which point a new
generation of cruft packs can accumulate until it reaches the maximum
size, and so on. To generate a new cruft pack, the process works like
so:
- Sort a list of any existing cruft packs in ascending order of pack
size.
- Starting from the beginning of the list, group cruft packs together
while the accumulated size is smaller than the maximum specified
pack size.
- Combine the objects in these cruft packs together into a new cruft
pack, along with any other unreachable objects which have since
entered the repository.
Once a cruft pack grows beyond the size specified via `--max-cruft-size`
the pack is effectively frozen. This limits the I/O churn up to a
quadratic function of the value specified by the `--max-cruft-size`
option, instead of behaving quadratically in the number of total
unreachable objects.
When pruning unreachable objects, we bypass the new code paths which
combine small cruft packs together, and instead start from scratch,
passing in the appropriate `--max-pack-size` down to `pack-objects`,
putting it in charge of keeping the resulting set of cruft packs sized
correctly.
This may seem like further I/O churn, but in practice it isn't so bad.
We could prune old cruft packs for whom all or most objects are removed,
and then generate a new cruft pack with just the remaining set of
objects. But this additional complexity buys us relatively little,
because most objects end up being pruned anyway, so the I/O churn is
well contained.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repack builtin takes a `--max-pack-size` command-line argument which
it uses to feed into any of the pack-objects children that it may spawn
when generating a new pack.
This option is parsed with OPT_STRING, meaning that we'll accept
anything as input, punting on more fine-grained validation until we get
down into pack-objects.
This is fine, but it's wasteful to spend an entire sub-process just to
figure out that one of its option is bogus. Instead, parse the value of
`--max-pack-size` with OPT_MAGNITUDE in 'git repack', and then pass the
known-good result down to pack-objects.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the broken quoting the test wouldn't even parse correctly, but
there's also the '==' instead of POSIX '=' (of the shells I tested,
busybox ash, bash and ksh (93 and OpenBSD) accept '==', dash and zsh do
not), and 'print 2' from Python 2 days.
(I assume the test failing due to 3 != 4 is intentional or immaterial.)
Fixes: 93a5724613 ("test-lib: Add support for multiple test prerequisites")
Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The terminology was changed in b0d12fc9b2 (Use the word 'stuck'
instead of 'sticked').
Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's even an example of such usage in the README.
Fixes: 67ba13e5a4 ("git-jump: pass "merge" arguments to ls-files")
Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the grammar ("which default value is") and reword to match other
similar descriptions (say "configuration variable" instead of
"parameter", link to git-config(1)).
Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When trying to obtain the MD5 of the Coverity Scan Tool (in order to
decide whether a cached version can be used or a new version has to be
downloaded), it is possible to get a 401 (Authorization required) due to
either an incorrect token, or even more likely due to an incorrect
Coverity project name.
Seeing an authorization failure that is caused by an incorrect project
name was somewhat surprising to me when developing the Coverity
workflow, as I found such a failure suggestive of an incorrect token
instead.
So let's provide a helpful error message about that specifically when
encountering authentication issues.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The display width table for unicode characters has been updated for
Unicode 15.1
* bb/unicode-width-table-15:
unicode: update the width tables to Unicode 15.1
"git for-each-ref" and friends learn to apply mailmap to authorname
and other fields.
* ks/ref-filter-mailmap:
ref-filter: add mailmap support
t/t6300: introduce test_bad_atom
t/t6300: cleanup test_atom
"git rev-list --stdin" learned to take non-revisions (like "--not")
recently from the standard input, but the way such a "--not" was
handled was quite confusing, which has been rethought. This is
potentially a change that breaks backward compatibility.
* ps/revision-cmdline-stdin-not:
revision: make pseudo-opt flags read via stdin behave consistently
The list of additional XY values for submodules in short format
isn't formatted consistently with the rest of the document.
Format as list for consistency.
Signed-off-by: Javier Mora <cousteaulecommandant@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a second submodule with a name that differs from its path. Test
that calling set-url modifies the correct .gitmodules entries. Make sure
we don't create a section named after the path instead of the name.
Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the submodule again with an explicitly different name and path. Test
that calling set-branch modifies the correct .gitmodules entries. Make
sure we don't create a section named after the path instead of the name.
Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a test function to verify config files. Use it as it's more
precise.
Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The submodule repo the test set up had the 'topic' branch checked out,
meaning the repo's default branch (HEAD) is the 'topic' branch.
The following tests then pretended to switch between the default branch
and the 'topic' branch. This was papered over by continually adding
commits to the 'topic' branch and checking if the submodule gets updated
to this new commit.
Return the submodule repo to the 'main' branch after setup so we can
actually test the switching behavior.
Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
set-branch will return an error when setting the config fails so I don't
see why set-url shouldn't. Also skip the sync in this case.
Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commands need a path to a submodule but treated it as the name when
modifying the .gitmodules file, leading to confusion when a submodule's
name does not match its path.
Because calling submodule_from_path initializes the submodule cache, we
need to manually trigger a reread before syncing, as the cache is
missing the config change we just made.
Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In graph_write() we store commits in an oidset, but never clean it up,
leaking the contents. We should clear it in the cleanup section.
The oidset comes from 6830c36077 (commit-graph.h: replace 'commit_hex'
with 'commits', 2020-04-13), but it was just replacing a string_list
that was also leaked. Curiously, we fixed the leak of some adjacent
variables in commit fa8953cb40 (builtin/commit-graph.c: extract
'read_one_commit()', 2020-05-18), but the oidset wasn't included for
some reason.
In combination with the preceding commits, this lets us mark t5324 as
leak-free.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6c622f9f0b (commit-graph: write commit-graph chains, 2019-06-18)
added a base_graph_name string to the write_commit_graph_context struct.
But the end-of-function cleanup forgot to free it, causing a leak.
This (presumably in combination with the preceding leak-fixes) lets us
mark t5328 as leak-free.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a split graph file, we replace the final element of the
commit_graph_hash_after and commit_graph_filenames_after arrays. But
since these are allocated strings, we need to free them before
overwriting to avoid leaking the old string.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading the graph chain file, we open (and allocate) each
individual slice it mentions and then add them to a linked-list chain.
But if adding to the chain fails (e.g., because the base-graph chunk it
contains didn't match what we expected), we leave the function without
freeing the graph struct that caused the failure, leaking it.
We can fix it by calling free_graph_commit().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding a graph to a chain, we do some consistency checks and then
if everything looks good, set g->base_graph to add a link to the chain.
But when we added a new consistency check in 209250ef38 (commit-graph.c:
prevent overflow in add_graph_to_chain(), 2023-07-12), it comes _after_
we've already set g->base_graph. So we might return failure, even though
we actually added to the chain.
This hasn't caused a bug yet, because after failing to add to the chain,
we discard the failed graph struct completely, leaking it. But in order
to fix that, it's important that the struct be in a consistent and
predictable state after the failure.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running "commit-graph verify", we call free_commit_graph(). That's
sufficient for the case of a single graph file, but if we loaded a chain
of split graph files, they form a linked list via the base_graph
pointers. We need to free all of them, or we leak all but the first
struct.
We can make this work by teaching free_commit_graph() to walk the
base_graph pointers and free each element. This in turn lets us simplify
close_commit_graph(), which does the same thing by recursion (we cannot
just use close_commit_graph() in "commit-graph verify", as the function
takes a pointer to an object store, and the verify command creates a
single one-off graph struct).
While indenting the code in free_commit_graph() for the loop, I noticed
that setting g->data to NULL is rather pointless, as we free the struct
a few lines later. So I cleaned that up while we're here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When closing and freeing a commit-graph, the main entry point is
close_commit_graph(), which then uses close_commit_graph_one() to
recurse through the base_graph links and free each one.
Commit 957ba814bf (commit-graph: when closing the graph, also release
the slab, 2021-09-08) put the call to clear the slab into the recursive
function, but this is pointless: there's only a single global slab
variable. It works OK in practice because clearing the slab is
idempotent, but it makes the code harder to reason about and refactor.
Move it into the parent function so it's only called once (and there are
no other direct callers of the recursive close_commit_graph_one(), so we
are not hurting them).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We call repo_get_merge_bases(), which allocates a commit_list, but never
free the result, causing a leak.
The obvious solution is to free it, but we need to look at the contents
of the first item to decide whether to leave the loop. One option is to
free it in both code paths. But since the commit that the list points to
is longer-lived than the list itself, we can just dereference it
immediately, free the list, and then continue with the existing logic.
This is about the same amount of code, but keeps the list management all
in one place.
This lets us mark a number of merge-related test scripts as leak-free.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We loop over the set of commits to merge, and for each one compute the
merge base against the existing set of merge base candidates we've
found. Then we replace the candidate set with a simple assignment of the
list head, leaking the old list. We should free it first before
assignment.
This makes t5521 leak-free, so mark it as such.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test has never leaked since it was added. Let's annotate it to make
sure it stays that way (and to reduce noise when looking for other
leak-free scripts after we fix some leaks).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
5c387428f1 (parse-options: don't emit "ambiguous option" for aliases,
2019-04-29) added "updated_options" to struct parse_opt_ctx_t, but it
has never been used. Remove it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A small handful of the tests in t7700 (the main script for testing
functionality of 'git repack') are specifically related to cruft pack
operations.
Prepare for adding new cruft pack-related tests by moving the existing
set into a new test script.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit implemented the `gc.repackFilter` config option
to specify a filter that should be used by `git gc` when
performing repacks.
Another previous commit has implemented
`git repack --filter-to=<dir>` to specify the location of the
packfile containing filtered out objects when using a filter.
Let's implement the `gc.repackFilterTo` config option to specify
that location in the config when `gc.repackFilter` is used.
Now when `git gc` will perform a repack with a <dir> configured
through this option and not empty, the repack process will be
passed a corresponding `--filter-to=<dir>` argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit has implemented `git repack --filter=<filter-spec>` to
allow users to filter out some objects from the main pack and move them
into a new different pack.
It would be nice if this new different pack could be created in a
different directory than the regular pack. This would make it possible
to move large blobs into a pack on a different kind of storage, for
example cheaper storage.
Even in a different directory, this pack can be accessible if, for
example, the Git alternates mechanism is used to point to it. In fact
not using the Git alternates mechanism can corrupt a repo as the
generated pack containing the filtered objects might not be accessible
from the repo any more. So setting up the Git alternates mechanism
should be done before using this feature if the user wants the repo to
be fully usable while this feature is used.
In some cases, like when a repo has just been cloned or when there is no
other activity in the repo, it's Ok to setup the Git alternates
mechanism afterwards though. It's also Ok to just inspect the generated
packfile containing the filtered objects and then just move it into the
'.git/objects/pack/' directory manually. That's why it's not necessary
for this command to check that the Git alternates mechanism has been
already setup.
While at it, as an example to show that `--filter` and `--filter-to`
work well with other options, let's also add a test to check that these
options work well with `--max-pack-size`.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit has implemented `git repack --filter=<filter-spec>` to
allow users to filter out some objects from the main pack and move them
into a new different pack.
Users might want to perform such a cleanup regularly at the same time as
they perform other repacks and cleanups, so as part of `git gc`.
Let's allow them to configure a <filter-spec> for that purpose using a
new gc.repackFilter config option.
Now when `git gc` will perform a repack with a <filter-spec> configured
through this option and not empty, the repack process will be passed a
corresponding `--filter=<filter-spec>` argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new option puts the objects specified by `<filter-spec>` into a
separate packfile.
This could be useful if, for example, some blobs take up a lot of
precious space on fast storage while they are rarely accessed. It could
make sense to move them into a separate cheaper, though slower, storage.
It's possible to find which new packfile contains the filtered out
objects using one of the following:
- `git verify-pack -v ...`,
- `test-tool find-pack ...`, which a previous commit added,
- `--filter-to=<dir>`, which a following commit will add to specify
where the pack containing the filtered out objects will be.
This feature is implemented by running `git pack-objects` twice in a
row. The first command is run with `--filter=<filter-spec>`, using the
specified filter. It packs objects while omitting the objects specified
by the filter. Then another `git pack-objects` command is launched using
`--stdin-packs`. We pass it all the previously existing packs into its
stdin, so that it will pack all the objects in the previously existing
packs. But we also pass into its stdin, the pack created by the previous
`git pack-objects --filter=<filter-spec>` command as well as the kept
packs, all prefixed with '^', so that the objects in these packs will be
omitted from the resulting pack. The result is that only the objects
filtered out by the first `git pack-objects` command are in the pack
resulting from the second `git pack-objects` command.
As the interactions with kept packs are a bit tricky, a few related
tests are added.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git repack` is about to learn a new `--filter=<filter-spec>` option and
we will want to check that this option is incompatible with
`--write-bitmap-index`.
Unfortunately it appears that a test like:
test_expect_success '--filter fails with --write-bitmap-index' '
test_must_fail \
env GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \
git -C bare.git repack -a -d --write-bitmap-index --filter=blob:none
'
sometimes fail because when rebuilding bitmaps, it appears that we are
reusing existing bitmap information. So instead of detecting that some
objects are missing and erroring out as it should, the
`git repack --write-bitmap-index --filter=...` command succeeds.
Let's fix that by making sure we rebuild bitmaps using new bitmaps
instead of existing ones.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new find_pack_prefix() to refactor code that handles finding
the pack prefix from the packtmp and packdir global variables, as we are
going to need this feature again in following commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new finish_pack_objects_cmd() to refactor duplicated code
that handles reading the packfile names from the output of a
`git pack-objects` command and putting it into a string_list, as well as
calling finish_command().
While at it, beautify a code comment a bit in the new function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a following commit, we will make it possible to separate objects in
different packfiles depending on a filter.
To make sure that the right objects are in the right packs, let's add a
new test-tool that can display which packfile(s) a given object is in.
Let's also make it possible to check if a given object is in the
expected number of packfiles with a `--check-count <n>` option.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21)
taught `git pack-objects` to use `--filter`, but required the use of
`--stdout` since a partial clone mechanism was not yet in place to
handle missing objects. Since then, changes like 9e27beaa23
(promisor-remote: implement promisor_remote_get_direct(), 2019-06-25)
and others added support to dynamically fetch objects that were missing.
Even without a promisor remote, filtering out objects can also be useful
if we can put the filtered out objects in a separate pack, and in this
case it also makes sense for pack-objects to write the packfile directly
to an actual file rather than on stdout.
Remove the `--stdout` requirement when using `--filter`, so that in a
follow-up commit, repack can pass `--filter` to pack-objects to omit
certain objects from the resulting packfile.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Checking early for OBJ_COMMIT excludes other objects that can be
resolved to commits, like annotated tags. If we remove it, annotated
tags will be resolved and handled just fine by
lookup_commit_reference(), and if we are given something that can't be
resolved to a commit, we'll still get a useful error message, e.g.:
> error: object 21ab162211ac3ef13c37603ca88b27e9c7e0d40b is a tree, not a commit
> fatal: no merge base found
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"checkout --merge -- path" and "update-index --unresolve path" did
not resurrect conflicted state that was resolved to remove path,
but now they do.
* jc/unresolve-removal:
checkout: allow "checkout -m path" to unmerge removed paths
checkout/restore: add basic tests for --merge
checkout/restore: refuse unmerging paths unless checking out of the index
update-index: remove stale fallback code for "--unresolve"
update-index: use unmerge_index_entry() to support removal
resolve-undo: allow resurrecting conflicted state that resolved to deletion
update-index: do not read HEAD and MERGE_HEAD unconditionally
Extract the commonly used initialization of the --stat-width=<width>,
--stat-name-width=<width> and --stat-graph-with=<width> parameters to their
internal default values into a helper function, to avoid repeating the same
initialization code in a few places.
Add a couple of tests to additionally cover existing configuration options
diff.statNameWidth=<width> and diff.statGraphWidth=<width> when used by
git-merge to generate --stat outputs. This closes the gap that existed
previously in the --stat tests, and reduces the chances for having any
regressions introduced by this commit.
While there, perform a small bunch of minor wording tweaks in the improved
unit test, to improve its test-level consistency a bit.
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The files config.{h,c} contain functions that have to do with parsing,
but not config.
In order to further reduce all-in-one headers, separate out functions in
config.c that do not operate on config into its own file, parse.h,
and update the include directives in the .c files that need only such
functions accordingly.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An incorrectly defined boolean environment value would result in the
following error message:
bad boolean config value '%s' for '%s'
This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
remove_or_warn() is only used by entry.c and apply.c, but it is
currently declared and defined in wrapper.{h,c}, so it has a scope much
greater than it needs. This needlessly large scope also causes wrapper.c
to need to include object.h, when this file is largely unconcerned with
Git objects.
Move remove_or_warn() to entry.{h,c}. The file apply.c still has access
to it, since it already includes entry.h for another reason.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to further reduce all-in-one headers, separate out functions in
hex.h that do not operate on object hashes into its own file, hex-ll.h,
and update the include directives in the .c files that need only such
functions accordingly.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
UBSAN options were not propagated through the test framework to git
run via the httpd, unlike ASAN options, which has been corrected.
* jk/test-pass-ubsan-options-to-http-test:
test-lib: set UBSAN_OPTIONS to match ASan
The command line completion script (in contrib/) can be told to
complete aliases by including ": git <cmd> ;" in the alias to tell
it that the alias should be completed similar to how "git <cmd>" is
completed. The parsing code for the alias as been loosened to
allow ';' without an extra space before it.
* jc/alias-completion:
completion: loosen and document the requirement around completing alias
"git range-diff --notes=foo" compared "log --notes=foo --notes" of
the two ranges, instead of using just the specified notes tree.
* kh/range-diff-notes:
range-diff: treat notes like `log`
"git diff" learned diff.statNameWidth configuration variable, to
give the default width for the name part in the "--stat" output.
* ds/stat-name-width-configuration:
diff --stat: add config option to limit filename width
Unused parameters in fsmonitor related code paths have been marked
as such.
* jk/fsmonitor-unused-parameter:
run-command: mark unused parameters in start_bg_wait callbacks
fsmonitor: mark unused hashmap callback parameters
fsmonitor/darwin: mark unused parameters in system callback
fsmonitor: mark unused parameters in stub functions
fsmonitor/win32: mark unused parameter in fsm_os__incompatible()
fsmonitor: mark some maybe-unused parameters
fsmonitor/win32: drop unused parameters
fsmonitor: prefer repo_git_path() to git_pathdup()
Fix recent regression in Git-GUI that fails to run hook scripts at
all.
* ml/git-gui-exec-path-fix:
git-gui - use git-hook, honor core.hooksPath
git-gui - re-enable use of hook scripts
The soft limit of the first line of the commit message should be
"no more than 50 characters" or "50 characters or less", but not
"less than 50 character".
Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The load_commit_graph_chain_fd_st() function will stop loading chains
when it sees an error. But if it has loaded any graph slice at all, it
will return it. This is a good thing for normal use (we use what data we
can, and this is just an optimization). But it's a bad thing for
"commit-graph verify", which should be careful about finding any
irregularities. We do complain to stderr with a warning(), but the
verify command still exits with a successful return code.
The new tests here cover corruption of both the base and tip slices of
the chain. The corruption of the base file already works (it is the
first file we look at, so when we see the error we return NULL). The
"tip" case is what is fixed by this patch (it complains to stderr but
still returns the base slice).
Likewise the existing tests for corruption of the commit-graph-chain
file itself need to be updated. We already exited non-zero correctly for
the "base" case, but the "tip" case can now do so, too.
Note that this also causes us to adjust a test later in the file that
similarly corrupts a tip (though confusingly the test script calls this
"base"). It checks stderr but erroneously expects the whole "verify"
command to exit with a successful code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we open a commit-graph-chain file, if it's smaller than a single
entry, we just quietly treat that as ENOENT. That make some sense if the
file is truly zero bytes, but it means that "commit-graph verify" will
quietly ignore a file that contains garbage if that garbage happens to
be short.
Instead, let's only simulate ENOENT when the file is truly empty, and
otherwise return EINVAL. The normal graph-loading routines don't care,
but "commit-graph verify" will notice and complain about the difference.
It's not entirely clear to me that the 0-is-ENOENT case actually happens
in real life, so we could perhaps just eliminate this special-case
altogether. But this is how we've always behaved, so I'm preserving it
in the name of backwards compatibility (though again, it really only
matters for "verify", as the regular routines are happy to load what
they can).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because it's OK to not have a graph file at all, the graph_verify()
function needs to tell the difference between a missing file and a real
error. So when loading a traditional graph file, we call
open_commit_graph() separately from load_commit_graph_chain_fd_st(), and
don't complain if the first one fails with ENOENT.
When the function learned about chain files in 3da4b609bb (commit-graph:
verify chains with --shallow mode, 2019-06-18), we couldn't be as
careful, since the only way to load a chain was with
read_commit_graph_one(), which did both the open/load as a single unit.
So we'll miss errors in chain files we load, thinking instead that there
was just no chain file at all.
Note that we do still report some of these problems to stderr, as the
loading function calls error() and warning(). But we'd exit with a
successful exit code, which is wrong.
We can fix that by using the recently split open/load functions for
chains. That lets us treat the chain file just like a single file with
respect to error handling here.
An existing test (from 3da4b609bb) shows off the problem; we were
expecting "commit-graph verify" to report success, but that makes no
sense. We did not even verify the contents of the graph data, because we
couldn't load it! I don't think this was an intentional exception, but
rather just the test covering what happened to occur.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t5324.20, we corrupt a hex character 60 bytes into the graph chain
file. Since the file consists of two hash identifiers, one per line, the
corruption differs between sha1 and sha256. In a sha1 repository, the
corruption is on the second line, and in a sha256 repository, it is on
the first.
We should of course detect the problem with either line. But as the next
few patches will show (and fix), that is not the case (in fact, we
currently do not exit non-zero for either line!). And while at the end
of our series we'll catch all errors, our intermediate states will have
differing behavior between the two hashes.
Let's make sure we test corruption of both the first and second lines,
and do so consistently with either hash by choosing offsets which are
always in the first hash (30 bytes) or in the second (70).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In read_commit_graph_one(), we call validate_mixed_generation_chain()
after loading the graph. Even though we don't check the return value,
this has the side effect of clearing the read_generation_data flag,
which is important when working with mixed generation numbers.
But doing this in load_commit_graph_chain_fd_st() makes more sense:
1. We are calling it even when we did not load a chain at all, which
is pointless (you cannot have mixed generations in a single file).
2. For now, all callers load the graph via read_commit_graph_one().
But the point of factoring out the open/load in the previous commit
was to let "commit-graph verify" call them separately. So it needs
to trigger this function as part of the load.
Without this patch, the mixed-generation tests in t5324 would start
failing on "git commit-graph verify" calls, once we switch to using
a separate open/load call there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The load_commit_graph_chain() function opens the chain file and all of
the slices of graph that it points to. If there is no chain file (which
is a totally normal condition), we return NULL. But if we run into
errors with the chain file or loading the actual graph data, we also
return NULL, and the caller cannot tell the difference.
The caller can check for ENOENT for the unremarkable "no such file"
case. But I'm hesitant to assume that the rest of the function would
never accidentally set errno to ENOENT itself, since it is opening the
slice files (and that would mean the caller fails to notice a real
error).
So let's break this into two functions: one to open the file, and one to
actually load it. This matches the interface we provide for the
non-chain graph file, which will also come in handy in a moment when we
fix some bugs in the "git commit-graph verify" code.
Some notes:
- I've kept the "1 is good, 0 is bad" return convention (and the weird
"fd" out-parameter) used by the matching open_commit_graph()
function and other parts of the commit-graph code. This is unlike
most of the rest of Git (which would just return the fd, with -1 for
error), but it makes sense to stay consistent with the adjacent bits
of the API here.
- The existing chain loading function will quietly return if the file
is too small to hold a single entry. I've retained that behavior
(and explicitly set ENOENT in the opener function) for now, under
the notion that it's probably valid (though I'd imagine unusual) to
have an empty chain file.
There are two small behavior changes here, but I think both are strictly
positive:
1. The original blindly did a stat() before checking if fopen()
succeeded, meaning we were making a pointless extra stat call.
2. We now use fstat() to check the file size. The previous code using
a regular stat() on the pathname meant we could technically race
with somebody updating the chain file, and end up with a size that
does not match what we just opened with fopen(). I doubt anybody
ever hit this in practice, but it may have caused an out-of-bounds
read.
We'll retain the load_commit_graph_chain() function which does both the
open and reading steps (most existing callers do not care about seeing
errors anyway, since loading commit-graphs is optimistic).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the code is written today index_bulk_checkin only accepts blobs.
Remove the enum object_type parameter and rename index_bulk_checkin to
index_blob_bulk_checkin, index_stream to index_blob_stream,
deflate_to_pack to deflate_blob_to_pack, stream_to_pack to
stream_blob_to_pack, to make this explicit.
Not supporting commits, tags, or trees has no downside as it is not
currently supported now, and commits, tags, and trees being smaller by
design do not have the problem that the problem that index_bulk_checkin
was built to solve.
Before we start adding code to support the hash function transition
supporting additional objects types in index_bulk_checkin has no real
additional cost, just an extra function parameter to know what the
object type is. Once we begin the hash function transition this is not
the case.
The hash function transition document specifies that a repository with
compatObjectFormat enabled will compute and store both the SHA-1 and
SHA-256 hash of every object in the repository.
What makes this a challenge is that it is not just an additional hash
over the same object. Instead the hash function transition document
specifies that the compatibility hash (specified with
compatObjectFormat) be computed over the equivalent object that another
git repository whose storage hash (specified with objectFormat) would
store. When comparing equivalent repositories built with different
storage hash functions, the oids embedded in objects used to refer to
other objects differ and the location of signatures within objects
differ.
As blob objects have neither oids referring to other objects nor stored
signatures their storage hash and their compatibility hash are computed
over the same object.
The other kinds of objects: trees, commits, and tags, all store oids
referring to other objects. Signatures are stored in commit and tag
objects. As oids and the tags to store signatures are not the same size
in repositories built with different storage hashes the size of the
equivalent objects are also different.
A version of index_bulk_checkin that supports more than just blobs when
computing both the SHA-1 and the SHA-256 of every object added would
need a different, and more expensive structure. The structure is more
expensive because it would be required to temporarily buffering the
equivalent object the compatibility hash needs to be computed over.
A temporary object is needed, because before a hash over an object can
computed it's object header needs to be computed. One of the members of
the object header is the entire size of the object. To know the size of
an equivalent object an entire pass over the original object needs to be
made, as trees, commits, and tags are composed of a variable number of
variable sized pieces. Unfortunately there is no formula to compute the
size of an equivalent object from just the size of the original object.
Avoid all of those future complications by limiting index_bulk_checkin
to only work on blobs.
Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add mailmap support to ref-filter formats which are similar in
pretty. This support is such that the following pretty placeholders are
equivalent to the new ref-filter atoms:
%aN = authorname:mailmap
%cN = committername:mailmap
%aE = authoremail:mailmap
%aL = authoremail:mailmap,localpart
%cE = committeremail:mailmap
%cL = committeremail:mailmap,localpart
Additionally, mailmap can also be used with ":trim" option for email by
doing something like "authoremail:mailmap,trim".
The above also applies for the "tagger" atom, that is,
"taggername:mailmap", "taggeremail:mailmap", "taggeremail:mailmap,trim"
and "taggername:mailmap,localpart".
The functionality of ":trim" and ":localpart" remains the same. That is,
":trim" gives the email, but without the angle brackets and ":localpart"
gives the part of the email before the '@' character (if such a
character is not found then we directly grab everything between the
angle brackets).
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new function "test_bad_atom", which is similar to
"test_atom()" but should be used to check whether the correct error
message is shown on stderr.
Like "test_atom", the new function takes three arguments. The three
arguments specify the ref, the format and the expected error message
respectively, with an optional fourth argument for tweaking
"test_expect_*" (which is by default "success").
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, when the executable part of "test_expect_{success,failure}"
(inside "test_atom") got "eval"ed, it would have been syntactically
incorrect if the second argument ($2, which is the format) to "test_atom"
were enclosed in single quotes because the $variables would get
interpolated even before the arguments to "test_expect_{success,failure}"
are formed.
So fix this and also some style issues along the way.
Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add merge strategy option to produce more customizable merge result such
as automatically resolving conflicts.
Signed-off-by: Tang Yuyi <winglovet@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For completeness' sake, let's add support for submitting macOS builds to
Coverity Scan.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By adding the repository variable `ENABLE_COVERITY_SCAN_ON_OS` with a
value, say, `["windows-latest"]`, this GitHub workflow now runs on
Windows, allowing to analyze Windows-specific issues.
This allows, say, the Git for Windows fork to submit Windows builds to
Coverity Scan instead of Linux builds.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, the builds are submitted to the `git` project at
https://scan.coverity.com/projects/git.
The Git for Windows project would like to use this workflow, too,
though, and needs the builds to be submitted to the `git-for-windows`
Coverity project.
To that end, allow configuring the Coverity project name via the
repository variable, you guessed it, `COVERITY_PROJECT`. The default if
that variable is not configured or has an empty value is still `git`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It would add a 1GB+ download for every run, better cache it.
This is inspired by the GitHub Action `vapier/coverity-scan-action`,
however, it uses the finer-grained `restore`/`save` method to be able to
cache the Coverity Build Tool even if an unrelated step in the GitHub
workflow fails later on.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Coverity is a static analysis tool that detects and generates reports on
various security and code quality issues.
It is particularly useful when diagnosing memory safety issues which may
be used as part of exploiting a security vulnerability.
Coverity's website provides a service that accepts "builds" (which
contains the object files generated during a standard build as well as a
database generated by Coverity's scan tool).
Let's add a GitHub workflow to automate all of this. To avoid running it
without appropriate Coverity configuration (e.g. the token required to
use Coverity's services), the job only runs when the repository variable
"ENABLE_COVERITY_SCAN_FOR_BRANCHES" has been configured accordingly (see
https://docs.github.com/en/actions/learn-github-actions/variables for
details how to configure repository variables): It is expected to be a
valid JSON array of branch strings, e.g. `["main", "next"]`.
In addition, this workflow requires two repository secrets:
- COVERITY_SCAN_EMAIL: the email to send the report to, and
- COVERITY_SCAN_TOKEN: the Coverity token (look in the Project Settings
tab of your Coverity project).
Note: The initial version of this patch used
`vapier/coverity-scan-action` to benefit from that Action's caching of
the Coverity tool, which is rather large. Sadly, that Action only
supports Linux, and we want to have the option of building on Windows,
too. Besides, in the meantime Coverity requires `cov-configure` to be
runantime, and that Action was not adjusted accordingly, i.e. it seems
not to be maintained actively. Therefore it would seem prudent to
implement the steps manually instead of using that Action.
Initial-patch-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading revisions from stdin via git-rev-list(1)'s `--stdin` option
then these revisions never honor flags like `--not` which have been
passed on the command line. Thus, an invocation like e.g. `git rev-list
--all --not --stdin` will not treat all revisions read from stdin as
uninteresting. While this behaviour may be surprising to a user, it's
been this way ever since it has been introduced via 42cabc341c (Teach
rev-list an option to read revs from the standard input., 2006-09-05).
With that said, in c40f0b7877 (revision: handle pseudo-opts in `--stdin`
mode, 2023-06-15) we have introduced a new mode to read pseudo opts from
standard input where this behaviour is a lot more confusing. If you pass
`--not` via stdin, it will:
- Influence subsequent revisions or pseudo-options passed on the
command line.
- Influence pseudo-options passed via standard input.
- _Not_ influence normal revisions passed via standard input.
This behaviour is extremely inconsistent and bound to cause confusion.
While it would be nice to retroactively change the behaviour for how
`--not` and `--stdin` behave together, chances are quite high that this
would break existing scripts that expect the current behaviour that has
been around for many years by now. This is thus not really a viable
option to explore to fix the inconsistency.
Instead, we change the behaviour of how pseudo-opts read via standard
input influence the flags such that the effect is fully localized. With
this change, when reading `--not` via standard input, it will:
- _Not_ influence subsequent revisions or pseudo-options passed on
the command line, which is a change in behaviour.
- Influence pseudo-options passed via standard input.
- Influence normal revisions passed via standard input, which is a
change in behaviour.
Thus, all flags read via standard input are fully self-contained to that
standard input, only.
While this is a breaking change as well, the behaviour has only been
recently introduced with Git v2.42.0. Furthermore, the current behaviour
can be regarded as a simple bug. With that in mind it feels like the
right thing to retroactively change it and make the behaviour sane.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An error message given by "git send-email" when given a malformed
address did not give correct information, which has been corrected.
* tb/send-email-extract-valid-address-error-message-fix:
git-send-email.perl: avoid printing undef when validating addresses
HTTP Header redaction code has been adjusted for a newer version of
cURL library that shows its traces differently from earlier
versions.
* jk/redact-h2h3-headers-fix:
http: update curl http/2 info matching for curl 8.3.0
http: factor out matching of curl http/2 trace lines
Code clean-up.
* jk/ort-unused-parameter-cleanups:
merge-ort: lowercase a few error messages
merge-ort: drop unused "opt" parameter from merge_check_renames_reusable()
merge-ort: drop unused parameters from detect_and_process_renames()
merge-ort: stop passing "opt" to read_oid_strbuf()
merge-ort: drop custom err() function
The code to keep track of existing packs in the repository while
repacking has been refactored.
* tb/repack-existing-packs-cleanup:
builtin/repack.c: extract common cruft pack loop
builtin/repack.c: avoid directly inspecting "util"
builtin/repack.c: store existing cruft packs separately
builtin/repack.c: extract `has_existing_non_kept_packs()`
builtin/repack.c: extract redundant pack cleanup for existing packs
builtin/repack.c: extract redundant pack cleanup for --geometric
builtin/repack.c: extract marking packs for deletion
builtin/repack.c: extract structure to store existing packs
Code clean-up.
Keep only the first three clean-ups, and discard the rest to be replaced later.
cf. <owly1qetjqo1.fsf@fine.c.googlers.com>
cf. <owlyzg1dsswr.fsf@fine.c.googlers.com>
* la/trailer-cleanups:
trailer: split process_command_line_args into separate functions
trailer: split process_input_file into separate pieces
trailer: separate public from internal portion of trailer_iterator
For a long time we have used ASAN_OPTIONS to set abort_on_error. This is
important because we want to notice detected problems even in programs
which are expected to fail. But we never did the same for UBSAN_OPTIONS.
This means that our UBSan test suite runs might silently miss some
cases.
It also causes a more visible effect, which is that t4058 complains
about unexpected "fixes" (and this is how I noticed the issue):
$ make SANITIZE=undefined CC=gcc && (cd t && ./t4058-*)
...
ok 8 - git read-tree does not segfault # TODO known breakage vanished
ok 9 - reset --hard does not segfault # TODO known breakage vanished
ok 10 - git diff HEAD does not segfault # TODO known breakage vanished
The tests themselves aren't that interesting. We have a known bug where
these programs segfault, and they do when compiled without sanitizers.
With UBSan, when the test runs:
test_might_fail git read-tree --reset base
it gets:
cache-tree.c:935:9: runtime error: member access within misaligned address 0x5a5a5a5a5a5a5a5a for type 'struct cache_entry', which requires 8 byte alignment
So that's garbage memory which would _usually_ cause us to segfault, but
UBSan catches it and complains first about the alignment. That makes
sense, but the weird thing is that UBSan then exits instead of aborting,
so our test_might_fail call considers that an acceptable outcome and the
test "passes".
Curiously, this historically seems to have aborted, because I've run
"make test" with UBSan many times (and so did our CI) and we never saw
the problem. Even more curiously, I see an abort if I use clang with
ASan and UBSan together, like:
# this aborts!
make SANITIZE=undefined,address CC=clang
But not with just UBSan, and not with both when used with gcc:
# none of these do
make SANITIZE=undefined CC=gcc
make SANITIZE=undefined CC=clang
make SANITIZE=undefined,address CC=gcc
Likewise moving to older versions of gcc (I tried gcc-11 and gcc-12 on
my Debian system) doesn't abort. Nor does moving around in Git's
history. Neither this test nor the relevant code have been touched in a
while, and going back to v2.41.0 produces the same outcome (even though
many UBSan CI runs have passed in the meantime).
So _something_ changed on my system (and likely will soon on other
people's, since this is stock Debian unstable), but I didn't track
it further. I don't know why it ever aborted in the past, but we
definitely should be explicit here and tell UBSan what we want to
happen.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The argument order was incorrect. This was introduced by 246cac8505
(i18n: turn even more messages into "cannot be used together" ones,
2022-01-05).
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recently we started to tell users to spell ": git foo ;" with
space(s) around 'foo' for an alias to be completed similarly
to the 'git foo' command. It however is easy to also allow users to
spell it in a more natural way with the semicolon attached to 'foo',
i.e. ": git foo;". Also, add a comment to note that 'git' is optional
and writing ": foo;" would complete the alias just fine.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-index" learns "--show-index-version" to inspect
the index format version used by the on-disk index file.
* jc/update-index-show-index-version:
test-tool: retire "index-version"
update-index: add --show-index-version
update-index doc: v4 is OK with JGit and libgit2
Clarify how "alias.foo = : git cmd ; aliased-command-string" should
be spelled with necessary whitespaces around punctuation marks to
work.
* pb/completion-aliases-doc:
completion: improve doc for complex aliases
The command-line complation support (in contrib/) learned to
complete "git commit --trailer=" for possible trailer keys.
* pb/complete-commit-trailers:
completion: commit: complete trailers tokens more robustly
completion: commit: complete configured trailer tokens
"git diff --cached" codepath did not fill the necessary stat
information for a file when fsmonitor knows it is clean and ended
up behaving as if it is not clean, which has been corrected.
* js/diff-cached-fsmonitor-fix:
diff-lib: fix check_removed when fsmonitor is on
Update "git maintainance" timers' implementation based on systemd
timers to work with WSL.
* js/systemd-timers-wsl-fix:
maintenance(systemd): support the Windows Subsystem for Linux
"git diff --no-index -R <(one) <(two)" did not work correctly,
which has been corrected.
* pw/diff-no-index-from-named-pipes:
diff --no-index: fix -R with stdin
While git show accepts options that apply to the git diff-tree command,
some options do not make sense in the context of git show.
The options of git show are handled using the machinery of git log.
The git log manual page is a better place to look into than git diff-tree
for options that are not in the git show manual page.
Signed-off-by: Han Young <hanyang.tony@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, `range-diff` shows the default notes if no notes-related
arguments are given. This is also how `log` behaves. But unlike
`range-diff`, `log` does *not* show the default notes if
`--notes=<custom>` are given. In other words, this:
git log --notes=custom
is equivalent to this:
git log --no-notes --notes=custom
While:
git range-diff --notes=custom
acts like this:
git log --notes --notes-custom
This can’t be how the user expects `range-diff` to behave given that the
man page for `range-diff` under `--[no-]notes[=<ref>]` says:
> This flag is passed to the `git log` program (see git-log(1)) that
> generates the patches.
This behavior also affects `format-patch` since it uses `range-diff` for
the cover letter. Unlike `log`, though, `format-patch` is not supposed
to show the default notes if no notes-related arguments are given.[1]
But this promise is broken when the range-diff happens to have something
to say about the changes to the default notes, since that will be shown
in the cover letter.
Remedy this by introducing `--show-notes-by-default` that `range-diff` can
use to tell the `log` subprocess what to do.
§ Authors
• Fix by Johannes
• Tests by Kristoffer
† 1: See e.g. 66b2ed09c2 (Fix "log" family not to be too agressive about
showing notes, 2010-01-20).
Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The start_bg_command() function takes a callback to tell when the
background-ed process is "ready". The callback receives the
child_process struct as well as an extra void pointer. But curiously,
neither of the two users of this interface look at either parameter!
This makes some sense. The only non-test user of the API is fsmonitor,
which uses fsmonitor_ipc__get_state() to connect to a single global
fsmonitor daemon (i.e., the one we just started!).
So we could just drop these parameters entirely. But it seems like a
pretty reasonable interface for the "wait" callback to have access to
the details of the spawned process, and to have room for passing extra
data through a void pointer. So let's leave these in place but mark the
unused ones so that -Wunused-parameter does not complain.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like many hashmap comparison functions, our cookies_cmp() does not look
at its extra void data parameter. This should have been annotated in
02c3c59e62 (hashmap: mark unused callback parameters, 2022-08-19), but
this new case was added around the same time (plus fsmonitor is not
built at all on Linux, so it is easy to miss there).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We pass fsevent_callback() to the system FSEventStreamCreate() function
as a callback. So we must match the expected function signature, even
though we don't care about all of the parameters. Mark the unused ones
to satisfy -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fsmonitor code has some platform-specific functions for which one or
more platforms implement noop or stub functions. We can't get rid of
these functions nor change their interface, since the point is to match
their equivalents in other platforms. But let's annotate their
parameters to quiet the compiler's -Wunused-parameter warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We never look at the "ipc" argument we receive. It was added in
8f44976882 (fsmonitor: avoid socket location check if using hook,
2022-10-04) to support the darwin fsmonitor code. The win32 code has to
match the same interface, but we should use an annotation to silence
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a bit of conditionally-compiled code in fsmonitor, so some
function parameters may be unused depending on the build options:
- in fsmonitor--daemon.c's try_to_run_foreground_daemon(), we take a
detach_console argument, but it's only used on Windows. This seems
intentional (and not mistakenly missing other platforms) based on
the discussion in c284e27ba7 (fsmonitor--daemon: implement 'start'
command, 2022-03-25), which introduced it.
- in fsmonitor-setting.c's check_for_incompatible(), we pass the "ipc"
flag down to the system-specific fsm_os__incompatible() helper. But
we can only do so if our platform has such a helper.
In both cases we can mark the argument as MAYBE_UNUSED. That annotates
it enough to suppress the compiler's -Wunused-parameter warning, but
without making it impossible to use the variable, as a regular UNUSED
annotation would.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few helper functions (centered around file-watch events) take extra
fsmonitor state parameters that they don't use. These are static helpers
local to the win32 implementation, and don't need to conform to any
particular interface. We can just drop the extra parameters, which
simplifies the code and silences -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fsmonitor_ipc__get_path() function ignores its repository argument.
It should use it when constructing repo paths (though in practice, it is
unlikely anything but the_repository is ever passed, so this is cleanup
and future proofing, not a bug fix).
Note that despite the lack of "dup" in the name, repo_git_path() behaves
like git_pathdup() and returns an allocated string.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The completion script (in contrib/) has been taught to treat the
"-t" option to "git checkout" and "git switch" just like the
"--track" option, to complete remote-tracking branches.
* js/complete-checkout-t:
completion(switch/checkout): treat --track and -t the same
"git grep -e A --no-or -e B" is accepted, even though the negation
of "or" did not mean anything, which has been tightened.
* rs/grep-no-no-or:
grep: reject --no-or
When validating email addresses with `extract_valid_address_or_die()`,
we print out a helpful error message when the given input does not
contain a valid email address.
However, the pre-image of this patch looks something like:
my $address = shift;
$address = extract_valid_address($address):
die sprintf(__("..."), $address) if !$address;
which fails when given a bogus email address by trying to use $address
(which is undef) in a sprintf() expansion, like so:
$ git.compile send-email --to="pi <pi@pi>" /tmp/x/*.patch --force
Use of uninitialized value $address in sprintf at /home/ttaylorr/src/git/git-send-email line 1175.
error: unable to extract a valid address from:
This regression dates back to e431225569 (git-send-email: remove invalid
addresses earlier, 2012-11-22), but became more noticeable in a8022c5f7b
(send-email: expose header information to git-send-email's
sendemail-validate hook, 2023-04-19), which validates SMTP headers in
the sendemail-validate hook.
Avoid trying to format an undef by storing the given and cleaned address
separately. After applying this fix, the error contains the invalid
email address, and the warning disappears:
$ git.compile send-email --to="pi <pi@pi>" /tmp/x/*.patch --force
error: unable to extract a valid address from: pi <pi@pi>
Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-gui currently runs some hooks directly using its own code written
before 2010, long predating git v2.9 that added the core.hooksPath
configuration to override the assumed location at $GIT_DIR/hooks. Thus,
git-gui looks for and runs hooks including prepare-commit-msg,
commit-msg, pre-commit, post-commit, and post-checkout from
$GIT_DIR/hooks, regardless of configuration. Commands (e.g., git-merge)
that git-gui invokes directly do honor core.hooksPath, meaning the
overall behaviour is inconsistent.
Furthermore, since v2.36 git exposes its hook execution machinery via
`git-hook run`, eliminating the need for others to maintain code
duplicating that functionality. Using git-hook will both fix git-gui's
current issues on hook configuration and (presumably) reduce the
maintenance burden going forward. So, teach git-gui to use git-hook.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new configuration option diff.statNameWidth=<width> that is equivalent
to the command-line option --stat-name-width=<width>, but it is ignored
by format-patch. This follows the logic established by the already
existing configuration option diff.statGraphWidth=<width>.
Limiting the widths of names and graphs in the --stat output makes sense
for interactive work on wide terminals with many columns, hence the support
for these configuration options. They don't affect format-patch because
it already adheres to the traditional 80-column standard.
Update the documentation and add more tests to cover new configuration
option diff.statNameWidth=<width>. While there, perform a few minor code
and whitespace cleanups here and there, as spotted.
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier, commit aae9560a introduced search in $PATH to find executables
before running them, avoiding an issue where on Windows a same named
file in the current directory can be executed in preference to anything
in a directory in $PATH. This search is intended to find an absolute
path for a bare executable ( e.g, a function "foo") by finding the first
instance of "foo" in a directory given in $PATH, and this search works
correctly. The search is explicitly avoided for an executable named
with an absolute path (e.g., /bin/sh), and that works as well.
Unfortunately, the search is also applied to commands named with a
relative path. A hook script (or executable) $HOOK is usually located
relative to the project directory as .git/hooks/$HOOK. The search for
this will generally fail as that relative path will (probably) not exist
on any directory in $PATH. This means that git hooks in general now fail
to run. Considerable mayhem could occur should a directory on $PATH be
git controlled. If such a directory includes .git/hooks/$HOOK, that
repository's $HOOK will be substituted for the one in the current
project, with unknown consequences.
This lookup failure also occurs in worktrees linked to a remote .git
directory using git-new-workdir. However, a worktree using a .git file
pointing to a separate git directory apparently avoids this: in that
case the hook command is resolved to an absolute path before being
passed down to the code introduced in aae9560a.
Fix this by replacing the test for an "absolute" pathname to a check for
a command name having more than one pathname component. This limits the
search and absolute pathname resolution to bare commands. The new test
uses tcl's "file split" command. Experiments on Linux and Windows, using
tclsh, show that command names with relative and absolute paths always
give at least two components, while a bare command gives only one.
Linux: puts [file split {foo}] ==> foo
Linux: puts [file split {/foo}] ==> / foo
Linux: puts [file split {.git/foo}] ==> .git foo
Windows: puts [file split {foo}] ==> foo
Windows: puts [file split {c:\foo}] ==> c:/ foo
Windows: puts [file split {.git\foo}] ==> .git foo
The above results show the new test limits search and replacement
to bare commands on both Linux and Windows.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in CodingGuidelines, error messages should not be capitalized.
Fix up a few of these that were copied verbatim from merge-recursive to
match our modern style.
We'll likewise fix up the matching ones from merge-recursive. We care a
bit less there, since the hope is that it will eventually go away. But
besides being the right thing to do in the meantime, it is necessary for
t6406 to pass both with and without GIT_TEST_MERGE_ALGORITHM set (one of
our CI jobs sets it to "recursive", which will use the merge-recursive.c
code). An alternative would be to use "grep -i" in the test to check
the message, but it's nice for the test suite to be be more exact (we'd
notice if the capitalization fix regressed).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `--type=<type>` was added as a prefered alias for `--<type>` by
fb0dc3bac1 (builtin/config.c: support `--type=<type>` as preferred
alias for `--<type>`), the explanation for the path type was
reworded. Whereas the previous explanation said "expand a leading
`~`" this was changed to "adding a leading `~`". Change "adding" to
"expanding" to correctly explain the canonicalization.
Signed-off-by: Evan Gates <evan.gates@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To redact header lines in http/2 curl traces, we have to parse past some
prefix bytes that curl sticks in the info lines it passes to us. That
changed once already, and we adapted in db30130165 (http: handle both
"h2" and "h2h3" in curl info lines, 2023-06-17).
Now it has changed again, in curl's fbacb14c4 (http2: cleanup trace
messages, 2023-08-04), which was released in curl 8.3.0. Running a build
of git linked against that version will fail to redact the trace (and as
before, t5559 notices and complains).
The format here is a little more complicated than the other ones, as it
now includes a "stream id". This is not constant but is always numeric,
so we can easily parse past it.
We'll continue to match the old versions, of course, since we want to
work with many different versions of curl. We can't even select one
format at compile time, because the behavior depends on the runtime
version of curl we use, not the version we build against.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have to parse out curl's http/2 trace lines so we can redact their
headers. We already match two different types of lines from various
vintages of curl. In preparation for adding another (which will be
slightly more complex), let's pull the matching into its own function,
rather than doing it in the middle of a conditional.
While we're doing so, let's expand the comment a bit to describe the two
matches. That probably should have been part of db30130165 (http: handle
both "h2" and "h2h3" in curl info lines, 2023-06-17), but will become
even more important as we add new types.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The merge_options parameter has never been used since the function was
introduced in 64aceb6d73 (merge-ort: add code to check for whether
cached renames can be reused, 2021-05-20). In theory some merge options
might impact our decisions here, but that has never been the case so
far.
Let's drop it to appease -Wunused-parameter; it would be easy to add
back later if we need to (there is only one caller).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function takes three trees representing the merge base and both
sides of the merge, but never looks at any of them. This is due to
f78cf97617 (merge-ort: call diffcore_rename() directly, 2021-02-14).
Prior to that commit, we passed pairs of trees to diff_tree_oid(). But
after that commit, we collect a custom diff_queue for each pair in the
merge_options struct, and just run diffcore_rename() on the result. So
the function does not need to know about the original trees at all
anymore.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function doesn't look at its merge_options parameter. It used to
pass it down to err(), but that function no longer exists (and didn't
look at "opt" anyway). We can drop it here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The merge-ort code has an err() function, but it's really just error()
in disguise. It differs in two ways:
1. It takes a "struct merge_options" argument. But the function
completely ignores it! We can simply remove it.
2. It formats the error string into a strbuf, prepending "error: ",
and then feeds the result into error(). But this is wrong! The
error() function already adds the prefix, so we end up with:
error: error: Failed to execute internal merge
So let's just drop this function entirely and call error() directly, as
the functions are otherwise identical (note that they both always return
-1).
Presumably nobody noticed the bogus messages because they are quite hard
to trigger (they are mostly internal errors reading and writing
objects). However, one easy trigger is a custom merge driver which dies
by signal; we have a test already here, but we were not checking the
contents of stderr.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
References from description of the `--patch` option in various
manual pages have been simplified and improved.
* so/diff-doc-for-patch-update:
doc/diff-options: fix link to generating patch section
Various fixes to the behaviour of "rebase -i" when the command got
interrupted by conflicting changes.
* pw/rebase-i-after-failure:
rebase -i: fix adding failed command to the todo list
rebase --continue: refuse to commit after failed command
rebase: fix rewritten list for failed pick
sequencer: factor out part of pick_commits()
sequencer: use rebase_path_message()
rebase -i: remove patch file after conflict resolution
rebase -i: move unlink() calls
The default log message created by "git revert", when reverting a
commit that records a revert, has been tweaked.
* ob/revert-of-revert-is-reapply:
git-revert.txt: add discussion
sequencer: beautify subject of reverts of reverts
"git log --format" has been taught the %(decorate) placeholder.
* ak/pretty-decorate-more:
decorate: use commit color for HEAD arrow
pretty: add pointer and tag options to %(decorate)
pretty: add %(decorate[:<options>]) format
decorate: color each token separately
decorate: avoid some unnecessary color overhead
decorate: refactor format_decorations()
pretty-formats: enclose options in angle brackets
pretty-formats: define "literal formatting code"
We now limit depth of the tree objects and maximum length of
pathnames recorded in tree objects.
* jk/tree-name-and-depth-limit:
lower core.maxTreeDepth default to 2048
tree-diff: respect max_allowed_tree_depth
list-objects: respect max_allowed_tree_depth
read_tree(): respect max_allowed_tree_depth
traverse_trees(): respect max_allowed_tree_depth
add core.maxTreeDepth config
fsck: detect very large tree pathnames
tree-walk: rename "error" variable
tree-walk: drop MAX_TRAVERSE_TREES macro
tree-walk: reduce stack size for recursive functions
"git for-each-ref --sort='contents:size'" sorts the refs according
to size numerically, giving a ref that points at a blob twelve-byte
(12) long before showing a blob hundred-byte (100) long.
* ks/ref-filter-sort-numerically:
ref-filter: sort numerically when ":size" is used
When generating the list of packs to store in a MIDX (when given the
`--write-midx` option), we include any cruft packs both during
--geometric and non-geometric repacks.
But the rules for when we do and don't have to check whether any of
those cruft packs were queued for deletion differ slightly between the
two cases.
But the two can be unified, provided there is a little bit of extra
detail added in the comment to clarify when it is safe to avoid checking
for any pending deletions (and why it is OK to do so even when not
required).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `->util` field corresponding to each string_list_item is used to
track the existence of some pack at the beginning of a repack operation
was originally intended to be used as a bitfield.
This bitfield tracked:
- (1 << 0): whether or not the pack should be deleted
- (1 << 1): whether or not the pack is cruft
The previous commit removed the use of the second bit, but a future
patch (from a different series than this one) will introduce a new use
of it.
So we could stop treating the util pointer as a bitfield and instead
start treating it as if it were a boolean. But this would require some
backtracking when that later patch is applied.
Instead, let's avoid touching the ->util field directly, and instead
introduce convenience functions like:
- pack_mark_for_deletion()
- pack_is_marked_for_deletion()
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When repacking with the `--write-midx` option, we invoke the function
`midx_included_packs()` in order to produce the list of packs we want to
include in the resulting MIDX.
This list is comprised of:
- existing .keep packs
- any pack(s) which were written earlier in the same process
- any unchanged packs when doing a `--geometric` repack
- any cruft packs
Prior to this patch, we stored pre-existing cruft and non-cruft packs
together (provided those packs are non-kept). This meant we needed an
additional bit to indicate which non-kept pack(s) were cruft versus
those that aren't.
But alternatively we can store cruft packs in a separate list, avoiding
the need for this extra bit, and simplifying the code below.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When there is:
- at least one pre-existing packfile (which is not marked as kept),
- repacking with the `-d` flag, and
- not doing a cruft repack
, then we pass a handful of additional options to the inner
`pack-objects` process, like `--unpack-unreachable`,
`--keep-unreachable`, and `--pack-loose-unreachable`, in addition to
marking any packs we just wrote for promisor remotes as kept in-core
(with `--keep-pack`, as opposed to the presence of a ".keep" file on
disk).
Because we store both cruft and non-cruft packs together in the same
`existing.non_kept_packs` list, it suffices to check its `nr` member to
see if it is zero or not.
But a following change will store cruft- and non-cruft packs separately,
meaning this check would break as a result. Prepare for this by
extracting this part of the check into a new helper function called
`has_existing_non_kept_packs()`.
This patch does not introduce any functional changes, but prepares us to
make a more isolated change in a subsequent patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To remove redundant packs at the end of a repacking operation, Git uses
its `remove_redundant_pack()` function in a loop over the set of
pre-existing, non-kept packs.
In a later commit, we will split this list into two, one for
pre-existing cruft pack(s), and another for non-cruft pack(s). Prepare
for this by factoring out the routine to loop over and delete redundant
packs into its own function.
Instead of calling `remove_redundant_pack()` directly, we now will call
`remove_redundant_existing_packs()`, which itself dispatches a call to
`remove_redundant_packs_1()`. Note that the geometric repacking code
will still call `remove_redundant_pack()` directly, but see the previous
commit for more details.
Having `remove_redundant_packs_1()` exist as a separate function may
seem like overkill in this patch. However, a later patch will call
`remove_redundant_packs_1()` once over two separate lists, so this
refactoring sets us up for that.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To reduce the complexity of the already quite-long `cmd_repack()`
implementation, extract out the parts responsible for deleting redundant
packs from a geometric repack out into its own sub-routine.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the end of a repack (when given `-d`), Git attempts to remove any
packs which have been made "redundant" as a result of the repacking
operation. For example, an all-into-one (`-A` or `-a`) repack makes
every pre-existing pack which is not marked as kept redundant. Geometric
repacks (with `--geometric=<n>`) make any packs which were rolled up
redundant, and so on.
But before deleting the set of packs we think are redundant, we first
check to see whether or not we just wrote a pack which is identical to
any one of the packs we were going to delete. When this is the case, Git
must avoid deleting that pack, since it matches a pack we just wrote
(so deleting it may cause the repository to become corrupt).
Right now we only process the list of non-kept packs in a single pass.
But a future change will split the existing non-kept packs further into
two lists: one for cruft packs, and another for non-cruft packs.
Factor out this routine to prepare for calling it twice on two separate
lists in a future patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repack machinery needs to keep track of which packfiles were present
in the repository at the beginning of a repack, segmented by whether or
not each pack is marked as kept.
The names of these packs are stored in two `string_list`s, corresponding
to kept- and non-kept packs, respectively. As a consequence, many
functions within the repack code need to take both `string_list`s as
arguments, leading to code like this:
ret = write_cruft_pack(&cruft_po_args, packtmp, pack_prefix,
cruft_expiration, &names,
&existing_nonkept_packs, /* <- */
&existing_kept_packs); /* <- */
Wrap up this pair of `string_list`s into a single structure that stores
both. This saves us from having to pass both string lists separately,
and prepares for adding additional fields to this structure.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update an error message (which would probably never been seen).
* ob/sequencer-reword-error-message:
sequencer: fix error message on failure to copy SQUASH_MSG
Unused parameters to functions are marked as such, and/or removed,
in order to bring us closer to -Wunused-parameter clean.
* jk/unused-post-2.42-part2:
parse-options: mark unused parameters in noop callback
interpret-trailers: mark unused "unset" parameters in option callbacks
parse-options: add more BUG_ON() annotations
merge: do not pass unused opt->value parameter
parse-options: mark unused "opt" parameter in callbacks
parse-options: prefer opt->value to globals in callbacks
checkout-index: delay automatic setting of to_tempfile
format-patch: use OPT_STRING_LIST for to/cc options
merge: simplify parsing of "-n" option
merge: make xopts a strvec
The completion code can be told to use a particular completion for
aliases that shell out by using ': git <cmd> ;' as the first command of
the alias. This only works if <cmd> and the semicolon are separated by a
space, since if the space is missing __git_aliased_command returns (for
example) 'checkout;' instead of just 'checkout', and then
__git_complete_command fails to find a completion for 'checkout;'.
The examples have that space but it's not clear if it's just for
style or if it's mandatory. Explicitly mention it.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we added support for completing configured
trailer tokens in 'git commit --trailer'.
Make the implementation more robust by:
- using '__git' instead of plain 'git', as the rest of the completion
script does
- using a stricter pattern for --get-regexp to avoid false hits
- using 'cut' and 'rev' instead of 'awk' to account for tokens including
dots.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was introduced by 56dc3ab04 ("sequencer (rebase -i): implement the
'edit' command", 2017-01-02), and was pointless from the get-go: all
early exits from the loop above are returns, so todo_list->current ==
todo_list->nr is an invariant after the loop.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test was introduced by commit 0c164ae7a ("rebase -i: add another
reword test", 2021-08-20). I didn't quite get what it was meant to do,
so here's an explanation from Phillip:
The purpose of the test is to ensure that
(i) There are no uncommitted changes when the editor runs. i.e., we
commit without running the editor and then reword by amending
that commit. This ensures that we have the same user experience
whether or not the commit was fast-forwarded [1].
(ii) That the todo list is re-read after the commit has been reworded.
This is to allow the user to update the todo list while the rebase
is paused for editing the commit message.
[1] https://lore.kernel.org/git/20190812175046.GM20404@szeder.dev/
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As "git update-index --show-index-version" can do the same thing,
the 'index-version' subcommand in the test-tool lost its reason to
exist. Remove it and replace its use with the end-user facing
'git update-index --show-index-version'.
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-index --index-version N" is used to set the index format
version to a specific version, but there was no way to query the
current version used in the on-disk index file.
Teach the command a new "--show-index-version" option, and also
teach the "--index-version N" option to report what the version was
when run with the "--verbose" option.
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Being invented in late 2012 no longer makes the index v4 format
"relatively young".
The support for the index version 4 was added to libgit2 with their
5625d86b (index: support index v4, 2016-05-17) and to JGit with
their e9cb0a8e (DirCache: support index V4, 2020-08-10).
Let's update the paragraph that discouraged its use for folks overly
cautious about cross-tool compatibility.
Helped-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git diff-index` may return incorrect deleted entries when fsmonitor
is used in a repository with git submodules. This can be observed on
Mac machines, but it can affect all other supported platforms too.
If fsmonitor is used, `stat *st` is not initialized if cache_entry has
CE_FSMONITOR_VALID set. But, there are three call sites that rely on stat
afterwards, which can result in incorrect results.
This change partially reverts commit 4f3d6d02 (fsmonitor: skip lstat
deletion check during git diff-index, 2021-03-17).
Signed-off-by: Josip Sokcevic <sokcevic@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running in the Windows Subsystem for Linux (WSL), it is usually
necessary to use the Git Credential Manager for authentication when
performing the background fetches.
This requires interoperability between the Windows Subsystem for Linux
and the Windows host to work, which uses so-called vsocks, i.e. sockets
intended for communcations between virtual machines and the host they
are running on.
However, when Git is configured to run background maintenance via
`systemd`, the address families available to those maintenance processes
are restricted, and did not include `AF_VSOCK`. This leads to problems
e.g. when a background fetch tries to access github.com:
systemd[437]: Starting Optimize Git repositories data...
git[747387]: WSL (747387) ERROR: UtilBindVsockAnyPort:285: socket failed 97
git[747381]: fatal: could not read Username for 'https://github.com': No such device or address
git[747381]: error: failed to prefetch remotes
git[747381]: error: task 'prefetch' failed
systemd[437]: git-maintenance@hourly.service: Main process exited, code=exited, status=1/FAILURE
systemd[437]: git-maintenance@hourly.service: Failed with result 'exit-code'.
systemd[437]: Failed to start Optimize Git repositories data.
Address this (pun intended) by adding the `AF_VSOCK` address family to
the allow list.
This fixes https://github.com/microsoft/git/issues/604.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When -R is given, queue_diff() swaps the mode and name variables of the
two files to produce a reverse diff. 1e3f26542a (diff --no-index:
support reading from named pipes, 2023-07-05) added variables that
indicate whether files are special, i.e named pipes or - for stdin.
These new variables were not swapped, though, which broke the handling
of stdin with with -R. Swap them like the other metadata variables.
Reported-by: Martin Storsjö <martin@martin.st>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, process_command_line_args did two things:
(1) parse trailers from the configuration, and
(2) parse trailers defined on the command line.
Separate (1) outside to a new function, parse_trailers_from_config.
Rename the remaining logic to parse_trailers_from_command_line_args.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, process_input_file does three things:
(1) parse the input string for trailers,
(2) print text before the trailers, and
(3) calculate the position of the input where the trailers end.
Rename this function to parse_trailers(), and make it only do
(1). The caller of this function, process_trailers, becomes responsible
for (2) and (3). These items belong inside process_trailers because they
are both concerned with printing the surrounding text around
trailers (which is already one of the immediate concerns of
process_trailers).
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fields here are not meant to be used by downstream callers, so put
them behind an anonymous struct named as "internal" to warn against
their use. This follows the pattern in 576de3d956 (unpack_trees: start
splitting internal fields from public API, 2023-02-27).
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git switch --track ` is to be completed, only remote refs are
eligible because that is what the `--track` option targets.
And when the short-hand `-t` is used instead, the same _should_ happen.
Let's make it so.
Note that the bug exists both in the completions of `switch` and
`completion`, even if it manifests in slightly different ways: While
the completion of `git switch -t ` will not even look at remote refs,
the completion of `git checkout -t ` will look at both remote _and_
local refs. Both should look only at remote refs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--trailer` option takes a "<token>=<value>" argument, for example
--trailer "Acked-by=Bob"
And in this exampple it is understood that "Acked-by" is the <token>.
However, the user can use a shorter "ack" string by defining
configuration like
git config trailer.ack.key "Acked-by"
However, in the docs we define the above configuration as
trailer.<token>.key
so the <token> can mean either the longer "Acked-by" or the shorter
"ack".
Separate the two meanings of <token> into <key> and <keyAlias>, and
update the configuration syntax to say "trailer.<keyAlias>.key".
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sentence does not mention the effect of configuration variables at
all, when they are actively used by default (unless --parse is
specified) to potentially add new trailers, without the user having to
always supply --trailer manually.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The phrase "join whitespace-continued values" requires some additional
context. For example, "whitespace" means newlines (not just space
characters), and "join" means to join only the multiple lines together
for a single trailer (and not that we are joining multiple trailers
together). That is, "join" means to convert
token: This is a very long value, with spaces and
newlines in it.
to
token: This is a very long value, with spaces and newlines in it.
and does not mean to convert
token: value1
token: value2
to
token: value1 value2.
Update the help text to resolve the above ambiguity. While we're add it,
update the docs to use similar language as the change in the help text.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For users who are skimming the docs to go straight to the individual
breakdown of each flag, it may not be clear why --parse is a convenience
alias (without them also looking at the other options that --parse turns
on). To save them the trouble of looking at the other options (and
computing what that would mean), describe a summary of the overall
effect.
Similarly update the area when we first mention --parse near the top of
the doc.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the phrase "configuration variables" instead of "rules" because
(1) we already say "configuration variables" in multiple
places in the docs (where the word "rules" is only used for describing
"--only-input" behavior and for an unrelated case of mentioning how
the trailers do not follow "rules for RFC 822 headers"), and
(2) this phrase is more specific than just "rules".
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing description "set parsing options" is vague, because
arguably _all_ of the options for interpret-trailers have to do with
parsing to some degree.
Explain what this flag does to match what is in the docs, namely how
it is an alias for "--only-trailers --only-input --unfold".
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the help text to say "placement" instead of "action" because the
values are placements, not actions.
While we're at it, tweak the documentation to say "placements" instead
of "values", similar to how the existing language for "--if-exists" uses
the word "action" to describe both the syntax (with the phrase
"--if-exists <action>") and the possible values (with the phrase
"possible actions").
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The wording "all configuration variables" is misleading (the same could
be said to the descriptions of the "--[no-]if-exists" and the
"--[no-]if-missing" options). Specifying --where=value overrides only
the trailer.where variable and applicable trailer.<token>.where
variables, and --no-where stops the overriding of these variables.
Ditto for the other two with their relevant configuration variables.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the "--no-where" flag is tested, the "--no-if-exists" and
"--no-if-missing" flags are not, so add tests for them. But also add
tests for all "--no-*" flags to check their effects, both when (1) there
are relevant configuration variables set, and (2) they are not set.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By using "test_config" instead of "git config", we avoid leaking
configuration state across test cases. This in turn helps to make the
tests more self-contained, by explicitly capturing the configuration
setup. It then makes it easier to add tests anywhere in this 1500+ line
file, without worrying about what implicit state was set in some prior
test case defined earlier up in the script.
This commit was created mechanically as follows: we changed the first
occurrence of a particular "git config trailer.*" option, then ran the
tests repeatedly to see which ones broke, adding in the extra
"test_config" equivalents to make them pass again. In addition, in some
test cases we removed "git config --unset ..." lines because they were
no longer necessary (as the --unset was being used to clean up leaked
configuration state from earlier test cases).
The process described above was done repeatedly until there were no more
unbridled "git config" invocations. Some "git config" invocations still
do exist in the script, but they were already cleaned up properly with
test_when_finished "git config --remove-section ..."
so they were left alone.
Note that these cleanups result in generally longer test case setups
because the previously hidden state is now being exposed. Although we
could then clean up the test cases' "expected" values to be less
verbose (the verbosity arising from the use of implicit state), we
choose not to do so here, to make sure that this cleanup does not change
any meanings behind the test cases.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch --rfc --subject-prefix=<foo>" used to ignore the
"--subject-prefix" option and used "[RFC PATCH]"; now we will add
"RFC" prefix to whatever subject prefix is specified.
This is a backward compatible change that may deserve a note.
* dd/format-patch-rfc-updates:
format-patch: --rfc honors what --subject-prefix sets
Unused parameters to functions are marked as such, and/or removed,
in order to bring us closer to -Wunused-parameter clean.
* jk/unused-post-2.42: (22 commits)
update-ref: mark unused parameter in parser callbacks
gc: mark unused descriptors in scheduler callbacks
bundle-uri: mark unused parameters in callbacks
fetch: mark unused parameter in ref_transaction callback
credential: mark unused parameter in urlmatch callback
grep: mark unused parmaeters in pcre fallbacks
imap-send: mark unused parameters with NO_OPENSSL
worktree: mark unused parameters in noop repair callback
negotiator/noop: mark unused callback parameters
add-interactive: mark unused callback parameters
grep: mark unused parameter in output function
test-trace2: mark unused argv/argc parameters
trace2: mark unused config callback parameter
trace2: mark unused us_elapsed_absolute parameters
stash: mark unused parameter in diff callback
ls-tree: mark unused parameter in callback
commit-graph: mark unused data parameters in generation callbacks
worktree: mark unused parameters in each_ref_fn callback
pack-bitmap: mark unused parameters in show_object callback
ref-filter: mark unused parameters in parser callbacks
...
Use of --max-pack-size to allow multiple packfiles to be created is
now supported even when we are sending unreachable objects to cruft
packs.
* tb/multi-cruft-pack:
Documentation/gitformat-pack.txt: drop mixed version section
Documentation/gitformat-pack.txt: remove multi-cruft packs alternative
builtin/pack-objects.c: support `--max-pack-size` with `--cruft`
builtin/pack-objects.c: remove unnecessary strbuf_reset()
Since 3e230fa1b2 (grep: use parseopt, 2009-05-07) git grep has been
accepting the option --no-or. It does the same as --or: nothing.
That's confusing and unintended. Forbid negating --or.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 2daae3d1d1 (commit: add --trailer option, 2021-03-23), 'git
commit' can add trailers to commit messages. To make that feature more
pleasant to use at the command line, update the Bash completion code to
offer configured trailer tokens.
Add a __git_trailer_tokens function to list the configured trailers
tokens, and use it in _git_commit to suggest the configured tokens,
suffixing the completion words with ':' so that the user only has to add
the trailer value.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebasing commands are moved from the todo list in "git-rebase-todo"
to the "done" file (which is used by "git status" to show the recently
executed commands) just before they are executed. This means that if a
command fails because it would overwrite an untracked file it has to be
added back into the todo list before the rebase stops for the user to
fix the problem.
Unfortunately when a failed command is added back into the todo list the
command preceding it is erroneously appended to the "done" file. This
means that when rebase stops after "pick B" fails the "done" file
contains
pick A
pick B
pick A
instead of
pick A
pick B
This happens because save_todo() updates the "done" file with the
previous command whenever "git-rebase-todo" is updated. When we add the
failed pick back into "git-rebase-todo" we do not want to update
"done". Fix this by adding a "reschedule" parameter to save_todo() which
prevents the "done" file from being updated when adding a failed command
back into the "git-rebase-todo" file. A couple of the existing tests are
modified to improve their coverage as none of them trigger this bug or
check the "done" file.
Reported-by: Stefan Haller <lists@haller-berlin.de>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a commit cannot be picked because it would overwrite an untracked
file then "git rebase --continue" should refuse to commit any staged
changes as the commit was not picked. This is implemented by refusing to
commit if the message file is missing. The message file is chosen for
this check because it is only written when "git rebase" stops for the
user to resolve merge conflicts.
Existing commands that refuse to commit staged changes when continuing
such as a failed "exec" rely on checking for the absence of the author
script in run_git_commit(). This prevents the staged changes from being
committed but prints
error: could not open '.git/rebase-merge/author-script' for
reading
before the message about not being able to commit. This is confusing to
users and so checking for the message file instead improves the user
experience. The existing test for refusing to commit after a failed exec
is updated to check that we do not print the error message about a
missing author script anymore.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git rebase keeps a list that maps the OID of each commit before it was
rebased to the OID of the equivalent commit after the rebase. This list
is used to drive the "post-rewrite" hook that is called at the end of a
successful rebase. When a rebase stops for the user to resolve merge
conflicts the OID of the commit being picked is written to
".git/rebase-merge/stopped-sha". Then when the rebase is continued that
OID is added to the list of rewritten commits. Unfortunately if a commit
cannot be picked because it would overwrite an untracked file we still
write the "stopped-sha1" file. This means that when the rebase is
continued the commit is added into the list of rewritten commits even
though it has not been picked yet.
Fix this by not calling error_with_patch() for failed commands. The pick
has failed so there is nothing to commit and therefore we do not want to
set up the state files for committing staged changes when the rebase
continues. This change means we no-longer write a patch for the failed
command or display the error message printed by error_with_patch(). As
the command has failed the patch isn't really useful and in any case the
user can inspect the commit associated with the failed command by
inspecting REBASE_HEAD. Unless the user has disabled it we already print
an advice message that is more helpful than the message from
error_with_patch() which the user will still see. Even if the advice is
disabled the user will see the messages from the merge machinery
detailing the problem.
The code to add a failed command back into the todo list is duplicated
between pick_one_commit() and the loop in pick_commits(). Both sites
print advice about the command being rescheduled, decrement the current
item and save the todo list. To avoid duplicating this code
pick_one_commit() is modified to set a flag to indicate that the command
should be rescheduled in the main loop. This simplifies things as only
the remaining copy of the code needs to be modified to set REBASE_HEAD
rather than calling error_with_patch().
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This simplifies the next commit. If a pick fails we now return the error
at the end of the loop body rather than returning early, a successful
"edit" command continues to return early. There are three things to
check to ensure that removing the early return for an error does not
change the behavior of the code:
(1) We could enter the block guarded by "if (reschedule)". This block
is not entered because "reschedlue" is always zero when picking a
commit.
(2) We could enter the block guarded by
"else if (is_rebase_i(opts) && check_todo && !res)". This block is
not entered when returning an error because "res" is non-zero in
that case.
(3) todo_list->current could be incremented before returning. That is
avoided by moving the increment which is of course a potential
change in behavior itself. The move is safe because none of the
callers look at todo_list after this function returns. Moving the
increment makes it clear we only want to advance the current item
if the command was successful.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than constructing the path in a struct strbuf use the ready
made function to get the path name instead. This was the last
remaining use of the strbuf so remove it as well.
As with the previous patch we now use a hard coded string rather than
git_dir() when constructing the path. This is safe for the same
reason (make_patch() is only called when rebasing) and is protected by
the assertion added in the previous patch.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a rebase stops for the user to resolve conflicts it writes a patch
for the conflicting commit to .git/rebase-merge/patch. This file has
been written since the introduction of "git-rebase-interactive.sh" in
1b1dce4bae (Teach rebase an interactive mode, 2007-06-25). I assume the
idea was to enable the user inspect the conflicting commit in the same
way as they could for the patch based rebase. This file should be
deleted when the rebase continues as if the rebase stops for a failed
"exec" command or a "break" command it is confusing to the user if there
is a stale patch lying around from an unrelated command. As the path is
now used in two different places rebase_path_patch() is added and used
to obtain the path for the patch.
To construct the path write_patch() previously used get_dir() which
returns different paths depending on whether we're rebasing or
cherry-picking/reverting. As this function is only called when
rebasing it is safe to use a hard coded string for the directory
instead. An assertion is added to make sure we don't starting calling
this function when cherry-picking in the future.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the start of each iteration the loop that picks commits removes the
state files from the previous pick. However some of these files are only
written if there are conflicts in which case we exit the loop before the
end of the loop body. Therefore they only need to be removed when the
rebase continues, not at the start of each iteration.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When formatted as man-page, the section title is rendered
"GENERATING PATCH TEXT WITH -P" whereas reference still reads
"Generating patch text with -p", that is inconsistent and makes
searching harder than it needs to be.
Fix this by getting rid of custom reference text.
Also, documentation for every command that describes `-p` option by
including the "diff-options.txt" file does include the
"diff-generate-patch.txt" file as well (as it should), so the internal
link is in fact useful for any of them.
Fix this by getting rid of conditionals around the reference.
Fixes: ebdc46c242 (docs: link generating patch sections)
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code introduced in 576a37fccb (var: add attributes files locations,
2023-06-27) paid careful attention to use `xstrdup()` for pointers known
never to be `NULL`, and `xstrdup_or_null()` otherwise.
One spot was missed, though: `git_attr_global_file()` can return `NULL`,
when the `HOME` variable is not set (and neither `XDG_CONFIG_HOME`), a
scenario not too uncommon in certain server scenarios.
Fix this, and add a test case to avoid future regressions.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The message talked about renaming, while the actual action is copying.
This was introduced by 6e98de72c ("sequencer (rebase -i): add support
for the 'fixup' and 'squash' commands", 2017-01-02).
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Acked-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
a91f453f64 (grep: Add --max-depth option., 2009-07-22) added the option
--max-depth, defining it using a positional struct option initializer of
type OPTION_INTEGER. It also sets defval to 1 for some reason, but that
value would only be used if the flag PARSE_OPT_OPTARG was given.
Use the macro OPT_INTEGER_F instead to standardize the definition and
specify only the necessary values. This also normalizes argh to N_("n")
as a side-effect, which is OK.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
adfc1857bd (describe: fix --contains when a tag is given as input,
2013-07-18) added the option --peel-tag, defining it using a positional
struct option initializer and a comment indicating that it's intended to
be a hidden OPT_BOOL. 4741edd549 (Remove deprecated OPTION_BOOLEAN for
parsing arguments, 2013-08-03) added the macro OPT_HIDDEN_BOOL, which
allows to express this more succinctly. Use it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Atoms like "raw" and "contents" have a ":size" option which can be used
to know the size of the data. Since these atoms have the cmp_type
FIELD_STR, they are sorted alphabetically from 'a' to 'z' and '0' to
'9'. Meaning, even when the ":size" option is used and what we
ultimatlely have is numbers, we still sort alphabetically.
For example, consider the the following case in a repo
refname contents:size raw:size
======= ============= ========
refs/heads/branch1 1130 1210
refs/heads/master 300 410
refs/tags/v1.0 140 260
Sorting with "--format="%(refname) %(contents:size) --sort=contents:size"
would give
refs/heads/branch1 1130
refs/tags/v1.0.0 140
refs/heads/master 300
which is an alphabetic sort, while what one might really expect is
refs/tags/v1.0.0 140
refs/heads/master 300
refs/heads/branch1 1130
which is a numeric sort (that is, a "$ sort -n file" as opposed to a
"$ sort file", where "file" contains only the "contents:size" or
"raw:size" info, each of which is on a newline).
Same is the case with "--sort=raw:size".
So, sort numerically whenever the sort is done with "contents:size" or
"raw:size" and do it the normal alphabetic way when "contents" or "raw"
are used with some other option (they are FIELD_STR anyways).
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unsurprisingly, the noop options callback doesn't bother to look at any
of its parameters. Let's mark them so that -Wunused-parameter does not
complain.
Another option would be to drop the callback and have parse-options
itself recognize OPT_NOOP_NOARG. But that seems like extra work for no
real benefit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few parse-option callbacks that do not look at their "unset"
parameters, but also do not set PARSE_OPT_NONEG. At first glance this
seems like a bug, as we'd ignore "--no-if-exists", etc.
But they do work fine, because when "unset" is true, then "arg" is NULL.
And all three functions pass "arg" on to helper functions which do the
right thing with the NULL.
Note that this shortcut would not be correct if any callback used
PARSE_OPT_NOARG (in which case "arg" would be NULL but "unset" would be
false). But none of these do.
So the code is fine as-is. But we'll want to mark the unused "unset"
parameters to quiet -Wunused-parameter. I've also added a comment to
make this rather subtle situation more explicit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These callbacks are similar to the ones touched by 517fe807d6 (assert
NOARG/NONEG behavior of parse-options callbacks, 2018-11-05), but were
either missed in that commit (the one in add.c) or were added later (the
one in log.c).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option_parse_strategy() callback does not look at opt->value;
instead it calls append_strategy(), which manipulates the global
use_strategies array directly. But the OPT_CALLBACK declaration assigns
"&use_strategies" to opt->value.
One could argue this is good, as it tells the reader what we generally
expect the callback to do. But it is also bad, because it can mislead
you into thinking that swapping out "&use_strategies" there might have
any effect. Let's switch it to pass NULL (which is what every other
"does not bother to look at opt->value" callback does). If you want to
know what the callback does, it's easy to read the function itself.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit argued that parse-options callbacks should try to
use opt->value rather than touching globals directly. In some cases,
however, that's awkward to do. Some callbacks touch multiple variables,
or may even just call into an abstracted function that does so.
In some of these cases we _could_ convert them by stuffing the multiple
variables into a single struct and passing the struct pointer through
opt->value. But that may make other parts of the code less readable,
as the struct relationship has to be mentioned everywhere.
Let's just accept that these cases are special and leave them as-is. But
we do need to mark their "opt" parameters to satisfy -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have several parse-options callbacks that ignore their "opt"
parameters entirely. This is a little unusual, as we'd normally put the
result of the parsing into opt->value. In the case of these callbacks,
though, they directly manipulate global variables instead (and in
most cases the caller sets opt->value to NULL in the OPT_CALLBACK
declaration).
The immediate symptom we'd like to deal with is that the unused "opt"
variables trigger -Wunused-parameter. But how to fix that is debatable.
One option is to annotate them with UNUSED. But another is to have the
caller pass in the appropriate variable via opt->value, and use it. That
has the benefit of making the callbacks reusable (in theory at least),
and makes it clear from the OPT_CALLBACK declaration which variables
will be affected (doubly so for the cases in builtin/fast-export.c,
where we do set opt->value, but it is completely ignored!).
The slight downside is that we lose type safety, since they're now
passing through void pointers.
I went with the "just use them" approach here. The loss of type safety
is unfortunate, but that is already an issue with most of the other
callbacks. If we want to try to address that, we should do so more
consistently (and this patch would prepare these callbacks for whatever
we choose to do there).
Note that in the cases in builtin/fast-export.c, we are passing
anonymous enums. We'll have to give them names so that we can declare
the appropriate pointer type within the callbacks.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using --stage=all requires writing to tempfiles, since we cannot put
multiple stages into a single file. So --stage=all implies --temp.
But we do so by setting to_tempfile in the options callback for --stage,
rather than after all options have been parsed. This leads to two bugs:
1. If you run "checkout-index --stage=all --stage=2", this should not
imply --temp, but it currently does. The callback cannot just unset
to_tempfile when it sees the "2" value, because it no longer knows
if its value was from the earlier --stage call, or if the user
specified --temp explicitly.
2. If you run "checkout-index --stage=all --no-temp", the --no-temp
will overwrite the earlier implied --temp. But this mode of
operation cannot work, and the command will fail with "<path>
already exists" when trying to write the higher stages.
We can fix both by lazily setting to_tempfile. We'll make it a tristate,
with -1 as "not yet given", and have --stage=all enable it only after
all options are parsed. Likewise, after all options are parsed we can
detect and reject the bogus "--no-temp" case.
Note that this does technically change the behavior for "--stage=all
--no-temp" for paths which have only one stage present (which
accidentally worked before, but is now forbidden). But this behavior was
never intended, and you'd have to go out of your way to try to trigger
it.
The new tests cover both cases, as well the general "--stage=all implies
--temp", as most of the other tests explicitly say "--temp". Ironically,
the test "checkout --temp within subdir" is the only one that _doesn't_
use "--temp", and so was implicitly covering this case. But it seems
reasonable to have a more explicit test alongside the other related
ones.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests with LSan from time to time seem to emit harmless message
that makes our tests unnecessarily flakey; we work it around by
filtering the uninteresting output.
* jk/test-lsan-denoise-output:
test-lib: ignore uninteresting LSan output
Flakey "git p4" tests, as well as "git svn" tests, are now skipped
in the (rather expensive) sanitizer CI job.
* js/ci-san-skip-p4-and-svn-tests:
ci(linux-asan-ubsan): let's save some time
Tests that are known to pass with LSan are now marked as such.
* tb/mark-more-tests-as-leak-free:
leak tests: mark t5583-push-branches.sh as leak-free
leak tests: mark t3321-notes-stripspace.sh as leak-free
leak tests: mark a handful of tests as leak-free
It may be tempting to leave the help text NULL for a command line
option that is either hidden or too obvious, but "git subcmd -h"
and "git subcmd --help-all" would have segfaulted if done so. Now
the help text is optional.
* rs/parse-options-help-text-is-optional:
parse-options: allow omitting option help text
Instead of generating a silly-looking `Revert "Revert "foo""`, make it
a more humane `Reapply "foo"`.
This is done for two reasons:
- To cover the actually common case of just a double revert.
- To encourage people to rewrite summaries of recursive reverts by
setting an example (a subsequent commit will also do this explicitly
in the documentation).
To achieve these goals, the mechanism does not need to be particularly
sophisticated. Therefore, more complicated alternatives which would
"compress more efficiently" have not been implemented.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch" learns a way to feed cover letter description,
that (1) can be used on detached HEAD where there is no branch
description available, and (2) also can override the branch
description if there is one.
* ob/format-patch-description-file:
format-patch: add --description-file option
"git diff --no-such-option" and other corner cases around the exit
status of the "diff" command has been corrected.
* jk/diff-result-code-cleanup:
diff: drop useless "status" parameter from diff_result_code()
diff: drop useless return values in git-diff helpers
diff: drop useless return from run_diff_{files,index} functions
diff: die when failing to read index in git-diff builtin
diff: show usage for unknown builtin_diff_files() options
diff-files: avoid negative exit value
diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
The OpenSSL 3+ EVP API for SHA-* cannot support our prior use cases
supported by other SHA-* implementations. It has the following
differences:
1. ->init_fn is required before all use
2. struct assignments don't work and requires ->clone_fn
3. can't support ->update_fn after ->final_*fn
While fixing cases 1 and 2 is merely the matter of calling ->init_fn and
->clone_fn as appropriate, fixing case 3 requires calling ->final_*fn on
a temporary context that's cloned from the primary context.
Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/ZPCL11k38PXTkFga@debian.me/
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Fixes: 3e440ea0ab ("sha256: avoid functions deprecated in OpenSSL 3+")
Fixes: bda9c12073 ("avoid SHA-1 functions deprecated in OpenSSL 3+")
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On my Linux system, all of our recursive tree walking algorithms can run
up to the 4096 default limit without segfaulting. But not all platforms
will have stack sizes as generous (nor might even Linux if we kick off a
recursive walk within a thread).
In particular, several of the tests added in the previous few commits
fail in our Windows CI environment. Through some guess-and-check
pushing, I found that 3072 is still too much, but 2048 is OK.
These are obviously vague heuristics, and there is nothing to promise
that another system might not have trouble at even lower values. But it
seems unlikely anybody will be too angry about a 2048-depth limit (this
is close to the default max-pathname limit on Linux even for a
pathological path like "a/a/a/..."). So let's just lower it.
Some alternatives are:
- configure separate defaults for Windows versus other platforms.
- just skip the tests on Windows. This leaves Windows users with the
annoying case that they can be crashed by running out of stack
space, but there shouldn't be any security implications (they can't
go deep enough to hit integer overflow problems).
Since the original default was arbitrary, it seems less confusing to
just lower it, keeping behavior consistent across platforms.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diffing trees, we recurse to handle subtrees. That means we may run
out of stack space and segfault. Let's teach this code path about
core.maxTreeDepth in order to fail more gracefully.
As with the previous patch, we have no way to return an error (and other
tree-loading problems would just cause us to die()). So we'll likewise
call die() if we exceed the maximum depth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tree traversal in list-objects.c, which is used by "rev-list
--objects", etc, uses recursion and may run out of stack space. Let's
teach it about the new core.maxTreeDepth config option.
We unfortunately can't return an error here, as this code doesn't
produce an error return at all. We'll die() instead, which matches the
behavior when we see an otherwise broken tree.
Note that this will also generally reject such deep trees from entering
the repository from a fetch or push, due to the use of rev-list in the
connectivity check. But it's not foolproof! We stop traversing when we
see an UNINTERESTING object, and the connectivity check marks existing
ref tips as UNINTERESTING. So imagine commit X has a tree
with maximum depth N. If you then create a new commit Y with a tree
entry "Y:subdir" that points to "X^{tree}", then the depth of Y will be
N+1. But a connectivity check running "git rev-list --objects Y --not X"
won't realize that; it will stop traversing at X^{tree}, since that was
already reachable.
So this will stop naive pushes of too-deep trees, but not carefully
crafted malicious ones. Doing it robustly and efficiently would require
caching the maximum depth of each tree (i.e., the longest path to any
leaf entry). That's much more complex and not strictly needed. If each
recursive algorithm limits itself already, then that's sufficient.
Blocking the objects from entering the repo would be a nice
belt-and-suspenders addition, but it's not worth the extra cost.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The read_tree() function reads trees recursively (via its read_tree_at()
helper). This can cause it to run out of stack space on very deep trees.
Let's teach it about the new core.maxTreeDepth option.
The easiest way to demonstrate this is via "ls-tree -r", which the test
covers. Note that I needed a tree depth of ~30k to trigger a segfault on
my Linux system, not the 4100 used by our "big" test in t6700. However,
that test still tells us what we want: that the default 4096 limit is
enough to prevent segfaults on all platforms. We could bump it, but that
increases the cost of the test setup for little gain.
As an interesting side-note: when I originally wrote this patch about 4
years ago, I needed a depth of ~50k to segfault. But porting it forward,
the number is much lower. Seemingly little things like cf0983213c (hash:
add an algo member to struct object_id, 2021-04-26) take it from 32,722
to 29,080.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tree-walk.c code walks trees recursively, and may run out of stack
space. The easiest way to see this is with git-archive; on my 64-bit
Linux system it runs out of stack trying to generate a tarfile with a
tree depth of 13,772.
I've picked 4100 as the depth for our "big" test. I ran it with a much
higher value to confirm that we do get a segfault without this patch.
But really anything over 4096 is sufficient for its stated purpose,
which is to find out if our default limit of 4096 is low enough to
prevent segfaults on all platforms. Keeping it small saves us time on
the test setup.
The tree-walk code that's touched here underlies unpack_trees(), so this
protects any programs which use it, not just git-archive (but archive is
easy to test, and was what alerted me to this issue in a real-world
case).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of our tree traversal algorithms use recursion to visit sub-trees.
For pathologically large trees, this can cause us to run out of stack
space and abort in an uncontrolled way. Let's put our own limit here so
that we can fail gracefully rather than segfaulting.
In similar cases where we recursed along the commit graph, we rewrote
the algorithms to avoid recursion and keep any stack data on the heap.
But the commit graph is meant to grow without bound, whereas it's not an
imposition to put a limit on the maximum size of tree we'll handle.
And this has a bonus side effect: coupled with a limit on individual
tree entry names, this limits the total size of a path we may encounter.
This gives us an extra protection against code handling long path names
which may suffer from integer overflows in the size (which could then be
exploited by malicious trees).
The default of 4096 is set to be much longer than anybody would care
about in the real world. Even with single-letter interior tree names
(like "a/b/c"), such a path is at least 8191 bytes. While most operating
systems will let you create such a path incrementally, trying to
reference the whole thing in a system call (as Git would do when
actually trying to access it) will result in ENAMETOOLONG. Coupled with
the recent fsck.largePathname warning, the maximum total pathname Git
will handle is (by default) 16MB.
This config option doesn't do anything yet; future patches will convert
various algorithms to respect the limit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In general, Git tries not to arbitrarily limit what it will store, and
there are currently no limits at all on the size of the path we find in
a tree. In theory you could have one that is gigabytes long.
But in practice this freedom is not really helping anybody, and is
potentially harmful:
1. Most operating systems have much lower limits for the size of a
single pathname component (e.g., on Linux you'll generally get
ENAMETOOLONG for anything over 255 bytes). And while you _can_ use
Git in a way that never touches the filesystem (manipulating the
index and trees directly), it's still probably not a good idea to
have gigantic tree names. Many operations load and traverse them,
so any clever Git-as-a-database scheme is likely to perform poorly
in that case.
2. We still have a lot of code which assumes strings are reasonably
sized, and I won't be at all surprised if you can trigger some
interesting integer overflows with gigantic pathnames. Stopping
malicious trees from entering the repository provides an extra line
of defense, protecting downstream code.
This patch implements an fsck check so that such trees can be rejected
by transfer.fsckObjects. I've picked a reasonably high maximum depth
here (4096) that hopefully should not bother anybody in practice. I've
also made it configurable, as an escape hatch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "error" variable in traverse_trees() shadows the global error()
function (meaning we can't call error() from here). Let's call the local
variable "ret" instead, which matches the idiom in other functions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the previous commit dropped the hard-coded limit in
traverse_trees(), we don't need this macro there anymore (the code can
handle any number of trees in parallel).
We do define MAX_UNPACK_TREES using MAX_TRAVERSE_TREES, due to
5290d45134 (tree-walk.c: break circular dependency with unpack-trees,
2020-02-01). So we can just directly define that as "8" now; we know
traverse_trees() can handle whatever we throw at it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The traverse_trees() and traverse_trees_recursive() functions call each
other recursively. In a deep tree, this can result in running out of
stack space and crashing.
There's obviously going to be some limit here based on available stack,
but the problem is exacerbated by a few large structs, many of which we
over-allocate. For example, in traverse_trees() we store a name_entry
and tree_desc_x per tree, both of which contain an object_id (which is
now 32 bytes). And we allocate 8 of them (from MAX_TRAVERSE_TREES), even
though many traversals will only look at 1 or 2.
Interestingly, we used to allocate these on the heap, prior to
8dd40c0472 (traverse_trees(): use stack array for name entries,
2020-01-30). That commit was trying to simplify away allocation size
computations, and naively assumed that the sizes were small enough not
to matter. And they don't in normal cases, but on my stock Debian system
I see a crash running "git archive" on a tree with ~3600 entries.
That's deep enough we wouldn't see it in practice, but probably shallow
enough that we'd prefer not to make it a hard limit. Especially because
other systems may have even smaller stacks.
We can replace these stack variables with a few malloc invocations. This
reduces the stack sizes for the two functions from 1128 and 752 bytes,
respectively, down to 40 and 92 bytes. That allows a depth of ~13000 on
my machine (the improvement isn't in linear proportion because my
numbers don't count the size of parameters and other function overhead).
The possible downsides are:
1. We now have to remember to free(). But both functions have an easy
single exit (and already had to clean up other bits anyway).
2. The extra malloc()/free() overhead might be measurable. I tested
this by setting up a 3000-depth tree with a single blob and running
"git archive" on it. After switching to the heap, it consistently
runs 2-3% faster! Presumably this is because the 1K+ of wasted
stack space penalized memory caches.
On a more real-world case like linux.git, the speed difference isn't
measurable at all, simply because most trees aren't that deep and
there's so much other work going on (like accessing the objects
themselves). So the improvement I saw should be taken as evidence that
we're not making anything worse, but isn't really that interesting on
its own. The main motivation here is that we're now less likely to run
out of stack space and crash.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The to_callback() and cc_callback() functions are identical to the
generic parse_opt_string_list() function (except that they don't handle
optional arguments, but that's OK because their callers do not use the
OPTARG flag).
Let's simplify the code by using OPT_STRING_LIST.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "-n" option is implemented by an option callback, as it is really a
"reverse bool". If given, it sets show_diffstat to 0. In theory, when
negated, it would set the same flag to 1. But it's not possible to
trigger that, since short options cannot be negated.
So in practice this is really just a SET_INT to 0. Let's use that
instead, which shortens the code.
Note that negation here would do the wrong thing (as with any SET_INT
with a value of "0"). We could specify PARSE_OPT_NONEG to future-proof
ourselves against somebody adding a long option name (which would make
it possible to negate). But there's not much point:
1. Nobody is going to do that, because the negated form already
exists, and is called "--stat" (which is defined separately so that
"--no-stat" works).
2. If they did, the BUG() check added by 3284b93862 (parse-options:
disallow negating OPTION_SET_INT 0, 2023-08-08) will catch it (and
that check is smart enough to realize that our short-only option is
OK).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "xopts" variable uses a custom array with ALLOC_GROW(). Using a
strvec simplifies things a bit. We need fewer variables, and we can also
ditch our custom parseopt callback in favor of OPT_STRVEC().
As a bonus, this means that "--no-strategy-option", which was previously
a silent noop, now does something useful: like other list-like options,
it will clear the list of -X options seen so far. This matches the
behavior of revert/cherry-pick, which made the same change in fb60b9f37f
(sequencer: use struct strvec to store merge strategy options,
2023-04-10).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than replacing the configured subject prefix (either through the
git config or command line) entirely with "RFC PATCH", this change
prepends RFC to whatever subject prefix was already in use.
This is useful, for example, when a user is working on a repository that
has a subject prefix considered to disambiguate patches:
git config format.subjectPrefix 'PATCH my-project'
Prior to this change, formatting patches with --rfc would lose the
'my-project' information.
The data flow for the subject-prefix was that rev.subject_prefix
were to be kept the authoritative version of the subject prefix even
while parsing command line options, and sprefix variable was used as
a temporary area to futz with it. Now, the parsing code has been
refactored to build the subject prefix into the sprefix variable and
assigns its value at the end to rev.subject_prefix, which makes the
flow easier to grasp.
Signed-off-by: Drew DeVault <sir@cmpwn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of API for consistency between two calls to
require_clean_work_tree() from the sequencer code has been cleaned
up.
* ob/sequencer-empty-hint-fix:
sequencer: rectify empty hint in call of require_clean_work_tree()
Drop the FakeTerm hack, just like dfd46bae (send-email: drop
FakeTerm hack, 2023-08-08) did, for exactly the same reason.
It has been obsolete in git-svn since 30d45f798d (git-svn: delay term
initialization, 2014-09-14). Note that unlike send-email, we already
make sure to load Term::ReadLine only once. So this is just a cleanup,
and not fixing any bug.
Signed-off-by: Wesley Schwengle <wesleys@opperschaap.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have the CI_BRANCHES mechanism, there is no need for anybody
to use the ci/config/allow-ref mechanism. In the long run, we can
hopefully remove it and the whole "config" job, as it consumes CPU and
adds to the end-to-end latency of the whole workflow. But we don't want
to do that immediately, as people need time to migrate until the
CI_BRANCHES change has made it into the workflow file of every branch.
So let's issue a warning, which will appear in the "annotations" section
below the workflow result in GitHub's web interface. And let's remove
the sample allow-refs script, as we don't want to encourage anybody to
use it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we added config to skip CI for certain branches in e76eec3554 (ci:
allow per-branch config for GitHub Actions, 2020-05-07), there wasn't
any way to avoid spinning up a VM just to check the config. From the
developer's perspective this isn't too bad, as the "skipped" branches
complete successfully after running the config job (the workflow result
is "success" instead of "skipped", but that is a minor lie).
But we are still wasting time and GitHub's CPU to spin up a VM just to
check the result of a short shell script. At the time there wasn't any
way to avoid this. But they've since introduced repo-level variables
that should let us do the same thing:
https://github.blog/2023-01-10-introducing-required-workflows-and-configuration-variables-to-github-actions/#configuration-variables
This is more efficient, and as a bonus is probably less confusing to
configure (the existing system requires sticking your config on a magic
ref).
See the included docs for how to configure it.
The code itself is pretty simple: it checks the variable and skips the
config job if appropriate (and everything else depends on the config job
already). There are two slight inaccuracies here:
- we don't insist on branches, so this likewise applies to tag names
or other refs. I think in practice this is OK, and keeping the code
(and docs) short is more important than trying to be more exact. We
are targeting developers of git.git and their limited workflows.
- the match is done as a substring (so if you want to run CI for
"foobar", then branch "foo" will accidentally match). Again, this
should be OK in practice, as anybody who uses this is likely to only
specify a handful of well-known names. If we want to be more exact,
we can have the code check for adjoining spaces. Or even move to a
more general CI_CONFIG variable formatted as JSON. I went with this
scheme for the sake of simplicity.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
transfer.unpackLimit ought to be used as a fallback, but overrode
fetch.unpackLimit and receive.unpackLimit instead.
* ts/unpacklimit-config-fix:
transfer.unpackLimit: fetch/receive.unpackLimit takes precedence
"git diff -w --exit-code" with various options did not work
correctly, which is being addressed.
* jc/diff-exit-code-with-w-fixes:
diff: the -w option breaks --exit-code for --raw and other output modes
t4040: remove test that succeeded for a wrong reason
diff: teach "--stat -w --exit-code" to notice differences
diff: mode-only change should be noticed by "--patch -w --exit-code"
diff: move the fallback "--exit-code" code down
The commit-graph verification code that detects mixture of zero and
non-zero generation numbers has been updated.
* tb/commit-graph-verify-fix:
commit-graph: avoid repeated mixed generation number warnings
t/t5318-commit-graph.sh: test generation zero transitions during fsck
commit-graph: verify swapped zero/non-zero generation cases
commit-graph: introduce `commit_graph_generation_from_graph()`
The parsing of stdin is driven by a table of function pointers; mark
unused parameters in concrete functions to avoid -Wunused-parameter
warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of the scheduler update callbacks gets the descriptor of the lock
file, but only the crontab updater needs it. We have to retain the
unused descriptors because these are dispatched from a table of function
pointers, but we should mark them to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first hunk is similar to 02c3c59e62 (hashmap: mark unused callback
parameters, 2022-08-19), but was added after that commit.
The other two are used with for_all_bundles_in_list(), but don't use
their void data pointer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this callback is just trying to collect the set of queued tag
updates, there is no need for it to look at old_oid at all. Mark it as
unused to appease -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our select_all() callback does not need to actually look at its
parameters, since the point is to match everything. But we need to mark
its parameters to satisfy -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When USE_LIBPCRE2 is not defined, we compile several noop fallbacks.
These need to have their parameters annotated to avoid
-Wunused-parameter warnings (and obviously we cannot remove the
parameters, since the functions must match the non-fallback versions).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier patches annotating unused parameters in imap-send missed a few
cases in code that is compiled only with NO_OPENSSL. These need to
retain the extra parameters to match the interfaces used when we compile
with openssl support.
Note in the case of socket_perror() that the function declaration and
parts of its code are shared between the two cases, and only the openssl
code looks at "sock". So we can't simply mark the parameter as always
unused. Instead, we can add a noop statement that references it. This is
ugly, but should be portable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The noop repair callback unsurprisingly does not look at any of its
parameters. Mark them as unused to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The noop negotiator unsurprisingly does not bother looking at any of its
parameters. Mark them unused to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interactive commands are dispatched from a table of abstract
pointers, but not every command uses every parameter it receives. Mark
the unused ones to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a callback used with grep_options.output, but we don't look at
the grep_opt parameter, as we're just writing the output to stdout.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The trace2 test helper uses function pointers to dispatch to individual
tests. Not all tests bother looking at their argv/argc parameters. We
could tighten this up (e.g., complaining when seeing unexpected
parameters), but for internal test code it's not worth worrying about.
This is similar in spirit to 126e3b3d2a (t/helper: mark unused argv/argc
arguments, 2023-03-28).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should have been part of 783a86c142 (config: mark unused callback
parameters, 2022-08-19), but was missed in that commit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many trace2 targets ignore the absolute elapsed time parameters.
However, the virtual interface needs to retain the parameter since it is
used by others (e.g., the perf target).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is similar to the cases in 61bdc7c5d8 (diff: mark unused parameters
in callbacks, 2022-12-13), but I missed it when making that commit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The formatting functions are dispatched from a table of function
pointers. The "path name only" function unsurprisingly does not need to
look at its "oid" parameter, but we must mark it as unused to make
-Wunused-parameter happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The compute_generation_info code uses function pointers to abstract the
get/set generation operations. Some callers don't need the extra void
data pointer, which should be annotated to appease -Wunused-parameter.
Note that we can drop the assignment of the "data" parameter in
compute_generation_numbers(), as we've just shown that neither of the
callbacks it uses will access it. This matches the caller in
ensure_generations_valid(), which already does not bother to set "data".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is similar to the cases in 63e14ee2d6 (refs: mark unused
each_ref_fn parameters, 2022-08-19), but it was added after that commit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is similar to the cases in c50dca2a18 (list-objects: mark unused
callback parameters, 2023-02-24), but was added after that commit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are similar to the cases annotated in 5fe9e1ce2f (ref-filter: mark
unused callback parameters, 2023-02-24), but were added after that
commit.
Note that the ahead/behind callback ignores its "atom" parameter, which
is a little unusual, since that struct usually stores the result. But in
this case, the data is stored centrally in ref_array->counts, since we
want to compute all ahead/behinds at once, not per ref.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In sequencer_get_last_command(), we don't ever look at the repository
parameter. This is due to ed5b1ca10b (status: do not report errors in
sequencer/todo, 2019-06-27), which dropped the call to parse_insn_line().
However, it _should_ be used when calling into git_path_* functions,
but the one we use here is declared with the non-REPO variant of
GIT_PATH_FUNC(), and so just uses the_repository internally.
We could change the path helper to use REPO_GIT_PATH_FUNC(), but doing
so piecemeal is not great. There are 41 uses of GIT_PATH_FUNC() in
sequencer.c, and inconsistently switching one makes the code more
confusing. Likewise, this one function is used in half a dozen other
spots, all of which would need to start passing in a repository argument
(with rippling effects up the call stack).
So let's punt on that for now and just silence any -Wunused-parameter
warning.
Note that we could also drop this parameter entirely, as the function is
always called directly, and not as a callback that has to conform to
some external interface. But since we'd eventually want to use the
repository parameter, let's leave it in place to avoid disrupting the
callers twice.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of just using the_repository, we can take a repository parameter
from the caller. Most of them already have one, and doing so clears up a
few -Wunused-parameter warnings. There are still a few callers which use
the_repository, but this pushes us one small step forward to eventually
getting rid of those.
Note that a few of these functions have a "rev_info" whose "repo"
parameter could probably be used instead of the_repository. I'm leaving
that for further cleanups, as it's not immediately obvious that
revs->repo is always valid, and there's quite a bit of other possible
refactoring here (even getting rid of some "struct repository" arguments
in favor of revs->repo).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Every once in a while, the `git-p4` tests flake for reasons outside of
our control. It typically fails with "Connection refused" e.g. here:
https://github.com/git/git/actions/runs/5969707156/job/16196057724
[...]
+ git p4 clone --dest=/home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git //depot
Initialized empty Git repository in /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git/.git/
Perforce client error:
Connect to server failed; check $P4PORT.
TCP connect to localhost:9807 failed.
connect: 127.0.0.1:9807: Connection refused
failure accessing depot: could not run p4
Importing from //depot into /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git
[...]
This happens in other jobs, too, but in the `linux-asan-ubsan` job it
hurts the most because that job often takes over a full hour to run,
therefore re-running a failed `linux-asan-ubsan` job is _very_ costly.
The purpose of the `linux-asan-ubsan` job is to exercise the C code of
Git, anyway, and any part of Git's source code that the `git-p4` tests
run and that would benefit from the attention of ASAN/UBSAN are run
better in other tests anyway, as debugging C code run via Python scripts
can get a bit hairy.
In fact, it is not even just `git-p4` that is the problem (even if it
flakes often enough to be problematic in the CI builds), but really the
part about Python scripts. So let's just skip any Python parts of the
tests from being run in that job.
For good measure, also skip the Subversion tests because debugging C
code run via Perl scripts is as much fun as debugging C code run via
Python scripts. And it will reduce the time this very expensive job
takes, which is a big benefit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git GUI updates.
* py/git-gui-updates:
git-gui - use mkshortcut on Cygwin
git-gui - use cygstart to browse on Cygwin
git-gui - remove obsolete Cygwin specific code
git gui Makefile - remove Cygwin modifications
Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
Work around Tcl's default `PATH` lookup
Move the `_which` function (almost) to the top
Move is_<platform> functions to the beginning
is_Cygwin: avoid `exec`ing anything
windows: ignore empty `PATH` elements
git-gui: Fix a typo in README
Tweak GitHub Actions CI so that pushing the same commit to multiple
branch tips at the same time will not waste building and testing
the same thing twice.
* jc/ci-skip-same-commit:
ci: avoid building from the same commit in parallel
Teach "git check-attr" work better with sparse-index.
* sl/sparse-check-attr:
check-attr: integrate with sparse-index
attr.c: read attributes in a sparse directory
t1092: add tests for 'git check-attr'
This section was added in 3d89a8c118 (Documentation/technical: add
cruft-packs.txt, 2022-05-20) to highlight a potential pitfall when
deploying cruft packs in an environment where multiple versions of Git
are GC-ing the same repository.
Now that it has been more than a year since 3d89a8c118 was written,
let's drop this section as it is no longer relevant.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This text, originally from 3d89a8c118 (Documentation/technical: add
cruft-packs.txt, 2022-05-20) lists multiple cruft packs as a potential
alternative to the design of cruft packs.
We have always supported multiple cruft packs (i.e. we use the most
recent mtime for a given object among all cruft packs which contain it,
etc.), but haven't encouraged its use.
We still aren't encouraging users to go out and generate multiple cruft
packs, but let's take a step in that direction by dropping language that
suggests we aren't capable of working with multiple cruft packs.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pack-objects learned the `--cruft` option back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20), we
explicitly forbade `--cruft` with `--max-pack-size`.
At the time, there was no specific rationale given in the patch for not
supporting the `--max-pack-size` option with `--cruft`. (As best I can
remember, it's because we were trying to push users towards only ever
having a single cruft pack, but I cannot be sure).
However, `--max-pack-size` is flexible enough that it already works with
`--cruft` and can shard unreachable objects across multiple cruft packs,
creating separate ".mtimes" files as appropriate. In fact, the
`--max-pack-size` option worked with `--cruft` as far back as
b757353676!
This is because we overwrite the `written_list`, and pass down the
appropriate length, i.e. the number of objects written in each pack
shard.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading input with the `--cruft` option, `git pack-objects` reads
each line into a strbuf, and then moves it to either the list of
discarded or fresh packs, depending on whether or not the input line
starts with a '-' character.
At the beginning of each loop iteration, the next line of input is read
with `strbuf_getline()`, which calls `strbuf_reset()` (as a part of
`strbuf_getwholeline()`) before reading the next line of input.
Thus, the call to `strbuf_reset()` (added back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20)) at the
end of the loop is unnecessary, so let's remove it accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When t5583-push-branches.sh was originally introduced via 425b4d7f47
(push: introduce '--branches' option, 2023-05-06), it was not leak-free.
In fact, the test did not even run correctly until 022fbb655d (t5583:
fix shebang line, 2023-05-12), but after applying that patch, we see a
failure at t5583.8:
==2529087==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 384 byte(s) in 1 object(s) allocated from:
#0 0x7fb536330986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
#1 0x55e07606cbf9 in xrealloc wrapper.c:140
#2 0x55e075fb6cb3 in prio_queue_put prio-queue.c:42
#3 0x55e075ec81cb in get_reachable_subset commit-reach.c:917
#4 0x55e075fe9cce in add_missing_tags remote.c:1518
#5 0x55e075fea1e4 in match_push_refs remote.c:1665
#6 0x55e076050a8e in transport_push transport.c:1378
#7 0x55e075e2eb74 in push_with_options builtin/push.c:401
#8 0x55e075e2edb0 in do_push builtin/push.c:458
#9 0x55e075e2ff7a in cmd_push builtin/push.c:702
#10 0x55e075d8aaf0 in run_builtin git.c:452
#11 0x55e075d8af08 in handle_builtin git.c:706
#12 0x55e075d8b12c in run_argv git.c:770
#13 0x55e075d8b6a0 in cmd_main git.c:905
#14 0x55e075e81f07 in main common-main.c:60
#15 0x7fb5360ab6c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#16 0x7fb5360ab784 in __libc_start_main_impl ../csu/libc-start.c:360
#17 0x55e075d88f40 in _start (git+0x1ff40) (BuildId: 38ad998b85a535e786129979443630d025ec2453)
SUMMARY: LeakSanitizer: 384 byte(s) leaked in 1 allocation(s).
This leak was addressed independently via 68b51172e3 (commit-reach: fix
memory leak in get_reachable_subset(), 2023-06-03), which makes t5583
leak-free.
But t5583 was not in the tree when 68b51172e3 was written, and the two
only met after the latter was merged back in via 693bde461c (Merge
branch 'mh/commit-reach-get-reachable-plug-leak', 2023-06-20).
At that point, t5583 was leak-free. Let's mark it as such accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test was leak-free when t3321 was originally introduced, but never
marked as such:
$ rev="$(git log --format='%H' --reverse -1 HEAD^ -- t/t3321-notes-stripspace.sh)"
$ git checkout $rev
$ make SANITIZE=leak
[...]
$ make -C t GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_OPTS=--immediate t3321-notes-stripspace.sh
[...]
# passed all 27 test(s)
1..27
# faking up non-zero exit with --invert-exit-code
make: *** [Makefile:66: t3321-notes-stripspace.sh] Error 1
make: Leaving directory '/home/ttaylorr/src/git/t'
Mark this test as leak-free accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the topic merged via 5a4f8381b6 (Merge branch
'ab/mark-leak-free-tests', 2021-10-25), a handful of tests in the suite
were marked as leak-free.
Since then, a handful of tests have become leak-free due to changes like
- 861c56f6f9 (branch: fix a leak in setup_tracking, 2023-06-11), and
- 866b43e644 (do_read_index(): always mark index as initialized unless
erroring out, 2023-06-29)
, but weren't updated at the time to mark themselves as such. This leads
to test "failures" when running:
$ make SANITIZE=leak
$ make -C t \
GIT_TEST_PASSING_SANITIZE_LEAK=check \
GIT_TEST_SANITIZE_LEAK_LOG=true \
GIT_TEST_OPTS=-vi test
This patch closes those gaps by exporting TEST_PASSES_SANITIZE_LEAK=true
before sourcing t/test-lib.sh on most remaining leak-free tests.
There are a couple of other tests which are similarly leak-free, but not
included in the list of tests touched by this patch. The remaining tests
will be addressed in the subsequent two patches.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When I run the tests in leak-checking mode the same way our CI job does,
like:
make SANITIZE=leak \
GIT_TEST_PASSING_SANITIZE_LEAK=true \
GIT_TEST_SANITIZE_LEAK_LOG=true \
test
then LSan can racily produce useless entries in the log files that look
like this:
==git==3034393==Unable to get registers from thread 3034307.
I think they're mostly harmless based on the source here:
7e0a52e8e9/compiler-rt/lib/lsan/lsan_common.cpp (L414)
which reads:
PtraceRegistersStatus have_registers =
suspended_threads.GetRegistersAndSP(i, ®isters, &sp);
if (have_registers != REGISTERS_AVAILABLE) {
Report("Unable to get registers from thread %llu.\n", os_id);
// If unable to get SP, consider the entire stack to be reachable unless
// GetRegistersAndSP failed with ESRCH.
if (have_registers == REGISTERS_UNAVAILABLE_FATAL)
continue;
sp = stack_begin;
}
The program itself still runs fine and LSan doesn't cause us to abort.
But test-lib.sh looks for any non-empty LSan logs and marks the test as
a failure anyway, under the assumption that we simply missed the failing
exit code somehow.
I don't think I've ever seen this happen in the CI job, but running
locally using clang-14 on an 8-core machine, I can't seem to make it
through a full run of the test suite without having at least one
failure. And it's a different one every time (though they do seem to
often be related to packing tests, which makes sense, since that is one
of our biggest users of threaded code).
We can hack around this by only counting LSan log files that contain a
line that doesn't match our known-uninteresting pattern.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two topics did not see much interest and reviews while they
were on 'next'; let's "inflict" them to the general public and see
if anybody screams, which is much less nicer way than to merge
only topics that are well reviewed down in an orderly manner, but
that is the only thing we can do to these topics without any
development community help.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update two credential helpers to correctly match which credential
to erase; they dropped not the ones with stale password.
* mh/credential-erase-improvements-more:
credential/wincred: erase matching creds only
credential/libsecret: erase matching creds only
The way authentication related data other than passwords (e.g.
oath token and password expiration data) are stored in libsecret
keyrings has been rethought.
* mh/credential-libsecret-attrs:
credential/libsecret: store new attributes
When running 'scalar reconfigure -a', Scalar has warning messages about
the repository missing (or not containing a .git directory). Failures
can also happen while trying to modify the repository-local config for
that repository.
These warnings may seem confusing to users who don't understand what
they mean or how to stop them.
Add a warning that instructs the user how to remove the warning in
future installations.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many reasons why discovering a Git directory may fail. In
particular, 8959555cee (setup_git_directory(): add an owner check for
the top-level directory, 2022-03-02) added ownership checks as a
security precaution.
Callers attempting to set up a Git directory may want to inform the user
about the reason for the failure. For that, expose the enum
discovery_result from within setup.c and move it into cache.h where
discover_git_directory() is defined.
I initially wanted to change the return type of discover_git_directory()
to be this enum, but several callers rely upon the "zero means success".
The two problems with this are:
1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive
results are errors.
2. There are multiple successful states; positive results are
successful.
It is worth noting that GIT_DIR_NONE is not returned, so we remove this
option from the enum. We must be careful to keep the successful reasons
as positive values, so they are given explicit positive values.
Instead of updating all callers immediately, add a new method,
discover_git_directory_reason(), and convert discover_git_directory() to
be a thin shim on top of it.
One thing that is important to note is that discover_git_directory()
previously returned -1 on error, so let's continue that into the future.
There is only one caller (in scalar.c) that depends on that signedness
instead of a non-zero check, so clean that up, too.
Because there are extra checks that discover_git_directory_reason() does
after setup_git_directory_gently_1(), there are other modes that can be
returned for failure states. Add these modes to the enum, but be sure to
explicitly add them as BUG() states in the switch of
setup_git_directory_gently().
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some users have strong aversions to Scalar's opinion that the repository
should be in a 'src' directory, even though this creates a clean slate
for placing build artifacts in adjacent directories.
The new --no-src option allows users to opt out of the default behavior.
While adding options, make sure the usage output by 'scalar clone -h'
reports the same as the SYNOPSIS line in Documentation/scalar.txt.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
1b68387e02 (builtin/receive-pack.c: use parse_options API, 2016-03-02)
added the options --stateless-rpc, --advertise-refs and
--reject-thin-pack-for-testing with a NULL `help` string; 03831ef7b5
(difftool: implement the functionality in the builtin, 2017-01-19)
similarly added the "helpless" option --prompt. Presumably this was
done because all four options are hidden and self-explanatory.
They cause a NULL pointer dereference when using the option --help-all
with their respective tool, though. Handle such options gracefully
instead by turning the NULL pointer into an empty string at the top of
the loop, always printing a newline at the end and passing through the
separating newlines from the help text.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git cmd -h" learned to signal which options can be negated by
listing such options like "--[no-]opt".
* rs/parse-options-negation-help:
parse-options: simplify usage_padding()
parse-options: no --[no-]no-...
parse-options: factor out usage_indent() and usage_padding()
parse-options: show negatability of options in short help
t1502: test option negation
t1502: move optionspec help output to a file
t1502, docs: disallow --no-help
subtree: disallow --no-{help,quiet,debug,branch,message}
At times, we may need to push the same commit to multiple branches
in the same push. Rewinding 'next' to rebuild on top of 'master'
soon after a release is such an occasion. Making sure 'main' stays
in sync with 'master' to help those who expect that primary branch
of the project is named either of these is another.
We already use the branch name as a "concurrency group" key, but
that does not address the situation illustrated above.
Let's introduce another `concurrency` attribute, using the commit
hash as the concurrency group key, on the workflow run level, to
address this. This will hold any workflow run in the queued state
when there is already a workflow run targeting the same commit,
until that latter run completed. The `skip-if-redundant` check of
the second run will then have a chance to see whether the first
run succeeded.
The only caveat with this strategy is that only one workflow run
will be kept in the queued state by the `concurrency` feature: if
another run targeting the same commit is triggered, the
previously-queued run will be canceled. Considering the benefit,
this seems the smaller price to pay than to overload Git's build
agent pool with undesired workflow runs.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* https://github.com/prati0100/git-gui:
git-gui - use mkshortcut on Cygwin
git-gui - use cygstart to browse on Cygwin
git-gui - remove obsolete Cygwin specific code
git gui Makefile - remove Cygwin modifications
Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
Work around Tcl's default `PATH` lookup
Move the `_which` function (almost) to the top
Move is_<platform> functions to the beginning
is_Cygwin: avoid `exec`ing anything
windows: ignore empty `PATH` elements
git-gui: Fix a typo in README
Hourly and other schedule of "git maintenance" jobs are randomly
distributed now.
* ds/maintenance-schedule-fuzz:
maintenance: update schedule before config
maintenance: fix systemd schedule overlaps
maintenance: use random minute in systemd scheduler
maintenance: swap method locations
maintenance: use random minute in cron scheduler
maintenance: use random minute in Windows scheduler
maintenance: use random minute in launchctl scheduler
maintenance: add get_random_minute()
Test updates.
* ob/test-lib-rebase-fake-editor-updates:
t/lib-rebase: improve documentation of set_fake_editor()
t/lib-rebase: set_fake_editor(): handle FAKE_LINES more consistently
t/lib-rebase: set_fake_editor(): fix recognition of reset's short command
Overly long label names used in the sequencer machinery are now
chopped to fit under filesystem limitation.
* mp/rebase-label-length-limit:
rebase: allow overriding the maximal length of the generated labels
sequencer: truncate labels to accommodate loose refs
A message written in olden time prevented a branch from getting
checked out saying it is already checked out elsewhere, but these
days, we treat a branch that is being bisected or rebased just like
a branch that is checked out and protect it. Rephrase the message
to say that the branch is in use.
* rj/branch-in-use-error-message:
branch: error message checking out a branch in use
branch: error message deleting a branch in use
The canonical way to represent "no error hint" is making it NULL, which
shortcuts the error() call altogether. This fixes the output by removing
the line which said just "error:", which would appear when the worktree
is dirtied while editing the initial rebase todo file. This was
introduced by 97e1873 (rebase -i: rewrite complete_action() in C,
2018-08-28), which did a somewhat inaccurate conversion from shell.
To avoid that such bugs re-appear, test for the condition in
require_clean_work_tree().
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove some code supporting ancient Cygwin Tcl/Tk versions. Also fix
exploring working directory and making desktop shortcuts on Cygwin.
* ml/cygwin-fixes:
git-gui - use mkshortcut on Cygwin
git-gui - use cygstart to browse on Cygwin
git-gui - remove obsolete Cygwin specific code
git gui Makefile - remove Cygwin modifications
git-gui enables the "Repository->Create Desktop Icon" item on Cygwin,
offering to create a shortcut that starts git-gui on the current
repository. The code in do_cygwin_shortcut invokes function
win32_create_lnk to create the shortcut. This latter function is shared
between Cygwin and Git For Windows and expects Windows rather than unix
pathnames, though do_cygwin_shortcut provides unix pathnames. Also, this
function tries to invoke the Windows Script Host to run a javascript
snippet, but this fails under Cygwin's Tcl. So, win32_create_lnk just
does not support Cygwin.
However, Cygwin's default installation provides /bin/mkshortcut for
creating desktop shortcuts. This is compatible with exec under Cygwin's
Tcl, understands Cygwin's unix pathnames, and avoids the need for shell
escapes to encode troublesome paths. So, teach git-gui to use mkshortcut
on Cygwin, leaving win32_create_lnk unchanged and for exclusive use by
Git For Windows.
Notes: "CHERE_INVOKING=1" is recognized by Cygwin's /etc/profile and
prevents a "chdir $HOME", leaving the shell in the working directory
specified by the shortcut. That directory is written directly by
mkshortcut eliminating any problems with shell escapes and quoting.
The code being replaced includes the full pathname of the git-gui
creating the shortcut, but that git-gui might not be compatible with the
git found after /etc/profile sets the path, and might have a pathname
that defies encoding using shell escapes that can survive the multiple
incompatible interpreters involved in the chain of creating and using
this shortcut. The new code uses bare "git gui" as the command to
execute, thus using the system git to launch the system git-gui, and
avoiding both issues.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
git-gui enables the "Repository->Explore Working Copy" menu on Cygwin,
offering to open a Windows graphical file browser at the root of the
working directory. This code, shared with Git For Windows support,
depends upon use of Windows pathnames. However, git gui on Cygwin uses
unix pathnames, so this shared code will not work on Cygwin.
A base install of Cygwin provides the /bin/cygstart utility that runs
a registered Windows application based upon the file type, after
translating unix pathnames to Windows. Adding the --explore option
guarantees that the Windows file explorer is opened, regardless of the
supplied pathname's file type and avoiding possibility of some other
action being taken.
So, teach git-gui to use cygstart --explore on Cygwin, restoring the
pre-2012 behavior of opening a Windows file explorer for browsing. This
separates the Git For Windows and Cygwin code paths. Note that
is_Windows is never true on Cygwin, and is_Cygwin is never true on Git
for Windows, though this is not obvious by examining the code for those
independent functions.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
In the current git release, git-gui runs on Cygwin without enabling any
of git-gui's Cygwin specific code. This happens as the Cygwin specific
code in git-gui was (mostly) written in 2007-2008 to work with Cygwin's
then supplied Tcl/Tk which was an incompletely ported variant of the
8.4.1 Windows Tcl/Tk code. In March, 2012, that 8.4.1 package was
replaced with a full port based upon the upstream unix/X11 code,
since maintained up to date. The two Tcl/Tk packages are completely
incompatible, and have different signatures.
When Cygwin's Tcl/Tk signature changed in 2012, git-gui no longer
detected Cygwin, so did not enable Cygwin specific code, and the POSIX
environment provided by Cygwin since 2012 supported git-gui as a generic
unix. Thus, no-one apparently noticed the existence of incompatible
Cygwin specific code.
However, since commit c5766eae6f in the git-gui source tree
(https://github.com/prati0100/git-gui, master at a5005ded), and not yet
pulled into the git repository, the is_Cygwin function does detect
Cygwin using the unix/X11 Tcl/Tk. The Cygwin specific code is enabled,
causing use of Windows rather than unix pathnames, and enabling
incorrect warnings about environment variables that were relevant only
to the old Tcl/Tk. The end result is that (upstream) git-gui is now
incompatible with Cygwin.
So, delete Cygwin specific code (code protected by "if is_Cygwin") that
is not needed in any form to work with the unix/X11 Tcl/Tk.
Cygwin specific code required to enable file browsing and shortcut
creation is not addressed in this patch, does not currently work, and
invocation of those items may leave git-gui in a confused state.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
git-gui's Makefile hardcodes the absolute Windows path of git-gui's libraries
into git-gui, destroying the ability to package git-gui on one machine and
distribute to others. The intent is to do this only if a non-Cygwin Tcl/Tk is
installed, but the test for this is wrong with the unix/X11 Tcl/Tk shipped
since 2012. Also, Cygwin does not support a non-Cygwin Tcl/Tk.
The Cygwin git maintainer disables this code, so this code is definitely
not in use in the Cygwin distribution.
The simplest fix is to just delete the Cygwin specific code,
allowing the Makefile to work out of the box on Cygwin. Do so.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Now that we ran a trustdb check forcibly, it no longer pollutes the
output, and filtering is no longer required.
Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The transfer.unpackLimit configuration variable is documented to be
used only as a fallback value when the more operation-specific
fetch.unpackLimit and receive.unpackLimit variables are not set, but
the implementation had the precedence reversed. Apparently this was
broken since the transfer.unpackLimit was introduced in e28714c5
(Consolidate {receive,fetch}.unpackLimit, 2007-01-24).
Often when documentation and code have diverged for so long, we
prefer to change the documentation instead, to avoid disrupting
users. But doing so would make these weirdly unlike most other
"specific overrides general" config options. And the fact that the
bug has existed for so long without anyone noticing implies to me
that nobody really tries to mix and match them much.
Signed-off-by: Taylor Santiago <taylorsantiago@google.com>
[jc: rewrote the log message, added tests, covered receive-pack as well]
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to compare output later, so randomly popping up 'gpg: checking
the trustdb' breaks the tests. Run the trustdb update forcibly.
Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The output from "--raw", "--name-status", and "--name-only" modes in
"git diff" does depend on and does not reflect how certain different
contents are considered equal, unlike "--patch" and "--stat" output
modes do, when used with options like "-w" (another way of thinking
about it is that it is not like we recompute the hash of the blob
after removing all whitespaces to show "git diff --raw -w" output).
But the fact that "--raw" and friends ignore "-w" is not a good
excuse for "diff --raw -w --exit-code" to also ignore the request to
report the differences with its exit status. When run without "-w",
"git diff --exit-code --raw" does report with its exit status the
differences as requested, and we should do the same when run with
"-w", too.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When validating that a commit-graph has either all zero, or all non-zero
generation numbers, we emit a warning on both the rising and falling
edge of transitioning between the two.
So if we are unfortunate enough to see a commit-graph which has a
repeating sequence of zero, then non-zero generation numbers, we'll
generate many warnings that contain more or less the same information.
Avoid this by keeping track of a single example for a commit with zero-
and non-zero generation, and emit a single warning at the end of
verification if both are non-NULL.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The second test called "detect incorrect generation number" asserts that
we correctly warn during an fsck when we see a non-zero generation
number after seeing a zero beforehand.
The other transition (going from non-zero to zero) was previously
untested. Test both directions, and rename the existing test to make
clear which direction it is exercising.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In verify_one_commit_graph(), we have code that complains when a commit
is found with a generation number of zero, and then later with a
non-zero number. It works like this:
1. When we see an entry with generation zero, we set the
generation_zero flag to GENERATION_ZERO_EXISTS.
2. When we later see an entry with a non-zero generation, we complain
if the flag is GENERATION_ZERO_EXISTS.
There's a matching GENERATION_NUMBER_EXISTS value, which in theory would
be used to find the case that we see the entries in the opposite order:
1. When we see an entry with a non-zero generation, we set the
generation_zero flag to GENERATION_NUMBER_EXISTS.
2. When we later see an entry with a zero generation, we complain if
the flag is GENERATION_NUMBER_EXISTS.
But that doesn't work; step 2 is implemented, but there is no step 1. We
never use NUMBER_EXISTS at all, and Coverity rightly complains that step
2 is dead code.
We can fix that by implementing that step 1.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2ee11f7261 (commit-graph: return generation from memory, 2023-03-20),
the `commit_graph_generation()` function stopped returning zeros when
asked to locate the generation number of a given commit.
This was done at the time to prepare for a later change which set
generation values in memory, meaning that we could no longer rely on
`graph_pos` alone to tell us whether or not to trust the generation
number returned by this function.
In 2ee11f7261, it was noted that this change only impacted very old
commit-graphs, which were written with all commits having generation
number 0. Indeed, zero is not a valid generation number, so we should
never expect to see that value outside of the aforementioned case.
The test fallout in 2ee11f7261 indicated that we were no longer able to
fsck a specific old case of commit-graph corruption, where we see a
non-zero generation number after having seen a generation number of 0
earlier.
Introduce a variant of `commit_graph_generation()` which behaves like
that function did prior to 2ee11f7261, known as
`commit_graph_generation_from_graph()`. Then use this function in the
context of `verify_one_commit_graph()`, where we only want to trust the
values from the graph.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many programs use diff_result_code() to get a user-visible program exit
code from a diff result (e.g., checking opts.found_changes if
--exit-code was requested).
This function also takes a "status" parameter, which seems at first
glance that it could be used to propagate an error encountered when
computing the diff. But it doesn't work that way:
- negative values are passed through as-is, but are not appropriate as
program exit codes
- when --exit-code or --check is in effect, we _ignore_ the passed-in
status completely. So a failed diff which did not have a chance to
set opts.found_changes would erroneously report "success, no
changes" instead of propagating the error.
After recent cleanups, neither of these bugs is possible to trigger, as
every caller just passes in "0". So rather than fixing them, we can
simply drop the useless parameter instead.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since git-diff has many diff modes, it dispatches to many helpers to
perform each one. But every helper simply returns "0", as it exits
directly if there are serious errors (and options like --exit-code are
handled afterwards). So let's get rid of these useless return values,
which makes the code flow more clear.
There's very little chance that we'd later want to propagate errors
instead of dying immediately. These are all static-local helpers for the
git-diff program implementing its various modes. More "lib-ified" code
would directly call the underlying functions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Neither of these functions ever returns a value other than zero.
Instead, they expect unrecoverable errors to exit immediately, and
things like "--exit-code" are stored inside the diff_options struct to
be handled later via diff_result_code().
Some callers do check the return values, but many don't bother. Let's
drop the useless return values, which are misleading callers about how
the functions work. This could be seen as a step in the wrong direction,
as we might want to eventually "lib-ify" these to more cleanly return
errors up the stack, in which case we'd have to add the return values
back in. But there are some benefits to doing this now:
1. In the current code, somebody could accidentally add a "return -1"
to one of the functions, which would be erroneously ignored by many
callers. By removing the return code, the compiler can notice the
mismatch and force the developer to decide what to do.
Obviously the other option here is that we could start consistently
checking the error code in every caller. But it would be dead code,
and we wouldn't get any compile-time help in catching new cases.
2. It communicates the situation to callers, who may want to choose a
different function. These functions are really thin wrappers for
doing git-diff-files and git-diff-index within the process. But
callers who care about recovering from an error here are probably
better off using the underlying library functions, many of
which do return errors.
If somebody eventually wants to teach these functions to propagate
errors, they'll have to switch back to returning a value, effectively
reverting this patch. But at least then they will be starting with a
level playing field: they know that they will need to inspect each
caller to see how it should handle the error.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the git-diff program fails to read the index in its diff-files or
diff-index helper functions, it propagates the error up the stack. This
eventually lands in diff_result_code(), which does not handle it well
(as discussed in the previous patch).
Since the only sensible thing here is to exit with an error code (and
what we were expecting the propagated error code to cause), let's just
do that directly.
There's no test here, as I'm not even sure this case can be triggered.
The index-reading functions tend to die() themselves when encountering
any errors, and the return value is just the number of entries in the
file (and so always 0 or positive). But let's err on the conservative
side and keep checking the return value. It may be worth digging into as
a separate topic (though index-reading is low-level enough that we
probably want to eventually teach it to propagate errors anyway for
lib-ification purposes, at which point this code would already be doing
the right thing).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-diff command has many modes (comparing worktree to index, index
to HEAD, individual blobs, etc). As a result, it dispatches to many
helper functions and cannot completely parse its options until we're in
those helper functions.
Most of them, when seeing an unknown option, exit immediately by calling
usage(). But builtin_diff_files(), which is the default if no revision
or blob arguments are given, instead prints an error() and returns -1.
One obvious shortcoming here is that the user doesn't get to see the
usual usage message. But there's a much more important bug: the -1
return is fed to diff_result_code(), which is not ready to handle it.
By default, it passes the code along as an exit code. We try to avoid
negative exit codes because they get converted to unsigned values, but
it should at least consistently show up as non-zero (i.e., a failure).
But much worse is that when --exit-code is in effect, diff_result_code()
will _ignore_ the status passed in by the caller, and instead only
report on whether the diff found changes. It didn't, of course, because
we never ran the diff, and the program unexpectedly exits with success!
We can fix this bug by just calling usage(), like the other helpers do.
Another option would of course be to teach diff_result_code() to handle
this value. But as we'll see in the next few patches, it can be cleaned
up even further. Let's just fix this bug directly to start with.
Reported-by: Romain Chossart <romainchossart@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If loading the index fails, we print an error and then return "-1" from
the function. But since this is a builtin, we end up with exit(-1),
which produces odd results since program exit codes are unsigned.
Because of integer conversion, it usually becomes 255, which is at least
still an error, but values above 128 are usually interpreted as signal
death.
Since we know the program is exiting immediately, we can just replace
the error return with a die().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many callers of run_diff_index() passed literal "1" for the option
flag word, which should better be spelled out as DIFF_INDEX_CACHED
for readablity. Everybody else passes "0" that can stay as-is.
The other bit in the option flag word is DIFF_INDEX_MERGE_BASE, but
curiously there is only one caller that can pass it, which is "git
diff-index --merge-base" itself---no internal callers uses the
feature.
A bit tricky call to the function is in builtin/submodule--helper.c
where the .cached member in a private struct is set/reset as a plain
Boolean flag, which happens to be "1" and happens to match the value
of DIFF_INDEX_CACHED.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch makes it possible to directly feed a branch description to
derive the cover letter from. The use case is formatting dynamically
created temporary commits which are not referenced anywhere.
The most obvious alternative would be creating a temporary branch and
setting a description on it, but that doesn't seem particularly elegant.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the commit color instead of the HEAD color for the arrow or custom
symbol in "HEAD -> branch" decorations, for visual consistency with the
prefix, separator and suffix symbols, which are also colored with the
commit color.
This change was triggered by the possibility that one could choose to
use the same symbol for the pointer and the separator options in
%(decorate), in which case they ought to be the same color.
A related precedent is 'ls -l', where the arrow for symlinks gets the
default color rather than that of the symlink name.
Amend test t4207-log-decoration-colors.sh accordingly.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add pointer and tag options to %(decorate) format, to allow to override
the " -> " string used to show where HEAD points and the "tag: " string
used to mark tags.
Document in pretty-formats.txt and test in t4205-log-pretty-formats.sh.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add %(decorate[:<options>]) format that lists ref names similarly to the
%d format, but which allows the otherwise fixed prefix, suffix and
separator strings to be customized. Omitted options default to the
strings used in %d.
Rename expand_separator() function used to expand %x literal formatting
codes to expand_string_arg(), as it is now used on strings other than
separators.
Examples:
- %(decorate) is equivalent to %d.
- %(decorate:prefix=,suffix=) is equivalent to %D.
- %(decorate:prefix=[,suffix=],separator=%x3B) produces a list enclosed
in square brackets and separated by semicolons.
Test the format in t4205-log-pretty-formats.sh and document it in
pretty-formats.txt.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wrap "tag:" prefixes and the arrows in "HEAD -> branch" decorations in
their own color sequences. Otherwise, if --graph is used, tag names or
arrows can end up uncolored when %w width formatting breaks a line just
before them. This is because --graph resets the color after doing its
drawing at the start of a line.
Amend test t4207-log-decoration-colors.sh accordingly.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In format_decorations(), don't obtain color sequences if there are no
decorations, and don't emit color sequences around empty strings.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the format_decorations_extended function to format_decorations
and drop the format_decorations wrapper macro. Pass the prefix, suffix
and separator strings as a single 'struct format_decorations' pointer
argument instead of separate arguments. Use default values defined in
the function when either the struct pointer or any of the struct fields
are NULL. This is to ease extension with additional options.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enclose the 'options' placeholders in the documentation of the
%(describe) and %(trailers) format specifiers in angle brackets to
clarify that they are placeholders rather than keywords.
Also remove the indentation from their descriptions, instead of
increasing it to account for the extra two angle brackets in the
headings. The indentation isn't required by asciidoc, it doesn't reflect
how the output text is formatted, and it's inconsistent with the
following bullet points that are at the same level in the output.
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description for a %(trailer) option already uses this term without
having a definition anywhere in the document, and we are about to add
another one in %(decorate) that uses it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We prefer for callback functions to match the signature with which
they'll be called, rather than casting them to the correct type when
assigning function pointers. Even though casting often works in the real
world, it is a violation of the standard.
We did a mass conversion in 939af16eac (hashmap_cmp_fn takes
hashmap_entry params, 2019-10-06), but have grown a few new cases since
then. Because of the cast, the compiler does not complain. However, as
of clang-18, UBSan will catch these at run-time, and the case in
range-diff.c triggers when running t3206.
After seeing that one, I scanned the results of:
git grep '_fn)[^(]' '*.c' | grep -v typedef
and found a similar case in compat/terminal.c (which presumably isn't
called in the test suite, since it doesn't trigger UBSan). There might
be other cases lurking if the cast is done using a typedef that doesn't
end in "_fn", but loosening it finds too many false positives. I also
looked for:
git grep ' = ([a-z_]*) *[a-z]' '*.c'
to find assignments that cast, but nothing looked like a function.
The resulting code is unfortunately a little longer, but the bonus of
using container_of() is that we are no longer restricted to the
hashmap_entry being at the start of the struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"diff-tree -b --exit-code" without "--patch" exits with 0 status,
not because it finds that the two input files are equivalent while
ignoring whitespaces, but because the implied "--raw" mode always
exits with 0 when whitespace tweaking options like "-b" and "-w"
are given, which is a long-standing bug.
We are about to fix it so that "--raw" and friends report the
differences with the exit status (even though they ignore the
whitespace tweaking options when producing their output), which
will make this test, which succeeded for a wrong reason, start
failing. Remove it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When options like "-w" is used while "--exit-code" option is in
effect, instead of the usual "do we have any filepair whose preimage
and postimage have different <mode,object>?" check, we need to compare
the contents of the blobs, taking into account that certain changes
are considered no-op.
With the previous step, we taught "--patch" codepath to set the
.found_changes bit correctly, even for a change that only affects
the mode and not object. The "--stat" codepath, however, did not
set the .found_changes bit at all. This lead to
$ git diff --stat -w --exit-code
for a change that does have an output to exit with status 0.
Set the bit by inspecting the list of paths the diffstat output is
given for (a mode-only change will still appear as a "0-line added
0-line deleted" change) to fix it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath to notice the content-level changes, taking certain
no-op changes like "ignore whitespace" into account, forgot that
a mode-only change is still a change. This resulted in
$ git diff --patch --exit-code -w
to exit with status 0 even when there is such a mode-only change,
breaking both "--patch" and "--quiet" output formats.
Teach the builtin_diff() codepath that creation and deletion as well
as mode changes are all interesting changes.
Note that the test specifically checks removal of an empty file,
because if there is anything in the preimage (i.e. the removed file
is not empty), the removal would still trigger textual patch output
and the codepath for that does update .found_changes bit to report
that it found an interesting change. We need to make sure that the
.found_changes bit is set even without triggering textual patch
output.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "--exit-code" is asked and the code cannot just answer by
comparing the object names on both sides but needs to inspect and
compare the contents, there are two ways that the result is found
out.
Some output modes, like "--stat" and "--patch", inherently have to
inspect the contents in order to show the differences in the way
they do. The codepaths for these modes set the .found_changes bit
as they compute what to show.
However, other output modes do not need to inspect the contents to
show the differences in the way they do. The most notable example
is "--quiet", which does not need to compute any output to show.
When they are asked to report "--exit-code", they run the codepaths
for the "--patch" output with their output redirected to "/dev/null",
only to set the .found_changes bit.
Currently, this fallback invocation of "--patch" output is done
after the "--stat" output format and its friends and before the
"--patch" and internal callback logic. Move it to the end of
the sequence to clarify the fallback status of this code block.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7ba7c52d76 (upload-pack: fix race condition in error messages,
2023-08-10), we have fixed a race in t5516-fetch-push.sh where sometimes
error messages got intermingled. This was done by splitting up the call
to `die()` such that we print the error message before writing to the
remote side, followed by a call to `exit(1)` afterwards.
This causes a subtle regression though as `die()` causes us to exit with
exit code 128, whereas we now call `exit(1)`. It's not really clear
whether we want to guarantee any specific error code in this case, and
neither do we document anything like that. But on the other hand, it
seems rather clear that this is an unintended side effect of the change
given that this change in behaviour was not mentioned at all.
Restore the status-quo by exiting with 128. The test in t5703 to
ensure that "git fetch" fails by using test_must_fail, which does
not care between exiting 1 and 128, so this changes will not affect
any test.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The continuations of the compound command were indented as if they were
continuations of the embedded pipe, which was misleading.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If both directories D1 and D2 already exists, and further there is a
filesystem entity D2/D1, "git mv D1 D2" would fail, and we get an
error message that says:
"cannot move directory over file, source=D1, destination=D2/D1"
regardless of the type of existing "D2/D1". If it is a file, the
message is correct, but if it is a directory, it is not (we could
make the D2/D1 directory a union of its original contents and what
was in D1/, but that is not what we do).
The code that decies to issue the error message only checks for
existence of "D2/D1" and does not care what kind of thing sits at
the path.
Rephrase the message to say
"destination already exists, source=D1, destination=D2/D1"
that would be suitable for any kind of thing being in the way.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set the requires-full-index to false for "check-attr".
Add a test to ensure that the index is not expanded whether the files
are outside or inside the sparse-checkout cone when the sparse index is
enabled.
The `p2000` tests demonstrate a ~63% execution time reduction for
'git check-attr' using a sparse index.
Test before after
-----------------------------------------------------------------------
2000.106: git check-attr -a f2/f4/a (full-v3) 0.05 0.05 +0.0%
2000.107: git check-attr -a f2/f4/a (full-v4) 0.05 0.05 +0.0%
2000.108: git check-attr -a f2/f4/a (sparse-v3) 0.04 0.02 -50.0%
2000.109: git check-attr -a f2/f4/a (sparse-v4) 0.04 0.01 -75.0%
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before this patch, git check-attr was unable to read the attributes from
a .gitattributes file within a sparse directory. The original comment
was operating under the assumption that users are only interested in
files or directories inside the cones. Therefore, in the original code,
in the case of a cone-mode sparse-checkout, we didn't load the
.gitattributes file.
However, this behavior can lead to missing attributes for files inside
sparse directories, causing inconsistencies in file handling.
To resolve this, revise 'git check-attr' to allow attribute reading for
files in sparse directories from the corresponding .gitattributes files:
1.Utilize path_in_cone_mode_sparse_checkout() and index_name_pos_sparse
to check if a path falls within a sparse directory.
2.If path is inside a sparse directory, employ the value of
index_name_pos_sparse() to find the sparse directory containing path and
path relative to sparse directory. Proceed to read attributes from the
tree OID of the sparse directory using read_attr_from_blob().
3.If path is not inside a sparse directory,ensure that attributes are
fetched from the index blob with read_blob_data_from_index().
Change the test 'check-attr with pathspec outside sparse definition' to
'test_expect_success' to reflect that the attributes inside a sparse
directory can now be read. Ensure that the sparse index case works
correctly for git check-attr to illustrate the successful handling of
attributes within sparse directories.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests for `git check-attr`, make sure attribute file does get read
from index when path is either inside or outside of sparse-checkout
definition.
Add a test named 'diff --check with pathspec outside sparse definition'.
It starts by disabling the trailing whitespace and space-before-tab
checks using the core. whitespace configuration option. Then, it
specifically re-enables the trailing whitespace check for a file located
in a sparse directory by adding a whitespace=trailing-space rule to the
.gitattributes file within that directory. Next, create and populate the
folder1 directory, and then add the .gitattributes file to the index.
Edit the contents of folder1/a, add it to the index, and proceed to
"re-sparsify" 'folder1/' with 'git sparse-checkout reapply'. Finally,
use 'git diff --check --cached' to compare the 'index vs. HEAD',
ensuring the correct application of the attribute rules even when the
file's path is outside the sparse-checkout definition.
Mark the two tests 'check-attr with pathspec outside sparse definition'
and 'diff --check with pathspec outside sparse definition' as
'test_expect_failure' to reflect an existing issue where the attributes
inside a sparse directory are ignored. Ensure that the 'check-attr'
command fails to read the required attributes to demonstrate this
expected failure.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git maintenance start', the current pattern is to
configure global config settings to enable maintenance on the current
repository and set 'maintenance.auto' to false and _then_ to set up the
schedule with the system scheduler.
This has a problematic error condition: if the scheduler fails to
initialize, the repository still will not use automatic maintenance due
to the 'maintenance.auto' setting.
Fix this gap by swapping the order of operations. If Git fails to
initialize maintenance, then the config changes should never happen.
Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git maintenance run' command prevents concurrent runs in the same
repository using a 'maintenance.lock' file. However, when using systemd
the hourly maintenance runs the same time as the daily and weekly runs.
(Similarly, daily maintenance runs at the same time as weekly
maintenance.) These competing commands result in some maintenance not
actually being run.
This overlap was something we could not fix until we made the recent
change to not use the builting 'hourly', 'daily', and 'weekly' schedules
in systemd. We can adjust the schedules such that:
1. Hourly runs avoid the 0th hour.
2. Daily runs avoid Monday.
This will keep maintenance runs from colliding when using systemd.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.
Add this random minute to the systemd integration.
This integration is more complicated than similar changes for other
schedulers because of a neat trick that systemd allows: templating.
The previous implementation generated two template files with names
of the form 'git-maintenance@.(timer|service)'. The '.timer' or
'.service' indicates that this is a template that is picked up when we
later specify '...@<schedule>.timer' or '...@<schedule>.service'. The
'<schedule>' string is then used to insert into the template both the
'OnCalendar' schedule setting and the '--schedule' parameter of the
'git maintenance run' command.
In order to set these schedules to a given minute, we can no longer use
the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
need to abandon the template model for the .timer files. We can still
use templates for the .service files. For this reason, we split these
writes into two methods.
Modify the template with a custom schedule in the 'OnCalendar' setting.
This schedule has some interesting differences from cron-like patterns,
but is relatively easy to figure out from context. The one that might be
confusing is that '*-*-*' is a date-based pattern, but this must be
omitted when using 'Mon' to signal that we care about the day of the
week. Monday is used since that matches the day used for the 'weekly'
schedule used previously.
Now that the timer files are not templates, we might want to abandon the
'@' symbol in the file names. However, this would cause users with
existing schedules to get two competing schedules due to different
names. The work to remove the old schedule name is one thing that we can
avoid by keeping the '@' symbol in our unit names. Since we are locked
into this name, it makes sense that we keep the template model for the
.service files.
The rest of the change involves making sure we are writing these .timer
and .service files before initializing the schedule with 'systemctl' and
deleting the files when we are done. Some changes are also made to share
the random minute along with a single computation of the execution path
of the current Git executable.
In addition, older Git versions may have written a
'git-maintenance@.timer' template file. Be sure to remove this when
successfully enabling maintenance (or disabling maintenance).
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The systemd_timer_write_unit_templates() method writes a single template
that is then used to start the hourly, daily, and weekly schedules with
systemd.
However, in order to schedule systemd maintenance on a given minute,
these templates need to be replaced with specific schedules for each of
these jobs.
Before modifying the schedules, move the writing method above the
systemd_timer_enable_unit() method, so we can write a specific schedule
for each unit.
The diff is computed smaller by showing systemd_timer_enable_unit() and
systemd_timer_delete_units() move instead of
systemd_timer_write_unit_templates() and
systemd_timer_delete_unit_templates().
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.
Add this random minute to the cron integration.
The cron schedule specification starts with a minute indicator, which
was previously inserted as the "0" string but now takes the given minute
as an integer parameter.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.
Add this random minute to the Windows scheduler integration.
We need only to modify the minute value for the 'StartBoundary' tag
across the three schedules.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.
Use get_random_minute() when constructing the schedules for launchctl.
The format already includes a 'Minute' key which is modified from 0 to
the random minute.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we initially created background maintenance -- with its hourly,
daily, and weekly schedules -- we considered the effects of all clients
launching fetches to the server every hour on the hour. The worry of
DDoSing server hosts was noted, but left as something we would consider
for a future update.
As background maintenance has gained more adoption over the past three
years, our worries about DDoSing the big Git hosts has been unfounded.
Those systems, especially those serving public repositories, are already
resilient to thundering herds of much smaller scale.
However, sometimes organizations spin up specific custom server
infrastructure either in addition to or on top of their Git host. Some
of these technologies are built for a different range of scale, and can
hit concurrency limits sooner. Organizations with such custom
infrastructures are more likely to recommend tools like `scalar` which
furthers their adoption of background maintenance.
To help solve for this, create get_random_minute() as a method to help
Git select a random minute when creating schedules in the future. The
integrations with this method do not yet exist, but will follow in
future changes.
To avoid multiple sources of randomness in the Git codebase, create a
new helper function, git_rand(), that returns a random uint32_t. This is
similar to how rand() returns a random nonnegative value, except it is
based on csprng_bytes() which is cryptographic and will return values
larger than RAND_MAX.
One thing that is important for testability is that we notice when we
are under a test scenario and return a predictable result. The schedules
themselves are not checked for this value, but at least one launchctl
test checks that we do not unnecessarily reboot the schedule if it has
not changed from a previous version.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With this change, users can override the compiled-in default for the
maximal length of the label names generated by `git rebase
--rebase-merges`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Mark Ruvald Pedersen <mped@demant.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some commits may have unusually long subject lines. When those subject
lines are used as labels in the `--rebase-merges` mode of `git rebase`,
they can cause errors when writing the corresponding loose refs because
most file systems have a maximal file name length of 255 (`NAME_MAX`).
The symptom looks like this:
$ git rebase --continue
error: cannot lock ref 'refs/rewritten/SANITIZED-SUBJECT': Unable to create '.git/refs/rewritten/SANITIZED-SUBJECT.lock': File name too long - where SANITIZED-SUBJECT is very long
Let's accommodate this situation by truncating the labels.
Care must be taken in case the subject line contains multi-byte
characters so as not to truncate in the middle of a character.
Signed-off-by: Mark Ruvald Pedersen <mped@demant.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Firstly, make it reflect better what actually happens. Not omitting some
possibilities makes it easier to fully exploit them, and not
contradicting the implementation makes it easier to grok and thus modify
the code.
Secondly, improve the overall structure, putting more general info
further up.
Thirdly, document `merge`, `fakesha`, and `break`, which were previously
omitted entirely.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test t5516-fetch-push.sh has a test 'deny fetch unreachable SHA1,
allowtipsha1inwant=true' that checks stderr for a specific error
string from the remote. In some build environments the error sent
over the remote connection gets mingled with the error from the
die() statement. Since both signals are being output to the same
file descriptor (but from parent and child processes), the output
we are matching with grep gets split.
To reduce the risk of this failure, follow this process instead:
1. Write an error message to stderr.
2. Write an error message across the connection.
3. exit(1).
This reorders the events so the error is written entirely before
the client receives a message from the remote, removing the race
condition.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While working on a blog post and using grammarly it suggested this
change.
Signed-off-by: Wesley Schwengle <wesleys@opperschaap.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `pack_geometry` struct is used to maintain and partition a list of
packfiles into a "frozen" set (to be left alone), and a non-frozen set
(to be combined into a single new pack). In the previous commit, we
removed a leak caused by neglecting to free() the heap allocated space
used to store the structure itself.
But there is no need for this structure to live on the heap anyway.
Instead, let's move it to be stack allocated, eliminating the
possibility of a direct leak like the one addressed in the previous
patch.
The one minor hitch is that we use the NULL-ness of the pack_geometry's
struct pointer to determine whether or not we are performing a geometric
repack with `--geometric=<d>`. But since we only initialize the
pack_geometry structure when the `geometric_factor` is non-zero, we can
use that variable (based on whether or not it is equal to zero) to
determine whether or not we are performing a geometric repack.
There are a couple of spots that have access to a pointer to the
pack_geometry struct, but not the geometric_factor itself. Instead of
passing in an additional variable, let's make the geometric_factor a
field of the pack_geometry struct.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The operation doesn't change the number of elements in the array, so we do
not need to allocate the result piecewise.
This moves the re-assignment of todo_list->alloc at the end slighly up,
so it's right after the newly added assert which also refers to `nr`
(and which indeed should come first). Also, the value is more likely to
be still in a register at that point.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's update the error message we show when the user tries to check out
a branch which is being used in another worktree, following the
guideline reasoned in 4970bedef2 (branch: update the message to refuse
touching a branch in-use, 2023-07-21).
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's update the error message we show when the user tries to delete a
branch which is being used in another worktree, following the guideline
reasoned in 4970bedef2 (branch: update the message to refuse touching a
branch in-use, 2023-07-21).
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Default next action after 'fakesha' to preserving the command instead
of forcing 'pick', consistently with other "instant-effect" keywords.
There is no reason why one would want that inconsistency, so this was
clearly just an oversight in commit 5dcdd740 ("t/lib-rebase: prepare
for testing `git rebase --rebase-merges`"). Rectifying it makes the
behavior easier to reason about and document.
This would affect hypothetical "fakesha <n>" sequences where line <n>
already isn't a pick, which currently don't appear.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
... in FAKE_LINES.
This has been broken ever since it was introduced in 5dcdd7409a
(t/lib-rebase: prepare for testing `git rebase --rebase-merges`,
2019-07-31), but it's not actually used, so it's a cosmetic defect
only.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c512643e67 (short help: allow a gap smaller than USAGE_GAP, 2023-07-18)
effectively did away with the two-space gap between options and their
description; one space is enough now. Incorporate USAGE_GAP into
USAGE_OPTS_WIDTH, merge the two cases with enough space on the line and
incorporate the newline into the format for the remaining case. The
output remains the same.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid showing an optional "no-" for options that already start with a
"no-" in the short help, as that double negation is confusing. Document
the opposite variant on its own line with a generated help text instead,
unless it's defined and documented explicitly already.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract functions for printing spaces before and after options. We'll
need them in the next commit.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a "[no-]" prefix to options without the flag PARSE_OPT_NONEG to
document the fact that you can negate them.
This looks a bit strange for options that already start with "no-", e.g.
for the option --no-name of git show-branch:
--[no-]no-name suppress naming strings
You can actually use --no-no-name as an alias of --name, so the short
help is not wrong. If we strip off any of the "no-"s, we lose either
the ability to see if the remaining one belongs to the documented
variant or to see if it can be negated.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests for checking the "git rev-parse --parseopt" flag "!" and
whether options can be negated with a "no-" prefix.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rev-parse --parseopt" shows the short help with its description of
all recognized options twice: When called with -h or --help, and after
reporting an unknown option. Move the one for optionspec into a file
and use it in two tests to deduplicate that part.
"git rev-parse --parseopt -- --h" wraps the help text in "cat <<\EOF"
and "EOF". Keep that part in the file to use it as is in the test that
needs it and simply remove it in the other one using sed.
Disable whitespace checking for the file using an attribute, as we need
to keep its spaces intact and wouldn't want a stray --whitespace=fix
turn them into tabs.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rev-parse --parseopt" handles the built-in options -h and --help,
but not --no-help. Make test definitions and documentation examples
more realistic by disabling negation.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git subtree" only handles the negated variant of the options annotate,
prefix, onto, rejoin, ignore-joins and squash explicitly. help is
handled by "git rev-parse --parseopt" implicitly, but not its negated
form. Disable negation for it and the for the rest of the options to
get a helpful error message when trying them.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git checkout -m -- path" uses the unmerge_marked_index() API, whose
implementation is incapable of unresolving a path that was resolved
as removed. Extend the unmerge_index() API function so that we can
mark the ce_flags member of the cache entries we add to the index as
unmerged, and replace use of unmerge_marked_index() with it.
Now, together with its unmerge_index_entry_at() helper function,
unmerge_marked_index() function is no longer called by anybody, and
can safely be removed.
This makes two known test failures in t2070 and t7201 to succeed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though "checkout --merge -- paths" had some tests, we never
made sure it worked to recreate the conflicted state _after_ the
resolution was recorded in the index. Also "restore --merge" did
not even have any tests.
Currently these commands use the unmerge_marked_index() interface
that cannot handle paths that have been resolved as removal, and
tests for that case are marked with test_expect_failure; these
should eventually be fixed, but not in this patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recreating unmerged index entries using resolve-undo data,
recreating conflicted working tree files using unmerged index
entries, and writing data out of unmerged index entries, make
sense only when we are checking paths out of the index and not when
we are checking paths out of a tree-ish.
Add an extra check to make sure "--merge" and "--ours/--theirs"
options are rejected when checking out from a tree-ish, update the
document (especially the SYNOPSIS section) to highlight that they
are incompatible, and add a few tests to make sure the combination
fails.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "update-index --unresolve" is a relatively old feature that was
introduced in Git v1.4.1 (June 2006), which predates the
resolve-undo extension introduced in Git v1.7.0 (February 2010).
The original code that was limited only to work during a merge (and
not during a rebase or a cherry-pick) has been kept as the fallback
codepath to be used as a transition measure.
By now, for more than 10 years we have stored resolve-undo extension
in the index file, and the fallback code way outlived its usefulness.
Remove it, together with two file-scope static global variables.
One of these variables is still used by surviving function, but it
does not have to be a global at all, so move it to local to that
function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"update-index --unresolve" uses the unmerge_index_entry_at() that
assumes that the path to be unresolved must be in the index, which
makes it impossible to unresolve a path that was resolved as removal.
Rewrite unresolve_one() to use the unmerge_index_entry() to support
unresolving such a path.
Existing tests for "update-index --unresolve" forgot to check one
thing that tests for "checkout --merge -- paths" tested, which is to
make sure that resolve-undo record that has already been used to
recreate higher-stage index entries is removed. Add new invocations
of "ls-files --resolve-undo" after running "update-index --unresolve"
to make sure that unresolving with update-index does remove the used
resolve-undo records.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The resolve-undo index extension records up to three (mode, object
name) tuples for non-zero stages for each path that was resolved,
to be used to recreate the original conflicted state later when the
user requests.
The unmerge_index_entry_at() function uses the resolve-undo data to
do so, but it assumes that the path for which the conflicted state
needs to be recreated can be specified by the position in the
active_cache[] array. This obviously cannot salvage the state of
conflicted paths that were resolved by removing them. For example,
a delete-modify conflict, in which the change whose "modify" side
made is a trivial typofix, may legitimately be resolved to remove
the path, and resolve-undo extension does record the two (mode,
object name) tuples for the common ancestor version and their
version, lacking our version. But after recording such a removal of
the path, you should be able to use resolve-undo data to recreate
the conflicted state.
Introduce a new unmerge_index_entry() helper function that takes the
path (which does not necessarily have to exist in the active_cache[]
array) and resolve-undo data, and use it to reimplement unmerge_index()
public function that is used by "git rerere".
The limited interface is still kept for now, as it is used by "git
checkout -m" and "git update-index --unmerge", but these two codepaths
will be updated to lift the assumption to allow conflicts that resolved
to deletion can be recreated.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "update-index --unresolve $path" cannot find the resolve-undo
record for the path the user requested to unresolve, it stuffs the
blobs from HEAD and MERGE_HEAD to stage #2 and stage #3 as a
fallback. For this reason, the operation does not even start unless
both "HEAD" and "MERGE_HEAD" exist.
This is suboptimal in a few ways:
* It does not recreate stage #1. Even though it is a correct
design decision not to do so (because it is impossible to
recreate in general cases, without knowing how we got there,
including what merge strategy was used), it is much less useful
not to have that information in the index.
* It limits the "unresolve" operation only during a conflicted "git
merge" and nothing else. Other operations like "rebase",
"cherry-pick", and "switch -m" may result in conflicts, and the
user may want to unresolve the conflict that they incorrectly
resolved in order to redo the resolution, but the fallback would
not kick in.
* Most importantly, the entire "unresolve" operation is disabled
after a conflicted merge is committed and MERGE_HEAD is removed,
even though the index has perfectly usable resolve-undo records.
By lazily reading the HEAD and MERGE_HEAD only when we need to go to
the fallback codepath, we will allow cases where resolve-undo
records are available (which is 100% of the time, unless the user is
reading from an index file created by Git more than 10 years ago) to
proceed even after a conflicted merge was committed, during other
mergy operations that do not use MERGE_HEAD, or after the result of
such mergy operations has been committed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The credential erase request typically includes protocol, host, username
and password.
credential-wincred erases stored credentials that match protocol,
host and username, regardless of password.
This is confusing in the case the stored password differs from that
in the request. This case can occur when multiple credential helpers are
configured.
Only erase credential if stored password matches request (or request
omits password).
This fixes test "helper (wincred) does not erase a password distinct
from input" when t0303 is run with GIT_TEST_CREDENTIAL_HELPER set to
"wincred". This test was added in aeb21ce22e (credential: avoid
erasing distinct password, 2023-06-13).
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The credential erase request typically includes protocol, host, username
and password.
credential-libsecret erases a stored credential if it matches protocol,
host and username, regardless of password.
This is confusing in the case the stored password differs from that
in the request. This case can occur when multiple credential helpers are
configured.
Only erase credential if stored password matches request (or request
omits password).
This fixes test "helper (libsecret) does not erase a password distinct
from input" when t0303 is run with GIT_TEST_CREDENTIAL_HELPER set to
"libsecret". This test was added in aeb21ce22e (credential: avoid
erasing distinct password, 2023-06-13).
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d208bfd (credential: new attribute password_expiry_utc, 2023-02-18)
and a5c76569e7 (credential: new attribute oauth_refresh_token)
introduced new credential attributes.
libsecret assumes attribute values are non-confidential and
unchanging, so we encode the new attributes in the secret, separated by
newline:
hunter2
password_expiry_utc=1684189401
oauth_refresh_token=xyzzy
This is extensible and backwards compatible. The credential protocol
already assumes that attribute values do not contain newlines.
Alternatives considered: store password_expiry_utc in a libsecret
attribute. This has the problem that libsecret creates new items
rather than overwrites when attribute values change.
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since GNU make 4.4 the semantics of the $(MAKEFLAGS) variable has
changed in a backward-incompatible way, as its "NEWS" file notes:
Previously only simple (one-letter) options were added to the MAKEFLAGS
variable that was visible while parsing makefiles. Now, all options are
available in MAKEFLAGS. If you want to check MAKEFLAGS for a one-letter
option, expanding "$(firstword -$(MAKEFLAGS))" is a reliable way to return
the set of one-letter options which can be examined via findstring, etc.
This means that $(MAKEFLAGS) now contains long options like
"--jobserver-auth=fifo:<path>" and we have to adapt to that.
Note that the "-" in "-$(MAKEFLAGS)" is critical here, as the variable
will always contain leading whitespace if there are no short options,
but long options are present.
This is a partial backport of 67b36879fc (Makefiles: change search
through $(MAKEFLAGS) for GNU make 4.4, 2022-11-30), which had been
applied directly to git/git.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Fix a Remote Code Execution vulnerability on Windows. This is caused by
the fact that Tcl on Windows always includes the current directory when
looking for an executable. Therefore malicious repositories can ship
with an aspell.exe in their top-level directory which is executed by Git
GUI without giving the user a chance to inspect it first, i.e. running
untrusted code.
This merge fixes CVE-2022-41953.
* js/windows-rce:
Work around Tcl's default `PATH` lookup
Move the `_which` function (almost) to the top
Move is_<platform> functions to the beginning
is_Cygwin: avoid `exec`ing anything
windows: ignore empty `PATH` elements
As per https://www.tcl.tk/man/tcl8.6/TclCmd/exec.html#M23, Tcl's `exec`
function goes out of its way to imitate the highly dangerous path lookup
of `cmd.exe`, but _of course_ only on Windows:
If a directory name was not specified as part of the application
name, the following directories are automatically searched in
order when attempting to locate the application:
The directory from which the Tcl executable was loaded.
The current directory.
The Windows 32-bit system directory.
The Windows home directory.
The directories listed in the path.
The dangerous part is the second item, of course: `exec` _prefers_
executables in the current directory to those that are actually in the
`PATH`.
It is almost as if people wanted to Windows users vulnerable,
specifically.
To avoid that, Git GUI already has the `_which` function that does not
imitate that dangerous practice when looking up executables in the
search path.
However, Git GUI currently fails to use that function e.g. when trying to
execute `aspell` for spell checking.
That is not only dangerous but combined with Tcl's unfortunate default
behavior and with the fact that Git GUI tries to spell-check a
repository just after cloning, leads to a critical Remote Code Execution
vulnerability.
Let's override both `exec` and `open` to always use `_which` instead of
letting Tcl perform the path lookup, to prevent this attack vector.
This addresses CVE-2022-41953.
For more details, see
https://github.com/git-for-windows/git/security/advisories/GHSA-v4px-mx59-w99c
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
We are about to make use of the `_which` function to address
CVE-2022-41953 by overriding Tcl/Tk's unsafe PATH lookup on Windows.
In preparation for that, let's move it close to the top of the file to
make sure that even early `exec` calls that happen during the start-up
of Git GUI benefit from the fix.
This commit is best viewed with `--color-moved`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
We need these in `_which` and they should be defined before that
function's definition.
This commit is best viewed with `--color-moved`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The `is_Cygwin` function is used, among other things, to determine
how executables are discovered in the `PATH` list by the `_which` function.
We are about to change the behavior of the `_which` function on Windows
(but not Cygwin): On Windows, we want it to ignore empty elements of the
`PATH` instead of treating them as referring to the current directory
(which is a "legacy feature" according to
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03,
but apparently not explicitly deprecated, the POSIX documentation is
quite unclear on that even if the Cygwin project itself considers it to
be deprecated: https://github.com/cygwin/cygwin/commit/fc74dbf22f5c).
This is important because on Windows, `exec` does something very unsafe
by default (unless we're running a Cygwin version of Tcl, which follows
Unix semantics).
However, we try to `exec` something _inside_ `is_Cygwin` to determine
whether we're running within Cygwin or not, i.e. before we determined
whether we need to handle `PATH` specially or not. That's a Catch-22.
Therefore, and because it is much cleaner anyway, use the
`$::tcl_platform(os)` value which is guaranteed to start with `CYGWIN_`
when running a Cygwin variant of Tcl/Tk, instead of executing `cygpath
--windir`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
When looking up an executable via the `_which` function, Git GUI
imitates the `execlp()` strategy where the environment variable `PATH`
is interpreted as a list of paths in which to search.
For historical reasons, stemming from the olden times when it was
uncommon to download a lot of files from the internet into the current
directory, empty elements in this list are treated as if the current
directory had been specified.
Nowadays, of course, this treatment is highly dangerous as the current
directory often contains files that have just been downloaded and not
yet been inspected by the user. Unix/Linux users are essentially
expected to be very, very careful to simply not add empty `PATH`
elements, i.e. not to make use of that feature.
On Windows, however, it is quite common for `PATH` to contain empty
elements by mistake, e.g. as an unintended left-over entry when an
application was installed from the Windows Store and then uninstalled
manually.
While it would probably make most sense to safe-guard not only Windows
users, it seems to be common practice to ignore these empty `PATH`
elements _only_ on Windows, but not on other platforms.
Sadly, this practice is followed inconsistently between different
software projects, where projects with few, if any, Windows-based
contributors tend to be less consistent or even "blissful" about it.
Here is a non-exhaustive list:
Cygwin:
It specifically "eats" empty paths when converting path lists to
POSIX: https://github.com/cygwin/cygwin/commit/753702223c7d
I.e. it follows the common practice.
PowerShell:
It specifically ignores empty paths when searching the `PATH`.
The reason for this is apparently so self-evident that it is not
even mentioned here:
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_environment_variables#path-information
I.e. it follows the common practice.
CMD:
Oh my, CMD. Let's just forget about it, nobody in their right
(security) mind takes CMD as inspiration. It is so unsafe by
default that we even planned on dropping `Git CMD` from Git for
Windows altogether, and only walked back on that plan when we
found a super ugly hack, just to keep Git's users secure by
default:
https://github.com/git-for-windows/MINGW-packages/commit/82172388bb51
So CMD chooses to hide behind the battle cry "Works as
Designed!" that all too often leaves users vulnerable. CMD is
probably the most prominent project whose lead you want to avoid
following in matters of security.
Win32 API (`CreateProcess()`)
Just like CMD, `CreateProcess()` adheres to the original design
of the path lookup in the name of backward compatibility (see
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw
for details):
If the file name does not contain a directory path, the
system searches for the executable file in the following
sequence:
1. The directory from which the application loaded.
2. The current directory for the parent process.
[...]
I.e. the Win32 API itself chooses backwards compatibility over
users' safety.
Git LFS:
There have been not one, not two, but three security advisories
about Git LFS executing executables from the current directory by
mistake. As part of one of them, a change was introduced to stop
treating empty `PATH` elements as equivalent to `.`:
https://github.com/git-lfs/git-lfs/commit/7cd7bb0a1f0d
I.e. it follows the common practice.
Go:
Go does not follow the common practice, and you can think about
that what you want:
https://github.com/golang/go/blob/go1.19.3/src/os/exec/lp_windows.go#L114-L135https://github.com/golang/go/blob/go1.19.3/src/path/filepath/path_windows.go#L108-L137
Git Credential Manager:
It tries to imitate Git LFS, but unfortunately misses the empty
`PATH` element handling. As of time of writing, this is in the
process of being fixed:
https://github.com/GitCredentialManager/git-credential-manager/pull/968
So now that we have established that it is a common practice to ignore
empty `PATH` elements on Windows, let's assess this commit's change
using Schneier's Five-Step Process
(https://www.schneier.com/crypto-gram/archives/2002/0415.html#1):
Step 1: What problem does it solve?
It prevents an entire class of Remote Code Execution exploits via
Git GUI's `Clone` functionality.
Step 2: How well does it solve that problem?
Very well. It prevents the attack vector of luring an unsuspecting
victim into cloning an executable into the worktree root directory
that Git GUI immediately executes.
Step 3: What other security problems does it cause?
Maybe non-security problems: If a project (ab-)uses the unsafe
`PATH` lookup. That would not only be unsafe, though, but
fragile in the first place because it would break when running
in a subdirectory. Therefore I would consider this a scenario
not worth keeping working.
Step 4: What are the costs of this measure?
Almost nil, except for the time writing up this commit message
;-)
Step 5: Given the answers to steps two through four, is the security
measure worth the costs?
Yes. Keeping Git's users Secure By Default is worth it. It's a
tiny price to pay compared to the damages even a single
successful exploit can cost.
So let's follow that common practice in Git GUI, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Hash object as it were located at the given path. The location of
file does not directly influence on the hash value, but path is
used to determine what Git filters should be applied to the object
before it can be placed to the object database, and, as result of
Hash object as if it were located at the given path. The location of
the file does not directly influence the hash value, but the path is
used to determine which Git filters should be applied to the object
before it can be placed in the object database. As a result of
applying filters, the actual blob put into the object database may
differ from the given file. This option is mainly useful for hashing
temporary files located outside of the working directory or files
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.