Fixes to code that parses the todo file used in "rebase -i".
* pw/rebase-i-parse-fix:
rebase -i: fix parsing of "fixup -C<commit>"
rebase -i: match whole word in is_command()
Various fix-ups on HTTP tests.
* jk/http-test-fixes:
t5559: make SSL/TLS the default
t5559: fix test failures with LIB_HTTPD_SSL
t/lib-httpd: enable HTTP/2 "h2" protocol, not just h2c
t/lib-httpd: respect $HTTPD_PROTO in expect_askpass()
t5551: drop curl trace lines without headers
t5551: handle v2 protocol in cookie test
t5551: simplify expected cookie file
t5551: handle v2 protocol in upload-pack service test
t5551: handle v2 protocol when checking curl trace
t5551: stop forcing clone to run with v0 protocol
t5551: handle HTTP/2 when checking curl trace
t5551: lower-case headers in expected curl trace
t5551: drop redundant grep for Accept-Language
t5541: simplify and move "no empty path components" test
t5541: stop marking "used receive-pack service" test as v0 only
t5541: run "used receive-pack service" test earlier
The format.attach configuration variable lacked a way to override a
value defined in a lower-priority configuration file (e.g. the
system one) by redefining it in a higher-priority configuration
file. Now, setting format.attach to an empty string means show the
patch inline in the e-mail message, without using MIME attachment.
This is a backward incompatible change.
* jc/countermand-format-attach:
format.attach: allow empty value to disable multi-part messages
sscanf(3) used in "git symbolic-ref --short" implementation found
to be not working reliably on macOS in UTF-8 locales. Rewrite the
code to avoid sscanf() altogether to work it around.
* jk/shorten-unambiguous-ref-wo-sscanf:
shorten_unambiguous_ref(): avoid sscanf()
shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
shorten_unambiguous_ref(): avoid integer truncation
The credential subsystem learned that a password may have an
explicit expiration.
* mh/credential-password-expiry:
credential: new attribute password_expiry_utc
"git archive HEAD^{tree}" records the paths with the current
timestamp in the archive, making it harder to obtain a stable
output. The command learned the --mtime option to specify an
arbitrary timestamp (e.g. --mtime="@0 +0000" for the epoch).
* rs/archive-mtime:
archive: add --mtime
Remove leftover and unused code.
* tb/drop-dir-iterator-follow-symlink-bit:
t0066: drop setup of "dir5"
dir-iterator: drop unused `DIR_ITERATOR_FOLLOW_SYMLINKS`
The "diff" drivers specified by the "diff" attribute attached to
paths can now specify which algorithm (e.g. histogram) to use.
* jc/diff-algo-attribute:
diff: teach diff to read algorithm from diff driver
diff: consolidate diff algorithm option parsing
An invalid label or ref in the "rebase -i" todo file used to
trigger an runtime error. SUch an error is now diagnosed while the
todo file is parsed.
* pw/rebase-i-validate-labels-early:
rebase -i: check labels and refs when parsing todo list
When a comment describing how each test file should start was added in
commit [1], it was the second comment of t/test-lib.sh. The comment
describes how variable "test_description" is supposed to be assigned at
the top of each test file and how "test-lib.sh" should be used by
sourcing it. However, even in [1], the comment was ten lines away from
the usage of the variable by test-lib.sh. Since then, the comment has
drifted away both from the top of the file and from the usage of the
variable. The comment just sits in the middle of the initialization of
the test library, surrounded by unrelated code, almost one hundred lines
away from the usage of "test_description".
Nobody has noticed this drift during evolution of test-lib.sh, which
suggests that this comment has outlived its usefulness. The assignment
of "test_description", sourcing of "test-lib.sh" by tests, and the
process of writing tests in general are described in detail in
"t/README". So drop the obsolete comment.
An alternative solution could be to move the comment either to the top
of the file, or down to the usage of variable "test_description".
[1] e1970ce43a ("[PATCH 1/2] Test framework take two.", 2005-05-13)
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The style of t9700-perl-git.sh is old. There are 3 problems:
* A title is not on the same line with test_expect_success command.
* A test body is indented by whitespaces.
* There are whitespaces after redirect operators.
Modernize test scripts by:
* Combine the title with test_expect_success command.
* Replace whitespace indents with TAB.
* Delete whitespaces after redirect operators.
Signed-off-by: Zhang Yi <18994118902@163.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch --jobs=0" used to hit a BUG(), which has been corrected
to use the available CPUs.
* ma/fetch-parallel-use-online-cpus:
fetch: choose a sensible default with --jobs=0 again
A test helper had a single write(2) of 256kB, which was too big for
some platforms (e.g. NonStop), which has been corrected by using
xwrite() wrapper appropriately.
* jc/genzeros-avoid-raw-write:
test-genzeros: avoid raw write(2)
Error messages given upon a signature verification failure used to
discard the errors from underlying gpg program, which has been
corrected.
* js/gpg-errors:
gpg: do show gpg's error message upon failure
t7510: add a test case that does not need gpg
If the user omits the space between "-C" and the commit in a fixup
command then it is parsed as an ordinary fixup and the commit message is
not updated as it should be. Fix this by making the space between "-C"
and "<commit>" optional as it is for the "merge" command.
Note that set_replace_editor() is changed to set $GIT_SEQUENCE_EDITOR
instead of $EDITOR in order to be able to replace the todo list and
reword commits with $FAKE_COMMIT_MESSAGE. This is safe as all the
existing users are using set_replace_editor() to replace the todo list.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When matching an unabbreviated command is_command() only does a prefix
match which means it parses "pickled" as TODO_PICK. parse_insn_line()
does error out because is_command() only advances as far as the end of
"pick" so it looks like the command name is not followed by a space but
the error message is "missing arguments for pick" rather than telling
the user that the "pickled" is not a valid command.
Fix this by ensuring the match is follow by whitespace or the end of the
string as we already do for abbreviated commands. The (*bol = p) at the
end of the condition is a bit cute for my taste but I decided to leave
it be for now. Rather than add new tests the existing tests for bad
commands are adapted to use a bad command name that triggers the prefix
matching bug.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of t5559 is run the regular t5551 tests with HTTP/2. But it
does so with the "h2c" protocol, which uses cleartext upgrades from
HTTP/1.1 to HTTP/2 (rather than learning about HTTP/2 support during the
TLS negotiation).
This has a few problems:
- it's not very indicative of the real world. In practice, most servers
that support HTTP/2 will also support TLS.
- support for upgrading does not seem as robust. In particular, we've
run into bugs in some versions of Apache's mod_http2 that trigger
only with the upgrade mode. See:
https://lore.kernel.org/git/Y8ztIqYgVCPILJlO@coredump.intra.peff.net/
So the upside is that this change makes our HTTP/2 tests more robust and
more realistic. The downside is that if we can't set up SSL for any
reason, we'll skip the tests (even though you _might_ have been able to
run the HTTP/2 tests the old way). We could probably have a conditional
fallback, but it would be complicated for little gain, and it's not even
clear it would help (i.e., would any test environment even have HTTP/2
but not SSL support?).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One test needs to be tweaked in order for t5559 to pass with SSL/TLS set
up. When we make our initial clone, we check that the curl trace of
requests is what we expected. But we need to fix two things:
- along with ignoring "data" lines from the trace, we need to ignore
"SSL data" lines
- when TLS is used, the server is able to tell the client (via ALPN)
that it supports HTTP/2 before the first HTTP request is made. So
rather than request an upgrade using an HTTP header, it can just
speak HTTP/2 immediately
With this patch, running:
LIB_HTTPD_SSL=1 ./t5559-http-fetch-smart-http2.sh
works, whereas it did not before.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 73c49a4474 (t: run t5551 tests with both HTTP and HTTP/2,
2022-11-11) added Apache config to enable HTTP/2. However, it only
enabled the "h2c" protocol, which allows cleartext HTTP/2 (generally
based on an upgrade header during an HTTP/1.1 request). This is what
t5559 is generally testing, since by default we don't set up SSL/TLS.
However, it should be possible to run t5559 with LIB_HTTPD_SSL set. In
that case, Apache will advertise support for HTTP/2 via ALPN during the
TLS handshake. But we need to tell it support "h2" (the non-cleartext
version) to do so. Without that, then curl does not even try to do the
HTTP/1.1 upgrade (presumably because after seeing that we did TLS but
didn't get the ALPN indicator, it assumes it would be fruitless).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the HTTP tests are run with LIB_HTTPD_SSL in the environment, then
we access the test server as https://. This causes expect_askpass to
complain, because it tries to blindly match "http://" in the prompt
shown to the user. We can adjust this to use $HTTPD_PROTO, which is set
during the setup phase.
Note that this is enough for t5551 and t5559 to pass when run with
https, but there are similar problems in other scripts that will need to
be fixed before the whole suite can run with LIB_HTTPD_SSL.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We pick apart a curl trace, looking for "=> Send header:" and so on, and
matching against an expected set of requests and responses. We remove
"== Info" lines entirely. However, our parser is fooled when running the
test with LIB_HTTPD_SSL on Ubuntu 20.04 (as found in our linux-gcc CI
job), as curl hands us an "Info" buffer with a newline, and we get:
== Info: successfully set certificate verify locations:
== Info: CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
=> Send SSL data[...]
which results in the "CApath" line ending up in the cleaned-up output,
causing the test to fail.
Arguably the tracing code should detect this and put it on two separate
"== Info" lines. But this is actually a curl bug, fixed by their
80d73bcca (tls: provide the CApath verbose log on its own line,
2020-08-18). It's simpler to just work around it here.
Since we are using GIT_TRACE_CURL, every line should just start with one
of "<=", "==", or "=>", and we can throw away anything else. In fact, we
can just replace the pattern for deleting "*" lines. Those were from the
old GIT_CURL_VERBOSE output, but we switched over in 14e24114d9
(t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var,
2016-09-05).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After making a request, we check that it stored the expected cookies.
This depends on the protocol version, because the cookies we store
depend on the exact requests we made (and for ls-remote, v2 will always
hit /git-upload-pack to get the refs, whereas v0 is happy with the
initial ref advertisement).
As a result, hardly anybody runs this test, as you'd have to manually
set GIT_TEST_PROTOCOL_VERSION=0 to do so.
Let's teach it to handle both protocol versions. One way to do this
would be to make the expectation conditional on the protocol used. But
there's a simpler solution. The reason that v0 doesn't hit
/git-upload-pack is that ls-remote doesn't fetch any objects. If we
instead do a fetch (making sure there's an actual object to grab), then
both v0 and v2 will hit the same endpoints and set the same cookies.
Note that we do have to clean up our new tag here; otherwise it confuses
the later "clone 2,000 tags" test.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After making an HTTP request that should store cookies, we check that
the expected values are in the cookie file. We don't want to look at the
whole file, because it has noisy comments at the top that we shouldn't
depend on. But we strip out the interesting bits using "tail -3", which
is brittle. It requires us to put an extra blank line in our expected
output, and it would fail to notice any reordering or extra content in
the cookie file.
Instead, let's just grep for non-blank lines that are not comments,
which more directly describes what we're interested in.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We perform a clone and a fetch, and then check that we saw the expected
requests in Apache's access log. In the v2 protocol, there will be one
extra request to /git-upload-pack for each operation (since the initial
/info/refs probe is just used to upgrade the protocol).
As a result, this test is a noop unless the use of the v0 protocol is
forced. Which means that hardly anybody runs it, since you have to do so
manually.
Let's update it to handle v2 and run it always. We could do this by just
conditionally adding in the extra POST lines. But if we look at the
origin of the test in 7da4e2280c (test smart http fetch and push,
2009-10-30), the point is really just to make sure that the smart
git-upload-pack service was used at all. So rather than counting up the
individual requests, let's just make sure we saw each of the expected
types. This is a bit looser, but makes maintenance easier.
Since we're now matching with grep, we can also loosen the HTTP/1.1
match, which allows this test to pass when run with HTTP/2 via t5559.
That lets:
GIT_TEST_PROTOCOL_VERSION=0 ./t5559-http-fetch-smart-http2.sh
run to completion, which previously failed (and of course it works if
you use v2, as well).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After cloning an http repository, we check the curl trace to make sure
the expected requests were made. But since the expected trace was never
updated to handle v2, it is only run when you ask the test suite to run
in v0 mode (which hardly anybody does).
Let's update it to handle both protocols. This isn't too hard since v2
just sends an extra header and an extra request. So we can just annotate
those extra lines and strip them out for v0 (and drop the annotations
for v2). I didn't bother handling v1 here, as it's not really of
practical interest (it would drop the extra v2 request, but still have
the "git-protocol" lines).
There's a similar tweak needed at the end. Since we check the
"accept-encoding" value loosely, we grep for it rather than finding it
in the verbatim trace. This grep insists that there are exactly 2
matches, but of course in v2 with the extra request there are 3. We
could tweak the number, but it's simpler still to just check that we saw
at least one match. The verbatim check already confirmed how many
instances of the header we have; we're really just checking here that
"gzip" is in the value (it's possible, of course, that the headers could
have different values, but that seems like an unlikely bug).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the "clone http repository" test, we check the curl trace to make
sure the expected requests were made. This whole script was marked to
handle only the v0 protocol in d790ee1707 (tests: fix protocol version
for overspecifications, 2019-02-25). That makes sense, since v2 requires
an extra request, so tests as specific as this would fail unless
modified.
Later, in preparation for v2 becoming the default, this was tweaked by
8a1b0978ab (test: request GIT_TEST_PROTOCOL_VERSION=0 when appropriate,
2019-12-23). There we run the trace check only if the user has
explicitly asked to test protocol version 0. But it also forced the
clone itself to run with the v0 protocol.
This makes the check for "can we expect a v0 trace" silly; it will
always be v0. But much worse, it means that the clone we are testing is
not like the one that normal users would run. They would use the
defaults, which are now v2. And since this is supposed to be a basic
check of clone-over-http, we should do the same.
Let's fix this by dropping the extra v0 override. The test still passes
because the trace checking only kicks in if we asked to use v0
explicitly (this is the same as before; even though we were running a v0
clone, unless you specifically set GIT_TEST_PROTOCOL_VERSION=0, the
trace check was always skipped).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check that the curl trace of a clone has the lines we expect, but
this won't work when we run the test under t5559, because a few details
are different under HTTP/2 (but nobody noticed because it only happens
when you manually set GIT_TEST_PROTOCOL_VERSION to "0").
We can handle both HTTP protocols with a few tweaks:
- we'll drop the HTTP "101 Switching Protocols" response, as well as
various protocol upgrade headers. These details aren't interesting
to us. We just want to make sure the correct protocol was used (and
we do in the main request/response lines).
- successful HTTP/2 responses just say "200" and not "200 OK"; we can
normalize these
- replace HTTP/1.1 with a variable in the request/response lines. We
can use the existing $HTTP_PROTO for this, as it's already set to
"HTTP/2" when appropriate. We do need to tweak the fallback value to
"HTTP/1.1" to match what curl will write (prior to this patch, the
fallback value didn't matter at all; we only checked if it was the
literal string "HTTP/2").
Note that several lines still expect HTTP/1.1 unconditionally. The first
request does so because the client requests an upgrade during the
request. The POST request and response do so because you can't do an
upgrade if there is a request body. (This will all be different if we
trigger HTTP/2 via ALPN, but the tests aren't yet capable of that).
This is enough to let:
GIT_TEST_PROTOCOL_VERSION=0 ./t5559-http-fetch-smart-http2.sh
pass the "clone http repository" test (but there are some other failures
later on).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a test in t5551 which checks the curl trace (after simplifying
it a bit). It doesn't work with HTTP/2, because in that case curl
outputs all of the headers in lower-case. Even though this test is run
with HTTP/2 by t5559, nobody has noticed because checking the trace only
happens if GIT_TEST_PROTOCOL_VERSION is manually set to "0".
Let's fix this by lower-casing all of the header names in the trace, and
then checking for those in our expected code (this is easier than making
HTTP/2 traces look like HTTP/1.1, since HTTP/1.1 uses title-casing).
Sadly, we can't quite do this in our existing sed script. This works if
you have GNU sed:
s/^\\([><]\\) \\([A-Za-z0-9-]*:\\)/\1 \L\2\E/
but \L is a GNU-ism, and I don't think there's a portable solution. We
could just "tr A-Z a-z" on the way in, of course, but that makes the
non-header parts harder to read (e.g., lowercase "post" requests). But
to paraphrase Baron Munchausen, I have learned from experience that a
modicum of Perl can be most efficacious.
Note that this doesn't quite get the test passing with t5559; there are
more fixes needed on top.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b0c4adcdd7 (remote-curl: send Accept-Language header to server,
2022-07-11) added tests to make sure the header is sent via HTTP.
However, it checks in two places:
1. In the expected trace output, we check verbatim for the header and
its value.
2. Afterwards, we grep for the header again in the trace file.
This (2) is probably cargo-culted from the earlier grep for
Accept-Encoding. It is needed for the encoding because we smudge the
value of that header when doing the verbatim check; see 1a53e692af
(remote-curl: accept all encodings supported by curl, 2018-05-22).
But we don't do so for the language header, so any problem that the
"grep" would catch in (2) would already have been caught by (1).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 9ee6bcd398 (t5541-http-push: add test for URLs with trailing
slash, 2010-04-08) added a test that clones a URL with a trailing slash,
and confirms that we don't send a doubled slash (like "$url//info/refs")
to the server.
But this test makes no sense in t5541, which is about pushing. It should
have been added in t5551. Let's move it there.
But putting it at the end is tricky, since it checks the entire contents
of the Apache access log. We could get around this by clearing the log
before our test. But there's an even simpler solution: just make sure no
doubled slashes appear in the log (fortunately, "http://" does not
appear in the log itself).
As a bonus, this also lets us drop the check for the v0 protocol (which
is otherwise necessary since v2 makes multiple requests, and
check_access_log insists on exactly matching the number of requests,
even though we don't care about that here).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a test which checks to see if a request to git-receive-pack was
made. Originally, it was checking the entire set of requests made in the
script so far, including clones, and thus it would break when run with
the v2 protocol (since that implies an extra request for fetches).
Since the previous commit, though, we are only checking the requests
made by a single push. And since there is no v2 push protocol, the test
now passes no matter what's in GIT_TEST_PROTOCOL_VERSION. We can just
run it all the time.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a test in t5541 that confirms that "git push" makes two requests
(a GET to /info/refs, and a POST to /git-receive-pack). However, it's a
noop unless GIT_TEST_PROTOCOL_VERSION is set to "0", due to 8a1b0978ab
(test: request GIT_TEST_PROTOCOL_VERSION=0 when appropriate,
2019-12-23).
This means that almost nobody runs it. And indeed, it has been broken
since b0c4adcdd7 (remote-curl: send Accept-Language header to server,
2022-07-11). But the fault is not in that commit, but in how brittle the
test is. It runs after several operations have been performed, which
means that it expects to see the complete set of requests made so far in
the script. Commit b0c4adcdd7 added a new test, which means that the
"used receive-pack service" test must be updated, too.
Let's fix this by making the test less brittle. We'll move it higher in
the script, right after the first push has completed. And we'll clear
the access log right before doing the push, so we'll see only the
requests from that command.
This is technically testing less, in that we won't check that all of
those other requests also correctly used smart http. But there's no
particular reason think that if the first one did, the others wouldn't.
After this patch, running:
GIT_TEST_PROTOCOL_VERSION=0 ./t5541-http-push-smart.sh
passes, whereas it did not before.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some passwords have an expiry date known at generation. This may be
years away for a personal access token or hours for an OAuth access
token.
When multiple credential helpers are configured, `credential fill` tries
each helper in turn until it has a username and password, returning
early. If Git authentication succeeds, `credential approve`
stores the successful credential in all helpers. If authentication
fails, `credential reject` erases matching credentials in all helpers.
Helpers implement corresponding operations: get, store, erase.
The credential protocol has no expiry attribute, so helpers cannot
store expiry information. Even if a helper returned an improvised
expiry attribute, git credential discards unrecognised attributes
between operations and between helpers.
This is a particular issue when a storage helper and a
credential-generating helper are configured together:
[credential]
helper = storage # eg. cache or osxkeychain
helper = generate # eg. oauth
`credential approve` stores the generated credential in both helpers
without expiry information. Later `credential fill` may return an
expired credential from storage. There is no workaround, no matter how
clever the second helper. The user sees authentication fail (a retry
will succeed).
Introduce a password expiry attribute. In `credential fill`, ignore
expired passwords and continue to query subsequent helpers.
In the example above, `credential fill` ignores the expired password
and a fresh credential is generated. If authentication succeeds,
`credential approve` replaces the expired password in storage.
If authentication fails, the expired credential is erased by
`credential reject`. It is unnecessary but harmless for storage
helpers to self prune expired credentials.
Add support for the new attribute to credential-cache.
Eventually, I hope to see support in other popular storage helpers.
Example usage in a credential-generating helper
https://github.com/hickford/git-credential-oauth/pull/16
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Reviewed-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the run-hooks API to allow feeding data from the standard
input when running the hook script(s).
* ab/hook-api-with-stdin:
hook: support a --to-stdin=<path> option
sequencer: use the new hook API for the simpler "post-rewrite" call
hook API: support passing stdin to hooks, convert am's 'post-rewrite'
run-command: allow stdin for run_processes_parallel
run-command.c: remove dead assignment in while-loop
Leak fixes.
* ab/various-leak-fixes:
push: free_refs() the "local_refs" in set_refspecs()
push: refactor refspec_append_mapped() for subsequent leak-fix
receive-pack: release the linked "struct command *" list
grep API: plug memory leaks by freeing "header_list"
grep.c: refactor free_grep_patterns()
builtin/merge.c: free "&buf" on "Your local changes..." error
builtin/merge.c: use fixed strings, not "strbuf", fix leak
show-branch: free() allocated "head" before return
commit-graph: fix a parse_options_concat() leak
http-backend.c: fix cmd_main() memory leak, refactor reg{exec,free}()
http-backend.c: fix "dir" and "cmd_arg" leaks in cmd_main()
worktree: fix a trivial leak in prune_worktrees()
repack: fix leaks on error with "goto cleanup"
name-rev: don't xstrdup() an already dup'd string
various: add missing clear_pathspec(), fix leaks
clone: use free() instead of UNLEAK()
commit-graph: use free_commit_graph() instead of UNLEAK()
bundle.c: don't leak the "args" in the "struct child_process"
tests: mark tests as passing with SANITIZE=leak
As mentioned in 'git-range-diff.txt': "`git range-diff` also accepts the
regular diff options (see linkgit:git-diff[1])...", but '--abbrev' is not
in the "regular" scope.
In Git, the "abbrev" of an object may not be a fixed value in different
repositories, depending on the needs of the them(Linus mentioned in
e6c587c7 in 2016: "the Linux kernel project needs 11 to 12 hexdigits"
at that time ), that's why a user may want to display abbrev according
to a specified length.
Although a similar effect can be achieved through configuration (like:
git -c core.abbrev=<abbrev>), but based on ease of use (many users may not
know that the -c option can be specified) and the description in existing
document, supporting users to directly use '--abbrev', could be a good way.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
prior to 51243f9 (run-command API: don't fall back on online_cpus(),
2022-10-12) `git fetch --multiple --jobs=0` would choose some default amount
of jobs, similar to `git -c fetch.parallel=0 fetch --multiple`. While our
documentation only ever promised that `fetch.parallel` would fall back to a
"sensible default", it makes sense to do the same for `--jobs`. So fall back
to online_cpus() and not BUG() out.
This fixes https://github.com/git-for-windows/git/issues/4302
Reported-by: Drew Noakes <drnoakes@microsoft.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
trace_repo_setup() of trace.c is called with the argument 'prefix' from
only one location, run_builtin of git.c, which sets 'prefix' to the return
value of setup_git_directory() or setup_git_directory_gently() (a wrapper
of the former).
Now that "prefix" is in startup_info there is no need for the parameter
of trace_repo_setup() because setup_git_directory() sets "startup_info->prefix"
to the same value it returns. It would be less confusing to use "prefix"
from startup_info instead of passing it as an argument.
Signed-off-by: Idriss Fekir <mcsm224@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
the `cd` followed the old style which wasn't consistent with the rest of
the test suite, so this commit makes it consistent with the current
style of the test suite for `cd` in subshell.
Signed-off-by: Ashutosh Pandey <ashutosh.pandeyhlr007@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be useful to specify diff algorithms per file type. For example,
one may want to use the minimal diff algorithm for .json files, another
for .c files, etc.
The diff machinery already checks attributes for a diff driver. Teach
the diff driver parser a new type "algorithm" to look for in the
config, which will be used if a driver has been specified through the
attributes.
Enforce precedence of the diff algorithm by favoring the command line
option, then looking at the driver attributes & config combination, then
finally the diff.algorithm config.
To enforce precedence order, use a new `ignore_driver_algorithm` member
during options parsing to indicate the diff algorithm was set via command
line args.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent commit will need the ability to tell if the diff algorithm
was set through the command line through setting a new member of
diff_options. While this logic can be added to the
diff_opt_diff_algorithm() callback, the `--minimal` and `--histogram`
options are handled via OPT_BIT without a callback.
Remedy this by consolidating the options parsing logic for --minimal and
--histogram into one callback. This way we can modify `diff_options` in
that function.
As an additional refactor, the logic that sets the diff algorithm in
diff_opt_diff_algorithm() can be refactored into a helper that will
allow multiple callsites to set the diff algorithm.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check that the argument to the "label" and "update-ref" commands is a
valid refname when the todo list is parsed rather than waiting until the
command is executed. This means that the user can deal with any errors
at the beginning of the rebase rather than having it stop halfway
through due to a typo in a label name. The "update-ref" command is
changed to reject single level refs as it is all to easy to type
"update-ref branch" which is incorrect rather than "update-ref
refs/heads/branch"
Note that it is not straight forward to check the arguments to "reset"
and "merge" commands as they may be any revision, not just a refname and
we do not have an equivalent of check_refname_format() for revisions.
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 647982bb71 (delta-islands: free island_marks and bitmaps, 2023-02-03)
we have introduced logic to free `island_marks` in order to reduce heap
memory usage in git-pack-objects(1). This commit is causing segfaults in
the case where this Git command does not load delta islands at all, e.g.
when reading object IDs from standard input. One such crash can be hit
when using repacking multi-pack-indices with delta islands enabled.
The root cause of this bug is that we unconditionally dereference the
`island_marks` variable even in the case where it is a `NULL` pointer,
which is fixed by making it conditional. Note that we still leave the
logic in place to set the pointer to `-1` to detect use-after-free bugs
even when there are no loaded island marks at all.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow users to specify the modification time of archive entries. The
new option --mtime uses approxidate() to parse a time specification and
overrides the default of using the current time for trees and the commit
time for tags and commits. It can be used to create a reproducible
archive for a tree, or to use a specific mtime without creating a commit
with GIT_COMMITTER_DATE set.
This implementation doesn't support the negated form of the new option,
i.e. --no-mtime is not accepted. It is not possible to have no mtime at
all. We could use the Unix epoch or revert to the default behavior, but
since negation is not necessary for the intended use it's left undecided
for now.
Requested-by: Raul E Rangel <rrangel@chromium.org>
Suggested-by: demerphq <demerphq@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a lower precedence configuration file (e.g. /etc/gitconfig)
defines format.attach in any way, there was no way to disable it in
a more specific configuration file (e.g. $HOME/.gitconfig).
Change the behaviour of setting it to an empty string. It used to
mean that the result is still a multipart message with only dashes
used as a multi-part separator, but now it resets the setting to
the default (which would be to give an inline patch, unless other
command line options are in effect).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The symlink setup in t0066 makes several directories with links, dir4
through dir6. But ever since dir5 was introduced in fa1da7d2ee
(dir-iterator: add flags parameter to dir_iterator_begin, 2019-07-10),
it has never actually been used. It was left over from an earlier
iteration of the patch which tried to handle recursive symlinks
specially, as seen in:
https://lore.kernel.org/git/20190502144829.4394-7-matheus.bernardino@usp.br/
It's not hurting any of the existing tests to be there, but the extra
setup is confusing to anybody trying to read and understand the tests.
Let's drop the extra directory, and we'll rename "dir6" to "dir5" so
nobody wonders whether the gap in naming is important.
Helped-by: Matheus Tavares Bernardino <matheus.tavb@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not test our http proxy functionality at all in the test suite, so
this is a pretty big blind spot. Let's at least add a basic check that
we can go through an authenticating proxy to perform a clone.
A few notes on the implementation:
- I'm using a single apache instance to proxy to itself. This seems to
work fine in practice, and we can check with a test that this rather
unusual setup is doing what we expect.
- I've put the proxy tests into their own script, and it's the only
one which loads the apache proxy config. If any platform can't
handle this (e.g., doesn't have the right modules), the start_httpd
step should fail and gracefully skip the rest of the script (but all
the other http tests in existing scripts will continue to run).
- I used a separate passwd file to make sure we don't ever get
confused between proxy and regular auth credentials. It's using the
antiquated crypt() format. This is a terrible choice security-wise
in the modern age, but it's what our existing passwd file uses, and
should be portable. It would probably be reasonable to switch both
of these to bcrypt, but we can do that in a separate patch.
- On the client side, we test two situations with credentials: when
they are present in the url, and when the username is present but we
prompt for the password. I think we should be able to handle the
case that _neither_ is present, but an HTTP 407 causes us to prompt
for them. However, this doesn't seem to work. That's either a bug,
or at the very least an opportunity for a feature, but I punted on
it for now. The point of this patch is just getting basic coverage,
and we can explore possible deficiencies later.
- this doesn't work with LIB_HTTPD_SSL. This probably would be
valuable to have, as https over an http proxy is totally different
(it uses CONNECT to tunnel the session). But adding in
mod_proxy_connect and some basic config didn't seem to work for me,
so I punted for now. Much of the rest of the test suite does not
currently work with LIB_HTTPD_SSL either, so we shouldn't be making
anything much worse here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `FOLLOW_SYMLINKS` flag was added to the dir-iterator API in
fa1da7d2ee (dir-iterator: add flags parameter to dir_iterator_begin,
2019-07-10) in order to follow symbolic links while traversing through a
directory.
`FOLLOW_SYMLINKS` gained its first caller in ff7ccc8c9a (clone: use
dir-iterator to avoid explicit dir traversal, 2019-07-10), but it was
subsequently removed in 6f054f9fb3 (builtin/clone.c: disallow `--local`
clones with symlinks, 2022-07-28).
Since then, we've held on to the code for `DIR_ITERATOR_FOLLOW_SYMLINKS`
in the name of making minimally invasive changes during a security
embargo.
In fact, we even changed the dir-iterator API in bffc762f87
(dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS,
2023-01-24) without having any non-test callers of that flag.
Now that we're past those security embargo(s), let's finalize our
cleanup of the `DIR_ITERATOR_FOLLOW_SYMLINKS` code and remove its
implementation since there are no remaining callers.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test helper feeds 256kB of data at once to a single invocation
of the write(2) system call, which may be too much for some
platforms.
Call our xwrite() wrapper that knows to honor MAX_IO_SIZE limit and
cope with short writes due to EINTR instead, and die a bit more
loudly by calling die_errno() when xwrite() indicates an error.
Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation mistakenly said that the default format was
similar to RFC 2822 format and tried to specify it by enumerating
differences, which had two problems:
* There are some more differences from the 2822 format that are not
mentioned; worse yet
* The default format is not modeled after RFC 2822 format at all.
As can be seen in f80cd783 (date.c: add "show_date()" function.,
2005-05-06), it is a derivative of ctime(3) format.
Stop saying that it is similar to RFC 2822, and rewrite the
description to explain the format without requiring the reader to
know any other format.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Finally retire the scripted "git add -p/-i" implementation and have
everybody use the one reimplemented in C.
* ab/retire-scripted-add-p:
docs & comments: replace mentions of "git-add--interactive.perl"
add API: remove run_add_interactive() wrapper function
add: remove "add.interactive.useBuiltin" & Perl "git add--interactive"
Userdiff regexp update for Java language.
* ar/userdiff-java-update:
userdiff: support Java sealed classes
userdiff: support Java record types
userdiff: support Java type parameters
In-tree .gitattributes update to match the way we recommend our
users to mark a file as text.
* po/attributes-text:
.gitattributes: include `text` attribute for eol attributes
Plug leaks in sequencer subsystem and its users.
* ab/sequencer-unleak:
commit.c: free() revs.commit in get_fork_point()
builtin/rebase.c: free() "options.strategy_opts"
sequencer.c: always free() the "msgbuf" in do_pick_commit()
builtin/rebase.c: fix "options.onto_name" leak
builtin/revert.c: move free-ing of "revs" to replay_opts_release()
sequencer API users: fix get_replay_opts() leaks
sequencer.c: split up sequencer_remove_state()
rebase: use "cleanup" pattern in do_interactive_rebase()
The bundle-URI subsystem adds support for creation-token heuristics
to help incremental fetches.
* ds/bundle-uri-5:
bundle-uri: test missing bundles with heuristic
bundle-uri: store fetch.bundleCreationToken
fetch: fetch from an external bundle URI
bundle-uri: drop bundle.flag from design doc
clone: set fetch.bundleURI if appropriate
bundle-uri: download in creationToken order
bundle-uri: parse bundle.<id>.creationToken values
bundle-uri: parse bundle.heuristic=creationToken
t5558: add tests for creationToken heuristic
bundle: verify using check_connected()
bundle: test unbundling with incomplete history
In an environment where dynamically generated code is prohibited to
run (e.g. SELinux), failure to JIT pcre patterns is expected. Fall
back to interpreted execution in such a case.
* cb/grep-fallback-failing-jit:
grep: fall back to interpreter if JIT memory allocation fails
There are few things more frustrating when signing a commit fails than
reading a terse "error: gpg failed to sign the data" message followed by
the unsurprising "fatal: failed to write commit object" message.
In many cases where signing a commit or tag fails, `gpg` actually said
something helpful, on its stderr, and Git even consumed that, but then
keeps mum about it.
Teach Git to stop withholding that rather important information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test case not only increases test coverage in setups without
working gpg, but also prepares for verifying that the error message of
`gpg.program` is shown upon failure.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref_rev_parse_rules[] array is terminated with a NULL entry, and we
count it and store the result in the local nr_rules variable. But we
don't need to do so; since the array is a constant, we can compute its
size directly. The original code probably didn't do that because it was
written as part of for-each-ref, and saw the array only as a pointer. It
was migrated in 7c2b3029df (make get_short_ref a public function,
2009-04-07) and could have been updated then, but that subtlety was not
noticed.
We even have a constant that represents this value already, courtesy of
60650a48c0 (remote: make refspec follow the same disambiguation rule as
local refs, 2018-08-01), though again, nobody noticed at the time that
it could be used here, too.
The current count-up isn't a big deal, as we need to preprocess that
array anyway. But it will become more cumbersome as we refactor the
shortening code. So let's get rid of it and just use the constant
everywhere.
Note that there are two things here that aren't just simple text
replacements:
1. We also use nr_rules to see if a previous call has initialized the
static pre-processing variables. We can just use the scanf_fmts
pointer to do the same thing, as it is non-NULL only after we've
done that initialization.
2. If nr_rules is zero after we've counted it up, we bail from the
function. This code is unreachable, though, as the set of rules is
hard-coded and non-empty. And that becomes even more apparent now
that we are using the constant. So we can drop this conditional
completely (and ironically, the code would have the same output if
it _did_ trigger, as we'd simply skip the loop entirely and return
the whole refname).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We parse the shortened name "foo" out of the full refname
"refs/heads/foo", and then assign the result of strlen(short_name) to an
int, which may truncate or wrap to negative.
In practice, this should never happen, as it requires a 2GB refname. And
even somebody trying to do something malicious should at worst end up
with a confused answer (we use the size only to feed back as a
placeholder length to strbuf_addf() to see if there are any collisions
in the lookup rules).
And it may even be impossible to trigger this, as we parse the string
with sscanf(), and stdio formatting functions are not known for handling
large strings well. I didn't test, but I wouldn't be surprised if
sscanf() on many platforms simply reports no match here.
But even if it is not a problem in practice so far, it is worth fixing
for two reasons:
1. We'll shortly be replacing the sscanf() call with a real parser
which will handle arbitrary-sized strings.
2. Assigning strlen() to an int is an anti-pattern that requires
people to look twice when auditing for real overflow problems.
So we'll make this a size_t. Unfortunately we still have to cast to int
eventually for the strbuf_addf() call, but at least we can localize the
cast there, and check that it will be valid. I used our new cast helper
here, which will just bail completely. That should be OK, as anybody
with a 2GB refname is up to no good, but if we really wanted to, we
could detect it manually and just refuse to shorten the refname.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up around unused function parameters.
* jk/unused-post-2.39:
userdiff: mark unused parameter in internal callback
list-objects-filter: mark unused parameters in virtual functions
diff: mark unused parameters in callbacks
xdiff: mark unused parameter in xdl_call_hunk_func()
xdiff: drop unused parameter in def_ff()
ws: drop unused parameter from ws_blank_line()
list-objects: drop process_gitlink() function
blob: drop unused parts of parse_blob_buffer()
ls-refs: use repository parameter to iterate refs
Clarify how "checkout -b/-B" and "git branch [-f]" are similar but
different in the documentation.
* jc/doc-checkout-b:
checkout: document -b/-B to highlight the differences from "git branch"
Document that "branch -f <branch>" disables only the safety to
avoid recreating an existing branch.
* jc/doc-branch-update-checked-out-branch:
branch: document `-f` and linked worktree behaviour
"git ls-tree --format='%(path) %(path)' $tree $path" showed the
path three times, which has been corrected.
* rs/ls-tree-path-expansion-fix:
ls-tree: remove dead store and strbuf for quote_c_style()
ls-tree: fix expansion of repeated %(path)
Document ORIG_HEAD a bit more.
* pb/doc-orig-head:
git-rebase.txt: add a note about 'ORIG_HEAD' being overwritten
revisions.txt: be explicit about commands writing 'ORIG_HEAD'
git-merge.txt: mention 'ORIG_HEAD' in the Description
git-reset.txt: mention 'ORIG_HEAD' in the Description
git-cherry-pick.txt: do not use 'ORIG_HEAD' in example
The logic to see if we are using the "cone" mode by checking the
sparsity patterns has been tightened to avoid mistaking a pattern
that names a single file as specifying a cone.
* ws/single-file-cone:
dir: check for single file cone patterns
"git diff --relative" did not mix well with "git diff --ext-diff",
which has been corrected.
* jk/ext-diff-with-relative:
diff: drop "name" parameter from prepare_temp_file()
diff: clean up external-diff argv setup
diff: use filespec path to set up tempfiles for ext-diff
Fix to a small regression in 2.38 days.
* ab/bundle-wo-args:
bundle <cmd>: have usage_msg_opt() note the missing "<file>"
builtin/bundle.c: remove superfluous "newargc" variable
bundle: don't segfault on "git bundle <subcmd>"
Fix the sequence to fsync $GIT_DIR/packed-refs file that forgot to
flush its output to the disk..
* ps/fsync-refs-fix:
refs: fix corruption by not correctly syncing packed-refs to disk
When given a pattern that matches an empty string at the end of a
line, the code to parse the "git diff" line-ranges fell into an
infinite loop, which has been corrected.
* lk/line-range-parsing-fix:
line-range: fix infinite loop bug with '$' regex
Newer regex library macOS stopped enabling GNU-like enhanced BRE,
where '\(A\|B\)' works as alternation, unless explicitly asked with
the REG_ENHANCED flag. "git grep" now can be compiled to do so, to
retain the old behaviour.
* rs/use-enhanced-bre-on-macos:
use enhanced basic regular expressions on macOS
Deal with a few deprecation warning from cURL library.
* jk/curl-avoid-deprecated-api:
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
Redefining system functions for a few functions did not follow our
usual "implement git_foo() and #define foo(args) git_foo(args)"
pattern, which has broken build for some folks.
* jk/avoid-redef-system-functions-2.30:
git-compat-util: undefine system names before redeclaring them
git-compat-util: avoid redefining system function names
CI updates. We probably want a clean-up to move the long shell
script embedded in yaml file into a separate file, but that can
come later.
* cw/ci-whitespace:
ci (check-whitespace): move to actions/checkout@v3
ci (check-whitespace): add links to job output
ci (check-whitespace): suggest fixes for errors
Test the character classifiers added by 1c149ab2dd (ctype: support
iscntrl, ispunct, isxdigit and isprint, 2012-10-15) and 0fcec2ce54
(format-patch: make rfc2047 encoding more strict, 2012-10-18).
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test the character classifiers added by 43ccdf56ec (ctype: implement
islower/isupper macro, 2012-02-10).
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test the character classifier added by c2e9364a06 (cleanup: add
isascii(), 2009-03-07). It returns 1 for NUL as well, which requires
special treatment, as our string-based tester can't find it with
strcmp(3). Allow NUL to be given as the first character in a class
specification string. This has the downside of no longer supporting
the empty string, but that's OK since we are not interested in testing
character classes with no members.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We document that you can specify "refs" to ls-remote, but we don't
explain any further than that they are "matched" as patterns. Since this
can be interpreted in a lot of ways, let's clarify that they are
tail-matched globs.
Likewise, let's use the word "patterns" to refer to them consistently,
rather than "refs" (both here and in the quick "-h" help), and mention
more explicitly that only one pattern needs to be matched (though there
is also an example already that shows this in action).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are effectively three example commands and their output, but
they're smushed together with no extra whitespace. Let's add some blank
lines to make them more readable.
Likewise, the first example uses "./." to refer to the path of the
current repository, which is somewhat distracting. That may have been
necessary back in 2005 when it was added, but we can just say "." these
days.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use size_t to store the original length of the strbuf tree_len, as
that's the correct type.
Don't double the allocated size of the strbuf when adding a subdirectory
name. And the chance of the trailing slash fitting in the slack left by
strbuf_add() is very high, so stop pre-growing the strbuf at all.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Have the last users of "USE_THE_INDEX_COMPATIBILITY_MACROS" use the
underlying *_index() variants instead. Now all previous users of
"USE_THE_INDEX_COMPATIBILITY_MACROS" have been migrated away from the
wrapper macros, and if applicable to use the "USE_THE_INDEX_VARIABLE"
added in [1].
Let's leave the "index-compatibility.cocci" in place, even though it
won't be doing anything on "master". It will benefit any out-of-tree
code that need to use these compatibility macros. We can eventually
remove it.
1. bdafeae0b9 (cache.h & test-tool.h: add & use
"USE_THE_INDEX_VARIABLE", 2022-11-19)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the redundant update_main_cache_tree() function, and make its
users use cache_tree_update() instead.
The behavior of populating the "the_index.cache_tree" if it wasn't
present already was needed when this function was introduced in [1],
but it hasn't been needed since [2]; The "cache_tree_update()" will
now lazy-allocate, so there's no need for the wrapper.
1. 996277c520 (Refactor cache_tree_update idiom from commit,
2011-12-06)
2. fb0882648e (cache-tree: clean up cache_tree_update(), 2021-01-23)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a trivial rule for "write_cache_as_tree" to
"index-compatibility.cocci", and apply it. This was left out of the
rules added in 0e6550a2c6 (cocci: add a
index-compatibility.pending.cocci, 2022-11-19) because this
compatibility wrapper lived in "cache-tree.h", not "cache.h"
But it's like the other "USE_THE_INDEX_COMPATIBILITY_MACROS", so let's
migrate it too.
The replacement of "USE_THE_INDEX_COMPATIBILITY_MACROS" here with
"USE_THE_INDEX_VARIABLE" is a manual change on top, now that these
files only use "&the_index", and don't need any compatibility
macros (or functions).
The wrapping of some argument lists is likewise manual, as coccinelle
would otherwise give us overly long argument lists.
The reason for putting the "O" in the cocci rule on the "-" and "+"
lines is because I couldn't get correct whitespacing otherwise,
i.e. I'd end up with "oid,&the_index", not "oid, &the_index".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the rule added in [1] to change "cache_name_pos" to
"index_name_pos", which allows us to get rid of another
"USE_THE_INDEX_COMPATIBILITY_MACROS" macro.
The replacement of "USE_THE_INDEX_COMPATIBILITY_MACROS" here with
"USE_THE_INDEX_VARIABLE" is a manual change on top, now that these
files only use "&the_index", and don't need any compatibility
macros (or functions).
1. 0e6550a2c6 (cocci: add a index-compatibility.pending.cocci,
2022-11-19)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the "active_nr" part of "index-compatibility.pending.cocci",
which was left out in [1] due to an in-flight conflict. As of [2] the
topic we conflicted with has been merged to "master", so we can fully
apply this rule.
1. dc594180d9 (cocci & cache.h: apply variable section of "pending"
index-compatibility, 2022-11-19)
2. 9ea1378d04 (Merge branch 'ab/various-leak-fixes', 2022-12-14)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the "USE_THE_INDEX_COMPATIBILITY_MACROS" define with the
narrower "USE_THE_INDEX_VARIABLE". This could have been done in
07047d6829 (cocci: apply "pending" index-compatibility to some
"builtin/*.c", 2022-11-19), but I missed it at the time.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pack-objects" learned to release delta-island bitmap data when
it is done using it, saving peak heap memory usage.
* ew/free-island-marks:
delta-islands: free island_marks and bitmaps
Fix use of CreateThread() API call made early in the windows
start-up code.
* sk/winansi-createthread-fix:
compat/winansi: check for errors of CreateThread() correctly
Remove support for MSys, which now lags way behind MSys2.
* hj/remove-msys-support:
mingw: remove msysGit/MSYS1 support
mingw: remove duplicate `USE_NED_ALLOCATOR` directive
Test update.
* jk/httpd-test-updates:
t/lib-httpd: increase ssl key size to 2048 bits
t/lib-httpd: drop SSLMutex config
t/lib-httpd: bump required apache version to 2.4
t/lib-httpd: bump required apache version to 2.2
Commit 7550424804 ("name-rev: include taggerdate in considering the best
name", 2016-04-22) introduced the idea of using taggerdate in the
criteria for selecting the best name. At the time, a certain commit in
linux.git -- namely, aed06b9cfcab -- was being named by name-rev as
v4.6-rc1~9^2~792
which, while correct, was very suboptimal. Some investigation found
that tweaking the MERGE_TRAVERSAL_WEIGHT to lower it could give
alternate answers such as
v3.13-rc7~9^2~14^2~42
or
v3.13~5^2~4^2~2^2~1^2~42
A manual solution involving looking at tagger dates came up with
v3.13-rc1~65^2^2~42
which is much nicer. That workaround was then implemented in name-rev.
Unfortunately, the taggerdate heuristic is causing bugs. I was pointed
to a case in a private repository where name-rev reports a name of the
form
v2022.10.02~86
when users expected to see one of the form
v2022.10.01~2
(I've modified the names and numbers a bit from the real testcase.) As
you can probably guess, v2022.10.01 was created after v2022.10.02 (by a
few hours), even though it pointed to an older commit. While the
condition is unusual even in the repository in question, it is not the
only problematic set of tags in that repository. The taggerdate logic
is causing problems.
Further, it turns out that this taggerdate heuristic isn't even helping
anymore. Due to the fix to naming logic in 3656f84278 ("name-rev:
prefer shorter names over following merges", 2021-12-04), we get
improved names without the taggerdate heuristic. For the original
commit of interest in linux.git, a modern git without the taggerdate
heuristic still provides the same optimal answer of interest, namely:
v3.13-rc1~65^2^2~42
So, the taggerdate is no longer providing benefit, and it is causing
problems. Simply get rid of it.
However, note that "taggerdate" as a variable is used to store things
besides a taggerdate these days. Ever since commit ef1e74065c
("name-rev: favor describing with tags and use committer date to
tiebreak", 2017-03-29), this has been used to store committer dates and
there it is used as a fallback tiebreaker (as opposed to a primary
criteria overriding effective distance calculations). We do not want to
remove that fallback tiebreaker, so not all instances of "taggerdate"
are removed in this change.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new kind of class was added in Java 17 -- sealed classes.[1] This
feature includes several new keywords that may appear in a declaration
of a class. New modifiers before name of the class: "sealed" and
"non-sealed", and a clause after name of the class marked by keyword
"permits".
The current set of regular expressions in userdiff.c already allows the
modifier "sealed" and the "permits" clause, but not the modifier
"non-sealed", which is the first hyphenated keyword in Java.[2] Allow
hyphen in the words that precede the name of type to match the
"non-sealed" modifier.
In new input file "java-sealed" for the test t4018-diff-funcname.sh, use
a Java code comment for the marker "RIGHT". This workaround is needed,
because the name of the sealed class appears on the line of code that
has the "ChangeMe" marker.
[1] Detailed description in "JEP 409: Sealed Classes"
https://openjdk.org/jeps/409
[2] "JEP draft: Keyword Management for the Java Language"
https://openjdk.org/jeps/8223002
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new kind of class was added in Java 16 -- records.[1] The syntax of
records is similar to regular classes with one important distinction:
the name of the record class is followed by a mandatory list of
components. The list is enclosed in parentheses, it may be empty, and
it may immediately follow the name of the class or type parameters, if
any, with or without separating whitespace. For example:
public record Example(int i, String s) {
}
public record WithTypeParameters<A, B>(A a, B b, String s) {
}
record SpaceBeforeComponents (String comp1, int comp2) {
}
Support records in the builtin userdiff pattern for Java. Add "record"
to the alternatives of keywords for kinds of class.
Allowing matching various possibilities for the type parameters and/or
list of the components of a record has already been covered by the
preceding patch.
[1] detailed description is available in "JEP 395: Records"
https://openjdk.org/jeps/395
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A class or interface in Java can have type parameters following the name
in the declared type, surrounded by angle brackets (paired less than and
greater than signs).[2] The type parameters -- `A` and `B` in the
examples -- may follow the class name immediately:
public class ParameterizedClass<A, B> {
}
or may be separated by whitespace:
public class SpaceBeforeTypeParameters <A, B> {
}
A part of the builtin userdiff pattern for Java matches declarations of
classes, enums, and interfaces. The regular expression requires at
least one whitespace character after the name of the declared type.
This disallows matching for opening angle bracket of type parameters
immediately after the name of the type. Mandatory whitespace after the
name of the type also disallows using the pattern in repositories with a
fairly common code style that puts braces for the body of a class on
separate lines:
class WithLineBreakBeforeOpeningBrace
{
}
Support matching Java code in more diverse code styles and declarations
of classes and interfaces with type parameters immediately following the
name of the type in the builtin userdiff pattern for Java. Do so by
just matching anything until the end of the line after the keywords for
the kind of type being declared.
[1] Since Java 5 released in 2004.
[2] Detailed description is available in the Java Language
Specification, sections "Type Variables" and "Parameterized Types":
https://docs.oracle.com/javase/specs/jls/se17/html/jls-4.html#jls-4.4
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the invocation of the "post-rewrite" hook added in
795160457d (sequencer (rebase -i): run the post-rewrite hook, if
needed, 2017-01-02) to use the new hook API.
This leaves the more complex "post-rewrite" invocation added in
a87a6f3c98 (commit: move post-rewrite code to libgit, 2017-11-17)
here in sequencer.c unconverted.
Here we can pass in a file's via the "in" file descriptor, in that
case we don't have a file, but will need to write_in_full() to an "in"
provide by the API. Support for that will be added to the hook API in
the future, but we're not there yet.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the invocation of the 'post-rewrite' hook run by 'git am' to
use the hook.h library. To do this we need to add a "path_to_stdin"
member to "struct run_hooks_opt".
In our API this is supported by asking for a file path, rather
than by reading stdin. Reading directly from stdin would involve caching
the entire stdin (to memory or to disk) once the hook API is made to
support "jobs" larger than 1, along with support for executing N hooks
at a time (i.e. the upcoming config-based hooks).
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While it makes sense not to inherit stdin from the parent process to
avoid deadlocking, it's not necessary to completely ban stdin to
children. An informed user should be able to configure stdin safely. By
setting `some_child.process.no_stdin=1` before calling `get_next_task()`
we provide a reasonable default behavior but enable users to set up
stdin streaming for themselves during the callback.
`some_child.process.stdout_to_stderr`, however, remains unmodifiable by
`get_next_task()` - the rest of the run_processes_parallel() API depends
on child output in stderr.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove code that's been unused since it was added in
c553c72eed (run-command: add an asynchronous parallel child
processor, 2015-12-15).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow "scalar" to warn but continue when its periodic maintenance
feature cannot be enabled.
* ds/scalar-ignore-cron-error:
scalar: only warn when background maintenance fails
t921*: test scalar behavior starting maintenance
t: allow 'scalar' in test_must_fail
Adjust "git request-pull" to strip embedded signature from signed
tags to notice non-PGP signatures.
* gm/request-pull-with-non-pgp-signed-tags:
request-pull: filter out SSH/X.509 tag signatures
In a remote with multiple configured URLs, `git remote -v` shows the
correct url that fetch uses. However, `git config remote.<remote>.url`
returns the last defined url instead. This discrepancy can cause
confusion for users with a remote defined as such, since any url
defined after the first essentially acts as a pushurl.
Add documentation to clarify how fetch interacts with multiple urls
and how push interacts with multiple pushurls and urls.
Add test affirming interaction between fetch and multiple urls.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was removed in ecec57b3c9 (config: respect includes in
protected config, 2022-10-13), but its prototype was left here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that's been with us since d96855ff51 (merge-base:
teach "--fork-point" mode, 2013-10-23).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "strategy_opts" member was added in ba1905a5fe (builtin
rebase: add support for custom merge strategies, 2018-09-04) the
corresponding free() for it at the end of cmd_rebase() wasn't added,
let's do so.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In [1] the strbuf_release(&msgbuf) was moved into this
do_pick_commit(), but didn't take into account the case of [2], where
we'd return before the strbuf_release(&msgbuf).
Then when the "fixup" support was added in [3] this leak got worse, as
in this error case we added another place where we'd "return" before
reaching the strbuf_release().
This changes the behavior so that we'll call
update_abort_safety_file() in these cases where we'd previously
"return", but as noted in [4] "update_abort_safety_file() is a no-op
when rebasing and you're changing code that is only run when
rebasing.". Here "no-op" refers to the early return in
update_abort_safety_file() if git_path_seq_dir() doesn't exist.
1. 452202c74b (sequencer: stop releasing the strbuf in
write_message(), 2016-10-21)
2. f241ff0d0a (prepare the builtins for a libified merge_recursive(),
2016-07-26)
3. 6e98de72c0 (sequencer (rebase -i): add support for the 'fixup' and
'squash' commands, 2017-01-02)
4. https://lore.kernel.org/git/bcace50b-a4c3-c468-94a3-4fe0c62b3671@dunelm.org.uk/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to the existing "squash_onto_name" added in [1] we need to
free() the xstrdup()'d "options.onto.name" added for "--keep-base" in
[2]..
1. 9dba809a69 (builtin rebase: support --root, 2018-09-04)
2. 414d924beb (rebase: teach rebase --keep-base, 2019-08-27)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In [1] and [2] I added the code being moved here to cmd_revert() and
cmd_cherry_pick(), now that we've got a "replay_opts_release()" for
the "struct replay_opts" it should know how to free these "revs",
rather than having these users reach into the struct to free its
individual members.
1. d1ec656d68 (cherry-pick: free "struct replay_opts" members,
2022-11-08)
2. fd74ac95ac (revert: free "struct replay_opts" members, 2022-07-01)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the replay_opts_release() function added in the preceding commit
non-static, and use it for freeing the "struct replay_opts"
constructed for "rebase" and "revert".
To safely call our new replay_opts_release() we'll need to stop
calling it in sequencer_remove_state(), and instead call it where we
allocate the "struct replay_opts" itself.
This is because in e.g. do_interactive_rebase() we construct a "struct
replay_opts" with "get_replay_opts()", and then call
"complete_action()". If we get far enough in that function without
encountering errors we'll call "pick_commits()" which (indirectly)
calls sequencer_remove_state() at the end.
But if we encounter errors anywhere along the way we'd punt out early,
and not free() the memory we allocated. Remembering whether we
previously called sequencer_remove_state() would be a hassle.
Using a FREE_AND_NULL() pattern would also work, as it would be safe
to call replay_opts_release() repeatedly. But let's fix this properly
instead, by having the owner of the data free() it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split off the free()-ing in sequencer_remove_state() into a utility
function, which will be adjusted and called independent of the other
code in sequencer_remove_state() in a subsequent commit.
The only functional change here is changing the "int" to a "size_t",
which is the correct type, as "xopts_nr" is a "size_t".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use a "goto cleanup" pattern in do_interactive_rebase(). This
eliminates some duplicated free() code added in 53bbcfbde7 (rebase
-i: implement the main part of interactive rebase as a builtin,
2018-09-27), and sets us up for a subsequent commit which'll make
further use of the "cleanup" label.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that's been with us since this code was added in
ca02465b41 (push: use remote.$name.push as a refmap, 2013-12-03).
The "remote = remote_get(...)" added in the same commit would seem to
leak based only on the context here, but that function is a wrapper
for sticking the remotes we fetch into "the_repository->remote_state".
See fd3cb0501e (remote: move static variables into per-repository
struct, 2021-11-17) for the addition of code in repository.c that
free's the "remote" allocated here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The set_refspecs() caller of refspec_append_mapped() (added in [1])
left open the question[2] of whether the "remote" we lazily fetch
might be NULL in the "[...]uniquely name our ref?" case, as
remote_get() can return NULL.
If we got past the "[...]uniquely name our ref?" case we'd have
already segfaulted if we tried to dereference it as
"remote->push.nr". In these cases the config mechanism & previous
remote validation will have bailed out earlier.
Let's refactor this code to clarify that, we'll now BUG() out if we
can't get a "remote", and will no longer retrieve it for these common
cases where we don't need it.
1. ca02465b41 (push: use remote.$name.push as a refmap, 2013-12-03)
2. https://lore.kernel.org/git/c0c07b89-7eaf-21cd-748e-e14ea57f09fd@web.de/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that's been with us since this code was introduced
in [1]. Later in [2] we started using FLEX_ALLOC_MEM() to allocate the
"struct command *".
1. 575f497456 (Add first cut at "git-receive-pack", 2005-06-29)
2. eb1af2df0b (git-receive-pack: start parsing ref update commands,
2005-06-29)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "header_list" struct member was added in [1], freeing this
field was neglected. Fix that now, so that commands like
./git -P log -1 --color=always --author=A origin/master
will run leak-free.
1. 80235ba79e ("log --author=me --grep=it" should find intersection,
not union, 2010-01-17)
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the free_grep_patterns() function to split out the freeing of
the "struct grep_pat" it contains. Right now we're only freeing the
"pattern_list", but we should be freeing another member of the same
type, which we'll do in the subsequent commit.
Let's also replace the "return" if we don't have an
"opt->pattern_expression" with a conditional call of
free_pattern_expr().
Before db84376f98 (grep.c: remove "extended" in favor of
"pattern_expression", fix segfault, 2022-10-11) the pattern here was:
if (!x)
return;
free_pattern_expr(y);
While at it, instead of:
if (!x)
return;
free_pattern_expr(x);
Let's instead do:
if (x)
free_pattern_expr(x);
This will make it easier to free additional members from
free_grep_patterns() in the future.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug a memory leak introduced in [1], since that change didn't follow
the "goto done" pattern introduced in [2] we'd leak the "&buf" memory.
1. e4cdfe84a0 (merge: abort if index does not match HEAD for trivial
merges, 2022-07-23)
2. d5a35c114a (Copy resolve_ref() return value for longer use,
2011-11-13)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow-up 465028e0e2 (merge: add missing strbuf_release(),
2021-10-07) and address the "msg" memory leak in this block. We could
free "&msg" before the "goto done" here, but even better is to avoid
allocating it in the first place.
By repeating the "Fast-forward" string here we can avoid using a
"struct strbuf" altogether.
Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop leaking the "head" variable, which we've been leaking since it
was originally added in [1], and in its current form since [2]
1. ed378ec7e8 (Make ref resolution saner, 2006-09-11)
2. d9e557a320 (show-branch: store resolved head in heap buffer,
2017-02-14).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the parse_options_concat() was added to this file in
84e4484f12 (commit-graph: use parse_options_concat(), 2021-08-23) we
wouldn't free() it if we returned early in these cases.
Since "result" is 0 by default we can "goto cleanup" in both cases,
and only need to set "result" if write_commit_graph_reachable() fails.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that's been with us ever since
2f4038ab33 (Git-aware CGI to provide dumb HTTP transport,
2009-10-30). In this case we're not calling regerror() after a failed
regexec(), and don't otherwise use "re" afterwards.
We can therefore simplify this code by calling regfree() right after
the regexec(). An alternative fix would be to add a regfree() to both
the "return" and "break" path in this for-loop.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Free the "dir" variable after we're done with it. Before
917adc0360 (http-backend: add GIT_PROJECT_ROOT environment var,
2009-10-30) there was no leak here, as we'd get it via getenv(), but
since 917adc0360 we've xstrdup()'d it (or the equivalent), so we need
to free() it.
We also need to free the "cmd_arg" variable, which has been leaked
ever since it was added in 2f4038ab33 (Git-aware CGI to provide dumb
HTTP transport, 2009-10-30).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We were leaking both the "struct strbuf" in prune_worktrees(), as well
as the "path" we got from should_prune_worktree(). Since these were
the only two uses of the "struct string_list" let's change it to a
"DUP" and push these to it with "string_list_append_nodup()".
For the string_list_append_nodup() we could also string_list_append()
the main_path.buf, and then strbuf_release(&main_path) right away. But
doing it this way avoids an allocation, as we already have the "struct
strbuf" prepared for appending to "kept".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cmd_repack() when we hit an error, replace "return ret" with "goto
cleanup" to ensure we free the necessary data structures.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "add_to_tip_table()" is called with a non-zero
"shorten_unambiguous" we always return an xstrdup()'d string, which
we'd then xstrdup() again, leaking memory. See [1] and [2] for how
this leak came about.
We could xstrdup() only if "shorten_unambiguous" wasn't true, but
let's instead inline this code, so that information on whether we need
to xstrdup() is contained within add_to_tip_table().
1. 98c5c4ad01 (name-rev: allow to specify a subpath for --refs
option, 2013-06-18)
2. b23e0b9353 (name-rev: allow converting the exact object name at
the tip of a ref, 2013-07-07)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix memory leaks resulting from a missing clear_pathspec().
- archive.c: Plug a leak in the "struct archiver_args", and
clear_pathspec() the "pathspec" member that the "parse_pathspec_arg()"
call in this function populates.
- builtin/clean.c: Fix a memory leak that's been with us since
893d839970 (clean: convert to use parse_pathspec, 2013-07-14).
- builtin/reset.c: Add clear_pathspec() calls to cmd_reset(),
including to the codepaths where we'd return early.
- builtin/stash.c: Call clear_pathspec() on the pathspec initialized
in push_stash().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change an UNLEAK() added in 0c4542738e (clone: free or UNLEAK further
pointers when finished, 2021-03-14) to use a "to_free" pattern
instead. In this case the "repo" can be either this absolute_pathdup()
value, or in the "else if" branch seen in the context the the
"argv[0]" argument to "main()".
We can only free() the value in the former case, hence the "to_free"
pattern.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 0bfb48e672 (builtin/commit-graph.c: UNLEAK variables, 2018-10-03)
this was made to UNLEAK(), but we can just as easily invoke the
free_commit_graph() function added in c3756d5b7f (commit-graph: add
free_commit_graph, 2018-07-11) instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a leak that's been here since 7366096de9 (bundle API: change
"flags" to be "extra_index_pack_args", 2021-09-05). If we can't verify
the bundle, we didn't call child_process_clear() to clear the "args".
But rather than adding an additional child_process_clear() call, let's
verify the bundle before we start preparing the process we're going to
spawn. If we fail to verify, we don't need to push anything to the
child_process "args".
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "ab/various-leak-fixes" topic was merged in [1] only t6021
would fail if the tests were run in the
"GIT_TEST_PASSING_SANITIZE_LEAK=check" mode, i.e. to check whether we
marked all leak-free tests with "TEST_PASSES_SANITIZE_LEAK=true".
Since then we've had various tests starting to pass under
SANITIZE=leak. Let's mark those as passing, this is when they started
to pass, narrowed down with "git bisect":
- t5317-pack-objects-filter-objects.sh: In
faebba436e (list-objects-filter: plug pattern_list leak, 2022-12-01).
- t3210-pack-refs.sh, t5613-info-alternate.sh,
t7403-submodule-sync.sh: In 189e97bc4b (diff: remove parseopts member
from struct diff_options, 2022-12-01).
- t1408-packed-refs.sh: In ab91f6b7c4 (Merge branch
'rs/diff-parseopts', 2022-12-19).
- t0023-crlf-am.sh, t4152-am-subjects.sh, t4254-am-corrupt.sh,
t4256-am-format-flowed.sh, t4257-am-interactive.sh,
t5403-post-checkout-hook.sh: In a658e881c1 (am: don't pass strvec to
apply_parse_options(), 2022-12-13)
- t1301-shared-repo.sh, t1302-repo-version.sh: In b07a819c05 (reflog:
clear leftovers in reflog_expiry_cleanup(), 2022-12-13).
- t1304-default-acl.sh, t1410-reflog.sh,
t5330-no-lazy-fetch-with-commit-graph.sh, t5502-quickfetch.sh,
t5604-clone-reference.sh, t6014-rev-list-all.sh,
t7701-repack-unpack-unreachable.sh: In b0c61be320 (Merge branch
'rs/reflog-expiry-cleanup', 2022-12-26)
- t3800-mktag.sh, t5302-pack-index.sh, t5306-pack-nobase.sh,
t5573-pull-verify-signatures.sh, t7612-merge-verify-signatures.sh: In
69bbbe484b (hash-object: use fsck for object checks, 2023-01-18).
- t1451-fsck-buffer.sh: In 8e4309038f (fsck: do not assume
NUL-termination of buffers, 2023-01-19).
- t6501-freshen-objects.sh: In abf2bb895b (Merge branch
'jk/hash-object-fsck', 2023-01-30)
1. 9ea1378d04 (Merge branch 'ab/various-leak-fixes', 2022-12-14)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we've removed "git-add--interactive.perl" let's replace
mentions of it with "add-interactive.c". In the case of the "git add"
documentation we were using it as an example filename, so the mention
wasn't wrong, but using a dead file is slightly confusing.
The "borrowed" comment here likewise isn't wrong, but let's mention
the successor file instead. In the case of pathspec.c the implied TODO
item should refer to the current code (and the comment may not even be
current, I didn't check).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the Perl "git-add--interactive" has gone away in the
preceding commit we don't need to pass along our desire for a mode as
a string, and can instead directly use the "enum add_p_mode", see
d2a233cb8b (built-in add -p: prepare for patch modes other than
"stage", 2019-12-21) for its introduction.
As a result of that the run_add_interactive() function would become a
trivial wrapper which would only run run_add_i() if a 0 (or now,
"NULL") "patch_mode" was provided. Let's instead remove it, and have
the one callsite that wanted the "NULL" case (interactive_add())
handle it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since [1] first released with Git v2.37.0 the built-in version of "add
-i" has been the default. That built-in implementation was added in
[2], first released with Git v2.25.0.
At this point enough time has passed to allow for finding any
remaining bugs in this new implementation, so let's remove the
fallback code.
As with similar migrations for "stash"[3] and "rebase"[4] we're
keeping a mention of "add.interactive.useBuiltin" in the
documentation, but adding a warning() to notify any outstanding users
that the built-in is now the default. As with [5] and [6] we should
follow-up in the future and eventually remove that warning.
1. 0527ccb1b5 (add -i: default to the built-in implementation,
2021-11-30)
2. f83dff60a7 (Start to implement a built-in version of `git add
--interactive`, 2019-11-13)
3. 8a2cd3f512 (stash: remove the stash.useBuiltin setting,
2020-03-03)
4. d03ebd411c (rebase: remove the rebase.useBuiltin setting,
2019-03-18)
5. deeaf5ee07 (stash: remove documentation for `stash.useBuiltin`,
2022-01-27)
6. 9bcde4d531 (rebase: remove transitory rebase.useBuiltin setting &
env, 2021-03-23)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call strcspn(3) to find the length of a string terminated by NUL, NL or
slash instead of open-coding it. Adopt its return type, size_t, to
support strings of arbitrary length. Use that type in callers as well
for variables and function parameters that receive the return value.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Support names of any length in base_name_compare() and df_name_compare()
by using size_t for their length parameters. They pass the length on to
memcmp(3), which also takes it as a size_t.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To match present day coding guiding codelines let's:
- use <<-EOF, so we can indent all lines to the
the same level for this test
- use <<\EOF to notify the reader that no interpolation
is expected in the body
Signed-off-by: Kostya Farber <kostya.farber@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit ec14d4ecb5 (builtin.h: take over documentation from
api-builtin.txt, 2017-08-02) deleted api-builtin.txt and moved the
contents into builtin.h, but new-command.txt still references the old
file.
Signed-off-by: Wes Lord <weslord@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The standard advice for text file eol endings in the .gitattributes file
was updated in e28eae3184 (gitattributes: Document the unified "auto"
handling, 2016-08-26) with a recent clarification in 8c591dbfce (docs:
correct documentation about eol attribute, 2022-01-11), with a follow
up comment by the original author in [1] confirming the use of the eol
attribute in conjunction with the text attribute.
Update Git's .gitattributes file to reflect our own advice.
[1] https://lore.kernel.org/git/?q=%3C20220216115239.uo2ie3flaqo3nf2d%40tb-raspi4%3E.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.35:
Git 2.35.7
Git 2.34.7
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
Git 2.33.7
Git 2.32.6
Git 2.31.7
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
* maint-2.34:
Git 2.34.7
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
Git 2.33.7
Git 2.32.6
Git 2.31.7
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
* maint-2.33:
Git 2.33.7
Git 2.32.6
Git 2.31.7
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
Deal with a few deprecation warning from cURL library.
* jk/curl-avoid-deprecated-api:
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The two options do exactly the same thing, but the latter has been
deprecated and in recent versions of curl may produce a compiler
warning. Since the UPLOAD form is available everywhere (it was
introduced in the year 2000 by curl 7.1), we can just switch to it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* maint-2.32:
Git 2.32.6
Git 2.31.7
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
* maint-2.31:
Git 2.31.7
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
* maint-2.30:
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
Fix a vulnerability (CVE-2023-23946) that allows crafted input to trick
`git apply` into writing files outside of the working tree.
* ps/apply-beyond-symlink:
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Resolve a security vulnerability (CVE-2023-22490) where `clone_local()`
is used in conjunction with non-local transports, leading to arbitrary
path exfiltration.
* tb/clone-local-symlinks:
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
On my mirror of linux.git forkgroup with 780 islands, this saves
nearly 4G of heap memory in pack-objects. This savings only
benefits delta island users of pack bitmaps, as the process
would otherwise be exiting anyways.
However, there's probably not many delta island users, but the
majority of delta island users would also be pack bitmaps users.
Signed-off-by: Eric Wong <e@80x24.org>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Doc update to ls-files.
* en/ls-files-doc-update:
ls-files: guide folks to --exclude-standard over other --exclude* options
ls-files: clarify descriptions of status tags for -t
ls-files: clarify descriptions of file selection options
ls-files: add missing documentation for --resolve-undo option
"git rebase" often ignored incompatible options instead of
complaining, which has been corrected.
* en/rebase-incompatible-opts:
rebase: provide better error message for apply options vs. merge config
rebase: put rebase_options initialization in single place
rebase: fix formatting of rebase --reapply-cherry-picks option in docs
rebase: clarify the OPT_CMDMODE incompatibilities
rebase: add coverage of other incompatible options
rebase: fix incompatiblity checks for --[no-]reapply-cherry-picks
rebase: fix docs about incompatibilities with --root
rebase: remove --allow-empty-message from incompatible opts
rebase: flag --apply and --merge as incompatible
rebase: mark --update-refs as requiring the merge backend
Improve the error message given when private key is not loaded in
the ssh agent in the codepath to sign with an ssh key.
* as/ssh-signing-improve-key-missing-error:
ssh signing: better error message when key not in agent
When writing files git-apply(1) initially makes sure that none of the
files it is about to create are behind a symlink:
```
$ git init repo
Initialized empty Git repository in /tmp/repo/.git/
$ cd repo/
$ ln -s dir symlink
$ git apply - <<EOF
diff --git a/symlink/file b/symlink/file
new file mode 100644
index 0000000..e69de29
EOF
error: affected file 'symlink/file' is beyond a symbolic link
```
This safety mechanism is crucial to ensure that we don't write outside
of the repository's working directory. It can be fooled though when the
patch that is being applied creates the symbolic link in the first
place, which can lead to writing files in arbitrary locations.
Fix this by checking whether the path we're about to create is
beyond a symlink or not. Tightening these checks like this should be
fine as we already have these precautions in Git as explained
above. Ideally, we should update the check we do up-front before
starting to reflect the computed changes to the working tree so that
we catch this case as well, but as part of embargoed security work,
adding an equivalent check just before we try to write out a file
should serve us well as a reasonable first step.
Digging back into history shows that this vulnerability has existed
since at least Git v2.9.0. As Git v2.8.0 and older don't build on my
system anymore I cannot tell whether older versions are affected, as
well.
Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSys has long fallen behind MSYS2 in features like Unicode or
x86_64 support or even security bug fixes, and is therefore no
longer used by anyone in the Git developer community. The Git for
Windows project itself started switching from MSys to MSYS2 early
in 2015, i.e. about eight years ago. Let's drop supporting MSys as
a development platform.
Signed-off-by: Harshil-Jani <harshiljani2002@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
nedalloc was added to fix the slowness of memory allocator. Here
specifically for the MSys2 build there seems to be a duplication of
USE_NED_ALLOCATOR directive. So this patch intends to remove the
duplicate USE_NED_ALLOCATOR and keeping it only into the MSys2 config
section so it still uses the nedalloc.
Signed-off-by: Harshil-Jani <harshiljani2002@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return value for failed thread creation is NULL,
not INVALID_HANDLE_VALUE, unlike other Windows API functions.
Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent versions of openssl will refuse to work with 1024-bit RSA keys,
as they are considered insecure. I didn't track down the exact version
in which the defaults were tightened, but the Debian-package openssl 3.0
on my system yields:
$ LIB_HTTPD_SSL=1 ./t5551-http-fetch-smart.sh -v -i
[...]
SSL Library Error: error:0A00018F:SSL routines::ee key too small
1..0 # SKIP web server setup failed
This could probably be overcome with configuration, but that's likely
to be a headache (especially if it requires touching /etc/openssl).
Let's just pick a key size that's less outrageously out of date.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The SSL config enabled by setting LIB_HTTPD_SSL does not work with
Apache versions greater than 2.2, as more recent versions complain about
the SSLMutex directive. According to
https://httpd.apache.org/docs/current/upgrading.html:
Directives AcceptMutex, LockFile, RewriteLock, SSLMutex,
SSLStaplingMutex, and WatchdogMutexPath have been replaced with a
single Mutex directive. You will need to evaluate any use of these
removed directives in your 2.2 configuration to determine if they can
just be deleted or will need to be replaced using Mutex.
Deleting this line will just use the system default, which seems
sensible. The original came as part of faa4bc35a0 (http-push: add
regression tests, 2008-02-27), but no specific reason is given there (or
on the mailing list) for its presence.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apache 2.4 has been out since early 2012, almost 11 years. And its
predecessor, 2.2, has been out of support since its last release in
2017, over 5 years ago. The last mention on the mailing list was from
around the same time, in this thread:
https://lore.kernel.org/git/20171231023234.21215-1-tmz@pobox.com/
We can probably assume that 2.4 is available everywhere. And the stakes
are fairly low, as the worst case is that such a platform would skip the
http tests.
This lets us clean up a few minor version checks in the config file, but
also revert f1f2b45be0 (tests: adjust the configuration for Apache 2.2,
2016-05-09). Its technique isn't _too_ bad, but certainly required a bit
more explanation than the 2.4 version it replaced. I manually confirmed
that the test in t5551 still behaves as expected (if you replace
"cadabra" with "foo", the server correctly rejects the request).
It will also help future patches which will no longer have to deal with
conditional config for this old version.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apache 2.2 was released in 2005, almost 18 years ago. We can probably
assume that people are running a version at least that old (and the
stakes for removing it are fairly low, as the worst case is that they
would not run the http tests against their ancient version).
Dropping support for the older versions cleans up the config file a
little, and will also enable us to bump the required version further
(with more cleanups) in a future patch.
Note that the file actually checks for version 2.1. In apache's
versioning scheme, odd numbered versions are for development and even
numbers are for stable releases. So 2.1 and 2.2 are effectively the same
from our perspective.
Older versions would just fail to start, which would generally cause us
to skip the tests. However, we do have version detection code in
lib-httpd.sh which produces a nicer error message, so let's update that,
too. I didn't bother handling the case of "3.0", etc. Apache has been on
2.x for 21 years, with no signs of bumping the major version. And if
they eventually do, I suspect there will be enough breaking changes that
we'd need to update more than just the numeric version check. We can
worry about that hypothetical when it happens.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/gitformat-index.txt describes the "mode" as 32 bits, but
only documents 16 bits. Document the missing 16 bits and specify that
'unused' bits must be zero.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under Linux systems with SELinux's 'deny_execmem' or PaX's MPROTECT
enabled, the allocation of PCRE2's JIT rwx memory may be prohibited,
making pcre2_jit_compile() fail with PCRE2_ERROR_NOMEMORY (-48):
[user@fedora git]$ git grep -c PCRE2_JIT
grep.c:1
[user@fedora git]$ # Enable SELinux's W^X policy
[user@fedora git]$ sudo semanage boolean -m -1 deny_execmem
[user@fedora git]$ # JIT memory allocation fails, breaking 'git grep'
[user@fedora git]$ git grep -c PCRE2_JIT
fatal: Couldn't JIT the PCRE2 pattern 'PCRE2_JIT', got '-48'
Instead of failing hard in this case and making 'git grep' unusable on
such systems, simply fall back to interpreter mode, leading to a much
better user experience.
As having a functional PCRE2 JIT compiler is a legitimate use case for
performance reasons, we'll only do the fallback if the supposedly
available JIT is found to be non-functional by attempting to JIT compile
a very simple pattern. If this fails, JIT is deemed to be non-functional
and we do the interpreter fallback. For all other cases, i.e. the simple
pattern can be compiled but the user provided cannot, we fail hard as we
do now as the reason for the failure must be the pattern itself. To aid
users in helping themselves change the error message to include a hint
about the '(*NO_JIT)' prefix. Also clip the pattern at 64 characters to
ensure the hint will be seen by the user and not internally truncated by
the die() function.
Cc: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The creationToken heuristic uses a different mechanism for downloading
bundles from the "standard" approach. Specifically: it uses a concrete
order based on the creationToken values and attempts to download as few
bundles as possible. It also modifies local config to store a value for
future fetches to avoid downloading bundles, if possible.
However, if any of the individual bundles has a failed download, then
the logic for the ordering comes into question. It is important to avoid
infinite loops, assigning invalid creation token values in config, but
also to be opportunistic as possible when downloading as many bundles as
seem appropriate.
These tests were used to inform the implementation of
fetch_bundles_by_token() in bundle-uri.c, but are being added
independently here to allow focusing on faulty downloads. There may be
more cases that could be added that result in modifications to
fetch_bundles_by_token() as interesting data shapes reveal themselves in
real scenarios.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a bundle list specifies the "creationToken" heuristic, the Git
client downloads the list and then starts downloading bundles in
descending creationToken order. This process stops as soon as all
downloaded bundles can be applied to the repository (because all
required commits are present in the repository or in the downloaded
bundles).
When checking the same bundle list twice, this strategy requires
downloading the bundle with the maximum creationToken again, which is
wasteful. The creationToken heuristic promises that the client will not
have a use for that bundle if its creationToken value is at most the
previous creationToken value.
To prevent these wasteful downloads, create a fetch.bundleCreationToken
config setting that the Git client sets after downloading bundles. This
value allows skipping that maximum bundle download when this config
value is the same value (or larger).
To test that this works correctly, we can insert some "duplicate"
fetches into existing tests and demonstrate that only the bundle list is
downloaded.
The previous logic for downloading bundles by creationToken worked even
if the bundle list was empty, but now we have logic that depends on the
first entry of the list. Terminate early in the (non-sensical) case of
an empty bundle list.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a user specifies a URI via 'git clone --bundle-uri', that URI may
be a bundle list that advertises a 'bundle.heuristic' value. In that
case, the Git client stores a 'fetch.bundleURI' config value storing
that URI.
Teach 'git fetch' to check for this config value and download bundles
from that URI before fetching from the Git remote(s). Likely, the bundle
provider has configured a heuristic (such as "creationToken") that will
allow the Git client to download only a portion of the bundles before
continuing the fetch.
Since this URI is completely independent of the remote server, we want
to be sure that we connect to the bundle URI before creating a
connection to the Git remote. We do not want to hold a stateful
connection for too long if we can avoid it.
To test that this works correctly, extend the previous tests that set
'fetch.bundleURI' to do follow-up fetches. The bundle list is updated
incrementally at each phase to demonstrate that the heuristic avoids
downloading older bundles. This includes the middle fetch downloading
the objects in bundle-3.bundle from the Git remote, and therefore not
needing that bundle in the third fetch.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Implementation Plan section lists a 'bundle.flag' option that is not
documented anywhere else. What is documented elsewhere in the document
and implemented by previous changes is the 'bundle.heuristic' config
key. For now, a heuristic is required to indicate that a bundle list is
organized for use during 'git fetch', and it is also sufficient for all
existing designs.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bundle providers may organize their bundle lists in a way that is
intended to improve incremental fetches, not just initial clones.
However, they do need to state that they have organized with that in
mind, or else the client will not expect to save time by downloading
bundles after the initial clone. This is done by specifying a
bundle.heuristic value.
There are two types of bundle lists: those at a static URI and those
that are advertised from a Git remote over protocol v2.
The new fetch.bundleURI config value applies for static bundle URIs that
are not advertised over protocol v2. If the user specifies a static URI
via 'git clone --bundle-uri', then Git can set this config as a reminder
for future 'git fetch' operations to check the bundle list before
connecting to the remote(s).
For lists provided over protocol v2, we will want to take a different
approach and create a property of the remote itself by creating a
remote.<id>.* type config key. That is not implemented in this change.
Later changes will update 'git fetch' to consume this option.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The creationToken heuristic provides an ordering on the bundles
advertised by a bundle list. Teach the Git client to download bundles
differently when this heuristic is advertised.
The bundles in the list are sorted by their advertised creationToken
values, then downloaded in decreasing order. This avoids the previous
strategy of downloading bundles in an arbitrary order and attempting
to apply them (likely failing in the case of required commits) until
discovering the order through attempted unbundling.
During a fresh 'git clone', it may make sense to download the bundles in
increasing order, since that would prevent the need to attempt
unbundling a bundle with required commits that do not exist in our empty
object store. The cost of testing an unbundle is quite low, and instead
the chosen order is optimizing for a future bundle download during a
'git fetch' operation with a non-empty object store.
Since the Git client continues fetching from the Git remote after
downloading and unbundling bundles, the client's object store can be
ahead of the bundle provider's object store. The next time it attempts
to download from the bundle list, it makes most sense to download only
the most-recent bundles until all tips successfully unbundle. The
strategy implemented here provides that short-circuit where the client
downloads a minimal set of bundles.
However, we are not satisfied by the naive approach of downloading
bundles until one successfully unbundles, expecting the earlier bundles
to successfully unbundle now. The example repository in t5558
demonstrates this well:
---------------- bundle-4
4
/ \
----|---|------- bundle-3
| |
| 3
| |
----|---|------- bundle-2
| |
2 |
| |
----|---|------- bundle-1
\ /
1
|
(previous commits)
In this repository, if we already have the objects for bundle-1 and then
try to fetch from this list, the naive approach will fail. bundle-4
requires both bundle-3 and bundle-2, though bundle-3 will successfully
unbundle without bundle-2. Thus, the algorithm needs to keep this in
mind.
A later implementation detail will store the maximum creationToken seen
during such a bundle download, and the client will avoid downloading a
bundle unless its creationToken is strictly greater than that stored
value. For now, if the client seeks to download from an identical
bundle list since its previous download, it will download the
most-recent bundle then stop since its required commits are already in
the object store.
Add tests that exercise this behavior, but we will expand upon these
tests when incremental downloads during 'git fetch' make use of
creationToken values.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change taught Git to parse the bundle.heuristic value,
especially when its value is "creationToken". Now, teach Git to parse
the bundle.<id>.creationToken values on each bundle in a bundle list.
Before implementing any logic based on creationToken values for the
creationToken heuristic, parse and print these values for testing
purposes.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bundle.heuristic value communicates that the bundle list is
organized to make use of the bundle.<id>.creationToken values that may
be provided in the bundle list. Those values will create a total order
on the bundles, allowing the Git client to download them in a specific
order and even remember previously-downloaded bundles by storing the
maximum creation token value.
Before implementing any logic that parses or uses the
bundle.<id>.creationToken values, teach Git to parse the
bundle.heuristic value from a bundle list. We can use 'test-tool
bundle-uri' to print the heuristic value and verify that the parsing
works correctly.
As an extra precaution, create the internal 'heuristics' array to be a
list of (enum, string) pairs so we can iterate through the array entries
carefully, regardless of the enum values.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As documented in the bundle URI design doc in 2da14fad8f (docs:
document bundle URI standard, 2022-08-09), the 'creationToken' member of
a bundle URI allows a bundle provider to specify a total order on the
bundles.
Future changes will allow the Git client to understand these members and
modify its behavior around downloading the bundles in that order. In the
meantime, create tests that add creation tokens to the bundle list. For
now, the Git client correctly ignores these unknown keys.
Create a new test helper function, test_remote_https_urls, which filters
GIT_TRACE2_EVENT output to extract a list of URLs passed to
git-remote-https child processes. This can be used to verify the order
of these requests as we implement the creationToken heuristic. For now,
we need to sort the actual output since the current client does not have
a well-defined order that it applies to the bundles.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When verifying a bundle, Git checks first that all prerequisite commits
exist in the object store, then adds an additional check: those
prerequisite commits must be reachable from references in the
repository.
This check is stronger than what is checked for refs being added during
'git fetch', which simply guarantees that the new refs have a complete
history up to the point where it intersects with the current reachable
history.
However, we also do not have any tests that check the behavior under
this condition. Create a test that demonstrates its behavior.
In order to construct a broken history, perform a shallow clone of a
repository with a linear history, but whose default branch ('base') has
a single commit, so dropping the shallow markers leaves a complete
history from that reference. However, the 'tip' reference adds a
shallow commit whose parent is missing in the cloned repository. Trying
to unbundle a bundle with the 'tip' as a prerequisite will succeed past
the object store check and move into the reachability check.
The two errors that are reported are of this form:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <present-commit>
These messages are not particularly helpful for the person running the
unbundle command, but they do prevent the command from succeeding.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git hash-object" now checks that the resulting object is well
formed with the same code as "git fsck".
* jk/hash-object-fsck:
fsck: do not assume NUL-termination of buffers
hash-object: use fsck for object checks
fsck: provide a function to fsck buffer without object struct
t: use hash-object --literally when created malformed objects
t7030: stop using invalid tag name
t1006: stop using 0-padded timestamps
t1007: modernize malformed object tests
Clarify column-padding operators in the pretty format string.
* po/pretty-format-columns-doc:
doc: pretty-formats note wide char limitations, and add tests
doc: pretty-formats describe use of ellipsis in truncation
doc: pretty-formats document negative column alignments
doc: pretty-formats: delineate `%<|(` parameter values
doc: pretty-formats: separate parameters from placeholders
Clarify how "checkout -b/-B" and "git branch [-f]" are similar but
different in the documentation.
* jc/doc-checkout-b:
checkout: document -b/-B to highlight the differences from "git branch"
A user reported issues with 'scalar clone' and 'scalar register' when
working in an environment that had locked down the ability to run
'crontab' or 'systemctl' in that those commands registered as _failures_
instead of opportunistically reporting a success with just a warning
about background maintenance.
As a workaround, they can use GIT_TEST_MAINT_SCHEDULER to fake a
successful background maintenance, but this is not a viable strategy for
long-term.
Update 'scalar register' and 'scalar clone' to no longer fail by
modifying register_dir() to only warn when toggle_maintenance(1) fails.
Since background maintenance is a "nice to have" and not a requirement
for a working repository, it is best to move this from hard error to
gentle warning.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A user recently reported issues with 'scalar register' and 'scalar
clone' in that they failed when the system had permissions locked down
so both 'crontab' and 'systemctl' commands failed when trying to enable
background maintenance.
This hard error is undesirable, but let's create tests that demonstrate
this behavior before modiying the behavior. We can use
GIT_TEST_MAINT_SCHEDULER to guarantee failure and check the exit code
and error message.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will enable scalar tests to use the test_must_fail helper, when
necessary.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch <group>", when "<group>" of remotes lists the same
remote twice, unnecessarily failed when parallel fetching was
enabled, which has been corrected.
* cw/fetch-remote-group-with-duplication:
fetch: fix duplicate remote parallel fetch bug
Document that "branch -f <branch>" disables only the safety to
avoid recreating an existing branch.
* jc/doc-branch-update-checked-out-branch:
branch: document `-f` and linked worktree behaviour
"git send-email -v 3" used to be expanded to "git send-email
--validate 3" when the user meant to pass them down to
"format-patch", which has been corrected.
* km/send-email-with-v-reroll-count:
send-email: relay '-v N' to format-patch
"grep -P" learned to use Unicode Character Property to grok
character classes when processing \b and \w etc.
* cb/grep-pcre-ucp:
grep: correctly identify utf-8 characters with \{b,w} in -P
The instructions in attr.h describing what functions to call to check
attributes is missing the index as the first argument to
git_check_attr(), as well as tree_oid as the second argument.
When 7a400a2c (attr: remove an implicit dependency on the_index,
2018-08-13) started passing an index_state instance to git_check_attr(),
it forgot to update the API documentation in
Documentation/technical/api-gitattributes.txt. Later, 3a1b3415
(attr: move doc to attr.h, 2019-11-17) moved the API documentation to
attr.h as a comment, but still left out the index_state as an argument.
In 47cfc9b (attr: add flag `--source` to work with tree-ish 2023-01-14)
added tree_oid as an optional parameter but was not added to the docs in
attr.h
Fix this to make the documentation in the comment consistent with the
actual function signature.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git request-pull filters PGP signatures out of the tag message, but not
SSH or X.509 signatures.
Signed-off-by: Gwyneth Morgan <gwymor@tilde.club>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When config which selects the merge backend (currently,
rebase.autosquash=true or rebase.updateRefs=true) conflicts with other
options on the command line (such as --whitespace=fix), make the error
message specifically call out the config option and specify how to
override that config option on the command line.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit ce5238a690 ("rebase --keep-base: imply --reapply-cherry-picks",
2022-10-17) accidentally added some blank lines that cause extra
paragraphs about --reapply-cherry-picks to be considered not part of
the documentation of that option. Remove the blank lines to make it
clear we are still discussing --reapply-cherry-picks.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--edit-todo was documented as being incompatible with any of the options
for the apply backend. However, it is also incompatible with any of the
options for the merge backend, and is incompatible with any options that
are not backend specific as well. The same can be said for --continue,
--skip, --abort, --quit, etc.
This is already somewhat implicitly covered by the synopsis, but since
"[<options>]" in the first two variants are vague it might be easy to
miss this. That might not be a big deal, but since the rebase manpage
has to spend so much verbiage about incompatibility of options, making
a separate section for these options that are incompatible with
everything else seems clearer. Do that, and remove the needless
inclusion of --edit-todo in the explicit incompatibility list.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-rebase manual noted several sets of incompatible options, but
we were missing tests for a few of these. Further, we were missing
code checks for one of these, which could result in command line
options being silently ignored.
Also, note that adding a check for autosquash means that using
--whitespace=fix together with the config setting rebase.autosquash=true
will trigger an error. A subsequent commit will improve the error
message.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--[no-]reapply-cherry-picks was traditionally only supported by the
sequencer. Support was added for the apply backend, when --keep-base is
also specified, in commit ce5238a690 ("rebase --keep-base: imply
--reapply-cherry-picks", 2022-10-17). Make the code error out when
--[no-]reapply-cherry-picks is specified AND the apply backend is used
AND --keep-base is not specified. Also, clarify a number of comments
surrounding the interaction of these flags.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 5dacd4abdd ("git-rebase.txt: document incompatible options",
2018-06-25), I added notes about incompatibilities between options for
the apply and merge backends. Unfortunately, I inverted the condition
when --root was incompatible with the apply backend. Fix the
documentation, and add a testcase that verifies the documentation
matches the code.
While at it, the documentation for --root also tried to cover some of
the backend differences between the apply and merge backends in relation
to reapplying cherry picks. The information:
* assumed that the apply backend was the default (it isn't anymore)
* was written before --reapply-cherry-picks became an option
* was written before the detailed information on backend differences
All of these factors make the sentence under --root about reapplying
cherry picks contradict information that is now available elsewhere in
the manual, and the other references are correct. So just strike this
sentence.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--allow-empty-message was turned into a no-op and even documented
as such; the flag is simply ignored. Since the flag is ignored, it
shouldn't be documented as being incompatible with other flags.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, we flagged options which implied --apply as being
incompatible with options which implied --merge. But if both options
were given explicitly, then we didn't flag the incompatibility. The
same is true with --apply and --interactive. Add the check, and add
some testcases to verify these are also caught.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--update-refs is built in terms of the sequencer, which requires the
merge backend. It was already marked as incompatible with the apply
backend in the git-rebase manual, but the code didn't check for this
incompatibility and warn the user. Check and error now.
While at it, fix a typo in t3422...and fix some misleading wording
(most options which used to be am-specific have since been implemented
in the merge backend as well).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When signing a commit with a SSH key, with the private key missing from
ssh-agent, a confusing error message is produced:
error: Load key
"/var/folders/t5/cscwwl_n3n1_8_5j_00x_3t40000gn/T//.git_signing_key_tmpkArSj7":
invalid format? fatal: failed to write commit object
The temporary file .git_signing_key_tmpkArSj7 created by git contains a
valid *public* key. The error message comes from `ssh-keygen -Y sign' and
is caused by a fallback mechanism in ssh-keygen whereby it tries to
interpret .git_signing_key_tmpkArSj7 as a *private* key if it can't find in
the agent [1]. A fix is scheduled to be released in OpenSSH 9.1. All that
needs to be done is to pass an additional backward-compatible option -U to
'ssh-keygen -Y sign' call. With '-U', ssh-keygen always interprets the file
as public key and expects to find the private key in the agent.
As a result, when the private key is missing from the agent, a more accurate
error message gets produced:
error: Couldn't find key in agent
[1] https://bugzilla.mindrot.org/show_bug.cgi?id=3429
Signed-off-by: Adam Szkoda <adaszko@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the dir_iterator API, we first stat(2) the base path, and
then use that as a starting point to enumerate the directory's contents.
If the directory contains symbolic links, we will immediately die() upon
encountering them without the `FOLLOW_SYMLINKS` flag. The same is not
true when resolving the top-level directory, though.
As explained in a previous commit, this oversight in 6f054f9fb3
(builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28)
can be used as an attack vector to include arbitrary files on a victim's
filesystem from outside of the repository.
Prevent resolving top-level symlinks unless the FOLLOW_SYMLINKS flag is
given, which will cause clones of a repository with a symlink'd
"$GIT_DIR/objects" directory to fail.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, t5619 demonstrates an issue where two calls to
`get_repo_path()` could trick Git into using its local clone mechanism
in conjunction with a non-local transport.
That sequence is:
- the starting state is that the local path https:/example.com/foo is a
symlink that points to ../../../.git/modules/foo. So it's dangling.
- get_repo_path() sees that no such path exists (because it's
dangling), and thus we do not canonicalize it into an absolute path
- because we're using --separate-git-dir, we create .git/modules/foo.
Now our symlink is no longer dangling!
- we pass the url to transport_get(), which sees it as an https URL.
- we call get_repo_path() again, on the url. This second call was
introduced by f38aa83f9a (use local cloning if insteadOf makes a
local URL, 2014-07-17). The idea is that we want to pull the url
fresh from the remote.c API, because it will apply any aliases.
And of course now it sees that there is a local file, which is a
mismatch with the transport we already selected.
The issue in the above sequence is calling `transport_get()` before
deciding whether or not the repository is indeed local, and not passing
in an absolute path if it is local.
This is reminiscent of a similar bug report in [1], where it was
suggested to perform the `insteadOf` lookup earlier. Taking that
approach may not be as straightforward, since the intent is to store the
original URL in the config, but to actually fetch from the insteadOf
one, so conflating the two early on is a non-starter.
Note: we pass the path returned by `get_repo_path(remote->url[0])`,
which should be the same as `repo_name` (aside from any `insteadOf`
rewrites).
We *could* pass `absolute_pathdup()` of the same argument, which
86521acaca (Bring local clone's origin URL in line with that of a remote
clone, 2008-09-01) indicates may differ depending on the presence of
".git/" for a non-bare repo. That matters for forming relative submodule
paths, but doesn't matter for the second call, since we're just feeding
it to the transport code, which is fine either way.
[1]: https://lore.kernel.org/git/CAMoD=Bi41mB3QRn3JdZL-FGHs4w3C2jGpnJB-CqSndO7FMtfzA@mail.gmail.com/
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning a repository, Git must determine (a) what transport
mechanism to use, and (b) whether or not the clone is local.
Since f38aa83f9a (use local cloning if insteadOf makes a local URL,
2014-07-17), the latter check happens after the remote has been
initialized, and references the remote's URL instead of the local path.
This is done to make it possible for a `url.<base>.insteadOf` rule to
convert a remote URL into a local one, in which case the `clone_local()`
mechanism should be used.
However, with a specially crafted repository, Git can be tricked into
using a non-local transport while still setting `is_local` to "1" and
using the `clone_local()` optimization. The below test case
demonstrates such an instance, and shows that it can be used to include
arbitrary (known) paths in the working copy of a cloned repository on a
victim's machine[^1], even if local file clones are forbidden by
`protocol.file.allow`.
This happens in a few parts:
1. We first call `get_repo_path()` to see if the remote is a local
path. If it is, we replace the repo name with its absolute path.
2. We then call `transport_get()` on the repo name and decide how to
access it. If it was turned into an absolute path in the previous
step, then we should always treat it like a file.
3. We use `get_repo_path()` again, and set `is_local` as appropriate.
But it's already too late to rewrite the repo name as an absolute
path, since we've already fed it to the transport code.
The attack works by including a submodule whose URL corresponds to a
path on disk. In the below example, the repository "sub" is reachable
via the dumb HTTP protocol at (something like):
http://127.0.0.1:NNNN/dumb/sub.git
However, the path "http:/127.0.0.1:NNNN/dumb" (that is, a top-level
directory called "http:", then nested directories "127.0.0.1:NNNN", and
"dumb") exists within the repository, too.
To determine this, it first picks the appropriate transport, which is
dumb HTTP. It then uses the remote's URL in order to determine whether
the repository exists locally on disk. However, the malicious repository
also contains an embedded stub repository which is the target of a
symbolic link at the local path corresponding to the "sub" repository on
disk (i.e., there is a symbolic link at "http:/127.0.0.1/dumb/sub.git",
pointing to the stub repository via ".git/modules/sub/../../../repo").
This stub repository fools Git into thinking that a local repository
exists at that URL and thus can be cloned locally. The affected call is
in `get_repo_path()`, which in turn calls `get_repo_path_1()`, which
locates a valid repository at that target.
This then causes Git to set the `is_local` variable to "1", and in turn
instructs Git to clone the repository using its local clone optimization
via the `clone_local()` function.
The exploit comes into play because the stub repository's top-level
"$GIT_DIR/objects" directory is a symbolic link which can point to an
arbitrary path on the victim's machine. `clone_local()` resolves the
top-level "objects" directory through a `stat(2)` call, meaning that we
read through the symbolic link and copy or hardlink the directory
contents at the destination of the link.
In other words, we can get steps (1) and (3) to disagree by leveraging
the dangling symlink to pick a non-local transport in the first step,
and then set is_local to "1" in the third step when cloning with
`--separate-git-dir`, which makes the symlink non-dangling.
This can result in data-exfiltration on the victim's machine when
sensitive data is at a known path (e.g., "/home/$USER/.ssh").
The appropriate fix is two-fold:
- Resolve the transport later on (to avoid using the local
clone optimization with a non-local transport).
- Avoid reading through the top-level "objects" directory when
(correctly) using the clone_local() optimization.
This patch merely demonstrates the issue. The following two patches will
implement each part of the above fix, respectively.
[^1]: Provided that any target directory does not contain symbolic
links, in which case the changes from 6f054f9fb3 (builtin/clone.c:
disallow `--local` clones with symlinks, 2022-07-28) will abort the
clone.
Reported-by: yvvdwf <yvvdwf@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pthread emulation on Win32 leaked thread handle when a thread is
joined.
* sk/win32-close-handle-upon-pthread-join:
win32: close handles of threads that have been joined
win32: prepare pthread.c for change by formatting
Newer regex library macOS stopped enabling GNU-like enhanced BRE,
where '\(A\|B\)' works as alternation, unless explicitly asked with
the REG_ENHANCED flag. "git grep" now can be compiled to do so, to
retain the old behaviour.
* rs/use-enhanced-bre-on-macos:
use enhanced basic regular expressions on macOS
"git check-attr" learned to take an optional tree-ish to read the
.gitattributes file from.
* kn/attr-from-tree:
attr: add flag `--source` to work with tree-ish
t0003: move setup for `--all` into new block
"git ls-tree --format='%(path) %(path)' $tree $path" showed the
path three times, which has been corrected.
* rs/ls-tree-path-expansion-fix:
ls-tree: remove dead store and strbuf for quote_c_style()
ls-tree: fix expansion of repeated %(path)
Code clean-up to tighten the use of in-core index in the API.
* ab/cache-api-cleanup:
cache API: add a "INDEX_STATE_INIT" macro/function, add release_index()
read-cache.c: refactor set_new_index_sparsity() for subsequent commit
sparse-index API: BUG() out on NULL ensure_full_index()
sparse-index.c: expand_to_path() can assume non-NULL "istate"
builtin/difftool.c: { 0 }-initialize rather than using memset()
Three hyphens are rendered verbatim in documentation, so "--" has to be
used to produce a dash. Fix asciidoc output for dashes. This is
similar to previous commits f0b922473e (Documentation: render special
characters correctly, 2021-07-29) and de82095a95 (doc
hash-function-transition: fix asciidoc output, 2021-02-05).
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command `dd bs=101M count=1` is not portable,
e.g. dd shipped with MacOs does not understand the 'M'.
Use `dd bs=1048576 count=101`, which achives the same, instead.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* ab/bisect-cleanup:
bisect: no longer try to clean up left-over `.git/head-name` files
bisect: remove Cogito-related code
bisect run: fix the error message
bisect: verify that a bogus option won't try to start a bisection
bisect--helper: make the order consistently `argc, argv`
bisect--helper: simplify exit code computation
Code clean-up.
* tl/ls-tree-code-clean-up:
t3104: remove shift code in 'test_ls_tree_format'
ls-tree: cleanup the redundant SPACE
ls-tree: make "line_termination" less generic
ls-tree: fold "show_tree_data" into "cb" struct
ls-tree: use a "struct options"
ls-tree: don't use "show_tree_data" for "fast" callbacks
Document ORIG_HEAD a bit more.
* pb/doc-orig-head:
git-rebase.txt: add a note about 'ORIG_HEAD' being overwritten
revisions.txt: be explicit about commands writing 'ORIG_HEAD'
git-merge.txt: mention 'ORIG_HEAD' in the Description
git-reset.txt: mention 'ORIG_HEAD' in the Description
git-cherry-pick.txt: do not use 'ORIG_HEAD' in example
Test clean-up.
* ar/test-cleanup:
t7527: use test_when_finished in 'case insensitive+preserving'
t6422: drop commented out code
t6003: uncomment test '--max-age=c3, --topo-order'
Code cleaning.
* rs/dup-array:
use DUP_ARRAY
add DUP_ARRAY
do full type check in BARF_UNLESS_COPYABLE
factor out BARF_UNLESS_COPYABLE
mingw: make argv2 in try_shell_exec() non-const
Test updates.
* jx/t1301-updates:
t1301: do not change $CWD in "shared=all" test case
t1301: use test_when_finished for cleanup
t1301: fix wrong template dir for git-init
The cURL one hasn't cooked for a week in 'next', but let's fast
track it so that linux-musl CI job would be happy.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Deal with a few deprecation warning from cURL library.
* jk/curl-avoid-deprecated-api:
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
The fsck code operates on an object buffer represented as a pointer/len
combination. However, the parsing of commits and tags is a little bit
loose; we mostly scan left-to-right through the buffer, without checking
whether we've gone past the length we were given.
This has traditionally been OK because the buffers we feed to fsck
always have an extra NUL after the end of the object content, which ends
any left-to-right scan. That has always been true for objects we read
from the odb, and we made it true for incoming index-pack/unpack-objects
checks in a1e920a0a7 (index-pack: terminate object buffers with NUL,
2014-12-08).
However, we recently added an exception: hash-object asks index_fd() to
do fsck checks. That _may_ have an extra NUL (if we read from a pipe
into a strbuf), but it might not (if we read the contents from the
file). Nor can we just teach it to always add a NUL. We may mmap the
on-disk file, which will not have any extra bytes (if it's a multiple of
the page size). Not to mention that this is a rather subtle assumption
for the fsck code to make.
Instead, let's make sure that the fsck parsers don't ever look past the
size of the buffer they've been given. This _almost_ works already,
thanks to earlier work in 4d0d89755e (Make sure fsck_commit_buffer()
does not run out of the buffer, 2014-09-11). The theory there is that we
check up front whether we have the end of header double-newline
separator. And then any left-to-right scanning we do is OK as long as it
stops when it hits that boundary.
However, we later softened that in 84d18c0bcf (fsck: it is OK for a tag
and a commit to lack the body, 2015-06-28), which allows the
double-newline header to be missing, but does require that the header
ends in a newline. That was OK back then, because of the NUL-termination
guarantees (including the one from a1e920a0a7 mentioned above).
Because 84d18c0bcf guarantees that any header line does end in a
newline, we are still OK with most of the left-to-right scanning. We
only need to take care after completing a line, to check that there is
another line (and we didn't run out of buffer).
Most of these checks are just need to check "buffer < buffer_end" (where
buffer is advanced as we parse) before scanning for the next header
line. But here are a few notes:
- we don't technically need to check for remaining buffer before
parsing the very first line ("tree" for a commit, or "object" for a
tag), because verify_headers() rejects a totally empty buffer. But
we'll do so in the name of consistency and defensiveness.
- there are some calls to strchr('\n'). These are actually OK by the
"the final header line must end in a newline" guarantee from
verify_headers(). They will always find that rather than run off the
end of the buffer. Curiously, they do check for a NULL return and
complain, but I believe that condition can never be reached.
However, I converted them to use memchr() with a proper size and
retained the NULL checks. Using memchr() is not much longer and
makes it more obvious what is going on. Likewise, retaining the NULL
checks serves as a defensive measure in case my analysis is wrong.
- commit 9a1a3a4d4c (mktag: allow omitting the header/body \n
separator, 2021-01-05), does check for the end-of-buffer condition,
but does so with "!*buffer", relying explicitly on the NUL
termination. We can accomplish the same thing with a pointer
comparison. I also folded it into the follow-on conditional that
checks the contents of the buffer, for consistency with the other
checks.
- fsck_ident() uses parse_timestamp(), which is based on strtoumax().
That function will happily skip past leading whitespace, including
newlines, which makes it a risk. We can fix this by scanning to the
first digit ourselves, and then using parse_timestamp() to do the
actual numeric conversion.
Note that as a side effect this fixes the fact that we missed
zero-padded timestamps like "<email> 0123" (whereas we would
complain about "<email> 0123"). I doubt anybody cares, but I
mention it here for completeness.
- fsck_tree() does not need any modifications. It relies on
decode_tree_entry() to do the actual parsing, and that function
checks both that there are enough bytes in the buffer to represent
an entry, and that there is a NUL at the appropriate spot (one
hash-length from the end; this may not be the NUL for the entry we
are parsing, but we know that in the worst case, everything from our
current position to that NUL is a filename, so we won't run out of
bytes).
In addition to fixing the code itself, we'd like to make sure our rather
subtle assumptions are not violated in the future. So this patch does
two more things:
- add comments around verify_headers() documenting the link between
what it checks and the memory safety of the callers. I don't expect
this code to be modified frequently, but this may help somebody from
accidentally breaking things.
- add a thorough set of tests covering truncations at various key
spots (e.g., for a "tree $oid" line, in the middle of the word
"tree", right after it, after the space, in the middle of the $oid,
and right at the end of the line. Most of these are fine already (it
is only truncating right at the end of the line that is currently
broken). And some of them are not even possible with the current
code (we parse "tree " as a unit, so truncating before the space is
equivalent). But I aimed here to consider the code a black box and
look for any truncations that would be a problem for a left-to-right
parser.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fetching in parallel from a remote group with a duplicated remote results
in the following:
error: cannot lock ref '<ref>': is at <oid> but expected <oid>
This doesn't happen in serial since fetching from the same remote that
has already been fetched from is a noop. Therefore, remove any duplicated
remotes after remote groups are parsed.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commits added clarifications to the column alignment
placeholders, note that the spaces are optional around the parameters.
Also, a proposed extension [1] to allow hard truncation (without
ellipsis '..') highlighted that the existing code does not play well
with wide characters, such as Asian fonts and emojis.
For example, N wide characters take 2N columns so won't fit an odd number
column width, causing misalignment somewhere.
Further analysis also showed that decomposed characters, e.g. separate
`a` + `umlaut` Unicode code-points may also be mis-counted, in some cases
leaving multiple loose `umlauts` all combined together.
Add some notes about these limitations, and add basic tests to demonstrate
them.
The chosen solution for the tests is to substitute any wide character
that overlaps a splitting boundary for the unicode vertical ellipsis
code point as a rare but 'obvious' substitution.
An alternative could be the substitution with a single dot '.' which
matches regular expression usage, and our two dot ellipsis, and further
in scenarios where the bulk of the text is wide characters, would be
obvious. In mainly 'ascii' scenarios a singleton emoji being substituted
by a dot could be confusing.
It is enough that the tests fail cleanly. The final choice for the
substitute character can be deferred.
[1]
https://lore.kernel.org/git/20221030185614.3842-1-philipoakley@iee.email/
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit a7f01c6b4d (pretty: support truncating in %>, %< and %><,
2013-04-19) added the use of ellipsis when truncating placeholder
values.
Show our 'two dot' ellipsis, and examples for the left, middle and
right truncation to avoid any confusion as to which end of the string
is adjusted. (cf justification and sub-string).
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 066790d7cb (pretty.c: support <direction>|(<negative number>) forms,
2016-06-16) added the option for right justified column alignment without
updating the documentation.
Add an explanation of its use of negative column values.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit a57523428b (pretty: support padding placeholders, %< %> and %><,
2013-04-19) introduced column width place holders. It also added
separate column position `%<|(` placeholders for display screen based
placement.
Change the display screen parameter reference from 'N' to 'M' and
corresponding descriptives to make the distinction clearer.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit a57523428b (pretty: support padding placeholders, %< %> and %><,
2013-04-19) introduced columnated place holders. These placeholders
can be confusing as they contain `<` and `>` characters as part
of their placeholders adjacent to the `<N>` parameters.
Add spaces either side of the `<N>` parameters in the title line.
The code (strtol) will consume any spaces around the number values
(assuming they are passed as a quoted string with spaces).
Note that the spaces are optional.
Subsequent commits will clarify other confusions.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On platforms where `size_t` does not have the same width as `unsigned
long`, passing a pointer to the former when a pointer to the latter is
expected can lead to problems.
Windows and 32-bit Linux are among the affected platforms.
In this instance, we want to store the size of the blob that was read in
that variable. However, `read_blob_data_from_index()` passes that
pointer to `read_object_file()` which expects an `unsigned long *`.
Which means that on affected platforms, the variable is not fully
populated and part of its value is left uninitialized. (On Big-Endian
platforms, this problem would be even worse.)
The consequence is that depending on the uninitialized memory's
contents, we may erroneously reject perfectly fine attributes.
Let's address this by passing a pointer to a variable of the expected
data type.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing text read as if "git checkout -b/-B name" were
equivalent to "git branch [-f] name", which clearly was not
what we wanted to say.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In hash_object(), we open a descriptor for each file to hash (whether we
got the filename from the command line or --stdin-paths), but never
close it. For the traditional code path, which feeds the result to
index_fd(), this is OK; it closes the descriptor for us.
But 5ba9a93b39 (hash-object: add --literally option, 2014-09-11) added a
second code path, which does not close the descriptor. There we need to
do so ourselves.
You can see the problem in a clone of git.git like this:
$ git ls-files -s | grep ^100644 | cut -f2 |
git hash-object --stdin-paths --literally >/dev/null
fatal: could not open 'builtin/var.c' for reading: Too many open files
After this patch, it completes successfully. I didn't bother with a
test, as it's a pain to deal with descriptor limits portably, and the
fix is so trivial.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch -f name start" forces to recreate the named branch, but
the forcing does not defeat the "do not touch a branch that is
checked out elsewhere" safety valve.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When UTF is enabled for a PCRE match, the corresponding flags are
added to the pcre2_compile() call, but PCRE2_UCP wasn't included.
This prevents extending the meaning of the character classes to
include those new valid characters and therefore result in failed
matches for expressions that rely on that extention, for ex:
$ git grep -P '\bÆvar'
Add PCRE2_UCP so that \w will include Æ and therefore \b could
correctly match the beginning of that word.
This has an impact on performance that has been estimated to be
between 20% to 40% and that is shown through the added performance
test.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git branch --recurse-submodules start from-here' fails if any submodule
present in 'from-here' is not yet cloned (under
submodule.propagateBranches=true). We then give this advice:
"You may try updating the submodules using 'git checkout from-here && git submodule update --init'"
If 'submodule.recurse' is set, 'git checkout from-here' will also fail since
it will try to recursively checkout the submodules.
Improve the advice by adding '--no-recurse-submodules' to the checkout
command.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since c879daa237 (Make hash-object more robust against malformed
objects, 2011-02-05), we've done some rudimentary checks against objects
we're about to write by running them through our usual parsers for
trees, commits, and tags.
These parsers catch some problems, but they are not nearly as careful as
the fsck functions (which make sense; the parsers are designed to be
fast and forgiving, bailing only when the input is unintelligible). We
are better off doing the more thorough fsck checks when writing objects.
Doing so at write time is much better than writing garbage only to find
out later (after building more history atop it!) that fsck complains
about it, or hosts with transfer.fsckObjects reject it.
This is obviously going to be a user-visible behavior change, and the
test changes earlier in this series show the scope of the impact. But
I'd argue that this is OK:
- the documentation for hash-object is already vague about which
checks we might do, saying that --literally will allow "any
garbage[...] which might not otherwise pass standard object parsing
or git-fsck checks". So we are already covered under the documented
behavior.
- users don't generally run hash-object anyway. There are a lot of
spots in the tests that needed to be updated because creating
garbage objects is something that Git's tests disproportionately do.
- it's hard to imagine anyone thinking the new behavior is worse. Any
object we reject would be a potential problem down the road for the
user. And if they really want to create garbage, --literally is
already the escape hatch they need.
Note that the change here is actually in index_mem(), which handles the
HASH_FORMAT_CHECK flag passed by hash-object. That flag is also used by
"git-replace --edit" to sanity-check the result. Covering that with more
thorough checks likewise seems like a good thing.
Besides being more thorough, there are a few other bonuses:
- we get rid of some questionable stack allocations of object structs.
These don't seem to currently cause any problems in practice, but
they subtly violate some of the assumptions made by the rest of the
code (e.g., the "struct commit" we put on the stack and
zero-initialize will not have a proper index from
alloc_comit_index().
- likewise, those parsed object structs are the source of some small
memory leaks
- the resulting messages are much better. For example:
[before]
$ echo 'tree 123' | git hash-object -t commit --stdin
error: bogus commit object 0000000000000000000000000000000000000000
fatal: corrupt commit
[after]
$ echo 'tree 123' | git.compile hash-object -t commit --stdin
error: object fails fsck: badTreeSha1: invalid 'tree' line format - bad sha1
fatal: refusing to create malformed object
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fsck code has been slowly moving away from requiring an object
struct in commits like 103fb6d43b (fsck: accept an oid instead of a
"struct tag" for fsck_tag(), 2019-10-18), c5b4269b57 (fsck: accept an
oid instead of a "struct commit" for fsck_commit(), 2019-10-18), etc.
However, the only external interface that fsck.c provides is
fsck_object(), which requires an object struct, then promptly discards
everything except its oid and type. Let's factor out the post-discard
part of that function as fsck_buffer(), leaving fsck_object() as a thin
wrapper around it. That will provide more flexibility for callers which
may not have a struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many test scripts use hash-object to create malformed objects to see how
we handle the results in various commands. In some cases we already have
to use "hash-object --literally", because it does some rudimentary
quality checks. But let's use "--literally" more consistently to
future-proof these tests against hash-object learning to be more
careful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We intentionally invalidate the signature of a tag by switching its tag
name from "seventh" to "7th forged". However, the latter is not a valid
tag name because it contains a space. This doesn't currently affect the
test, but we're better off using something syntactically valid. That
reduces the number of possible failure modes in the test, and
future-proofs us if git hash-object gets more picky about its input.
The t7031 script, which was mostly copied from t7030, has the same
problem, so we'll fix it, too.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fake objects in t1006 use dummy timestamps like "0000000000 +0000".
While this does make them look more like normal timestamps (which,
unless it is 1970, have many digits), it actually violates our fsck
checks, which complain about zero-padded timestamps.
This doesn't currently break anything, but let's future-proof our tests
against a version of hash-object which is a little more careful about
its input. We don't actually care about the exact values here (and in
fact, the helper functions in this script end up removing the timestamps
anyway, so we don't even have to adjust other parts of the tests).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests in t1007 for detecting malformed objects have two
anachronisms:
- they use "sha1" instead of "oid" in variable names, even though the
script as a whole has been adapted to handle sha256
- they use test_i18ngrep, which is no longer necessary
Since we'll be adding a new similar test, let's clean these up so they
are all consistently using the modern style.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With Asciidoctor, all of the '+' introduced in a797c0ea04 ("cat-file:
add mailmap support to --batch-check option", 2022-12-20) render
literally rather than functioning as list continuations. With asciidoc,
this renders just fine. It's not too surprising that there is room for
ambiguity and surprises here, since we have lists within lists.
Simply replacing all of these '+' with empty lines makes this render
fine using both tools. Except, in the third hunk, where after this inner
'*' list ends, we want to continue with more contents of the outer list
item (`--batch-command=<format>`). We can solve any ambiguity here and
make this clear to both tools by wrapping the inner list in an open
block (using "--").
For consistency, let's wrap all three of these inner lists from
a797c0ea04 in open blocks. This also future-proofs us a little -- if we
ever gain more contents after any of those first two lists, as we did
already in a797c0ea04 for the third list, we're prepared and should
render fine with both asciidoc and Asciidoctor from the start.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "repo" member was added to "the_index" in [1] the
repo_read_index() was made to populate it, but the unpopulated
"the_index" variable didn't get the same treatment.
Let's do that in initialize_the_repository() when we set it up, and
likewise for all of the current callers initialized an empty "struct
index_state".
This simplifies code that needs to deal with "the_index" or a custom
"struct index_state", we no longer need to second-guess this part of
the "index_state" deep in the stack. A recent example of such
second-guessing is the "istate->repo ? istate->repo : the_repository"
code in [2]. We can now simply use "istate->repo".
We're doing this by making use of the INDEX_STATE_INIT() macro (and
corresponding function) added in [3], which now have mandatory "repo"
arguments.
Because we now call index_state_init() in repository.c's
initialize_the_repository() we don't need to handle the case where we
have a "repo->index" whose "repo" member doesn't match the "repo"
we're setting up, i.e. the "Complete the double-reference" code in
repo_read_index() being altered here. That logic was originally added
in [1], and was working around the lack of what we now have in
initialize_the_repository().
For "fsmonitor-settings.c" we can remove the initialization of a NULL
"r" argument to "the_repository". This was added back in [4], and was
needed at the time for callers that would pass us the "r" from an
"istate->repo". Before this change such a change to
"fsmonitor-settings.c" would segfault all over the test suite (e.g. in
t0002-gitfile.sh).
This change has wider eventual implications for
"fsmonitor-settings.c". The reason the other lazy loading behavior in
it is required (starting with "if (!r->settings.fsmonitor) ..." is
because of the previously passed "r" being "NULL".
I have other local changes on top of this which move its configuration
reading to "prepare_repo_settings()" in "repo-settings.c", as we could
now start to rely on it being called for our "r". But let's leave all
of that for now, and narrowly remove this particular part of the
lazy-loading.
1. 1fd9ae517c (repository: add repo reference to index_state,
2021-01-23)
2. ee1f0c242e (read-cache: add index.skipHash config option,
2023-01-06)
3. 2f6b1eb794 (cache API: add a "INDEX_STATE_INIT" macro/function,
add release_index(), 2023-01-12)
4. 1e0ea5c431 (fsmonitor: config settings are repository-specific,
2022-03-25)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ab/cache-api-cleanup:
cache API: add a "INDEX_STATE_INIT" macro/function, add release_index()
read-cache.c: refactor set_new_index_sparsity() for subsequent commit
sparse-index API: BUG() out on NULL ensure_full_index()
sparse-index.c: expand_to_path() can assume non-NULL "istate"
builtin/difftool.c: { 0 }-initialize rather than using memset()
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The two options do exactly the same thing, but the latter has been
deprecated and in recent versions of curl may produce a compiler
warning. Since the UPLOAD form is available everywhere (it was
introduced in the year 2000 by curl 7.1), we can just switch to it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test 1600.6 can fail under --stress due to mtime collisions. Most of
the tests include a removal of the index file to guarantee that the
index is updated. However, the submodule test addded in ee1f0c242e
(read-cache: add index.skipHash config option, 2023-01-06) did not
include this removal. Thus, on rare occasions, the test can fail because
the index still has a non-null trailing hash, as detected by the helper
added in da9acde14e (test-lib-functions: add helper for trailing hash,
2023-01-06).
By removing the submodule's index before the 'git -C sub add a' command,
we guarantee that the index is rewritten with the new index.skipHash
config option.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On platforms where `size_t` does not have the same width as `unsigned
long`, passing a pointer to the former when a pointer to the latter is
expected can lead to problems.
Windows and 32-bit Linux are among the affected platforms.
In this instance, we want to store the size of the blob that was read in
that variable. However, `read_blob_data_from_index()` passes that
pointer to `read_object_file()` which expects an `unsigned long *`.
Which means that on affected platforms, the variable is not fully
populated and part of its value is left uninitialized. (On Big-Endian
platforms, this problem would be even worse.)
The consequence is that depending on the uninitialized memory's
contents, we may erroneously reject perfectly fine attributes.
Let's address this by passing a pointer to a variable of the expected
data type.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce an optional configuration to allow the trailing hash that
protects the index file from bit flipping.
* ds/omit-trailing-hash-in-index:
features: feature.manyFiles implies fast index writes
test-lib-functions: add helper for trailing hash
read-cache: add index.skipHash config option
hashfile: allow skipping the hash function
The logic to see if we are using the "cone" mode by checking the
sparsity patterns has been tightened to avoid mistaking a pattern
that names a single file as specifying a cone.
* ws/single-file-cone:
dir: check for single file cone patterns
"git diff --relative" did not mix well with "git diff --ext-diff",
which has been corrected.
* jk/ext-diff-with-relative:
diff: drop "name" parameter from prepare_temp_file()
diff: clean up external-diff argv setup
diff: use filespec path to set up tempfiles for ext-diff
Conditionally skip the pre-applypatch and applypatch-msg hooks when
applying patches with 'git am'.
* tr/am--no-verify:
am: allow passing --no-verify flag
Test fixes.
* es/t1509-root-fixes:
t1509: facilitate repeated script invocations
t1509: make "setup" test more robust
t1509: fix failing "root work tree" test due to owner-check
Hopefully in some not so distant future, we'll get advantages from always
initializing the "repo" member of the "struct index_state". To make
that easier let's introduce an initialization macro & function.
The various ad-hoc initialization of the structure can then be changed
over to it, and we can remove the various "0" assignments in
discard_index() in favor of calling index_state_init() at the end.
While not strictly necessary, let's also change the CALLOC_ARRAY() of
various "struct index_state *" to use an ALLOC_ARRAY() followed by
index_state_init() instead.
We're then adding the release_index() function and converting some
callers (including some of these allocations) over to it if they
either won't need to use their "struct index_state" again, or are just
about to call index_state_init().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes when users use scalar to download a monorepo with a long
commit history, they want to check the progress bar to know how long
they still need to wait during the fetch process, but scalar
suppresses this output by default.
So let's check whether scalar stderr refer to a terminal, if so,
show progress, otherwise disable it.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "subject_prefix" member of "struct revision" usually is set to a
borrowed string (either a string literal like "PATCH" that appear in
the program text as a hardcoded default, or the value of
"format.subjectprefix") and is never freed when the containing
revision structure is released. The "-v <num>" codepath however
violates this rule and stores a pointer to an allocated string to
this member, relinquishing the responsibility to free it when it is
done using the revision structure, leading to a small one-time leak.
Instead, keep track of the string it allocates to let the revision
structure borrow, and clean it up when it is done.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop initializing "name" because it is set again before use.
Let quote_c_style() write directly to "sb" instead of taking a detour
through "quoted". This avoids an allocation and a string copy. The
result is the same because the function only appends.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
expand_show_tree() borrows the base strbuf given to us by read_tree() to
build the full path of the current entry when handling %(path). Only
its indirect caller, show_tree_fmt(), removes the added entry name.
That works fine as long as %(path) is only included once in the format
string, but accumulates duplicates if it's repeated:
$ git ls-tree --format='%(path) %(path) %(path)' HEAD M*
Makefile MakefileMakefile MakefileMakefileMakefile
Reset the length after each use to get the same expansion every time;
here's the behavior with this patch:
$ ./git ls-tree --format='%(path) %(path) %(path)' HEAD M*
Makefile Makefile Makefile
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t6426.7 (a rename/add testcase) long had a TODO/FIXME comment about
how the test could be improved (with some commented out sample code
that had a few small errors), but those improvements were blocked on
other changes still in progress. The necessary changes were put in
place years ago but the comment was forgotten. Remove and fix the
commented out code section and finally remove the big TODO/FIXME
comment.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since [1] there has been no reason for keeping "git env--helper" a
built-in. The reason it was a built-in to begin with was to support
the GIT_TEST_GETTEXT_POISON mode removed in that commit. I.e. unlike
the rest of "test-tool" it would potentially be called by the
installed git via "git-sh-i18n.sh".
As none of that applies since [1] we should stop carrying this
technical debt, and move it to t/helper/*. As this mostly move-only
change shows this has the nice bonus that we'll stop wasting time
translating the internal-only strings it emits.
Even though this was a built-in, it was intentionally never
documented, see its introduction in [2]. It never saw use outside of
the test suite, except for the "GIT_TEST_GETTEXT_POISON" use-case
noted above.
1. d162b25f95 (tests: remove support for GIT_TEST_GETTEXT_POISON,
2021-01-20)
2. b4f207f339 (env--helper: new undocumented builtin wrapping
git_env_*(), 2019-06-21)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The contents of the .gitattributes files may evolve over time, but "git
check-attr" always checks attributes against them in the working tree
and/or in the index. It may be beneficial to optionally allow the users
to check attributes taken from a commit other than HEAD against paths.
Add a new flag `--source` which will allow users to check the
attributes against a commit (actually any tree-ish would do). When the
user uses this flag, we go through the stack of .gitattributes files but
instead of checking the current working tree and/or in the index, we
check the blobs from the provided tree-ish object. This allows the
command to also be used in bare repositories.
Since we use a tree-ish object, the user can pass "--source
HEAD:subdirectory" and all the attributes will be looked up as if
subdirectory was the root directory of the repository.
We cannot simply use the `<rev>:<path>` syntax without the `--source`
flag, similar to how it is used in `git show` because any non-flag
parameter before `--` is treated as an attribute and any parameter after
`--` is treated as a pathname.
The change involves creating a new function `read_attr_from_blob`, which
given the path reads the blob for the path against the provided source and
parses the attributes line by line. This function is plugged into
`read_attr()` function wherein we go through the stack of attributes
files.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Co-authored-by: toon@iotcl.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is some setup code which is used by multiple tests being setup in
`attribute test: --all option`. This means when we run "sh
./t0003-attributes.sh --run=setup,<num>" there is a chance of failing
since we missed this setup block.
So to ensure that setups are independent of test logic, move this to a
new setup block.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Co-authored-by: toon@iotcl.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace non-existent `branch.<name>.fetch` to `remote.<repository>.fetch`, in
the first example in `git-fetch` doc, which was introduced in
d504f6975d (modernize fetch/merge/pull examples, 2009-10-21).
Rename placeholder `<name>` to `<repository>`, to be consistent with all other
uses in git docs, except that `git-config.txt` uses `remote.<name>.fetch` in
its "Variables" section.
Also add missing monospace markups.
Signed-off-by: Yukai Chou <muzimuzhi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t3104-ls-tree-format.sh, There is a legacy 'shift 2' code
and the relevant code block no longer depends on it anymore,
so let's remove it for a small cleanup.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An redundant space was found in ls-tree.c, which is no doubt
a small change, but it might be OK to make a commit on its own.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "ls-tree" command isn't capable of ending "lines" with anything
except '\n' or '\0', and in the latter case we can avoid calling
write_name_quoted_relative() entirely. Let's do that, less for
optimization and more for clarity, the write_name_quoted_relative()
API itself does much the same thing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the the preceding two commits the only user of the
"show_tree_data" struct needed it along with the "options" member,
let's instead fold all of that into a "show_tree_data" struct that
we'll use only for that callback.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a first step towards being able to turn this code into an API some
day let's change the "static" options in builtin/ls-tree.c into a
"struct ls_tree_options" that can be constructed dynamically without
the help of parse_options().
Because we're now using non-static variables for this we'll need to
clear_pathspec() at the end of cmd_ls_tree(), least various tests
start failing under SANITIZE=leak. The memory leak was already there
before, now it's just being brought to the surface.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in [1] the code that made it in as part of
9c4d58ff2c (ls-tree: split up "fast path" callbacks, 2022-03-23) was
a "maybe a good idea, maybe not" RFC-quality patch. I hadn't looked
very carefully at the resulting patterns.
The implementation shared the "struct show_tree_data data", which was
introduced in e81517155e (ls-tree: introduce struct "show_tree_data",
2022-03-23) both for use in 455923e0a1 (ls-tree: introduce "--format"
option, 2022-03-23), and because the "fat" callback hadn't been split
up as 9c4d58ff2c did.
Now that that's been done we can see that most of what
show_tree_common() was doing could be done lazily by the callbacks
themselves, who in the pre-image were often using an odd mis-match of
their own arguments and those same arguments stuck into the "data"
structure. Let's also have the callers initialize the "type", rather
than grabbing it from the "data" structure afterwards.
1. https://lore.kernel.org/git/cover-0.7-00000000000-20220310T134811Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyronteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As per the code comment, the `.git/head-name` files were cleaned up for
backwards-compatibility: an old version of `git bisect` could have left
them behind.
Now, just how old would such a version be? As of 0f497e75f0 (Eliminate
confusing "won't bisect on seeked tree" failure, 2008-02-23), `git
bisect` does not write that file anymore. Which corresponds to Git
v1.5.4.4.
Even if the likelihood is non-nil that there might still be users out
there who use such an old version to start a bisection, but then decide
to continue bisecting with a current Git version, it is highly
improbable.
So let's remove that code, at long last.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, there was this idea that Git would not actually be a
single coherent program, but rather a set of low-level programs that
users cobble together via shell scripts, or develop high-level user
interfaces for Git, or both.
Cogito was such a high-level user interface, incidentally implemented
via shell scripts that cobble together Git calls.
It did turn out relatively quickly that Git would much rather provide a
useful high-level user interface itself.
As of April 19th, 2007, Cogito was therefore discontinued (see
https://lore.kernel.org/git/20070419124648.GL4489@pasky.or.cz/).
Nevertheless, for almost 15 years after that announcement, Git carried
special code in `git bisect` to accommodate Cogito.
Since it is beyond doubt that there are no more Cogito users, let's
remove the last remnant of Cogito-accommodating code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d1bbbe45df (bisect--helper: reimplement `bisect_run` shell function
in C, 2021-09-13), we ported the `bisect run` subcommand to C, including
the part that prints out an error message when the implicit `git bisect
bad` or `git bisect good` failed.
However, the error message was supposed to print out whether the state
was "good" or "bad", but used a bogus (because non-populated) `args`
variable for it. This was fixed in [1], but as of [2] (when
`bisect--helper` was changed to the present `bisect-state') the error
message still talks about implementation details that should not
concern end users.
Fix that, and add a regression test to ensure that the intended form of
the error message.
1. 80c2e9657f (bisect--helper: report actual bisect_state() argument
on error, 2022-01-18
2. f37d0bdd42 (bisect: fix output regressions in v2.30.0, 2022-11-10)
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not want `git bisect --bogus-option` to start a bisection. To
verify that, we look for the tell-tale error message `You need to start
by "git bisect start"` and fail if it was found.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In C, the natural order is for `argc` to come before `argv` by virtue of
the `main()` function declaring the parameters in precisely that order.
It is confusing & distracting, then, when readers familiar with the C
language read code where that order is switched around.
Let's just change the order and avoid that type of developer friction.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We _already_ have a function to determine whether a given `enum
bisect_error` value is non-zero but still _actually_ indicates success.
Let's use it instead of duplicating the logic.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, in the git-log documentation, the reference to generating
patches does not match the section title. This can make the section
"Generating patch text with -p" hard to find, since typically readers of
the documentation will copy and paste to search the page.
Let's make this more convenient for readers by linking it directly to
the section.
Since git-log pulls in diff-generate-patch.txt, we can provide a direct
link to the section. Otherwise, change the verbiage to match exactly
what the section title is, to at least make searching for it an easier
task.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When handling "--exec" rebase collects the commands into a struct
string_list, then prepends "exec " to each command creating a multi line
string and finally splits that string back into a list of commands. This
is an artifact of the scripted rebase and the need to support "rebase
--preserve-merges". Now that "--preserve-merges" no-longer exists we can
cleanup the way the argument is handled. There is no need to add the
"exec " prefix to the commands as that is added by todo_list_to_strbuf().
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most tests in t7527-builtin-fsmonitor.sh that start a daemon, use the
helper function test_when_finished with stop_daemon_delete_repo.
Function stop_daemon_delete_repo explicitly stops the daemon. Calling
it via test_when_finished is needed for tests that don't check daemon's
automatic shutdown logic [1] and it is needed to avoid daemons being
left running in case of breakage of the logic of automatic shutdown of
the daemon.
Unlike these tests, test 'case insensitive+preserving' added in [2] has
a call to function test_when_finished commented out. It was commented
out in all versions of the patch [2] during development [3]. This seems
to not be intentional, because neither commit message in [2], nor the
comment above the test mention this line being commented out. Compare
it, for example, to "# unicode_debug=true" which is explicitly described
by a documentation comment above it.
Uncomment test_when_finished for stop_daemon_delete_repo in test 'case
insensitive+preserving' to ensure that daemons are not left running in
cases when automatic shutdown logic of daemon itself is broken.
[1] See documentation in "fsmonitor--daemon.h" for details.
[2] caa9c37ec0 (t7527: test FSMonitor on case insensitive+preserving
file system, 2022-05-26)
[3] See mailing list thread
https://lore.kernel.org/git/41f8cbc2ae45cb86e299eb230ad3cb0319256c37.1653601644.git.gitgitgadget@gmail.com/T/#t
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit [1] tests in t6422-merge-rename-corner-cases.sh were
refactored to not run setup steps separately. This included replacing
all tests like
test_expect_success "setup ..." '
<code of setup>
'
with corresponding Shell functions
test_setup_... () {
<code of setup>
}
During this replacement first and last lines of one of such tests got
left commented out in code. Drop these lines to avoid confusion.
[1] da1e295e00 (t604[236]: do not run setup in separate tests, 2019-10-22)
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test '--max-age=c3, --topo-order' in t6003-rev-list-topo-order.sh has
been commented out as failing since its introduction in [1]. However,
the test is successful at least since commit [2] -- bisecting further is
harder because of incompatibility of such old Git code with modern
header file <openssl/bn.h> [3].
Uncomment this test to gain test coverage.
[1] f573571a21 ([PATCH] Add t/t6003 with some --topo-order tests,
2005-07-07)
[2] 765ac8ec46 (Rip out merge-order and make "git log <paths>..." work
again., 2006-02-28)
[3] BIGNUM used in git's `epoch.c` which was removed in [2] changed
significantly between OpenSSL 1.0.2 and OpenSSL 1.1.0
See also https://stackoverflow.com/a/42295243/1083697 and
https://lore.kernel.org/git/Y71qiCs+oAS2OegH@coredump.intra.peff.net/
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d656218a83 (docs/bisect-lk2009: update nist report link,
2017-04-20) replaced a dead link to news release on nist.gov. However,
this might be confusing to the reader (like myself) because the article
git-bisect-lk2009.txt quotes from the news release but the exact quote
cannot be found in the full report. In addition to that, the link added
in 2017 is also dead in 2023.
Replace the reference to nist.gov with an version of the original NIST
news release archived to the Wayback Machine. Include also an updated
link to a live version of the full report.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much like the file selection options we tweaked in the last commit, the
status tags printed with -t had descriptions that were easy to
misunderstand, and for many of the same reasons. Clarify them.
Also, while at it, remove the "semi-deprecated" comment for "git
ls-files -t". The -t option was marked as semi-deprecated in 5bc0e247c4
("Document ls-files -t as semi-obsolete.", 2010-07-28) because:
"git ls-files -t" is [...] badly documented, hence we point the
users to superior alternatives.
The feature is marked as "semi-obsolete" but not "scheduled for removal"
since it's a plumbing command, scripts might use it, and Git testsuite
already uses it to test the state of the index.
Marking it as obsolete because it was easily misunderstood, which I
think was primarily due to documentation problems, is one strategy, but
I think fixing the documentation is a better option. Especially since
in the intervening time, "git ls-files -t" has become heavily used by
sparse-checkout users where the same confusion just doesn't apply.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous descriptions of the file selection options were very easy
to misunderstand. For example:
* "Show cached files in the output"
This could be interpreted as meaning "show files which have been
modified and git-add'ed, i.e. files which have cached changes
relative to HEAD".
* "Show deleted files"
This could be interpreted as meaning "for each `git rm $FILE` we
ran, show me $FILE"
* "Show modified files"
This could be interpreted as meaning "show files which have been
modified and git-add'ed" or as "show me files that differ from HEAD"
or as "show me undeleted files different from HEAD" (given that
--deleted is a separate option), none of which are correct.
Further, it's not very clear when some options only modify and/or
override other options, as was the case with --ignored, --directory, and
--unmerged (I've seen folks confused by each of them on the mailing
list, sometimes even fellow git developers.)
Tweak these definitions, and the one for --killed, to try to make them
all a bit more clear. Finally, also clarify early on that duplicate
reports for paths are often expected (both when (a) there are multiple
entries for the file in the index -- i.e. when there are conflicts, and
also (b) when the user specifies options that might pick the same file
multiple times, such as `git ls-files --cached --deleted --modified`
when there is a file with an unstaged deletion).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-files' --resolve-undo option has existed ever since 9d9a2f4aba
("resolve-undo: basic tests", 2009-12-25), but was never documented.
However, the option has been referred to in the ls-files manual itself
ever since ce74de931d ("ls-files: introduce "--format" option",
2022-07-23), making its omission a bit jarring. Document this option.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ISO 8601 permits "reduced precision" time representations to omit the
seconds value or both the minutes and the seconds values. The
abbreviate times could look like 17:45 or 1745 to omit the seconds,
or simply as 17 to omit both the minutes and the seconds.
parse_date_basic accepts the 17:45 format but it rejects the other two.
Change it to accept 4-digit and 2-digit time values when they follow a
recognized date and a 'T'.
Before this change:
$ TZ=UTC test-tool date approxidate 2022-12-13T23:00 2022-12-13T2300 2022-12-13T23
2022-12-13T23:00 -> 2022-12-13 23:00:00 +0000
2022-12-13T2300 -> 2022-12-13 23:54:13 +0000
2022-12-13T23 -> 2022-12-13 23:54:13 +0000
After this change:
$ TZ=UTC helper/test-tool date approxidate 2022-12-13T23:00 2022-12-13T2300 2022-12-13T23
2022-12-13T23:00 -> 2022-12-13 23:00:00 +0000
2022-12-13T2300 -> 2022-12-13 23:00:00 +0000
2022-12-13T23 -> 2022-12-13 23:00:00 +0000
Note: ISO 8601 also allows reduced precision date strings such as
"2022-12" and "2022". This patch does not attempt to address these.
Reported-by: Pat LaVarre <plavarre@purestorage.com>
Signed-off-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interop test library sets up wrappers "git.a" and "git.b" to
represent the two versions to be tested. It also wraps vanilla "git" to
report an error, with the goal of catching tests which accidentally fail
to use one of the version-specific wrappers (which could invalidate the
tests in a very subtle way).
But when it catches an invocation of vanilla git, it doesn't give any
details, which makes it very hard to debug exactly which invocation is
responsible (especially if it's buried in a function invocation, etc).
Let's report the arguments passed to git, which helps narrow it down.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor code added to set_new_index_sparsity() in [1] to eliminate
indentation resulting from putting the body of his function within the
"if" block. Let's instead return early if we have no
istate->repo. This trivial change makes the subsequent commit's diff
smaller.
1. 491df5f679 (read-cache: set sparsity when index is new, 2022-05-10)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the ensure_full_index() function stricter, and have it only
accept a non-NULL "struct index_state". This function (and this
behavior) was added in [1].
The only reason it needed to be this lax was due to interaction with
repo_index_has_changes(). See the addition of that code in [2].
The other reason for why this was needed dates back to interaction
with code added in [3]. In [4] we started calling ensure_full_index()
in unpack_trees(), but the caller added in 34110cd4e3 wants to pass
us a NULL "dst_index". Let's instead do the NULL check in
unpack_trees() itself.
1. 4300f8442a (sparse-index: implement ensure_full_index(), 2021-03-30)
2. 0c18c059a1 (read-cache: ensure full index, 2021-04-01)
3. 34110cd4e3 (Make 'unpack_trees()' have a separate source and
destination index, 2008-03-06)
4. 6863df3550 (unpack-trees: ensure full index, 2021-03-30)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function added in [1] was subsequently used in [2]. All of the
calls to it are in name-hash.c, and come after calls to
lazy_init_name_hash(istate). The first thing that function does is:
if (istate->name_hash_initialized)
return;
So we can already assume that we have a non-NULL "istate" here, or
we'd be segfaulting. Let's not confuse matters by making it appear
that's not the case.
1. 71f82d032f (sparse-index: expand_to_path(), 2021-04-12)
2. 4589bca829 (name-hash: use expand_to_path(), 2021-04-12)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor an initialization of a variable added in
03831ef7b5 (difftool: implement the functionality in the builtin,
2017-01-19). This refactoring makes a subsequent change smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once we find a match, there is no point to try finding the second
match in the inner loop. Break out of the loop once we find the
first match.
Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the "DETACHED HEAD" section in the git-checkout doc, it suggests
using "git checkout -b <branch-name>" to create a new branch on the
detached head.
On the other hand, when you checkout a commit that is not at the tip of
any named branch (e.g., when you checkout a tag), git suggests using
"git switch -c <branch-name>".
Add "git switch -c" as another option and mitigate this inconsistency.
Signed-off-by: Yutaro Ohno <yutaro.ono.418@gmail.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When mentioning 'ORIG_HEAD', be explicit about which command write that
pseudo-ref, namely 'git am', 'git merge', 'git rebase' and 'git reset'.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fact that 'git merge' writes 'ORIG_HEAD' before performing the merge
is missing from the documentation of the command.
Mention it in the 'Description' section.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fact that 'git reset' writes 'ORIG_HEAD' before changing HEAD is
mentioned in an example, but is missing from the 'Description' section.
Mention it in the discussion of the "'git reset' [<mode>] [<commit>]"
form of the command.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 67ac1e1d57 (cherry-pick/revert: add support for
-X/--strategy-option, 2010-12-10) added an example to the documentation
of 'git cherry-pick'. This example mentions how to abort a failed
cherry-pick and retry with an additional merge strategy option.
The command used in the example to abort the cherry-pick is 'git reset
--merge ORIG_HEAD', but cherry-pick does not write 'ORIG_HEAD' before
starting its operation. So this command would checkout a commit
unrelated to what was at HEAD when the user invoked cherry-pick.
Use 'git cherry-pick --abort' instead.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b25562e63f (object-file: inline calls to read_object(),
2023-01-07) accidentally indented a conditional block with spaces
instead of a tab.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a semantic patch for replace ALLOC_ARRAY+COPY_ARRAY with DUP_ARRAY
to reduce code duplication and apply its results.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a macro for allocating and populating a shallow copy of an array.
It is intended to replace a sequence like this:
ALLOC_ARRAY(dst, n);
COPY_ARRAY(dst, src, n);
With the less repetitve:
DUP_ARRAY(dst, src, n);
It checks whether the types of source and destination are compatible to
ensure the copy can be used safely.
An easier alternative would be to only consider the source and return
a void pointer, that could be used like this:
dst = ARRAY_DUP(src, n);
That would be more versatile, as it could be used in declarations as
well. Making it type-safe would require the use of typeof_unqual from
C23, though.
So use the safe and compatible variant for now.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use __builtin_types_compatible_p to perform a full type check if
possible. Otherwise fall back to the old size comparison, but add a
non-evaluated assignment to catch more type mismatches. It doesn't flag
copies between arrays with different signedness, but that's as close to
a full type check as it gets without the builtin, as far as I can see.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the common basic element type check of COPY_ARRAY and MOVE_ARRAY to
a new macro. This reduces code duplication and simplifies adding more
elaborate checks.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare for a stricter type check in COPY_ARRAY by removing the const
qualifier of argv2, like we already do to placate Visual Studio. We
have to add it back using explicit casts when actually using the
variable, unfortunately, because GCC (rightly) refuses to add it
implicitly. Similar casts are already used in mingw_execv().
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CI updates. We probably want a clean-up to move the long shell
script embedded in yaml file into a separate file, but that can
come later.
* cw/ci-whitespace:
ci (check-whitespace): move to actions/checkout@v3
ci (check-whitespace): add links to job output
ci (check-whitespace): suggest fixes for errors
Use `git diff --no-index` as a test_cmp on Windows.
We'd probably need to revisit "do we really want to, and have to,
lose CRLF vs LF?" later, at which time we may be able to further
clean this up by replacing "git diff --no-index" with "diff -u".
* js/drop-mingw-test-cmp:
tests(mingw): avoid very slow `mingw_test_cmp`
When the pack code was split into its own file[1], it got a copy of the
static read_object() function. But there's only one caller here, so we
could just inline it. And it's worth doing so, as the name read_object()
invites comparisons to the public read_object_file(), but the two don't
behave quite the same.
[1] The move happened over several commits, but the relevant one here is
f1d8130be0 (pack: move clear_delta_base_cache(), packed_object_info(),
unpack_entry(), 2017-08-18).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only caller of read_object_file_extended() is the thin wrapper of
repo_read_object_file(). Instead of wrapping, let's just rename the
inner function and let people call it directly. This cleans up the
namespace and reduces confusion.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our sole caller always passes in "1", so we can just drop the parameter
entirely. Anybody who doesn't want this behavior could easily call
oid_object_info_extended() themselves, as we're just a thin wrapper
around it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The open_istream_incore() function is the only direct user of
read_object_file_extended(), and the only caller which unsets the
lookup_replace flag. Since read_object_file_extended() is now just a
thin wrapper around oid_object_info_extended(), let's inline the call.
That will let us simplify read_object_file_extended() in the next patch.
The inlined version here is a few more lines because of the query setup,
but it's much more flexible, since we can pass (or omit) any flags we
want.
Note the updated comment in the istream struct definition. It was
already slightly wrong (we never called read_object(); it has been
read_object_file_extended() since day one), but should now be accurate.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since read_object() is these days just a thin wrapper around
oid_object_info_extended(), and since it only has two callers, let's
just inline those calls. This has a few positive outcomes:
- it's a net reduction in source code lines
- even though the callers end up with a few extra lines, they're now
more flexible and can use object_info flags directly. So no more
need to convert die_if_corrupt between parameter/flag, and we can
ask for lookup replacement with a flag rather than doing it
ourselves.
- there's one fewer function in an already crowded namespace (e.g.,
the difference between read_object() and read_object_file() was not
immediately obvious; now we only have one of them).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the previous patch, using skip_prefix() is more readable and
less error-prone than a raw strncmp(), because it avoids a
manually-computed length. These cases differ from the previous patch
that uses starts_with() because they care about the value after the
matched prefix.
We can convert these to use skip_prefix() by introducing an extra
variable to hold the out-pointer.
Note in the case in ws.c that to get rid of the magic number "9"
completely, we also switch out "len" for recomputing the pointer
difference. These are equivalent because "len" is always "ep - string".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's more readable to use starts_with() instead of strncmp() to match a
prefix, as the latter requires a manually-computed length, and has the
funny "matching is zero" return value common to cmp functions. This
patch converts several cases which were found with:
git grep 'strncmp(.*, [0-9]*)'
But note that it doesn't convert all such cases. There are several where
the magic length number is repeated elsewhere in the code, like:
/* handle "buf" which isn't NUL-terminated and might be too small */
if (len >= 3 && !strncmp(buf, "foo", 3))
or:
/* exact match for "foo", but within a larger string */
if (end - buf == 3 && !strncmp(buf, "foo", 3))
While it would not produce the wrong outcome to use starts_with() in
these cases, we'd still be left with one instance of "3". We're better
to leave them for now, as the repeated "3" makes it clear that the two
are linked (there may be other refactorings that handle both, but
they're out of scope for this patch).
A few things to note while reading the patch:
- all cases but one are trying to match, and so lose the extra "!".
The case in the first hunk of urlmatch.c is not-matching, and hence
gains a "!".
- the case in remote-fd.c is matching the beginning of "connect foo",
but we never look at str+8 to parse the "foo" part (which would make
this a candidate for skip_prefix(), not starts_with()). This seems
at first glance like a bug, but is a limitation of how remote-fd
works.
- the second hunk in urlmatch.c shows some cases adjacent to other
strncmp() calls that are left. These are of the "exact match within
a larger string" type, as described above.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix typos in code comments which repeat various words. Most of the
cases are simple in that they repeat a word that usually cannot be
repeated in a grammatically correct sentence. Just remove the
incorrectly duplicated word in these cases and rewrap text, if needed.
A tricky case is usage of "that that", which is sometimes grammatically
correct. However, an instance of this in "t7527-builtin-fsmonitor.sh"
doesn't need two words "that", because there is only one daemon being
discussed, so replace the second "that" with "the".
Reword code comment "entries exist on on-disk index" in function
update_one in file cache-tree.c, by replacing incorrect preposition "on"
with "in".
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 1819ad327b (grep: fix multibyte regex handling under macOS,
2022-08-26) started to use the native regex library instead of Git's
own (compat/regex/), it lost support for alternation in basic
regular expressions.
Bring it back by enabling the flag REG_ENHANCED on macOS when
compiling basic regular expressions.
Reported-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The recent addition of the index.skipHash config option allows index
writes to speed up by skipping the hash computation for the trailing
checksum. This is particularly critical for repositories with many files
at HEAD, so add this config option to two cases where users in that
scenario may opt-in to such behavior:
1. The feature.manyFiles config option enables some options that are
helpful for repositories with many files at HEAD.
2. 'scalar register' and 'scalar reconfigure' set config options that
optimize for large repositories.
In both of these cases, set index.skipHash=true to gain this
speedup. Add tests that demonstrate the proper way that
index.skipHash=true can override feature.manyFiles=true.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be helpful to check that a file format with a trailing hash has a
specific hash in the final bytes of a written file. This is made more
apparent by recent changes that allow skipping the hash algorithm and
writing a null hash at the end of the file instead.
Add a new test_trailing_hash helper and use it in t1600 to verify that
index.skipHash=true really does skip the hash computation, since
'git fsck' does not actually verify the hash. This confirms that when
the config is disabled explicitly in a super project but enabled in a
submodule, then the use of repo_config_get_bool() loads config from the
correct repository in the case of 'git add'. There are other cases where
istate->repo is NULL and thus this config is loaded instead from
the_repository, but that's due to many different code paths initializing
index_state structs in their own way.
Keep the 'git fsck' call to ensure that any potential future change to
check the index hash does not cause an error in this case.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change allowed skipping the hashing portion of the
hashwrite API, using it instead as a buffered write API. Disabling the
hashwrite can be particularly helpful when the write operation is in a
critical path.
One such critical path is the writing of the index. This operation is so
critical that the sparse index was created specifically to reduce the
size of the index to make these writes (and reads) faster.
This trade-off between file stability at rest and write-time performance
is not easy to balance. The index is an interesting case for a couple
reasons:
1. Writes block users. Writing the index takes place in many user-
blocking foreground operations. The speed improvement directly
impacts their use. Other file formats are typically written in the
background (commit-graph, multi-pack-index) or are super-critical to
correctness (pack-files).
2. Index files are short lived. It is rare that a user leaves an index
for a long time with many staged changes. Outside of staged changes,
the index can be completely destroyed and rewritten with minimal
impact to the user.
Following a similar approach to one used in the microsoft/git fork [1],
add a new config option (index.skipHash) that allows disabling this
hashing during the index write. The cost is that we can no longer
validate the contents for corruption-at-rest using the trailing hash.
[1] 21fed2d914
We load this config from the repository config given by istate->repo,
with a fallback to the_repository if it is not set.
While older Git versions will not recognize the null hash as a special
case, the file format itself is still being met in terms of its
structure. Using this null hash will still allow Git operations to
function across older versions.
The one exception is 'git fsck' which checks the hash of the index file.
This used to be a check on every index read, but was split out to just
the index in a33fc72fe9 (read-cache: force_verify_index_checksum,
2017-04-14) and released first in Git 2.13.0. Document the versions that
relaxed these restrictions, with the optimistic expectation that this
change will be included in Git 2.40.0.
Here, we disable this check if the trailing hash is all zeroes. We add a
warning to the config option that this may cause undesirable behavior
with older Git versions.
As a quick comparison, I tested 'git update-index --force-write' with
and without index.skipHash=true on a copy of the Linux kernel
repository.
Benchmark 1: with hash
Time (mean ± σ): 46.3 ms ± 13.8 ms [User: 34.3 ms, System: 11.9 ms]
Range (min … max): 34.3 ms … 79.1 ms 82 runs
Benchmark 2: without hash
Time (mean ± σ): 26.0 ms ± 7.9 ms [User: 11.8 ms, System: 14.2 ms]
Range (min … max): 16.3 ms … 42.0 ms 69 runs
Summary
'without hash' ran
1.78 ± 0.76 times faster than 'with hash'
These performance benefits are substantial enough to allow users the
ability to opt-in to this feature, even with the potential confusion
with older 'git fsck' versions.
Test this new config option, both at a command-line level and within a
submodule. The confirmation is currently limited to confirm that 'git
fsck' does not complain about the index. Future updates will make this
test more robust.
It is critical that this test is placed before the test_index_version
tests, since those tests obliterate the .git/config file and hence lose
the setting from GIT_TEST_DEFAULT_HASH, if set.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hashfile API is useful for generating files that include a trailing
hash of the file's contents up to that point. Using such a hash is
helpful for verifying the file for corruption-at-rest, such as a faulty
drive causing flipped bits.
Git's index file includes this trailing hash, so it uses a 'struct
hashfile' to handle the I/O to the file. This was very convenient to
allow using the hashfile methods during these operations.
However, hashing the file contents during write comes at a performance
penalty. It's slower to hash the bytes on their way to the disk than
without that step. This problem is made worse by the replacement of
hardware-accelerated SHA1 computations with the software-based sha1dc
computation.
This write cost is significant, and the checksum capability is likely
not worth that cost for such a short-lived file. The index is rewritten
frequently and the only time the checksum is checked is during 'git
fsck'. Thus, it would be helpful to allow a user to opt-out of the hash
computation.
We first need to allow Git to opt-out of the hash computation in the
hashfile API. The buffered writes of the API are still helpful, so it
makes sense to make the change here.
Introduce a new 'skip_hash' option to 'struct hashfile'. When set, the
update_fn and final_fn members of the_hash_algo are skipped. When
finalizing the hashfile, the trailing hash is replaced with the null
hash.
This use of a trailing null hash would be desireable in either case,
since we do not want to special case a file format to have a different
length depending on whether it was hashed or not. When the final bytes
of a file are all zero, we can infer that it was written without
hashing, and thus that verification is not available as a check for file
consistency. This also means that we could easily toggle hashing for any
file format we desire.
A version of this patch has existed in the microsoft/git fork since
2017 [1] (the linked commit was rebased in 2018, but the original dates
back to January 2017). Here, the change to make the index use this fast
path is delayed until a later change.
[1] 21fed2d914
Co-authored-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prepare_temp_file() function takes a diff_filespec as well as a
filename. But it is almost certainly an error to pass in a name that
isn't the filespec's "path" parameter, since that is the only thing that
reliably tells us how to find the content (and indeed, this was the
source of a recently-fixed bug).
So let's drop the redundant "name" parameter and just use one->path
throughout the function. This simplifies the interface a little bit, and
makes it impossible for calling code to get it wrong.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the previous commit, setting up the tempfile for an external diff
uses df->path from the diff_filespec, rather than the logical name. This
means add_external_diff_name() does not need to take a "name" parameter
at all, and we can drop it. And that in turn lets us simplify the
conditional for handling renames (when the "other" name is non-NULL).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're going to run an external diff, we have to make the contents
of the pre- and post-images available either by dumping them to a
tempfile, or by pointing at a valid file in the worktree. The logic of
this is all handled by prepare_temp_file(), and we just pass in the
filename and the diff_filespec.
But there's a gotcha here. The "filename" we have is a logical filename
and not necessarily a path on disk or in the repository. This matters in
at least one case: when using "--relative", we may have a name like
"foo", even though the file content is found at "subdir/foo". As a
result, we look for the wrong path, fail to find "foo", and claim that
the file has been deleted (passing "/dev/null" to the external diff,
rather than the correct worktree path).
We can fix this by passing the pathname from the diff_filespec, which
should always be a full repository path (and that's what we want even if
reusing a worktree file, since we're always operating from the top-level
of the working tree).
The breakage seems to go all the way back to cd676a5136 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12). As far as I can tell, before then "name" would always have
been the same as the filespec's "path".
There are two related cases I looked at that aren't buggy:
1. the only other caller of prepare_temp_file() is run_textconv(). But
it always passes the filespec's path field, so it's OK.
2. I wondered if file renames/copies might cause similar confusion.
But they don't, because run_external_diff() receives two names in
that case: "name" and "other", which correspond to the two sides of
the diff. And we did correctly pass "other" when handling the
post-image side. Barring the use of "--relative", that would always
match "two->path", the path of the second filespec (and the rename
destination).
So the only bug is just the interaction with external diff drivers and
--relative.
Reported-by: Carl Baldwin <carl@ecbaldwin.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 70b9c10373 (bundle-uri client: add helper for testing server,
2022-12-22) added a cmd_ls_remote() function which contains "uploadpack"
and "server_options" variables. Neither of these variables is ever
modified after being initialized, so the code to handle non-NULL and
non-empty values is impossible to reach.
While in theory we might add command-line parsing to set these, let's
drop the dead code for now in the name of cleanliness. It's easy enough
to add it back later if need be.
Noticed by Coverity.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop using "git --super-prefix" and narrow the scope of its use to
the submodule--helper.
* ab/no-more-git-global-super-prefix:
read-tree: add "--super-prefix" option, eliminate global
submodule--helper: convert "{update,clone}" to their own "--super-prefix"
submodule--helper: convert "status" to its own "--super-prefix"
submodule--helper: convert "sync" to its own "--super-prefix"
submodule--helper: convert "foreach" to its own "--super-prefix"
submodule--helper: don't use global --super-prefix in "absorbgitdirs"
submodule.c & submodule--helper: pass along "super_prefix" param
read-tree + fetch tests: test failing "--super-prefix" interaction
submodule absorbgitdirs tests: add missing "Migrating git..." tests
Fix to a small regression in 2.38 days.
* ab/bundle-wo-args:
bundle <cmd>: have usage_msg_opt() note the missing "<file>"
builtin/bundle.c: remove superfluous "newargc" variable
bundle: don't segfault on "git bundle <subcmd>"
Even in a repository with promisor remote, it is useless to
attempt to lazily attempt fetching an object that is expected to be
commit, because no "filter" mode omits commit objects. Take
advantage of this assumption to fail fast on errors.
* jt/avoid-lazy-fetch-commits:
commit: don't lazy-fetch commits
object-file: emit corruption errors when detected
object-file: refactor map_loose_object_1()
object-file: remove OBJECT_INFO_IGNORE_LOOSE
'cat-file' gains mailmap support for its '--batch-check' and '-s'
options.
* sa/cat-file-mailmap--batch-check:
cat-file: add mailmap support to --batch-check option
cat-file: add mailmap support to -s option
The git-am --no-verify flag is analogous to the same flag passed to
git-commit. It bypasses the pre-applypatch and applypatch-msg hooks
if they are enabled.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse checkout documentation states that the cone mode pattern set
is limited to patterns that either recursively include directories or
patterns that match all files in a directory. In the sparse checkout
file, the former manifest in the form:
/A/B/C/
while the latter become a pair of patterns either in the form:
/A/B/
!/A/B/*/
or in the special case of matching the toplevel files:
/*
!/*/
The 'add_pattern_to_hashsets()' function contains checks which serve to
disable cone-mode when non-cone patterns are encountered. However, these
do not catch when the pattern list attempts to match a single file or
directory, e.g. a pattern in the form:
/A/B/C
This causes sparse-checkout to exhibit unexpected behaviour when such a
pattern is in the sparse-checkout file and cone mode is enabled.
Concretely, with the pattern like the above, sparse-checkout, in
non-cone mode, will only include the directory or file located at
'/A/B/C'. However, with cone mode enabled, sparse-checkout will instead
just manifest the toplevel files but not any file located at '/A/B/C'.
Relatedly, issues occur when supplying the same kind of filter when
partial cloning with '--filter=sparse:oid=<oid>'. 'upload-pack' will
correctly just include the objects that match the non-cone pattern
matching. Which means that checking out the newly cloned repo with the
same filter, but with cone mode enabled, fails due to missing objects.
To fix these issues, add a cone mode pattern check that asserts that
every pattern is either a directory match or the pattern '/*'. Add a
test to verify the new pattern check and modify another to reflect that
non-directory patterns are caught earlier.
Signed-off-by: William Sprent <williams@unity3d.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the thread terminates, the handle to the
original thread should be closed.
This change makes win32_pthread_join POSIX compliant.
Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As well as removing the explicit shell setting d8b21a0fe2 (CI: don't
explicitly pick "bash" shell outside of Windows, fix regression,
2022-12-07) also reverted the name of the print test failures step
introduced by 5aeb145780 (ci(github): bring back the 'print test
failures' step, 2022-06-08). This is unfortunate as 5aeb145780 added a
message to direct contributors to the "print test failures" step when a
test fails and that step is no-longer known by that name on the
non-windows ci jobs.
In principle we could update the message to print the correct name for
the step but then we'd have to deal with having two different names for
the same step on different jobs. It is simpler for the implementation
and contributors to use the same name for this step on all jobs.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the sequence to fsync $GIT_DIR/packed-refs file that forgot to
flush its output to the disk..
* ps/fsync-refs-fix:
refs: fix corruption by not correctly syncing packed-refs to disk
"git format-patch" learned to honor format.mboxrd even when sending
patches to the standard output stream,
* ew/format-patch-mboxrd:
format-patch: support format.mboxrd with --stdout
Bundle URIs part 4.
* ds/bundle-uri-4:
clone: unbundle the advertised bundles
bundle-uri: download bundles from an advertised list
bundle-uri: allow relative URLs in bundle lists
strbuf: introduce strbuf_strip_file_from_path()
bundle-uri: serve bundle.* keys from config
bundle-uri client: add helper for testing server
transport: rename got_remote_heads
bundle-uri client: add boolean transfer.bundleURI setting
clone: request the 'bundle-uri' command when available
t: create test harness for 'bundle-uri' command
protocol v2: add server-side "bundle-uri" skeleton
When given a pattern that matches an empty string at the end of a
line, the code to parse the "git diff" line-ranges fell into an
infinite loop, which has been corrected.
* lk/line-range-parsing-fix:
line-range: fix infinite loop bug with '$' regex
Just like "git var GIT_EDITOR" abstracts the complex logic to
choose which editor gets used behind it, "git var" now give support
to GIT_SEQUENCE_EDITOR.
* sa/git-var-sequence-editor:
var: add GIT_SEQUENCE_EDITOR variable
"git pull -v --recurse-submodules" attempted to pass "-v" down to
underlying "git submodule update", which did not understand the
request and barfed, which has been corrected.
* ss/pull-v-recurse-fix:
submodule: accept -v for the update command
Improve the usage we emit on e.g. "git bundle create" to note why
we're showing the usage, it's because the "<file>" argument is
missing.
We know that'll be the case for all parse_options_cmd_bundle() users,
as they're passing the "char **bundle_file" parameter, which as the
context shows we're expected to populate.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in 891cb09db6 (bundle: don't segfault on "git bundle
<subcmd>", 2022-12-20) the "newargc" in this function is redundant to
using our own "argc". Let's refactor the function to avoid needlessly
introducing another variable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the description of --force to use '<start-point>' rather than
'<startpoint>' to match the spelling used everywhere else in the
git-branch documentation.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the SHA1DC implementation on macOS, just like other platforms,
by default.
* ab/darwin-default-to-sha1dc:
Makefile: use sha1collisiondetection by default on OSX and Darwin
The output from "git diff --stat" on an unmerged path lost the
terminating LF in Git 2.39, which has been corrected.
* pg/diff-stat-unmerged-regression-fix:
diff: fix regression with --stat and unmerged file
Clean-ups in error messages produced by "git for-each-ref" and friends.
* jk/ref-filter-error-reporting-fix:
ref-filter: convert email atom parser to use err_bad_arg()
ref-filter: truncate atom names in error messages
ref-filter: factor out "unrecognized %(foo) arg" errors
ref-filter: factor out "%(foo) does not take arguments" errors
ref-filter: reject arguments to %(HEAD)
Code clean-up around unused function parameters.
* jk/unused-post-2.39:
userdiff: mark unused parameter in internal callback
list-objects-filter: mark unused parameters in virtual functions
diff: mark unused parameters in callbacks
xdiff: mark unused parameter in xdl_call_hunk_func()
xdiff: drop unused parameter in def_ff()
ws: drop unused parameter from ws_blank_line()
list-objects: drop process_gitlink() function
blob: drop unused parts of parse_blob_buffer()
ls-refs: use repository parameter to iterate refs
"git http-fetch" (which is rarely used) forgot to identify itself
in the trace2 output.
* jt/http-fetch-trace2-report-name:
http-fetch: invoke trace2_cmd_name()
The code to auto-correct a misspelt subcommand unnecessarily called
into git_default_config() from the early config codepath, which was
a no-no. This has bee corrected.
* sg/help-autocorrect-config-fix:
help.c: fix autocorrect in work tree for bare repository
The "--super-prefix" option to "git" was initially added in [1] for
use with "ls-files"[2], and shortly thereafter "submodule--helper"[3]
and "grep"[4]. It wasn't until [5] that "read-tree" made use of it.
At the time [5] made sense, but since then we've made "ls-files"
recurse in-process in [6], "grep" in [7], and finally
"submodule--helper" in the preceding commits.
Let's also remove it from "read-tree", which allows us to remove the
option to "git" itself.
We can do this because the only remaining user of it is the submodule
API, which will now invoke "read-tree" with its new "--super-prefix"
option. It will only do so when the "submodule_move_head()" function
is called.
That "submodule_move_head()" function was then only invoked by
"read-tree" itself, but now rather than setting an environment
variable to pass "--super-prefix" between cmd_read_tree() we:
- Set a new "super_prefix" in "struct unpack_trees_options". The
"super_prefixed()" function in "unpack-trees.c" added in [5] will now
use this, rather than get_super_prefix() looking up the environment
variable we set earlier in the same process.
- Add the same field to the "struct checkout", which is only needed to
ferry the "super_prefix" in the "struct unpack_trees_options" all the
way down to the "entry.c" callers of "submodule_move_head()".
Those calls which used the super prefix all originated in
"cmd_read_tree()". The only other caller is the "unlink_entry()"
caller in "builtin/checkout.c", which now passes a "NULL".
1. 74866d7579 (git: make super-prefix option, 2016-10-07)
2. e77aa336f1 (ls-files: optionally recurse into submodules, 2016-10-07)
3. 89c8626557 (submodule helper: support super prefix, 2016-12-08)
4. 0281e487fd (grep: optionally recurse into submodules, 2016-12-16)
5. 3d415425c7 (unpack-trees: support super-prefix option, 2017-01-17)
6. 188dce131f (ls-files: use repository object, 2017-06-22)
7. f9ee2fcdfa (grep: recurse in-process using 'struct repository', 2017-08-02)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with a preceding commit to convert "absorbgitdirs", we can convert
"submodule--helper status" to use its own "--super-prefix", instead of
relying on the global "--super-prefix" argument to "git".
We need to convert both of these away from the global "--super-prefix"
at the same time, because "update" will call "clone", but "clone"
itself didn't make use of the global "--super-prefix" for displaying
paths. It was only on the list of sub-commands that accepted it
because "update"'s use of it would set it in its environment.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with a preceding commit to convert "absorbgitdirs", we can convert
"submodule--helper status" to use its own "--super-prefix", instead of
relying on the global "--super-prefix" argument to "git" itself. See
that earlier commit for the rationale and background.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with a preceding commit to convert "absorbgitdirs", we can convert
"submodule--helper sync" to use its own "--super-prefix", instead of
relying on the global "--super-prefix" argument to "git" itself. See
that earlier commit for the rationale and background.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with a preceding commit to convert "absorbgitdirs", we can convert
"submodule--helper foreach" to use its own "--super-prefix", instead
of relying on the global "--super-prefix" argument to "git"
itself. See that earlier commit for the rationale and background.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--super-prefix" facility was introduced in [1] has always been a
transitory hack, which is why we've made it an error to supply it as
an option to "git" to commands that don't know about it.
That's been a good goal, as it has a global effect we haven't wanted
calls to get_super_prefix() from built-ins we didn't expect.
But it has meant that when we've had chains of different built-ins
using it all of the processes in that "chain" have needed to support
it, and worse processes that don't need it have needed to ask for
"SUPPORT_SUPER_PREFIX" because their parent process needs it.
That's how "fsmonitor--daemon" ended up with it, per [2] it's called
from (among other things) "submodule--helper absorbgitdirs", but as we
declared "submodule--helper" as "SUPPORT_SUPER_PREFIX" we needed to
declare "fsmonitor--daemon" as accepting it too, even though it
doesn't care about it.
But in the case of "absorbgitdirs" it only needed "--super-prefix" to
invoke itself recursively, and we'd never have another "in-between"
process in the chain. So we didn't need the bigger hammer of "git
--super-prefix", and the "setenv(GIT_SUPER_PREFIX_ENVIRONMENT, ...)"
that it entails.
Let's instead accept a hidden "--super-prefix" option to
"submodule--helper absorbgitdirs" itself.
Eventually (as with all other "--super-prefix" users) we'll want to
clean this code up so that this all happens in-process. I.e. needing
any variant of "--super-prefix" is itself a hack around our various
global state, and implicit reliance on "the_repository". This stepping
stone makes such an eventual change easier, as we'll need to deal with
less global state at that point.
The "fsmonitor--daemon" test adjusted here was added in [3]. To assert
that it didn't run into the "--super-prefix" message it was asserting
the output it didn't have. Let's instead assert the full output that
we *do* have, using the same pattern as a preceding change to
"t/t7412-submodule-absorbgitdirs.sh" used.
We could also remove the test entirely (as [4] did), but even though
the initial reason for having it is gone we're still getting some
marginal benefit from testing the "fsmonitor" and "submodule
absorbgitdirs" interaction, so let's keep it.
The change here to have either a NULL or non-"" string as a
"super_prefix" instead of the previous arrangement of "" or non-"" is
somewhat arbitrary. We could also decide to never have to check for
NULL.
As we'll be changing the rest of the "git --super-prefix" users to the
same pattern, leaving them all consistent makes sense. Why not pick ""
over NULL? Because that's how the "prefix" works[5], and having
"prefix" and "super_prefix" work the same way will be less
confusing. That "prefix" picked NULL instead of "" is itself
arbitrary, but as it's easy to make this small bit of our overall API
consistent, let's go with that.
1. 74866d7579 (git: make super-prefix option, 2016-10-07)
2. 53fcfbc84f (fsmonitor--daemon: allow --super-prefix argument,
2022-05-26)
3. 53fcfbc84f (fsmonitor--daemon: allow --super-prefix argument,
2022-05-26)
4. https://lore.kernel.org/git/20221109004708.97668-5-chooglen@google.com/
5. 9725c8dda2 (built-ins: trust the "prefix" from run_builtin(),
2022-02-16)
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start passing the "super_prefix" along as a parameter to
get_submodule_displaypath() and absorb_git_dir_into_superproject(),
rather than get the value directly as a global.
This is in preparation for subsequent commits, where we'll gradually
phase out get_super_prefix() for an alternative way of getting the
"super_prefix".
Most of the users of this get a get_super_prefix() value, either
directly or by indirection. The exceptions are:
- builtin/rm.c: Doesn't declare SUPPORT_SUPER_PREFIX, so we'd have
died if this was provided, so it's safe to pass "NULL".
- deinit_submodule(): The "deinit_submodule()" function has never been
able to use the "git -super-prefix". It will call
"absorb_git_dir_into_superproject()", but it will only do so from the
top-level project.
If "absorbgitdirs" recurses will use the "path" passed to
"absorb_git_dir_into_superproject()" in "deinit_submodule()" as its
starting "--super-prefix". So we can safely remove the
get_super_prefix() call here, and pass NULL instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since "git fetch --refetch" was introduced in 0f5e885173 (Merge
branch 'rc/fetch-refetch', 2022-04-04) the test being added here would
fail. This is because "restore" will "read-tree .. --reset <hash>",
which will in turn invoke "fetch". The "fetch" will then die with:
fatal: fetch doesn't support --super-prefix
This edge case and other "--super-prefix" bugs will be fixed in
subsequent commits, but let's first add a "test_expect_failure" test
for it. It passes until the very last command in the test.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a blind spots in the tests surrounding "submodule absorbgitdirs"
and test what output we emit, and how emitted the message and behavior
interacts with a "git worktree" where the repository isn't at the base
of the working directory.
The "$(pwd)" instead of "$PWD" here is needed due to Windows, where
the latter will be a path like "/d/a/git/[...]", whereas we need
"D:/a/git/[...]".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because we use the C runtime and
use _beginthreadex to create pthreads,
pthread_exit MUST use _endthreadex.
Otherwise, according to Microsoft:
"Failure to do so results in small
memory leaks when the thread
calls ExitThread."
Simply put, this is not the same as ExitThread.
Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mboxrd is a more robust output format when used with --stdout
and needs more exposure. Introducing this config knob lets
users choose the more robust format for all their --stdout
uses.
Relying on --pretty=mboxrd and including all of pretty-formats.txt
in the `git format-patch' documentation would likely be
confusing to users. Furthermore, this setting is useful across
multiple invocations. So introduce `format.mboxrd' as a boolean
configuration knob that changes the default --pretty=email format
to --pretty=mboxrd when (and only when) --stdout is in use.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous change introduced the transport methods to acquire a bundle
list from the 'bundle-uri' protocol v2 command, when advertised _and_
when the client has chosen to enable the feature.
Teach Git to download and unbundle the data advertised by those bundles
during 'git clone'. This takes place between the ref advertisement and
the object data download, and stateful connections will linger while
the client downloads bundles. In the future, we should consider closing
the remote connection during this process.
Also, since the --bundle-uri option exists, we do not want to mix the
advertised bundles with the user-specified bundles.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logic in fetch_bundle_uri() is useful for the --bundle-uri option of
'git clone', but is not helpful when the clone operation discovers a
list of URIs from the bundle-uri protocol v2 command. To actually
download and unbundle the advertised bundles, we need a different
mechanism.
Create the new fetch_bundle_list() method which is very similar to
fetch_bundle_uri() except that it relies on download_bundle_list()
instead of fetch_bundle_uri_internal(). The download_bundle_list()
method will recursively call fetch_bundle_uri_internal() if any of the
advertised URIs serve a bundle list instead of a bundle. This will also
follow the bundle.list.mode setting from the input list: "any" will
download only one such URI while "all" will download data from all of
the URIs.
In an identical way to fetch_bundle_uri(), the bundles are unbundled
after all of the bundle lists have been expanded and all necessary URIs.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.
Allow a bundle list to specify a relative URI for the bundles. This URI
is based on where the client received the bundle list. For a list
provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
the base URI. Otherwise, the bundle list was provided from an HTTP URI
not using the Git protocol, and that URI is the base URI. This allows
easier distribution of bundle data.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf_parent_directory() method was added as a static method in
contrib/scalar by d0feac4e8c (scalar: 'register' sets recommended
config and starts maintenance, 2021-12-03) and then removed in
65f6a9eb0b (scalar: constrain enlistment search, 2022-08-18), but now
there is a need for a similar method in the bundle URI feature.
Re-add the method, this time in strbuf.c, but with a new name:
strbuf_strip_file_from_path(). The method requirements are slightly
modified to allow a trailing slash, in which case nothing is done, which
makes the name change valuable.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the "bundle-uri" protocol v2 capability by populating the
key=value packet lines from the local Git config. The list of bundles is
provided from the keys beginning with "bundle.".
In the future, we may want to filter this list to be more specific to
the exact known keys that the server intends to share, but for
flexibility at the moment we will assume that the config values are
well-formed.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.
In the "git clone" case we'll have already done the handshake(),
but not here. Add an extra case to check for this handshake in
get_bundle_uri() for ease of use for future callers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'got_remote_heads' member of 'struct git_transport_data' was used
historically to indicate that the initial server connection was made and
the ref advertisement was returned. With protocol v2, that initial
handshake does not necessarily include the ref advertisement, so this
member is not an accurate name. Thankfully, all uses of the member are
only checking to see if the handshake should take place, not whether or
not some local data has the ref advertisement.
Rename the member to 'finished_handshake' to represent the proper state.
Note that the variable is only set to 1 during the handshake() method.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.
The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set up all the needed client parts of the 'bundle-uri' protocol v2
command, without actually doing anything with the bundle URIs.
If the server says it supports 'bundle-uri' teach Git to issue the
'bundle-uri' command after the 'ls-refs' during 'git clone'. The
returned key=value pairs are passed to the bundle list code which is
tested using a different ingest mechanism in t5750-bundle-uri-parse.sh.
At this point, Git does nothing with that bundle list. It will not
download any of the bundles. That will come in a later change after
these protocol bits are finalized.
The no-op client is initially used only by 'git clone' to test the basic
functionality, and eventually will bootstrap the initial download of Git
objects during a fresh clone. The bundle URI client will not be
integrated into other fetches until a mechanism is created to select a
subset of bundles for download.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change allowed for a Git server to advertise the
'bundle-uri' command as a capability based on the
uploadPack.advertiseBundleURIs config option. Create a set of tests that
check that this capability is advertised using 'git ls-remote'.
In order to test this functionality across three protocols (file, git,
and http), create lib-bundle-uri-protocol.sh to generalize the tests,
allowing the other test scripts to set an environment variable and
otherwise inherit the setup and tests from this script.
The tests currently only test that the 'bundle-uri' command is
advertised or not. Other actions will be tested as the Git client learns
to request the 'bundle-uri' command and parse its response.
To help with URI escaping, specifically for file paths with a space in
them, extract a 'sed' invocation from t9199-git-svn-info.sh into a
helper function for use here, too.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a skeleton server-side implementation of a new "bundle-uri" command
to protocol v2. This will allow conforming clients to optionally seed
their initial clones or incremental fetches from URLs containing
"*.bundle" files created with "git bundle create".
This change only performs the basic boilerplate of advertising a new
protocol v2 capability. The new 'bundle-uri' capability allows a client
to request a list of bundles. Right now, the server only returns a flush
packet, which corresponds to an empty advertisement. The bundle.* config
namespace describes which key-value pairs will be communicated across
this interface in future updates.
The critical bit right now is that the new boolean
uploadPack.adverstiseBundleURIs config value signals whether or not this
capability should be advertised at all.
An earlier version of this patch [1] used a different transfer format
than the "key=value" pairs in the current implementation. The change was
made to unify the protocol v2 command with the bundle lists provided by
independent bundle servers. Further, the standard allows for the server
to advertise a URI that contains a bundle list. This allows users
automatically discovering bundle providers that are loosely associated
with the origin server, but without the origin server knowing exactly
which bundles are currently available.
[1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/
The very-deep headings needed to be modified to stop at level 4 due to
documentation build issues. These were not recognized in earlier builds
since the file was previously in the Documentation/technical/ directory
and was built in a different way. With its current location, the
heavily-nested details were causing build issues and they are now
replaced with a bulletted list of details.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At GitLab we have recently received a report where a repository was left
with a corrupted `packed-refs` file after the node hard-crashed even
though `core.fsync=reference` was set. This is something that in theory
should not happen if we correctly did the atomic-rename dance to:
1. Write the data into a temporary file.
2. Synchronize the temporary file to disk.
3. Rename the temporary file into place.
So if we crash in the middle of writing the `packed-refs` file we should
only ever see either the old or the new state of the file.
And while we do the dance when writing the `packed-refs` file, there is
indeed one gotcha: we use a `FILE *` stream to write the temporary file,
but don't flush it before synchronizing it to disk. As a consequence any
data that is still buffered will not get synchronized and a crash of the
machine may cause corruption.
Fix this bug by flushing the file stream before we fsync.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since aef7d75e58 (builtin/bundle.c: let parse-options parse
subcommands, 2022-08-19) we've been segfaulting if no argument was
provided.
The fix is easy, as all of the "git bundle" subcommands require a
non-option argument we can check that we have arguments left after
calling parse-options().
This makes use of code added in 73c3253d75 (bundle: framework for
options before bundle file, 2019-11-10), before this change that code
has always been unreachable. In 73c3253d75 we'd never reach it as we
already checked "argc < 2" in cmd_bundle() itself.
Then when aef7d75e58 (whose segfault we're fixing here) migrated this
code to the subcommand API it removed that "argc < 2" check, but we
were still checking the wrong "argc" in parse_options_cmd_bundle(), we
need to check the "newargc". The "argc" will always be >= 1, as it
will necessarily contain at least the subcommand name
itself (e.g. "create").
As an aside, this could be safely squashed into this, but let's not do
that for this minimal segfault fix, as it's an unrelated refactoring:
--- a/builtin/bundle.c
+++ b/builtin/bundle.c
@@ -55,13 +55,12 @@ static int parse_options_cmd_bundle(int argc,
const char * const usagestr[],
const struct option options[],
char **bundle_file) {
- int newargc;
- newargc = parse_options(argc, argv, NULL, options, usagestr,
+ argc = parse_options(argc, argv, NULL, options, usagestr,
PARSE_OPT_STOP_AT_NON_OPTION);
- if (!newargc)
+ if (!argc)
usage_with_options(usagestr, options);
*bundle_file = prefix_filename(prefix, argv[0]);
- return newargc;
+ return argc;
}
static int cmd_bundle_create(int argc, const char **argv, const char *prefix) {
Reported-by: Hubert Jasudowicz <hubertj@stmcyber.pl>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Tested-by: Hubert Jasudowicz <hubertj@stmcyber.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.
In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.
There were two ways to do it:
1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
option is given, the first call will get us the type of the object
and second call will only be made if the object type is either a
commit or tag to get the contents of the object.
2. Make one call to `oid_object_info_extended()` to get the type of the
object. Then, if the object type is either of commit or tag, make a
call to `repo_read_object_file()` to read the contents of the object.
I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:
`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`
The results can be summarized as follows:
Time (mean ± σ)
default 827.7 ms ± 104.8 ms
first approach 6.197 s ± 0.093 s
second approach 1.975 s ± 0.217 s
Since, the second approach is faster than the first one, I implemented
it in this patch.
The command git cat-file can now use the mailmap mechanism to replace
idents with canonical versions for commit and tag objects. There are
several options like `--batch`, `--batch-check` and `--batch-command`
that can be combined with `--use-mailmap`. But the documentation for
`--batch`, `--batch-check` and `--batch-command` doesn't say so. This
patch fixes that documentation.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.
In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of deprecation warnings in the CI runs. Also gets the latest
security patches.
Signed-off-by: Chris. Webster <chris@webstech.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A message in the step log will refer to the Summary output.
The job summary output is using markdown to improve readability. The
git commands and commits with errors are now in ordered lists.
Commits and files in error are links to the user's repository.
Signed-off-by: Chris. Webster <chris@webstech.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the errors more visible by adding them to the job summary and
display the git commands that will usually fix the problem.
Signed-off-by: Chris. Webster <chris@webstech.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It has been a frequent matter of contention that the win+VS jobs not
only take a long time to run, but are also more easily broken than the
other jobs (because they do not use the same `Makefile`-based builds as
all other jobs), and to make matters worse, these breakages are also
much harder to diagnose and fix than other jobs', especially for
contributors who are happy to stay away from Windows.
The purpose of these win+VS jobs is to maintain the CMake-based build
of Git, with the target audience being Visual Studio users on Windows
who are typically quite unfamiliar with `make` and POSIX shell
scripting, but the benefit of whose expertise we want for the Git
project nevertheless.
The CMake support was introduced for that specific purpose, and already
early on concerns were raised that it would put an undue burden on
contributors to ensure that these jobs pass in CI, when they do not have
access to Windows machines (nor want to have that).
This developer's initial hope was that it would be enough to fix win+VS
failures and provide the changes to be squashed into contributors'
patches, and that it would be worth the benefit of attracting
Windows-based developers' contributions.
Neither of these hopes have panned out.
To lower the frustration, and incidentally benefit from using way less
build minutes, let's just not run the win+VS jobs by default, which
appears to be the consensus of the mail thread leading up to
https://lore.kernel.org/git/xmqqk0311blt.fsf@gitster.g/
Since the Git for Windows project still needs to at least try to attract
more of said Windows-based developers, let's keep the jobs, but disable
them everywhere except in Git for Windows' fork. This will help because
Git for Windows' branch thicket is "continuously rebased" via automation
to the `shears/maint`, `shears/main`, `shears/next` and `shears/seen`
branches at https://github.com/git-for-windows/git. That way, the Git
for Windows project will still be notified early on about potential
breakages, but the Git project won't be burdened with fixing them
anymore, which seems to be the best compromise we can get on this issue.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the -L argument to "git log" is passed the zero-width regular
expression "$" (as in "-L :$:line-range.c"), this results in an
infinite loop in find_funcname_matching_regexp().
Modify find_funcname_matching_regexp to correctly match the entire line
instead of the zero-width match at eol and update the loop condition to
prevent an infinite loop in the event of other undiscovered corner cases.
The primary change is that we pre-decrement the beginning-of-line marker
('bol') before comparing it to '\n'. In the case of '$', where we match the
'\n' at the end of the line and start the loop with bol == eol, this
ensures that bol will find the beginning of the line on which the match
occurred.
Originally reported in <https://stackoverflow.com/q/74690545/147356>.
Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a repository is on a FAT32 file system, the user sees a message
that the path ownership cannot be determined. Fix a typo in the
message.
Signed-off-by: Daniël Haazen <danielhaazen@hotmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The advice message given by "git status" when it takes long time to
enumerate untracked paths has been updated.
* rr/status-untracked-advice:
status: modernize git-status "slow untracked files" advice
Introduce a case insensitive mode to the Bash completion helpers.
* aw/complete-case-insensitive:
completion: add case-insensitive match of pseudorefs
completion: add optional ignore-case when matching refs
The way the diff machinery prepares the options array for the
parse_options API has been refactored to avoid resource leaks.
* rs/diff-parseopts:
diff: remove parseopts member from struct diff_options
diff: use add_diff_options() in diff_opt_parse()
diff: factor out add_diff_options()
Redefining system functions for a few functions did not follow our
usual "implement git_foo() and #define foo(args) git_foo(args)"
pattern, which has broken build for some folks.
* jk/avoid-redef-system-functions-2.30:
git-compat-util: undefine system names before redeclaring them
git-compat-util: avoid redefining system function names
Unlike other test helper functions, 'test_oid' doesn't terminate its
output with a LF, but, alas, the reason for this, if any, is not
mentioned in 2c02b110da (t: add test functions to translate
hash-related values, 2018-09-13)).
Now, in the vast majority of cases 'test_oid' is invoked in a command
substitution that is part of a heredoc or supplies an argument to a
command or the value to a variable, and the command substitution would
chop off any trailing LFs, so in these cases the lack or presence of a
trailing LF in its output doesn't matter. However:
- There appear to be only three cases where 'test_oid' is not
invoked in a command substitution:
$ git grep '\stest_oid ' -- ':/t/*.sh'
t0000-basic.sh: test_oid zero >actual &&
t0000-basic.sh: test_oid zero >actual &&
t0000-basic.sh: test_oid zero >actual &&
These are all in test cases checking that 'test_oid' actually
works, and that the size of its output matches the size of the
corresponding hash function with conditions like
test $(wc -c <actual) -eq 40
In these cases the lack of trailing LF does actually matter,
though they could be trivially updated to account for the presence
of a trailing LF.
- There are also a few cases where the lack of trailing LF in
'test_oid's output actually hurts, because tests need to compare
its output with LF terminated file contents, forcing developers to
invoke it as 'echo $(test_oid ...)' to append the missing LF:
$ git grep 'echo "\?$(test_oid ' -- ':/t/*.sh'
t1302-repo-version.sh: echo $(test_oid version) >expect &&
t1500-rev-parse.sh: echo "$(test_oid algo)" >expect &&
t4044-diff-index-unique-abbrev.sh: echo "$(test_oid val1)" > foo &&
t4044-diff-index-unique-abbrev.sh: echo "$(test_oid val2)" > foo &&
t5313-pack-bounds-checks.sh: echo $(test_oid oidfff) >file &&
And there is yet another similar case in an in-flight topic at:
https://public-inbox.org/git/813e81a058227bd373cec802e443fcd677042fb4.1670862677.git.gitgitgadget@gmail.com/
Arguably we would be better off if 'test_oid' terminated its output
with a LF. So let's update 'test_oid' accordingly, update its tests
in t0000 to account for the extra character in those size tests, and
remove the now unnecessary 'echo $(...)' command substitutions around
'test_oid' invocations as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The editor program used by Git when editing the sequencer "todo" file
is determined by examining a few environment variables and also
affected by configuration variables. Introduce "git var
GIT_SEQUENCE_EDITOR" to give users access to the final result of the
logic without having to know the exact details.
This is very similar in spirit to 44fcb497 (Teach git var about
GIT_EDITOR, 2009-11-11) that introduced "git var GIT_EDITOR".
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since a56771a6 (builtin/pull: respect verbosity settings in
submodules, 2018-01-25), "git pull -v --recurse-submodules"
propagates the "-v" to the submodule command, but because the
latter command does not understand the option, it barfs.
Teach "git submodule update" to accept the option to fix it.
Signed-off-by: Sven Strickroth <email@cs-ware.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the sha1collisiondetection library was added and made the default
in [1] the interaction with APPLE_COMMON_CRYPTO added in [2] and [3]
seems to have been missed. On modern OSX and Darwin we are able to use
Apple's CommonCrypto both for SHA-1, and as a generic (but partial)
OpenSSL replacement.
This left OSX and Darwin without protection against the SHAttered
attack when building Git in its default configuration.
Let's also use sha1collisiondetection on OSX, to do so we'll need to
split up the "APPLE_COMMON_CRYPTO" flag into that flag and a new
"APPLE_COMMON_CRYPTO_SHA1".
Because of this we can stop conflating whether we want to use Apple's
CommonCrypto at all, and whether we want to use it for SHA-1. This
makes the CI recipe added in [4] simpler.
1. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
2. 4dcd7732db (Makefile: add support for Apple CommonCrypto facility, 2013-05-19)
3. 61067954ce (cache.h: eliminate SHA-1 deprecation warnings on Mac OS X, 2013-05-19)
4. 1ad5c3df35 (ci: use DC_SHA1=YesPlease on osx-clang job for CI,
2022-10-20)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error message for a bogus argument to %(authoremail), etc, is:
$ git for-each-ref --format='%(authoremail:foo)'
fatal: unrecognized email option: foo
Saying just "email" is a little vague; most of the other atom parsers
would use the full name "%(authoremail)", but we can't do that here
because the same function also handles %(taggeremail), etc. Until
recently, passing atom->name was a bad idea, because it erroneously
included the arguments in the atom name. But since the previous commit
taught err_bad_arg() to handle this, we can now do so and get:
fatal: unrecognized %(authoremail) argument: foo
which is consistent with other atoms.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you pass a bogus argument to %(refname), you may end up with a
message like this:
$ git for-each-ref --format='%(refname:foo)'
fatal: unrecognized %(refname:foo) argument: foo
which is confusing. It should just say:
fatal: unrecognized %(refname) argument: foo
which is clearer, and is consistent with most other atom parsers. Those
other parsers do not have the same problem because they pass the atom
name from a string literal in the parser function. But because the
parser for %(refname) also handles %(upstream) and %(push), it instead
uses atom->name, which includes the arguments. The oid atom parser which
handles %(tree), %(parent), etc suffers from the same problem.
It seems like the cleanest fix would be for atom->name to be _just_ the
name, since there's already a separate "args" field. But since that
field is also used for other things, we can't change it easily (e.g.,
it's how we find things in the used_atoms array, and clearly %(refname)
and %(refname:short) are not the same thing).
Instead, we'll teach our error_bad_arg() function to stop at the first
":". This is a little hacky, as we're effectively re-parsing the name,
but the format is simple enough to do this as a one-liner, and this
localizes the change to the error-reporting code.
We'll give the same treatment to err_no_arg(). None of its callers use
this atom->name trick, but it's worth future-proofing it while we're
here.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Atom parsers that take arguments generally have a catch-all for "this
arg is not recognized". Most of them use the same printf template, which
is good, because it makes life easier for translators. Let's pull this
template into a helper function, which makes the code in the parsers
shorter and avoids any possibility of differences.
As with the previous commit, we'll pick an arbitrary atom to make sure
the test suite covers this code.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many atom parsers give the same error message, differing only in the
name of the atom. If we use "%s does not take arguments", that should
make life easier for translators, as they only need to translate one
string. And in doing so, we can easily pull it into a helper function to
make sure they are all using the exact same string.
I've added a basic test here for %(HEAD), just to make sure this code is
exercised at all in the test suite. We could cover each such atom, but
the effort-to-reward ratio of trying to maintain an exhaustive list
doesn't seem worth it.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The %(HEAD) atom doesn't take any arguments, but unlike other atoms in
the same boat (objecttype, deltabase, etc), it does not detect this
situation and complain. Let's make it consistent with the others.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A regression was introduced in
12fc4ad89e (diff.c: use utf8_strwidth() to count display width, 2022-09-14)
that causes missing newlines after "Unmerged" entries in `git diff
--cached --stat` output.
This problem affects v2.39.0-rc0 through v2.39.0.
Add the missing newline along with a new test to cover this
behavior.
Signed-off-by: Peter Grayson <pete@jpgrayson.net>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the call to `FSEventStreamScheduleWithRunLoop()` function with
the suggested `FSEventStreamSetDispatchQueue()` function.
The MacOS version of the builtin FSMonitor feature uses the
`FSEventStreamScheduleWithRunLoop()` function to drive the event loop
and process FSEvents from the system. This routine has now been
deprecated by Apple. The MacOS 13 (Ventura) compiler tool chain now
generates a warning when compiling calls to this function. In
DEVELOPER=1 mode, this now causes a compile error.
The `FSEventStreamSetDispatchQueue()` function is conceptually similar
and is the suggested replacement. However, there are some subtle
thread-related differences.
Previously, the event stream would be processed by the
`fsm_listen__loop()` thread while it was in the `CFRunLoopRun()`
method. (Conceptually, this was a blocking call on the lifetime of
the event stream where our thread drove the event loop and individual
events were handled by the `fsevent_callback()`.)
With the change, a "dispatch queue" is created and FSEvents will be
processed by a hidden queue-related thread (that calls the
`fsevent_callback()` on our behalf). Our `fsm_listen__loop()` thread
maintains the original blocking model by waiting on a mutex/condition
variable pair while the hidden thread does all of the work.
While the deprecated API used by the original were introduced in
macOS 10.5 (Oct 2007), the API used by the updated code were
introduced back in macOS 10.6 (Aug 2009) and has been available
since then. So this change _could_ break those who have happily
been using 10.5 (if there were such people), but these two dates
both predate the oldest versions of macOS Apple seems to support
anyway, so we should be safe.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing commits, fail fast when the commit is missing or
corrupt, instead of attempting to fetch them. This is done by inlining
repo_read_object_file() and setting the flag that prevents fetching.
This is motivated by a situation in which through a bug (not necessarily
through Git), there was corruption in the object store of a partial
clone. In this particular case, the problem was exposed when "git gc"
tried to expire reflogs, which calls repo_parse_commit(), which triggers
fetches of the missing commits.
(There are other possible solutions to this problem including passing an
argument from "git gc" to "git reflog" to inhibit all lazy fetches, but
I think that this fix is at the wrong level - fixing "git reflog" means
that this particular command works fine, or so we think (it will fail if
it somehow needs to read a legitimately missing blob, say, a .gitmodules
file), but fixing repo_parse_commit() will fix a whole class of bugs.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of relying on errno being preserved across function calls, teach
do_oid_object_info_extended() to itself report object corruption when
it first detects it. There are 3 types of corruption being detected:
- when a replacement object is missing
- when a loose object is corrupt
- when a packed object is corrupt and the object cannot be read
in another way
Note that in the RHS of this patch's diff, a check for ENOENT that was
introduced in 3ba7a06552 (A loose object is not corrupt if it cannot
be read due to EMFILE, 2010-10-28) is also removed. The purpose of this
check is to avoid a false report of corruption if the errno contains
something like EMFILE (or anything that is not ENOENT), in which case
a more generic report is presented. Because, as of this patch, we no
longer rely on such a heuristic to determine corruption, but surface
the error message at the point when we read something that we did not
expect, this check is no longer necessary.
Besides being more resilient, this also prepares for a future patch in
which an indirect caller of do_oid_object_info_extended() will need
such functionality.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function can do 3 things:
1. Gets an fd given a path
2. Simultaneously gets a path and fd given an OID
3. Memory maps an fd
Keep 3 (renaming the function accordingly) and inline 1 and 2 into their
respective callers.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Its last user was removed in 97b2fa08b6 (fetch-pack: drop
custom loose object cache, 2018-11-12), so we can remove it.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git var UNKNOWN_VARIABLE" and "git var VARIABLE" with the variable
given an empty value used to behave identically. Now the latter
just gives an empty output, while the former still gives an error
message.
* sa/git-var-empty:
var: allow GIT_EDITOR to return null
var: do not print usage() with a correct invocation
The pack-bitmap machinery is taught to log the paths of redundant
bitmap(s) to trace2 instead of stderr.
* tl/pack-bitmap-absolute-paths:
pack-bitmap.c: trace bitmap ignore logs when midx-bitmap is found
pack-bitmap.c: break out of the bitmap loop early if not tracing
pack-bitmap.c: avoid exposing absolute paths
pack-bitmap.c: remove unnecessary "open_pack_index()" calls
"git jump" (in contrib/) learned to present the "quickfix list" to
its standard output (instead of letting it consumed by the editor
it invokes), and learned to also drive emacs/emacsclient.
* yn/git-jump-emacs:
git-jump: invoke emacs/emacsclient
git-jump: move valid-mode check earlier
git-jump: add an optional argument '--stdout'
Various leak fixes.
* ab/various-leak-fixes:
built-ins: use free() not UNLEAK() if trivial, rm dead code
revert: fix parse_options_concat() leak
cherry-pick: free "struct replay_opts" members
rebase: don't leak on "--abort"
connected.c: free the "struct packed_git"
sequencer.c: fix "opts->strategy" leak in read_strategy_opts()
ls-files: fix a --with-tree memory leak
revision API: call graph_clear() in release_revisions()
unpack-file: fix ancient leak in create_temp_file()
built-ins & libs & helpers: add/move destructors, fix leaks
dir.c: free "ident" and "exclude_per_dir" in "struct untracked_cache"
read-cache.c: clear and free "sparse_checkout_patterns"
commit: discard partial cache before (re-)reading it
{reset,merge}: call discard_index() before returning
tests: mark tests as passing with SANITIZE=leak
"merge-tree" learns a new `--merge-base` option.
* kz/merge-tree-merge-base:
docs: fix description of the `--merge-base` option
merge-tree.c: allow specifying the merge-base when --stdin is passed
merge-tree.c: add --merge-base=<commit> option
`git bisect` becomes a builtin.
* dd/git-bisect-builtin:
bisect; remove unused "git-bisect.sh" and ".gitignore" entry
Turn `git bisect` into a full built-in
bisect--helper: log: allow arbitrary number of arguments
bisect--helper: handle states directly
bisect--helper: emit usage for "git bisect"
bisect test: test exit codes on bad usage
bisect--helper: identify as bisect when report error
bisect-run: verify_good: account for non-negative exit status
bisect run: keep some of the post-v2.30.0 output
bisect: fix output regressions in v2.30.0
bisect: refactor bisect_run() to match CodingGuidelines
bisect tests: test for v2.30.0 "bisect run" regressions
write_buffer() reports the OS error if it is unable to write. Its only
caller dies in that case, giving some more context in its last message.
Inline this function and show only a single error message that includes
both the context (writing a loose object file) and the OS error. This
shortens the code and simplifies the output.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since f12fa9ee6c (userdiff: add and use for_each_userdiff_driver(),
2021-04-08), lookup of userdiffs is done with a generic
for_each_userdiff_driver(). But the name lookup doesn't use the "type"
field, of course.
We can't get rid of that field from the generic interface because it is
used by t/helper/test-userdiff.c. So mark it as unused in this instance
to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct filter" abstract type defines several virtual function
pointers. Not all of the concrete functions need every parameter, but
they have to conform to the generic interface. Mark unused ones to
silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff code provides a format_callback interface, but not every
callback needs each parameter (e.g., the "opt" and "data" parameters are
frequently left unused). Likewise for the output_prefix callback, the
low-level change/add_remove interfaces, the callbacks used by
xdi_diff(), etc.
Mark unused arguments in the callback implementations to quiet
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is used interchangeably with xdl_emit via a function
pointer, so we can't just drop the unused parameter. Mark it to silence
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The def_ff() function is the default "find_func" for finding hunk
headers. It has never used its "priv" argument since it was introduced
in f258475a6e (Per-path attribute based hunk header selection.,
2007-07-06). But back then we used a function pointer to switch between
a caller-provided function and the default, so the two had to conform to
the same interface.
In ff2981f724 (xdiff: factor out match_func_rec(), 2016-05-28), that
pointer indirection went away in favor of code which directly calls
either of the two functions. So there's no need for def_ff() to retain
this unused parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We take a ws_rule parameter, but have never looked at it since the
function was added in 877f23ccb8 (Teach "diff --check" about new blank
lines at end, 2008-06-26). A comment in the function does mention how we
_could_ use it, but nobody has felt the need to do so for over a decade.
We could keep it around as reminder of what could be done, but the
comment serves that purpose. And in the meantime, it triggers
-Wunused-parameter.
So let's drop it, which in turn allows us to drop similar arguments
further up the callstack. I've left the comment intact. It does still
say "ws_rule", but that name is used consistently in the whitespace
code, so the meaning is clear.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our object graph traversal code has a process_gitlink() function which
we call when we see a gitlink entry. The function does nothing; it was
added in the early days of gitlinks by 6e2f441bd4 (Teach git
list-objects logic to not follow gitlinks, 2007-04-13).
The comment above the function talks about some things we _could_ do.
But in the intervening 15 years, nobody has touched the function, and
the submodule code usually makes its own decisions about when and how to
examine the links. At the generic traversal layer, we can't assume that
the pointed-to commit is available.
Let's drop this placeholder that isn't really helping anything. This
silences some -Wunused-parameter warnings, and also gets rid of a crufty
use of "const unsigned char *" to pass a raw hash value.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our parse_blob_buffer() takes a ptr/len combo, just like
parse_tree_buffer(), etc, and returns success or failure. But it doesn't
actually do anything with them; we just set the "parsed" flag in the
object and return success, without even looking at the contents.
There could be some value to keeping these unused parameters:
- it's consistent with the parse functions for other object types. But
we already lost that consistency in 837d395a5c (Replace parse_blob()
with an explanatory comment, 2010-01-18).
- As the comment from 837d395a5c explains, callers are supposed to
make sure they have the object content available. So in theory
asking for these parameters could serve as a signal. But there are
only two callers, and one of them always passes NULL (after doing a
streaming check of the object hash).
This shows that there aren't likely to be a lot of callers (since
everyone either uses the type-generic parse functions, or handles
blobs individually), and that they need to take special care anyway
(because we usually want to avoid loading whole blobs in memory if
we can avoid it).
So let's just drop these unused parameters, and likewise the useless
return value. While we're touching the header file, let's move the
declaration of parse_blob_buffer() right below that explanatory comment,
where it's more likely to be seen by people looking for the function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ls_refs() function (for the v2 protocol command of the same name)
takes a repository parameter (like all v2 commands), but ignores it. It
should use it to access the refs.
This isn't a bug in practice, since we only call this function when
serving upload-pack from the main repository. But it's an awkward
gotcha, and it causes -Wunused-parameter to complain.
The main reason we don't use the repository parameter is that the ref
iteration interface we call doesn't have a "refs_" variant that takes a
ref_store. However we can easily add one. In fact, since there is only
one other caller (in ref-filter.c), there is no need to maintain the
non-repository wrapper; that caller can just use the_repository. It's
still a long way from consistently using a repository object, but it's
one small step in the right direction.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The server_supports_v2() helper lets a caller find out if the server
supports a feature, and will optionally die if it's not supported. This
makes the return value confusing, as it's only meaningful when the
function is not asked to die.
Coverity flagged a new call like:
/* check that we support "foo" */
server_supports_v2("foo", 1);
complaining that we usually checked the return value, but this time we
didn't. But this call is correct, and other ones that did:
if (server_supports_v2("foo", 1))
do_something_with_foo();
are "wrong", in the sense that we know the conditional will always be
true (but there's no bug; the code is simply misleading).
Let's split the "die" behavior into its own function which returns void,
and modify each caller to use the correct one.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
apply_parse_options() passes the array of argument strings to
parse_options(), which removes recognized options. The removed strings
are not freed, though.
Make a copy of the strvec to pass to the function to retain the pointers
of its strings, so we release them all at the end.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't put clean parents on the pending list, as they and their ancestors
don't need any treatment and would be skipped later anyway. This saves
the allocation and release of a commit list item in ca. 20% of the cases
during a run of the test suite.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
reflog_expiry_prepare() calls mark_reachable(), which recurively flags
commits as REACHABLE. The traversal stops beyond a certain age
threshold; the boundary commits also marked as REACHABLE and put back
into mark_list at the end. unreachable() finishes the traversal down to
the roots if necessary -- but if all interesting commits are younger
than the age threshold then only recent commits need to be visited.
When this optimization works then the boundary commits still sit there
in mark_list at the end. Clear their REACHABLE flag and release the
commit list allocations.
While at it remove a duplicate code line from mark_reachable(); the same
flag is already set five lines up.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ee4512ed48 ("trace2: create new combined trace facility", 2019-02-
22) introduced trace2_cmd_name() and taught both the Git built-ins and
some non-built-ins to use it. However, http-fetch was not one of them
(perhaps due to its low usage at the time).
Teach http-fetch to invoke this function. After this patch, this
function will be invoked right after argument parsing, just like in
remote-curl.c.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, auto correction doesn't work reliably for commands which must
run in a work tree (e.g. `git status`) in Git work trees which are
created from a bare repository.
As far as I'm able to determine, this has been broken since commit
659fef199f (help: use early config when autocorrecting aliases,
2017-06-14), where the call to `git_config()` in `help_unknown_cmd()`
was replaced with a call to `read_early_config()`. From what I can tell,
the actual cause for the unexpected error is that we call
`git_default_config()` in the `git_unknown_cmd_config` callback instead
of simply returning `0` for config entries which we aren't interested
in.
Calling `git_default_config()` in this callback to `read_early_config()`
seems like a bad idea since those calls will initialize a bunch of state
in `environment.c` (among other things `is_bare_repository_cfg`) before
we've properly detected that we're running in a work tree.
All other callbacks provided to `read_early_config()` appear to only
extract their configurations while simply returning `0` for all other
config keys.
This commit changes the `git_unknown_cmd_config` callback to not call
`git_default_config()`. Instead we also simply return `0` for config
keys which we're not interested in.
Additionally the commit adds a new test case covering `help.autocorrect`
in a work tree created from a bare clone.
Signed-off-by: Simon Gerber <gesimu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git's test suite uses `test_cmp`, it is not actually trying to
compare binary files as the name `cmp` would suggest to users familiar
with Unix' tools, but the tests instead verify that actual output
matches the expected text.
On Unix, `cmp` works well enough for Git's purposes because only Line
Feed characters are used as line endings. However, on Windows, while
most tools accept Line Feeds as line endings, many tools produce
Carriage Return + Line Feed line endings, including some of the tools
used by the test suite (which are therefore provided via Git for Windows
SDK). Therefore, `cmp` would frequently fail merely due to different
line endings.
To accommodate for that, the `mingw_test_cmp` function was introduced
into Git's test suite to perform a line-by-line comparison that ignores
line endings. This function is a Bash function that is only used on
Windows, everywhere else `cmp` is used.
This is a double whammy because `cmp` is fast, and `mingw_test_cmp` is
slow, even more so on Windows because it is a Bash script function, and
Bash scripts are known to run particularly slowly on Windows due to
Bash's need for the POSIX emulation layer provided by the MSYS2 runtime.
The commit message of 32ed3314c1 (t5351: avoid using `test_cmp` for
binary data, 2022-07-29) provides an illuminating account of the
consequences: On Windows, the platform on which Git could really use all
the help it can get to improve its performance, the time spent on one
entire test script was reduced from half an hour to less than half a
minute merely by avoiding a single call to `mingw_test_cmp` in but a
single test case.
Learning the lesson to avoid shell scripting wherever possible, the Git
for Windows project implemented a minimal replacement for
`mingw_test_cmp` in the form of a `test-tool` subcommand that parses the
input files line by line, ignoring line endings, and compares them.
Essentially the same thing as `mingw_test_cmp`, but implemented in
C instead of Bash. This solution served the Git for Windows project
well, over years.
However, when this solution was finally upstreamed, the conclusion was
reached that a change to use `git diff --no-index` instead of
`mingw_test_cmp` was more easily reviewed and hence should be used
instead.
The reason why this approach was not even considered in Git for Windows
is that in 2007, there was already a motion on the table to use Git's
own diff machinery to perform comparisons in Git's test suite, but it
was dismissed in https://lore.kernel.org/git/xmqqbkrpo9or.fsf@gitster.g/
as undesirable because tests might potentially succeed due to bugs in
the diff machinery when they should not succeed, and those bugs could
therefore hide regressions that the tests try to prevent.
By the time Git for Windows' `mingw-test-cmp` in C was finally
contributed to the Git mailing list, reviewers agreed that the diff
machinery had matured enough and should be used instead.
When the concern was raised that the diff machinery, due to its
complexity, would perform substantially worse than the test helper
originally implemented in the Git for Windows project, a test
demonstrated that these performance differences are well lost within the
100+ minutes it takes to run Git's test suite on Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recently, a vulnerability was reported that can lead to an out-of-bounds
write when reading an unreasonably large gitattributes file. The root
cause of this error are multiple integer overflows in different parts of
the code when there are either too many lines, when paths are too long,
when attribute names are too long, or when there are too many attributes
declared for a pattern.
As all of these are related to size, it seems reasonable to restrict the
size of the gitattributes file via git-fsck(1). This allows us to both
stop distributing known-vulnerable objects via common hosting platforms
that have fsck enabled, and users to protect themselves by enabling the
`fetch.fsckObjects` config.
There are basically two checks:
1. We verify that size of the gitattributes file is smaller than
100MB.
2. We verify that the maximum line length does not exceed 2048
bytes.
With the preceding commits, both of these conditions would cause us to
either ignore the complete gitattributes file or blob in the first case,
or the specific line in the second case. Now with these consistency
checks added, we also grow the ability to stop distributing such files
in the first place when `receive.fsckObjects` is enabled.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the checks for gitattributes so that they can be extended more
readily.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `fsck_finish()` we check all blobs for consistency that we have found
during the tree walk, but that haven't yet been checked. This is only
required for gitmodules right now, but will also be required for a new
check for gitattributes.
Pull out a function `fsck_blobs()` that allows the caller to check a set
of blobs for consistency.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In general, we don't need to validate blob contents as they are opaque
blobs about whose content Git doesn't need to care about. There are some
exceptions though when blobs are linked into trees so that they would be
interpreted by Git. We only have a single such check right now though,
which is the one for gitmodules that has been added in the context of
CVE-2018-11235.
Now we have found another vulnerability with gitattributes that can lead
to out-of-bounds writes and reads. So let's refactor `fsck_blob()` so
that it is more extensible and can check different types of blobs.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the padding and wrapping formatting directives allow the caller to
specify an integer that ultimately leads to us adding this many chars to
the result buffer. As a consequence, it is trivial to e.g. allocate 2GB
of RAM via a single formatting directive and cause resource exhaustion
on the machine executing this logic. Furthermore, it is debatable
whether there are any sane usecases that require the user to pad data to
2GB boundaries or to indent wrapped data by 2GB.
Restrict the input sizes to 16 kilobytes at a maximum to limit the
amount of bytes that can be requested by the user. This is not meant
as a fix because there are ways to trivially amplify the amount of
data we generate via formatting directives; the real protection is
achieved by the changes in previous steps to catch and avoid integer
wraparound that causes us to under-allocate and access beyond the
end of allocated memory reagions. But having such a limit
significantly helps fuzzing the pretty format, because the fuzzer is
otherwise quite fast to run out-of-memory as it discovers these
formatters.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `strbuf_utf8_replace`, we preallocate the destination buffer and then
use `memcpy` to copy bytes into it at computed offsets. This feels
rather fragile and is hard to understand at times. Refactor the code to
instead use `strbuf_add` and `strbuf_addstr` so that we can be sure that
there is no possibility to perform an out-of-bounds write.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `strbuf_utf8_replace()`, we call `utf8_width()` to compute the width
of the current glyph. If the glyph is a control character though it can
be that `utf8_width()` returns `-1`, but because we assign this value to
a `size_t` the conversion will cause us to underflow. This bug can
easily be triggered with the following command:
$ git log --pretty='format:xxx%<|(1,trunc)%x10'
>From all I can see though this seems to be a benign underflow that has
no security-related consequences.
Fix the bug by using an `int` instead. When we see a control character,
we now copy it into the target buffer but don't advance the current
width of the string.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return type of both `utf8_strwidth()` and `utf8_strnwidth()` is
`int`, but we operate on string lengths which are typically of type
`size_t`. This means that when the string is longer than `INT_MAX`, we
will overflow and thus return a negative result.
This can lead to an out-of-bounds write with `--pretty=format:%<1)%B`
and a commit message that is 2^31+1 bytes long:
=================================================================
==26009==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000001168 at pc 0x7f95c4e5f427 bp 0x7ffd8541c900 sp 0x7ffd8541c0a8
WRITE of size 2147483649 at 0x603000001168 thread T0
#0 0x7f95c4e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
#1 0x5612bbb1068c in format_and_pad_commit pretty.c:1763
#2 0x5612bbb1087a in format_commit_item pretty.c:1801
#3 0x5612bbc33bab in strbuf_expand strbuf.c:429
#4 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
#5 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
#6 0x5612bba0a4d5 in show_log log-tree.c:781
#7 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
#8 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
#9 0x5612bb69235b in cmd_log_walk builtin/log.c:549
#10 0x5612bb6951a2 in cmd_log builtin/log.c:883
#11 0x5612bb56c993 in run_builtin git.c:466
#12 0x5612bb56d397 in handle_builtin git.c:721
#13 0x5612bb56db07 in run_argv git.c:788
#14 0x5612bb56e8a7 in cmd_main git.c:923
#15 0x5612bb803682 in main common-main.c:57
#16 0x7f95c4c3c28f (/usr/lib/libc.so.6+0x2328f)
#17 0x7f95c4c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#18 0x5612bb5680e4 in _start ../sysdeps/x86_64/start.S:115
0x603000001168 is located 0 bytes to the right of 24-byte region [0x603000001150,0x603000001168)
allocated by thread T0 here:
#0 0x7f95c4ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x5612bbcdd556 in xrealloc wrapper.c:136
#2 0x5612bbc310a3 in strbuf_grow strbuf.c:99
#3 0x5612bbc32acd in strbuf_add strbuf.c:298
#4 0x5612bbc33aec in strbuf_expand strbuf.c:418
#5 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
#6 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
#7 0x5612bba0a4d5 in show_log log-tree.c:781
#8 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
#9 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
#10 0x5612bb69235b in cmd_log_walk builtin/log.c:549
#11 0x5612bb6951a2 in cmd_log builtin/log.c:883
#12 0x5612bb56c993 in run_builtin git.c:466
#13 0x5612bb56d397 in handle_builtin git.c:721
#14 0x5612bb56db07 in run_argv git.c:788
#15 0x5612bb56e8a7 in cmd_main git.c:923
#16 0x5612bb803682 in main common-main.c:57
#17 0x7f95c4c3c28f (/usr/lib/libc.so.6+0x2328f)
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
Shadow bytes around the buggy address:
0x0c067fff81d0: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
0x0c067fff81e0: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
0x0c067fff81f0: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
0x0c067fff8200: fd fd fd fa fa fa fd fd fd fd fa fa 00 00 00 fa
0x0c067fff8210: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
=>0x0c067fff8220: fd fa fa fa fd fd fd fa fa fa 00 00 00[fa]fa fa
0x0c067fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==26009==ABORTING
Now the proper fix for this would be to convert both functions to return
an `size_t` instead of an `int`. But given that this commit may be part
of a security release, let's instead do the minimal viable fix and die
in case we see an overflow.
Add a test that would have previously caused us to crash.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `utf8_strnwidth()` function calls `utf8_width()` in a loop and adds
its returned width to the end result. `utf8_width()` can return `-1`
though in case it reads a control character, which means that the
computed string width is going to be wrong. In the worst case where
there are more control characters than non-control characters, we may
even return a negative string width.
Fix this bug by treating control characters as having zero width.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `utf8_strnwidth()` function accepts an optional string length as
input parameter. This parameter can either be set to `-1`, in which case
we call `strlen()` on the input. Or it can be set to a positive integer
that indicates a precomputed length, which callers typically compute by
calling `strlen()` at some point themselves.
The input parameter is an `int` though, whereas `strlen()` returns a
`size_t`. This can lead to implementation-defined behaviour though when
the `size_t` cannot be represented by the `int`. In the general case
though this leads to wrap-around and thus to negative string sizes,
which is sure enough to not lead to well-defined behaviour.
Fix this by accepting a `size_t` instead of an `int` as string length.
While this takes away the ability of callers to simply pass in `-1` as
string length, it really is trivial enough to convert them to instead
pass in `strlen()` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `%w(width,indent1,indent2)` formatting directive can be used to
rewrap text to a specific width and is designed after git-shortlog(1)'s
`-w` parameter. While the three parameters are all stored as `size_t`
internally, `strbuf_add_wrapped_text()` accepts integers as input. As a
result, the casted integers may overflow. As these now-negative integers
are later on passed to `strbuf_addchars()`, we will ultimately run into
implementation-defined behaviour due to casting a negative number back
to `size_t` again. On my platform, this results in trying to allocate
9000 petabyte of memory.
Fix this overflow by using `cast_size_t_to_int()` so that we reject
inputs that cannot be represented as an integer.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a formatting directive has a `+` or ` ` after the `%`, then we add
either a line feed or space if the placeholder expands to a non-empty
string. In specific cases though this logic doesn't work as expected,
and we try to add the character even in the case where the formatting
directive is empty.
One such pattern is `%w(1)%+d%+w(2)`. `%+d` expands to reference names
pointing to a certain commit, like in `git log --decorate`. For a tagged
commit this would for example expand to `\n (tag: v1.0.0)`, which has a
leading newline due to the `+` modifier and a space added by `%d`. Now
the second wrapping directive will cause us to rewrap the text to
`\n(tag:\nv1.0.0)`, which is one byte shorter due to the missing leading
space. The code that handles the `+` magic now notices that the length
has changed and will thus try to insert a leading line feed at the
original posititon. But as the string was shortened, the original
position is past the buffer's boundary and thus we die with an error.
Now there are two issues here:
1. We check whether the buffer length has changed, not whether it
has been extended. This causes us to try and add the character
past the string boundary.
2. The current logic does not make any sense whatsoever. When the
string got expanded due to the rewrap, putting the separator into
the original position is likely to put it somewhere into the
middle of the rewrapped contents.
It is debatable whether `%+w()` makes any sense in the first place.
Strictly speaking, the placeholder never expands to a non-empty string,
and consequentially we shouldn't ever accept this combination. We thus
fix the bug by simply refusing `%+w()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An out-of-bounds read can be triggered when parsing an incomplete
padding format string passed via `--pretty=format` or in Git archives
when files are marked with the `export-subst` gitattribute.
This bug exists since we have introduced support for truncating output
via the `trunc` keyword a7f01c6b4d (pretty: support truncating in %>, %<
and %><, 2013-04-19). Before this commit, we used to find the end of the
formatting string by using strchr(3P). This function returns a `NULL`
pointer in case the character in question wasn't found. The subsequent
check whether any character was found thus simply checked the returned
pointer. After the commit we switched to strcspn(3P) though, which only
returns the offset to the first found character or to the trailing NUL
byte. As the end pointer is now computed by adding the offset to the
start pointer it won't be `NULL` anymore, and as a consequence the check
doesn't do anything anymore.
The out-of-bounds data that is being read can in fact end up in the
formatted string. As a consequence, it is possible to leak memory
contents either by calling git-log(1) or via git-archive(1) when any of
the archived files is marked with the `export-subst` gitattribute.
==10888==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000398 at pc 0x7f0356047cb2 bp 0x7fff3ffb95d0 sp 0x7fff3ffb8d78
READ of size 1 at 0x602000000398 thread T0
#0 0x7f0356047cb1 in __interceptor_strchrnul /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725
#1 0x563b7cec9a43 in strbuf_expand strbuf.c:417
#2 0x563b7cda7060 in repo_format_commit_message pretty.c:1869
#3 0x563b7cda8d0f in pretty_print_commit pretty.c:2161
#4 0x563b7cca04c8 in show_log log-tree.c:781
#5 0x563b7cca36ba in log_tree_commit log-tree.c:1117
#6 0x563b7c927ed5 in cmd_log_walk_no_free builtin/log.c:508
#7 0x563b7c92835b in cmd_log_walk builtin/log.c:549
#8 0x563b7c92b1a2 in cmd_log builtin/log.c:883
#9 0x563b7c802993 in run_builtin git.c:466
#10 0x563b7c803397 in handle_builtin git.c:721
#11 0x563b7c803b07 in run_argv git.c:788
#12 0x563b7c8048a7 in cmd_main git.c:923
#13 0x563b7ca99682 in main common-main.c:57
#14 0x7f0355e3c28f (/usr/lib/libc.so.6+0x2328f)
#15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115
0x602000000398 is located 0 bytes to the right of 8-byte region [0x602000000390,0x602000000398)
allocated by thread T0 here:
#0 0x7f0356072faa in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:439
#1 0x563b7cf7317c in xstrdup wrapper.c:39
#2 0x563b7cd9a06a in save_user_format pretty.c:40
#3 0x563b7cd9b3e5 in get_commit_format pretty.c:173
#4 0x563b7ce54ea0 in handle_revision_opt revision.c:2456
#5 0x563b7ce597c9 in setup_revisions revision.c:2850
#6 0x563b7c9269e0 in cmd_log_init_finish builtin/log.c:269
#7 0x563b7c927362 in cmd_log_init builtin/log.c:348
#8 0x563b7c92b193 in cmd_log builtin/log.c:882
#9 0x563b7c802993 in run_builtin git.c:466
#10 0x563b7c803397 in handle_builtin git.c:721
#11 0x563b7c803b07 in run_argv git.c:788
#12 0x563b7c8048a7 in cmd_main git.c:923
#13 0x563b7ca99682 in main common-main.c:57
#14 0x7f0355e3c28f (/usr/lib/libc.so.6+0x2328f)
#15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725 in __interceptor_strchrnul
Shadow bytes around the buggy address:
0x0c047fff8020: fa fa fd fd fa fa 00 06 fa fa 05 fa fa fa fd fd
0x0c047fff8030: fa fa 00 02 fa fa 06 fa fa fa 05 fa fa fa fd fd
0x0c047fff8040: fa fa 00 07 fa fa 03 fa fa fa fd fd fa fa 00 00
0x0c047fff8050: fa fa 00 01 fa fa fd fd fa fa 00 00 fa fa 00 01
0x0c047fff8060: fa fa 00 06 fa fa 00 06 fa fa 05 fa fa fa 05 fa
=>0x0c047fff8070: fa fa 00[fa]fa fa fd fa fa fa fd fd fa fa fd fd
0x0c047fff8080: fa fa fd fd fa fa 00 00 fa fa 00 fa fa fa fd fa
0x0c047fff8090: fa fa fd fd fa fa 00 00 fa fa fa fa fa fa fa fa
0x0c047fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff80b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff80c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==10888==ABORTING
Fix this bug by checking whether `end` points at the trailing NUL byte.
Add a test which catches this out-of-bounds read and which demonstrates
that we used to write out-of-bounds data into the formatted message.
Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Original-patch-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the `%>>(<N>)` pretty formatter, you can ask git-log(1) et al to
steal spaces. To do so we need to look ahead of the next token to see
whether there are spaces there. This loop takes into account ANSI
sequences that end with an `m`, and if it finds any it will skip them
until it finds the first space. While doing so it does not take into
account the buffer's limits though and easily does an out-of-bounds
read.
Add a test that hits this behaviour. While we don't have an easy way to
verify this, the test causes the following failure when run with
`SANITIZE=address`:
==37941==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000000baf at pc 0x55ba6f88e0d0 bp 0x7ffc84c50d20 sp 0x7ffc84c50d10
READ of size 1 at 0x603000000baf thread T0
#0 0x55ba6f88e0cf in format_and_pad_commit pretty.c:1712
#1 0x55ba6f88e7b4 in format_commit_item pretty.c:1801
#2 0x55ba6f9b1ae4 in strbuf_expand strbuf.c:429
#3 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
#4 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
#5 0x55ba6f7884c8 in show_log log-tree.c:781
#6 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
#7 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
#8 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
#9 0x55ba6f4131a2 in cmd_log builtin/log.c:883
#10 0x55ba6f2ea993 in run_builtin git.c:466
#11 0x55ba6f2eb397 in handle_builtin git.c:721
#12 0x55ba6f2ebb07 in run_argv git.c:788
#13 0x55ba6f2ec8a7 in cmd_main git.c:923
#14 0x55ba6f581682 in main common-main.c:57
#15 0x7f2d08c3c28f (/usr/lib/libc.so.6+0x2328f)
#16 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#17 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115
0x603000000baf is located 1 bytes to the left of 24-byte region [0x603000000bb0,0x603000000bc8)
allocated by thread T0 here:
#0 0x7f2d08ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x55ba6fa5b494 in xrealloc wrapper.c:136
#2 0x55ba6f9aefdc in strbuf_grow strbuf.c:99
#3 0x55ba6f9b0a06 in strbuf_add strbuf.c:298
#4 0x55ba6f9b1a25 in strbuf_expand strbuf.c:418
#5 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
#6 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
#7 0x55ba6f7884c8 in show_log log-tree.c:781
#8 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
#9 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
#10 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
#11 0x55ba6f4131a2 in cmd_log builtin/log.c:883
#12 0x55ba6f2ea993 in run_builtin git.c:466
#13 0x55ba6f2eb397 in handle_builtin git.c:721
#14 0x55ba6f2ebb07 in run_argv git.c:788
#15 0x55ba6f2ec8a7 in cmd_main git.c:923
#16 0x55ba6f581682 in main common-main.c:57
#17 0x7f2d08c3c28f (/usr/lib/libc.so.6+0x2328f)
#18 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#19 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115
SUMMARY: AddressSanitizer: heap-buffer-overflow pretty.c:1712 in format_and_pad_commit
Shadow bytes around the buggy address:
0x0c067fff8120: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
0x0c067fff8130: fd fd fa fa fd fd fd fd fa fa fd fd fd fa fa fa
0x0c067fff8140: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
0x0c067fff8150: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa fd fd
0x0c067fff8160: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
=>0x0c067fff8170: fd fd fd fa fa[fa]00 00 00 fa fa fa 00 00 00 fa
0x0c067fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff81a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff81b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff81c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Luckily enough, this would only cause us to copy the out-of-bounds data
into the formatted commit in case we really had an ANSI sequence
preceding our buffer. So this bug likely has no security consequences.
Fix it regardless by not traversing past the buffer's start.
Reported-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Eric Sesterhenn <eric.sesterhenn@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using a padding specifier in the pretty format passed to git-log(1)
we need to calculate the string length in several places. These string
lengths are stored in `int`s though, which means that these can easily
overflow when the input lengths exceeds 2GB. This can ultimately lead to
an out-of-bounds write when these are used in a call to memcpy(3P):
==8340==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1ec62f97fe at pc 0x7f2127e5f427 bp 0x7ffd3bd63de0 sp 0x7ffd3bd63588
WRITE of size 1 at 0x7f1ec62f97fe thread T0
#0 0x7f2127e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
#1 0x5628e96aa605 in format_and_pad_commit pretty.c:1762
#2 0x5628e96aa7f4 in format_commit_item pretty.c:1801
#3 0x5628e97cdb24 in strbuf_expand strbuf.c:429
#4 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
#5 0x5628e96acd0f in pretty_print_commit pretty.c:2161
#6 0x5628e95a44c8 in show_log log-tree.c:781
#7 0x5628e95a76ba in log_tree_commit log-tree.c:1117
#8 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
#9 0x5628e922c35b in cmd_log_walk builtin/log.c:549
#10 0x5628e922f1a2 in cmd_log builtin/log.c:883
#11 0x5628e9106993 in run_builtin git.c:466
#12 0x5628e9107397 in handle_builtin git.c:721
#13 0x5628e9107b07 in run_argv git.c:788
#14 0x5628e91088a7 in cmd_main git.c:923
#15 0x5628e939d682 in main common-main.c:57
#16 0x7f2127c3c28f (/usr/lib/libc.so.6+0x2328f)
#17 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#18 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115
0x7f1ec62f97fe is located 2 bytes to the left of 4831838265-byte region [0x7f1ec62f9800,0x7f1fe62f9839)
allocated by thread T0 here:
#0 0x7f2127ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x5628e98774d4 in xrealloc wrapper.c:136
#2 0x5628e97cb01c in strbuf_grow strbuf.c:99
#3 0x5628e97ccd42 in strbuf_addchars strbuf.c:327
#4 0x5628e96aa55c in format_and_pad_commit pretty.c:1761
#5 0x5628e96aa7f4 in format_commit_item pretty.c:1801
#6 0x5628e97cdb24 in strbuf_expand strbuf.c:429
#7 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
#8 0x5628e96acd0f in pretty_print_commit pretty.c:2161
#9 0x5628e95a44c8 in show_log log-tree.c:781
#10 0x5628e95a76ba in log_tree_commit log-tree.c:1117
#11 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
#12 0x5628e922c35b in cmd_log_walk builtin/log.c:549
#13 0x5628e922f1a2 in cmd_log builtin/log.c:883
#14 0x5628e9106993 in run_builtin git.c:466
#15 0x5628e9107397 in handle_builtin git.c:721
#16 0x5628e9107b07 in run_argv git.c:788
#17 0x5628e91088a7 in cmd_main git.c:923
#18 0x5628e939d682 in main common-main.c:57
#19 0x7f2127c3c28f (/usr/lib/libc.so.6+0x2328f)
#20 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#21 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
Shadow bytes around the buggy address:
0x0fe458c572a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0fe458c572b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0fe458c572c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0fe458c572d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0fe458c572e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0fe458c572f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]
0x0fe458c57300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0fe458c57310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0fe458c57320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0fe458c57330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0fe458c57340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==8340==ABORTING
The pretty format can also be used in `git archive` operations via the
`export-subst` attribute. So this is what in our opinion makes this a
critical issue in the context of Git forges which allow to download an
archive of user supplied Git repositories.
Fix this vulnerability by using `size_t` instead of `int` to track the
string lengths. Add tests which detect this vulnerability when Git is
compiled with the address sanitizer.
Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Original-patch-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Modified-by: Taylor Blau <me@ttalorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit
platforms and regardless of the size of `long`.
This imitates the `LONG_IS_64BIT` prerequisite.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t1509-root-work-tree.sh, which tests behavior of a Git repository
located at the root `/` directory, refuses to run if it detects the
presence of an existing repository at `/`. This safeguard ensures that
it won't clobber a legitimate repository at that location. However,
because t1509 does a poor job of cleaning up after itself, it runs afoul
of its own safety check on subsequent runs, which makes it painful to
run the script repeatedly since each run requires manual cleanup of
detritus from the previous run.
Address this shortcoming by making t1509 clean up after itself as its
last action. This is safe since the script can only make it to this
cleanup action if it did not find a legitimate repository at `/` in the
first place, so the resources cleaned up here can only have been created
by the script itself.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
One of the t1509 setup tests is very particular about the output it
expects from `git init`, and fails if the output differs even slightly
which can happen easily if the script is run multiple times since it
doesn't do a good job of cleaning up after itself (i.e. it leaves
detritus in the root directory `/`). One bit of cruft in particular
(`/HEAD`) makes the test fail since its presence causes `git init` to
alter its output; rather than reporting "Initialized empty Git
repository", it instead reports "Reinitialized existing Git repository"
when `/HEAD` is present. Address this problem by making the test do a
more careful job of crafting its intended initial state.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
When 8959555cee (setup_git_directory(): add an owner check for the
top-level directory, 2022-03-02) tightened security surrounding
directory ownership, it neglected to adjust t1509-root-work-tree.sh to
take the new restriction into account. As a result, since the root
directory `/` is typically not owned by the user running the test
(indeed, t1509 refuses to run as `root`), the ownership check added
by 8959555cee kicks in and causes the test to fail:
fatal: detected dubious ownership in repository at '/'
To add an exception for this directory, call:
git config --global --add safe.directory /
This problem went unnoticed for so long because t1509 is rarely run
since it requires setting up a `chroot` environment or a sacrificial
virtual machine in which `/` can be made writable and polluted by any
user.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
grep(1) converts CRLF line endings to LF on current MinGW:
$ uname -sr
MINGW64_NT-10.0-22621 3.3.6-341.x86_64
$ printf 'a\r\n' | hexdump.exe -C
00000000 61 0d 0a |a..|
00000003
$ printf 'a\r\n' | grep . | hexdump.exe -C
00000000 61 0a |a.|
00000002
Create the intended test file by grepping the original file with LF
line endings and adding CRs explicitly.
The missing CRs went unnoticed because test_cmp on MinGW ignores line
endings since 4d715ac05c (Windows: a test_cmp that is agnostic to random
LF <> CRLF conversions, 2013-10-26). Fix this test anyway to avoid
depending on that special test_cmp behavior, especially since this is
the only test that needs it.
Piping the output of grep(1) through append_cr has the side-effect of
ignoring its return value. That means we no longer need the explicit
"|| true" to support commit messages without a body.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Git for Windows, when passing paths from shell scripts to regular
Win32 executables, thanks to the MSYS2 runtime a somewhat magic path
conversion happens that lets the shell script think that there is a file
at `/git/Makefile` and the Win32 process it spawned thinks that the
shell script said `C:/git-sdk-64/git/Makefile` instead.
This conversion is documented in detail over here:
https://www.msys2.org/docs/filesystem-paths/#automatic-unix-windows-path-conversion
As all automatic conversions, there are gaps. For example, to avoid
mistaking command-line options like `/LOG=log.txt` (which are quite
common in the Windows world) from being mistaken for a Unix-style
absolute path, the MSYS2 runtime specifically exempts arguments
containing a `=` character from that conversion.
We are about to change `test_cmp` to use `git diff --no-index`, which
involves spawning precisely such a Win32 process.
In combination, this would cause a failure in `t0021-conversion.sh`
where we pass an absolute path containing an equal character to the
`test_cmp` function.
Seeing as the Unix tools like `cp` and `diff` that are used by Git's
test suite in the Git for Windows SDK (thanks to the MSYS2 project)
understand both Unix-style as well as Windows-style paths, we can stave
off this problem by simply switching to Windows-style paths and
side-stepping the need for any automatic path conversion.
Note: The `PATH` variable is obviously special, as it is colon-separated
in the MSYS2 Bash used by Git for Windows, and therefore _cannot_
contain absolute Windows-style paths, lest the colon after the drive
letter is mistaken for a path separator. Therefore, we need to be
careful to keep the Unix-style when modifying the `PATH` variable.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar as with the preceding commit, start ignoring gitattributes files
that are overly large to protect us against out-of-bounds reads and
writes caused by integer overflows. Unfortunately, we cannot just define
"overly large" in terms of any preexisting limits in the codebase.
Instead, we choose a very conservative limit of 100MB. This is plenty of
room for specifying gitattributes, and incidentally it is also the limit
for blob sizes for GitHub. While we don't want GitHub to dictate limits
here, it is still sensible to use this fact for an informed decision
given that it is hosting a huge set of repositories. Furthermore, over
at GitLab we scanned a subset of repositories for their root-level
attribute files. We found that 80% of them have a gitattributes file
smaller than 100kB, 99.99% have one smaller than 1MB, and only a single
repository had one that was almost 3MB in size. So enforcing a limit of
100MB seems to give us ample of headroom.
With this limit in place we can be reasonably sure that there is no easy
way to exploit the gitattributes file via integer overflows anymore.
Furthermore, it protects us against resource exhaustion caused by
allocating the in-memory data structures required to represent the
parsed attributes.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two different code paths to read gitattributes: once via a
file, and once via the index. These two paths used to behave differently
because when reading attributes from a file, we used fgets(3P) with a
buffer size of 2kB. Consequentially, we silently truncate line lengths
when lines are longer than that and will then parse the remainder of the
line as a new pattern. It goes without saying that this is entirely
unexpected, but it's even worse that the behaviour depends on how the
gitattributes are parsed.
While this is simply wrong, the silent truncation saves us with the
recently discovered vulnerabilities that can cause out-of-bound writes
or reads with unreasonably long lines due to integer overflows. As the
common path is to read gitattributes via the worktree file instead of
via the index, we can assume that any gitattributes file that had lines
longer than that is already broken anyway. So instead of lifting the
limit here, we can double down on it to fix the vulnerabilities.
Introduce an explicit line length limit of 2kB that is shared across all
paths that read attributes and ignore any line that hits this limit
while printing a warning.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading attributes from a file we use fgets(3P) with a buffer size
of 2048 bytes. This means that as soon as a line exceeds the buffer size
we split it up into multiple parts and parse each of them as a separate
pattern line. This is of course not what the user intended, and even
worse the behaviour is inconsistent with how we read attributes from the
index.
Fix this bug by converting the code to use `strbuf_getline()` instead.
This will indeed read in the whole line, which may theoretically lead to
an out-of-memory situation when the gitattributes file is huge. We're
about to reject any gitattributes files larger than 100MB in the next
commit though, which makes this less of a concern.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing an attributes line, we need to allocate an array that holds
all attributes specified for the given file pattern. The calculation to
determine the number of bytes that need to be allocated was prone to an
overflow though when there was an unreasonable amount of attributes.
Harden the allocation by instead using the `st_` helper functions that
cause us to die when we hit an integer overflow.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Attributes have a field that tracks the position in the `all_attrs`
array they're stored inside. This field gets set via `hashmap_get_size`
when adding the attribute to the global map of attributes. But while the
field is of type `int`, the value returned by `hashmap_get_size` is an
`unsigned int`. It can thus happen that the value overflows, where we
would now dereference teh `all_attrs` array at an out-of-bounds value.
We do have a sanity check for this overflow via an assert that verifies
the index matches the new hashmap's size. But asserts are not a proper
mechanism to detect against any such overflows as they may not in fact
be compiled into production code.
Fix this by using an `unsigned int` to track the index and convert the
assert to a call `die()`.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `struct attr_stack` tracks the stack of all patterns together with
their attributes. When parsing a gitattributes file that has more than
2^31 such patterns though we may trigger multiple out-of-bounds reads on
64 bit platforms. This is because while the `num_matches` variable is an
unsigned integer, we always use a signed integer to iterate over them.
I have not been able to reproduce this issue due to memory constraints
on my systems. But despite the out-of-bounds reads, the worst thing that
can seemingly happen is to call free(3P) with a garbage pointer when
calling `attr_stack_free()`.
Fix this bug by using unsigned integers to iterate over the array. While
this makes the iteration somewhat awkward when iterating in reverse, it
is at least better than knowingly running into an out-of-bounds read.
While at it, convert the call to `ALLOC_GROW` to use `ALLOC_GROW_BY`
instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is possible to trigger an integer overflow when parsing attribute
names when there are more than 2^31 of them for a single pattern. This
can either lead to us dying due to trying to request too many bytes:
blob=$(perl -e 'print "f" . " a=" x 2147483649' | git hash-object -w --stdin)
git update-index --add --cacheinfo 100644,$blob,.gitattributes
git attr-check --all file
=================================================================
==1022==ERROR: AddressSanitizer: requested allocation size 0xfffffff800000032 (0xfffffff800001038 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0)
#0 0x7fd3efabf411 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
#1 0x5563a0a1e3d3 in xcalloc wrapper.c:150
#2 0x5563a058d005 in parse_attr_line attr.c:384
#3 0x5563a058e661 in handle_attr_line attr.c:660
#4 0x5563a058eddb in read_attr_from_index attr.c:769
#5 0x5563a058ef12 in read_attr attr.c:797
#6 0x5563a058f24c in bootstrap_attr_stack attr.c:867
#7 0x5563a058f4a3 in prepare_attr_stack attr.c:902
#8 0x5563a05905da in collect_some_attrs attr.c:1097
#9 0x5563a059093d in git_all_attrs attr.c:1128
#10 0x5563a02f636e in check_attr builtin/check-attr.c:67
#11 0x5563a02f6c12 in cmd_check_attr builtin/check-attr.c:183
#12 0x5563a02aa993 in run_builtin git.c:466
#13 0x5563a02ab397 in handle_builtin git.c:721
#14 0x5563a02abb2b in run_argv git.c:788
#15 0x5563a02ac991 in cmd_main git.c:926
#16 0x5563a05432bd in main common-main.c:57
#17 0x7fd3ef82228f (/usr/lib/libc.so.6+0x2328f)
==1022==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: allocation-size-too-big /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77 in __interceptor_calloc
==1022==ABORTING
Or, much worse, it can lead to an out-of-bounds write because we
underallocate and then memcpy(3P) into an array:
perl -e '
print "A " . "\rh="x2000000000;
print "\rh="x2000000000;
print "\rh="x294967294 . "\n"
' >.gitattributes
git add .gitattributes
git commit -am "evil attributes"
$ git clone --quiet /path/to/repo
=================================================================
==15062==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000002550 at pc 0x5555559884d5 bp 0x7fffffffbc60 sp 0x7fffffffbc58
WRITE of size 8 at 0x602000002550 thread T0
#0 0x5555559884d4 in parse_attr_line attr.c:393
#1 0x5555559884d4 in handle_attr_line attr.c:660
#2 0x555555988902 in read_attr_from_index attr.c:784
#3 0x555555988902 in read_attr_from_index attr.c:747
#4 0x555555988a1d in read_attr attr.c:800
#5 0x555555989b0c in bootstrap_attr_stack attr.c:882
#6 0x555555989b0c in prepare_attr_stack attr.c:917
#7 0x555555989b0c in collect_some_attrs attr.c:1112
#8 0x55555598b141 in git_check_attr attr.c:1126
#9 0x555555a13004 in convert_attrs convert.c:1311
#10 0x555555a95e04 in checkout_entry_ca entry.c:553
#11 0x555555d58bf6 in checkout_entry entry.h:42
#12 0x555555d58bf6 in check_updates unpack-trees.c:480
#13 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
#14 0x555555785ab7 in checkout builtin/clone.c:724
#15 0x555555785ab7 in cmd_clone builtin/clone.c:1384
#16 0x55555572443c in run_builtin git.c:466
#17 0x55555572443c in handle_builtin git.c:721
#18 0x555555727872 in run_argv git.c:788
#19 0x555555727872 in cmd_main git.c:926
#20 0x555555721fa0 in main common-main.c:57
#21 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308
#22 0x555555723f39 in _start (git+0x1cff39)
0x602000002552 is located 0 bytes to the right of 2-byte region [0x602000002550,0x602000002552) allocated by thread T0 here:
#0 0x7ffff768c037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x555555d7fff7 in xcalloc wrapper.c:150
#2 0x55555598815f in parse_attr_line attr.c:384
#3 0x55555598815f in handle_attr_line attr.c:660
#4 0x555555988902 in read_attr_from_index attr.c:784
#5 0x555555988902 in read_attr_from_index attr.c:747
#6 0x555555988a1d in read_attr attr.c:800
#7 0x555555989b0c in bootstrap_attr_stack attr.c:882
#8 0x555555989b0c in prepare_attr_stack attr.c:917
#9 0x555555989b0c in collect_some_attrs attr.c:1112
#10 0x55555598b141 in git_check_attr attr.c:1126
#11 0x555555a13004 in convert_attrs convert.c:1311
#12 0x555555a95e04 in checkout_entry_ca entry.c:553
#13 0x555555d58bf6 in checkout_entry entry.h:42
#14 0x555555d58bf6 in check_updates unpack-trees.c:480
#15 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
#16 0x555555785ab7 in checkout builtin/clone.c:724
#17 0x555555785ab7 in cmd_clone builtin/clone.c:1384
#18 0x55555572443c in run_builtin git.c:466
#19 0x55555572443c in handle_builtin git.c:721
#20 0x555555727872 in run_argv git.c:788
#21 0x555555727872 in cmd_main git.c:926
#22 0x555555721fa0 in main common-main.c:57
#23 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-buffer-overflow attr.c:393 in parse_attr_line
Shadow bytes around the buggy address:
0x0c047fff8450: fa fa 00 02 fa fa 00 07 fa fa fd fd fa fa 00 00
0x0c047fff8460: fa fa 02 fa fa fa fd fd fa fa 00 06 fa fa 05 fa
0x0c047fff8470: fa fa fd fd fa fa 00 02 fa fa 06 fa fa fa 05 fa
0x0c047fff8480: fa fa 07 fa fa fa fd fd fa fa 00 01 fa fa 00 02
0x0c047fff8490: fa fa 00 03 fa fa 00 fa fa fa 00 01 fa fa 00 03
=>0x0c047fff84a0: fa fa 00 01 fa fa 00 02 fa fa[02]fa fa fa fa fa
0x0c047fff84b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff84c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff84d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff84e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff84f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==15062==ABORTING
Fix this bug by using `size_t` instead to count the number of attributes
so that this value cannot reasonably overflow without running out of
memory before already.
Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is possible to trigger an integer overflow when parsing attribute
names that are longer than 2^31 bytes because we assign the result of
strlen(3P) to an `int` instead of to a `size_t`. This can lead to an
abort in vsnprintf(3P) with the following reproducer:
blob=$(perl -e 'print "A " . "B"x2147483648 . "\n"' | git hash-object -w --stdin)
git update-index --add --cacheinfo 100644,$blob,.gitattributes
git check-attr --all path
BUG: strbuf.c:400: your vsnprintf is broken (returned -1)
But furthermore, assuming that the attribute name is even longer than
that, it can cause us to silently truncate the attribute and thus lead
to wrong results.
Fix this integer overflow by using a `size_t` instead. This fixes the
silent truncation of attribute names, but it only partially fixes the
BUG we hit: even though the initial BUG is fixed, we can still hit a BUG
when parsing invalid attribute lines via `report_invalid_attr()`.
This is due to an underlying design issue in vsnprintf(3P) which only
knows to return an `int`, and thus it may always overflow with large
inputs. This issue is benign though: the worst that can happen is that
the error message is misreported to be either truncated or too long, but
due to the buffer being NUL terminated we wouldn't ever do an
out-of-bounds read here.
Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is an out-of-bounds read possible when parsing gitattributes that
have an attribute that is 2^31+1 bytes long. This is caused due to an
integer overflow when we assign the result of strlen(3P) to an `int`,
where we use the wrapped-around value in a subsequent call to
memcpy(3P). The following code reproduces the issue:
blob=$(perl -e 'print "a" x 2147483649 . " attr"' | git hash-object -w --stdin)
git update-index --add --cacheinfo 100644,$blob,.gitattributes
git check-attr --all file
AddressSanitizer:DEADLYSIGNAL
=================================================================
==8451==ERROR: AddressSanitizer: SEGV on unknown address 0x7f93efa00800 (pc 0x7f94f1f8f082 bp 0x7ffddb59b3a0 sp 0x7ffddb59ab28 T0)
==8451==The signal is caused by a READ memory access.
#0 0x7f94f1f8f082 (/usr/lib/libc.so.6+0x176082)
#1 0x7f94f2047d9c in __interceptor_strspn /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:752
#2 0x560e190f7f26 in parse_attr_line attr.c:375
#3 0x560e190f9663 in handle_attr_line attr.c:660
#4 0x560e190f9ddd in read_attr_from_index attr.c:769
#5 0x560e190f9f14 in read_attr attr.c:797
#6 0x560e190fa24e in bootstrap_attr_stack attr.c:867
#7 0x560e190fa4a5 in prepare_attr_stack attr.c:902
#8 0x560e190fb5dc in collect_some_attrs attr.c:1097
#9 0x560e190fb93f in git_all_attrs attr.c:1128
#10 0x560e18e6136e in check_attr builtin/check-attr.c:67
#11 0x560e18e61c12 in cmd_check_attr builtin/check-attr.c:183
#12 0x560e18e15993 in run_builtin git.c:466
#13 0x560e18e16397 in handle_builtin git.c:721
#14 0x560e18e16b2b in run_argv git.c:788
#15 0x560e18e17991 in cmd_main git.c:926
#16 0x560e190ae2bd in main common-main.c:57
#17 0x7f94f1e3c28f (/usr/lib/libc.so.6+0x2328f)
#18 0x7f94f1e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#19 0x560e18e110e4 in _start ../sysdeps/x86_64/start.S:115
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/lib/libc.so.6+0x176082)
==8451==ABORTING
Fix this bug by converting the variable to a `size_t` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `git_attr_internal()` is called to upsert attributes into
the global map. And while all callers pass a `size_t`, the function
itself accepts an `int` as the attribute name's length. This can lead to
an integer overflow in case the attribute name is longer than `INT_MAX`.
Now this overflow seems harmless as the first thing we do is to call
`attr_name_valid()`, and that function only succeeds in case all chars
in the range of `namelen` match a certain small set of chars. We thus
can't do an out-of-bounds read as NUL is not part of that set and all
strings passed to this function are NUL-terminated. And furthermore, we
wouldn't ever read past the current attribute name anyway due to the
same reason. And if validation fails we will return early.
On the other hand it feels fragile to rely on this behaviour, even more
so given that we pass `namelen` to `FLEX_ALLOC_MEM()`. So let's instead
just do the correct thing here and accept a `size_t` as line length.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we define a macro to point a system function (e.g., flockfile) to
our custom wrapper, we should make sure that the system did not already
define it as a macro. This is rarely a problem, but can cause
compilation failures if both of these are true:
- we decide to define our own wrapper even though the system provides
the function; we know this happens at least with uclibc, which may
declare flockfile, etc, without _POSIX_THREAD_SAFE_FUNCTIONS
- the system version is declared as a macro; we know this happens at
least with uclibc's version of getc_unlocked()
So just handling getc_unlocked() would be sufficient to deal with the
real-world case we've seen. But since it's easy to do, we may as well be
defensive about the other macro wrappers added in the previous patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for pthread_create and pthread_sigmask state that:
"On success, pthread_create() returns 0;
on error, it returns an error number"
As such, we ought to check for an error
by seeing if the output is not 0.
Checking for "less than" is a mistake
as the error code numbers can be greater than 0.
Signed-off-by: Seija <doremylover123@gmail.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is customary to write `A || true` to ignore a potential error exit of
command A. But when we have a sequence `A && B && C || true && D`, then
a failure of any of A, B, or C skips to D right away. This is not
intended here. Turn the command whose failure is to be ignored into a
compound command to ensure it is the only one that is allowed to fail.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a "git diff-tree" command to be &&-chained so that we won't
ignore its exit code, see the ea05fd5fbf (Merge branch
'ab/keep-git-exit-codes-in-tests', 2022-03-16) topic for prior art.
This fixes code added in b45563a229 (rename: Break filepairs with
different types., 2007-11-30). Due to hiding the exit code we hid a
memory leak under SANITIZE=leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the verify_mergeheads() helper the check the exit code of "git
rev-parse".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the test added in [1] to check the exit code of the "git"
invocations. An in-flight change[2] introduced a memory leak in these
invocations, which went undetected unless we were running under
"GIT_TEST_SANITIZE_LEAK_LOG=true".
Note that the in-flight change made 8 test files fail, but as far as I
can tell only this one would have had its exit code hidden unless
under "GIT_TEST_SANITIZE_LEAK_LOG=true". The rest would be caught
without it.
We could pick other variable names here than "ln%d", e.g. "commit",
"dummy_blob" and "file_blob", but having the "rev-parse" invocations
aligned makes the difference between them more readable, so let's pick
"ln%d".
1. 4cf2143e02 (pack-objects: break delta cycles before delta-search
phase, 2016-08-11)
2. https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/
3. faececa53f (test-lib: have the "check" mode for SANITIZE=leak
consider leak logs, 2022-07-28)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix racy tests in t7527 by forcing the use of cookie files during all
types of queries. There were originaly observed on M1 macs with file
system encryption enabled.
There were a series of simple tests, such as "edit some files" and
"create some files", that started the daemon with GIT_TRACE_FSMONITOR
enabled so that the daemon would emit "event: <path>" messages to the
trace log. The test would make worktree modifications and then grep
the log file to confirm it contained the expected trace messages.
The greps would occasionally racily-fail. The expected messages
were always present in the log file, just not yet always present
when the greps ran.
NEEDSWORK: One could argue that the tests should use the `test-tool
fsmonitor-client query` and search for the expected pathnames in the
output rather than grepping the trace log, but I'll leave that for a
later exercise.
The racy tests called `test-tool fsmonitor-client query --token 0`
before grepping the log file. (Presumably to introduce a small delay
and/or to let the daemon sync with the file system following the last
modification, but that was not always sufficient and hence the race.)
When the query arg is just "0", the daemon treated it as a V1
(aka timestamp-relative request) and responded with a "trivial
response" and a new token, but without trying to catch up to the
the file system event stream. So the "event: <path>" messages
may or may not yet be in the log file when the grep commands
started.
FWIW, if the tests had sent `--token builtin:0:0` instead, it would
have forced a slightly different code path in the daemon that would
cause the daemon to use a cookie file and let it catch up with the
file system event stream. I did not see any test failures with this
change.
Instead of modifying the test, I updated the fsmonitor--daemon to
always use a cookie file and catch up to the file system on any
query operation, regardless of the format of the request token.
This is safer.
FWIW, I think the effect of the race was limited to the test.
Commands like `git status` would always do a full scan when getting a
trivial response. The fact that the daemon was slighly behind the
file system when it generated the response token would cause a second
`git status` to get a few extra paths that the client would have to
examine, but it would not be missing paths.
FWIW, I also think that an earlier version of the code always did
the cookie file for all types of queries, but it was optimized out
during a round of reviews or rework and we didn't notice the race.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
filter_sparse_oid__init() uses add_patterns_from_blob_to_list() to
populate the struct pattern_list member of struct filter_sparse_data.
Release it in the complementing filter_sparse_free().
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
repo_diff_setup() builds the struct option array with git diff's command
line options and stores a pointer to it in the parseopts member of
struct diff_options. The array is freed by diff_setup_done(), but not
by release_revisions(). Thus calling only repo_diff_setup() and
release_revisions() leaks that array.
We could free it in release_revisions() as well to plug that leak, but
there is a better way: Only build it when needed. Absorb
prep_parse_options() into the last place that uses the parseopts member
of struct diff_options, add_diff_parseopts(), and get rid of said
member.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare the removal of the parseopts member of struct diff_options by
using the API function add_diff_options() instead of accessing it
directly to get the command line option definitions. Building the copy
by concatenating with an empty option array is slightly awkward, but
simpler than a non-concat version of add_diff_options() would be to use
in places that need concatenation.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function for appending the parseopts member of struct diff_options
to a struct option array. Use it in two sites instead of accessing the
parseopts member directly. Decoupling callers from diff internals like
that allows us to change the latter.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only abort the individual check instead of exiting the whole test script
if git show fails. Noticed with GIT_TEST_PASSING_SANITIZE_LEAK=check.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git status` can be slow when there are a large number of
untracked files and directories since Git must search the entire
worktree to enumerate them. When it is too slow, Git prints
advice with the elapsed search time and a suggestion to disable
the search using the `-uno` option. This suggestion also carries
a warning that might scare off some users.
However, these days, `-uno` isn't the only option. Git can reduce
the time taken to enumerate untracked files by caching results from
previous `git status` invocations, when the `core.untrackedCache`
and `core.fsmonitor` features are enabled.
Update the `git status` man page to explain these configuration
options, and update the advice to provide more detail about the
current configuration and to refer to the updated documentation.
Signed-off-by: Rudy Rigot <rudy.rigot@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our git-compat-util header defines a few noop wrappers for system
functions if they are not available. This was originally done with a
macro, but in 15b52a44e0 (compat-util: type-check parameters of no-op
replacement functions, 2020-08-06) we switched to inline functions,
because it gives us basic type-checking.
This can cause compilation failures when the system _does_ declare those
functions but we choose not to use them, since the compiler will
complain about the redeclaration. This was seen in the real world when
compiling against certain builds of uclibc, which may leave
_POSIX_THREAD_SAFE_FUNCTIONS unset, but still declare flockfile() and
funlockfile().
It can also be seen on any platform that has setitimer() if you choose
to compile without it (which plausibly could happen if the system
implementation is buggy). E.g., on Linux:
$ make NO_SETITIMER=IWouldPreferNotTo git.o
CC git.o
In file included from builtin.h:4,
from git.c:1:
git-compat-util.h:344:19: error: conflicting types for ‘setitimer’; have ‘int(int, const struct itimerval *, struct itimerval *)’
344 | static inline int setitimer(int which UNUSED,
| ^~~~~~~~~
In file included from git-compat-util.h:234:
/usr/include/x86_64-linux-gnu/sys/time.h:155:12: note: previous declaration of ‘setitimer’ with type ‘int(__itimer_which_t, const struct itimerval * restrict, struct itimerval * restrict)’
155 | extern int setitimer (__itimer_which_t __which,
| ^~~~~~~~~
make: *** [Makefile:2714: git.o] Error 1
Here I think the compiler is complaining about the lack of "restrict"
annotations in our version, but even if we matched it completely (and
there is no way to match all platforms anyway), it would still complain
about a static declaration following a non-static one. Using macros
doesn't have this problem, because the C preprocessor rewrites the name
in our code before we hit this level of compilation.
One way to fix this would just be to revert most of 15b52a44e0. What we
really cared about there was catching build problems with
precompose_argv(), which most platforms _don't_ build, and which is our
custom function. So we could just switch the system wrappers back to
macros; most people build the real versions anyway, and they don't
change. So the extra type-checking isn't likely to catch bugs.
But with a little work, we can have our cake and eat it, too. If we
define the type-checking wrappers with a unique name, and then redirect
the system names to them with macros, we still get our type checking,
but without redeclaring the system function names.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In test case "shared=all", the working directory is permanently changed
to the "sub" directory. This leads to a strange behavior that the
temporary repositories created by subsequent test cases are all in this
"sub" directory, such as "sub/new", "sub/child.git". If we bypass this
test case, all subsequent test cases will have different working
directory.
Besides, all subsequent test cases assuming they are in the "sub"
directory do not run any destructive operations in their parent
directory (".."), and will not make damage out side of $TRASH_DIRECTORY.
So it is a safe change for us to run the test case "shared=all" in
current repository instead of creating and changing to "sub".
For the next test case, the path ".git/info" is assumed to be missing,
but we no longer run the test case in the "sub" repository which is
initialized from an empty template. In order for the test case to run
properly, we can set "TEST_CREATE_REPO_NO_TEMPLATE=1" to initialize the
default repository without a template.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor several test cases to use "test_when_finished" for cleanup.
1. For first of these, we used to clean-up outside the test, but instead
let's use test_when_finished for that.
2. For the second, we used to leave "new" after we are done, but not use
it at all later. Now we do clean up.
3. For the rest, these child.git test repositories used to follow
"initialize what we are going to use to a known state before we use"
pattern, which is not wrong per-se, but now we use "clean up the
cruft we made after we are done" pattern, which may arguably be
better simply because the test that makes cruft should know what
cruft it created better than whatever comes later that may not know.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The template dir prepared in test case "forced modes" is not used as
expected because a wrong template dir is provided to "git init". This is
because the $CWD for "git-init" command is a sibling directory alongside
the template directory. Change it to the right template directory and
add a protection test using "test_path_is_file".
The wrong template directory was introduced by mistake in commit
e1df7fe43f (init: make --template path relative to $CWD, 2019-05-10).
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OPT_PARSE_LIST_OBJECTS_FILTER_INIT() with a non-NULL second argument
passes a function pointer via an object pointer, which is undefined. It
may work fine on platforms that implement C99 extension J.5.7 (Function
pointer casts). Remove the unused macro and avoid the dependency on
that extension.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-objects uses OPT_PARSE_LIST_OBJECTS_FILTER_INIT() to initialize the
a rev_info struct lazily before populating its filter member using the
--filter option values. It tracks whether the initialization is needed
using the .have_revs member of the callback data.
There is a better way: Use a stand-alone list_objects_filter_options
struct and build a rev_info struct with its .filter member after option
parsing. This allows using the simpler OPT_PARSE_LIST_OBJECTS_FILTER()
and getting rid of the extra callback mechanism.
Even simpler would be using a struct rev_info as before 5cb28270a1
(pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28),
but that would expose a memory leak caused by repo_init_revisions()
followed by release_revisions() without a setup_revisions() call in
between.
Using list_objects_filter_options also allows pushing the rev_info
struct into get_object_list(), where it arguably belongs. Either way,
this is all left for later.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't
leak, 2022-03-28) --filter options given to git pack-objects overrule
earlier ones, letting only the leftmost win and leaking the memory
allocated for earlier ones. Fix that by only initializing the rev_info
struct once.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git pack-objects should accept multiple --filter options as documented
in Documentation/rev-list-options.txt, but currently the last one wins.
Show that using tests with multiple blob size limits
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fb2d0db502 (test-lib-functions: add parsing helpers for ls-files and
ls-tree, 2022-04-04) not only started to use helper functions, it also
started to pipe the output of git ls-files into them directly, without
using a temporary file. No explanation was given. This causes the
return code of that git command to be ignored.
Revert that part of the change, use temporary files and check the return
code of git ls-files again.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When GIT_COMPLETION_IGNORE_CASE is set, also allow lowercase completion
text like "head" to match uppercase HEAD and other pseudorefs.
Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If GIT_COMPLETION_IGNORE_CASE is set, --ignore-case will be added to
git for-each-ref calls so that refs can be matched case insensitively,
even when running on case sensitive filesystems.
Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we find a midx bitmap, we do not bother checking for pack
bitmaps, since we can use only one. But since we will warn of unused
bitmaps via trace2, let's continue looking for pack bitmaps when
tracing is enabled.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After opening a bitmap successfully, we try opening others only
because we want to report that other bitmap files are ignored in
the trace2 log. When trace2 is not enabled, we do not have to
do any of that.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It works with GIT_EDITOR="emacs", "emacsclient" or "emacsclient -t"
Signed-off-by: Yoichi Nakayama <yoichi.nakayama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check if the "mode" argument supplied by the user is valid by seeing
if we have a mode_$mode function defined. But we don't do that until
after creating the tempfile. This is wasteful (we create a tempfile but
never use it), and makes it harder to add new options (the recent stdout
option exits before creating the tempfile, so it misses the check and
"git jump --stdout foo" will produce "git-jump: 92: mode_foo: not found"
rather than the regular usage message).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email relays unrecognized arguments to its format-patch call.
Passing '-v N' leads to an error because -v is consumed as
send-email's --validate. For example,
git send-email -v 3 @{u}
fails with
fatal: ambiguous argument '3': unknown revision or path not in the
working tree. [...]
To prevent this, add the short --reroll-count option to send-email's
main option list and explicitly provide it to the format-patch call.
There other format-patch options that send-email doesn't relay
properly, including at least -n, -N, and the diff option -D. Punt on
these because dealing with them is more complicated:
* they would require configuring send-email to not ignore option case
* send-email makes three GetOptions() calls with different sets of
options, the last being the main set of options. Unlike -v, which
is consumed by the last GetOptions call, the -n, -N, and -D options
are consumed as abbreviations by the earlier calls.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The handling to die early when there is no EDITOR is valuable when
used in normal code (i.e., editor.c). In git-var, where
null/empty-string is a perfectly valid value to return, it doesn't
make as much sense.
Remove this handling from `git var GIT_EDITOR` so that it does not
fail so noisily when there is no defined editor.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, git-var could print usage() even if the command was invoked
correctly with a variable defined in git_vars -- provided that its
read() function returned NULL.
Now, we only print usage() only if it was called with a logical
variable that wasn't defined -- regardless of read().
Since we now know the variable is valid when we call read_var(), we
can avoid printing usage() here (and exiting with code 129) and
instead exit quietly with code 1. While exiting with a different code
can be a breaking change, it's far better than changing the exit
status more generally from 'failure' to 'success'.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a lot of uses of UNLEAK() it would be quite tricky to release the
memory involved, or we're missing the relevant *_(release|clear)()
functions. But in these cases we have them already, and can just
invoke them on the variable(s) involved, instead of UNLEAK().
For "builtin/worktree.c" the UNLEAK() was also added in [1], but the
struct member it's unleaking was removed in [2]. The only non-"int"
member of that structure is "const char *keep_locked", which comes to
us via "argv" or a string literal[3].
We have good visibility via the compiler and
tooling (e.g. SANITIZE=address) on bad free()-ing, but none on
UNLEAK() we don't need anymore. So let's prefer releasing the memory
when it's easy.
For "bugreport", "worktree" and "config" we need to start using a "ret
= ..." return pattern. For "builtin/bugreport.c" these UNLEAK() were
added in [4], and for "builtin/config.c" in [1].
For "config" the code seen here was the only user of the "value"
variable. For "ACTION_{RENAME,REMOVE}_SECTION" we need to be sure to
return the right exit code in the cases where we were relying on
falling through to the top-level.
I think there's still a use-case for UNLEAK(), but hat it's changed
since then. Using it so that "we can see the real leaks" is
counter-productive in these cases.
It's more useful to have UNLEAK() be a marker of the remaining odd
cases where it's hard to free() the memory for whatever reason. With
this change less than 20 of them remain in-tree.
1. 0e5bba53af (add UNLEAK annotation for reducing leak false
positives, 2017-09-08)
2. d861d34a6e (worktree: remove extra members from struct add_opts,
2018-04-24)
3. 0db4961c49 (worktree: teach `add` to accept --reason <string> with
--lock, 2021-07-15)
4. 0e5bba53af and 00d8c31105 (commit: fix "author_ident" leak,
2022-05-12).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Free memory from parse_options_concat(), which comes from code
originally added (then extended) in [1].
At this point we could get several more tests leak-free by free()-ing
the xstrdup() just above the line being changed, but that one's
trickier than it seems. The sequencer_remove_state() function
supposedly owns it, but sometimes we don't call it. I have a fix for
it, but it's non-trivial, so let's fix the easy one first.
1. c62f6ec341 (revert: add --ff option to allow fast forward when
cherry-picking, 2010-03-06)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Call the release_revisions() function added in
1878b5edc0 (revision.[ch]: provide and start using a
release_revisions(), 2022-04-13) in cmd_cherry_pick(), as well as
freeing the xmalloc()'d "revs" member itself.
This is the same change as the one made for cmd_revert() a few lines
above it in fd74ac95ac (revert: free "struct replay_opts" members,
2022-07-01).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix a leak in the recent 6159e7add4 (rebase --abort: improve reflog
message, 2022-10-12). Before that commit we'd strbuf_release() the
reflog message we were formatting, but when that code was refactored
to use "ropts.head_msg" the strbuf_release() was omitted.
Ideally the three users of "ropts" in cmd_rebase() should use
different "ropts" variables, in practice they're completely separate,
as this and the other user in the "switch" statement will "goto
cleanup", which won't touch "ropts".
The third caller after the "switch" is then unreachable if we take
these two branches, so all of them are getting a "{ 0 }" init'd
"ropts".
So it's OK that we're leaving a stale pointer in "ropts.head_msg",
cleaning it up was our responsibility, and it won't be used again.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The "new_pack" we allocate in check_connected() wasn't being
free'd. Let's do that before we return from the function. This has
leaked ever since "new_pack" was added to this function in
c6807a40dc (clone: open a shortcut for connectivity check,
2013-05-26).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When "read_strategy_opts()" is called we may have populated the
"opts->strategy" before, so we'll need to free() it to avoid leaking
memory.
We populate it before because we cal get_replay_opts() from within
"rebase.c" with an already populated "opts", which we then copy. Then
if we're doing a "rebase -i" the sequencer API itself will promptly
clobber our alloc'd version of it with its own.
If this code is changed to do, instead of the added free() here a:
if (opts->strategy)
opts->strategy = xstrdup("another leak");
We get a couple of stacktraces from -fsanitize=leak showing how we
ended up clobbering the already allocated value, i.e.:
Direct leak of 6 byte(s) in 1 object(s) allocated from:
#0 0x7f2e8cd45545 in __interceptor_malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
#1 0x7f2e8cb0fcaa in __GI___strdup string/strdup.c:42
#2 0x6c4778 in xstrdup wrapper.c:39
#3 0x66bcb8 in read_strategy_opts sequencer.c:2902
#4 0x66bf7b in read_populate_opts sequencer.c:2969
#5 0x6723f9 in sequencer_continue sequencer.c:5063
#6 0x4a4f74 in run_sequencer_rebase builtin/rebase.c:348
#7 0x4a64c8 in run_specific_rebase builtin/rebase.c:753
#8 0x4a9b8b in cmd_rebase builtin/rebase.c:1824
#9 0x407a32 in run_builtin git.c:466
#10 0x407e0a in handle_builtin git.c:721
#11 0x40803d in run_argv git.c:788
#12 0x40850f in cmd_main git.c:923
#13 0x4eee79 in main common-main.c:57
#14 0x7f2e8ca9f209 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#15 0x7f2e8ca9f2bb in __libc_start_main_impl ../csu/libc-start.c:389
#16 0x405fd0 in _start (git+0x405fd0)
Direct leak of 4 byte(s) in 1 object(s) allocated from:
#0 0x7f2e8cd45545 in __interceptor_malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
#1 0x7f2e8cb0fcaa in __GI___strdup string/strdup.c:42
#2 0x6c4778 in xstrdup wrapper.c:39
#3 0x4a3c31 in xstrdup_or_null git-compat-util.h:1169
#4 0x4a447a in get_replay_opts builtin/rebase.c:163
#5 0x4a4f5b in run_sequencer_rebase builtin/rebase.c:346
#6 0x4a64c8 in run_specific_rebase builtin/rebase.c:753
#7 0x4a9b8b in cmd_rebase builtin/rebase.c:1824
#8 0x407a32 in run_builtin git.c:466
#9 0x407e0a in handle_builtin git.c:721
#10 0x40803d in run_argv git.c:788
#11 0x40850f in cmd_main git.c:923
#12 0x4eee79 in main common-main.c:57
#13 0x7f2e8ca9f209 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#14 0x7f2e8ca9f2bb in __libc_start_main_impl ../csu/libc-start.c:389
#15 0x405fd0 in _start (git+0x405fd0)
This can be seen in e.g. the 4th test of
"t3404-rebase-interactive.sh".
In the larger picture the ownership of the "struct replay_opts" is
quite a mess, e.g. in this case rebase.c's static "get_replay_opts()"
function partially creates it, but nothing in rebase.c will free()
it. The structure is "mostly owned" by the sequencer API, but it also
expects to get these partially populated versions of it.
It would be better to have rebase keep track of what it allocated, and
free() that, and to pass that as a "const" to the sequencer API, which
would copy what it needs to its own version, and to free() that.
But doing so is a much larger change, and however messy the ownership
boundary is here is consistent with what we're doing already, so let's
just free() this to fix the leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix a memory leak in overlay_tree_on_index(), we need to
clear_pathspec() at some point, which might as well be after the last
time we use it in the function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Call graph_clear() in release_revisions(), this will free memory
allocated by e.g. this command, which will now run without memory
leaks:
git -P log -1 --graph --no-graph --graph
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix a leak that's been with us since 3407bb4940 (Add "unpack-file"
helper that unpacks a sha1 blob into a tmpfile., 2005-04-18). See
00c8fd493a (cat-file: use streaming API to print blobs, 2012-03-07)
for prior art which shows the same API pattern, i.e. free()-ing the
result of read_object_file() after it's used.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix various leaks in built-ins, libraries and a test helper here we
were missing a call to strbuf_release(), string_list_clear() etc, or
were calling them after a potential "return".
Comments on individual changes:
- builtin/checkout.c: Fix a memory leak that was introduced in [1]. A
sibling leak introduced in [2] was recently fixed in [3]. As with [3]
we should be using the wt_status_state_free_buffers() API introduced
in [4].
- builtin/repack.c: Fix a leak that's been here since this use of
"strbuf_release()" was added in a1bbc6c017 (repack: rewrite the shell
script in C, 2013-09-15). We don't use the variable for anything
except this loop, so we can instead free it right afterwards.
- builtin/rev-parse: Fix a leak that's been here since this code was
added in 21d4783538 (Add a parseopt mode to git-rev-parse to bring
parse-options to shell scripts., 2007-11-04).
- builtin/stash.c: Fix a couple of leaks that have been here since
this code was added in d4788af875 (stash: convert create to builtin,
2019-02-25), we strbuf_release()'d only some of the "struct strbuf" we
allocated earlier in the function, let's release all of them.
- ref-filter.c: Fix a leak in 482c119186 (gpg-interface: improve
interface for parsing tags, 2021-02-11), we don't use the "payload"
variable that we ask parse_signature() to populate for us, so let's
free it.
- t/helper/test-fake-ssh.c: Fix a leak that's been here since this
code was added in 3064d5a38c (mingw: fix t5601-clone.sh,
2016-01-27). Let's free the "struct strbuf" as soon as we don't need
it anymore.
1. c45f0f525d (switch: reject if some operation is in progress,
2019-03-29)
2. 2708ce62d2 (branch: sort detached HEAD based on a flag,
2021-01-07)
3. abcac2e19f (ref-filter.c: fix a leak in get_head_description,
2022-09-25)
4. 962dd7ebc3 (wt-status: introduce wt_status_state_free_buffers(),
2020-09-27).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When the "ident" member of the structure was added in
1e8fef609e (untracked cache: guard and disable on system changes,
2015-03-08) this function wasn't updated to free it. Let's do so.
Let's also free the "exclude_per_dir" memory we've been leaking
since[1], while making sure not to free() the constant ".gitignore"
string we add by default[2].
As we now have three struct members we're freeing let's change
free_untracked_cache() to return early if "uc" isn't defined. We won't
hand it to free() now, but that was just for convenience, once we're
dealing with >=2 struct members this pattern is more convenient.
1. f9e6c64958 (untracked cache: load from UNTR index extension,
2015-03-08)
2. 039bc64e88 (core.excludesfile clean-up, 2007-11-14)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The "sparse_checkout_patterns" member was added to the "struct
index_state" in 836e25c51b (sparse-checkout: hold pattern list in
index, 2021-03-30), but wasn't added to discard_index(). Let's do
that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The read_cache() in prepare_to_commit() would end up clobbering the
pointer we had for a previously populated "the_index.cache_tree" in
the very common case of "git commit" stressed by e.g. the tests being
changed here.
We'd populate "the_index.cache_tree" by calling
"update_main_cache_tree" in prepare_index(), but would not end up with
a "fully prepared" index. What constitutes an existing index is
clearly overly fuzzy, here we'll check "active_nr" (aka
"the_index.cache_nr"), but our "the_index.cache_tree" might have been
malloc()'d already.
Thus the code added in 11c8a74a64 (commit: write cache-tree data when
writing index anyway, 2011-12-06) would end up allocating the
"cache_tree", and would interact here with code added in
7168624c35 (Do not generate full commit log message if it is not
going to be used, 2007-11-28). The result was a very common memory
leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
These two built-ins both deal with the index, but weren't discarding
it. In subsequent commits we'll add more free()-ing to discard_index()
that we've missed, but let's first call the existing function.
We can doubtless add discard_index() (or its alias discard_cache()) to
a lot more places, but let's just add it here for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
This marks tests that have been leak-free since various recent
commits, but which were not marked us such when the memory leak was
fixed. These were mostly discovered with the "check" mode added in
faececa53f (test-lib: have the "check" mode for SANITIZE=leak
consider leak logs, 2022-07-28).
Commits that fixed the last memory leak in these tests. Per narrowing
down when they started to pass under SANITIZE=leak with "bisect":
- t1022-read-tree-partial-clone.sh:
7e2619d8ff (list_objects_filter_options: plug leak of filter_spec
strings, 2022-09-08)
- t4053-diff-no-index.sh: 07a6f94a6d (diff-no-index: release prefixed
filenames, 2022-09-07)
- t6415-merge-dir-to-symlink.sh: bac92b1f39 (Merge branch
'js/ort-clean-up-after-failed-merge', 2022-08-08).
- t5554-noop-fetch-negotiator.sh:
66eede4a37 (prepare_repo_settings(): plug leak of config values,
2022-09-08)
- t2012-checkout-last.sh, t7504-commit-msg-hook.sh,
t91{15,46,60}-git-svn-*.sh: The in-flight "pw/rebase-no-reflog-action"
series, upon which this is based:
https://lore.kernel.org/git/pull.1405.git.1667575142.gitgitgadget@gmail.com/
Let's mark all of these as passing with
"TEST_PASSES_SANITIZE_LEAK=true", to have it regression tested,
including as part of the "linux-leaks" CI job.
Additionally, let's remove the "!SANITIZE_LEAK" prerequisite from
tests that now pass, these were marked as failing in:
- 77e56d55ba (diff.c: fix a double-free regression in a18d66cefb,
2022-03-17)
- c4d1d52631 (tests: change some 'test $(git) = "x"' to test_cmp,
2022-03-07)
These were not spotted with the new "check" mode, but manually, it
doesn't cover these sort of prerequisites. There's few enough that we
shouldn't bother to automate it. They'll be going away sooner than
later.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since 52d59cc645 (branch: add a --copy (-c) option to go with --move
(-m), 2017-06-18) we can copy a branch to make a new branch with the
'-c' (copy) option or to overwrite an existing branch using the '-C'
(force copy) option. A no-op possibility is considered when we are
asked to copy a branch to itself, to follow the same no-op introduced
for the rename (-M) operation in 3f59481e33 (branch: allow a no-op
"branch -M <current-branch> HEAD", 2011-11-25). To check for this, in
52d59cc645 we compared the branch names provided by the user, source
(HEAD if omitted) and destination, and a match is considered as this
no-op.
Since ae5a6c3684 (checkout: implement "@{-N}" shortcut name for N-th
last branch, 2009-01-17) a branch can be specified using shortcuts like
@{-1}. This allows this usage:
$ git checkout -b test
$ git checkout -
$ git branch -C test test # no-op
$ git branch -C test @{-1} # oops
$ git branch -C @{-1} test # oops
As we are using the branch name provided by the user to do the
comparison, if one of the branches is provided using a shortcut we are
not going to have a match and a call to git_config_copy_section() will
happen. This will make a duplicate of the configuration for that
branch, and with this progression the second call will produce four
copies of the configuration, and so on.
Let's use the interpreted branch name instead for this comparison.
The rename operation is not affected.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since 73fce29427 (Turn `git bisect` into a full built-in, 2022-11-10)
we've used builtin/bisect.c instead of git-bisect.sh to implement the
"bisect" command.
Let's remove the unused leftover script, and the ".gitignore" entry for
the "git-bisect--helper", which also hasn't been built since
73fce29427.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In "open_midx_bitmap_1()" and "open_pack_bitmap_1()", when we find that
there are multiple bitmaps, we will only open the first one and then
leave warnings about the remaining pack information, the information
will contain the absolute path of the repository, for example in a
alternates usage scenario. So let's hide this kind of potentially
sensitive information in this commit.
Found-by: XingXin <moweng.xx@antgroup.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When trying to open a pack bitmap, we call open_pack_bitmap_1() in a
loop, during which it tries to open up the pack index corresponding
with each available pack.
It's likely that we'll end up relying on objects in that pack later
in the process (in which case we're doing the work of opening the
pack index optimistically), but not guaranteed.
For instance, consider a repository with a large number of small
packs, and one large pack with a bitmap. If we see that bitmap pack
last in our loop which calls open_pack_bitmap_1(), the current code
will have opened *all* pack index files in the repository. If the
request can be served out of the bitmapped pack alone, then the time
spent opening these idx files was wasted.S
Since open_pack_bitmap_1() calls is_pack_valid() later on (which in
turns calls open_pack_index() itself), we can just drop the earlier
call altogether.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The previous commit added a `--merge-base` option in order to allow
using a specified merge-base for the merge. Extend the input accepted
by `--stdin` to also allow a specified merge-base with each merge
requested. For example:
printf "<b3> -- <b1> <b2>" | git merge-tree --stdin
does a merge of b1 and b2, and uses b3 as the merge-base.
Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
This patch will give our callers more flexibility to use `git merge-tree`,
such as:
git merge-tree --write-tree --merge-base=branch^ HEAD branch
This does a merge of HEAD and branch, but uses branch^ as the merge-base.
And the reason why using an option flag instead of a positional argument
is to allow additional commits passed to merge-tree to be handled via an
octopus merge in the future.
Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Now that the shell script hands off to the `bisect--helper` to do
_anything_ (except to show the help), it is but a tiny step to let the
helper implement the actual `git bisect` command instead.
This retires `git-bisect.sh`, concluding a multi-year journey that many
hands helped with, in particular Pranit Bauna, Tanushree Tumane and
Miriam Rubio.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In a later change, we would like to turn bisect into a builtin by
renaming bisect--helper.
However, there's an oddity that "git bisect log" accepts any number of
arguments and it will just ignore them all.
Let's prepare for the next step by ignoring any arguments passed to
"git bisect--helper log"
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In preparation for making `git bisect` a real built-in, let's prepare
the `bisect--helper` built-in to handle `git bisect--helper good` and
`git bisect--helper bad`, i.e. eliminate the need of `state` subcommand.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In subsequent commits we'll be removing "git-bisect.sh" in favor of
promoting "bisect--helper" to a "bisect" built-in.
In doing that we'll first need to have it support "git bisect--helper
<cmd>" rather than "git bisect--helper --<cmd>", and then finally have
its "-h" output claim to be "bisect" rather than "bisect--helper".
Instead of suffering that churn let's start claiming to be "git
bisect" now. In just a few commits this will be true, and in the
meantime emitting the "wrong" usage information from the helper is a
small price to pay to avoid the churn.
Let's also declare "BUILTIN_*" macros, when we eventually migrate the
sub-commands themselves to parse_options() we'll be able to re-use the
strings. See 0afd556b2e (worktree: define subcommand -h in terms of
command -h, 2022-10-13) for a recent example.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Address a test blindspot, the "log" command is the odd one out because
"git-bisect.sh" ignores any arguments it receives. Let's test both the
exit codes we expect, and the stderr and stdout we're emitting.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In a later change, we will convert the bisect--helper to be builtin
bisect. Let's start by self-identifying it's the real bisect when reporting
error.
This change is safe since 'git bisect--helper' is an implementation
detail, users aren't expected to call 'git bisect--helper'.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Some system never reports negative exit code at all, they reports them
as bigger-than-128 instead. We take extra care for those systems in the
later check for normal 'do_bisect_run' loop.
Let's check it here, too.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Preceding commits fixed output and behavior regressions in
d1bbbe45df (bisect--helper: reimplement `bisect_run` shell function
in C, 2021-09-13), which did not claim to be changing the output of
"git bisect run".
But some of the output it emitted was subjectively better, so once
we've asserted that we're back on v2.29.0 behavior, let's change some
of it back:
- We now quote the arguments again, but omit the first " " when
printing the "running" line.
- Ditto for other cases where we emitted the argument
- We say "found first bad commit" again, not just "run success"
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Based-on-patch-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13) reimplemented parts of "git bisect run" in
C it changed the output we emitted so that:
- The "running ..." line was now quoted
- We lost the \n after our output
- We started saying "bisect found ..." instead of "bisect run success"
Arguably some of this is better now, but as d1bbbe45df did not
advocate for changing the output, let's revert this for now. It'll be
easy to change it back if that's what we'd prefer.
This does not change the one remaining use of "command.buf" to emit
the quoted argument, as that's new in d1bbbe45df.
Some of these cases were not tested for in the tests added in the
preceding commit, I didn't have time to fleshen those out, but a look
at f1de981e8b will show that the other output being adjusted here is
now equivalent to what it was before d1bbbe45df.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
We didn't add "{}" to all "if/else" branches, and one "error" was
mis-indented. Let's fix that first, which makes subsequent commits
smaller. In the case of the "if" we can simply early return instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add three failing tests which succeed on v2.29.0, but due to the topic
merged at [1] (specifically [2]) have been failing since then. We'll
address those regressions in subsequent commits.
There was also a "regression" where:
git bisect run ./missing-script.sh
Would count a non-existing script as "good", as the shell would exit
with 127. That edge case is a bit too insane to preserve, so let's not
add it to these regression tests.
There was another regression that 'git bisect' consumed some options
that was meant to passed down to program run with 'git bisect run'.
Since that regression is breaking user's expectation, it has been fixed
earlier without this patch queued.
1. 0a4cb1f1f2 (Merge branch 'mr/bisect-in-c-4', 2021-09-23)
2. d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
* dd/bisect-helper-subcommand:
bisect--helper: parse subcommand with OPT_SUBCOMMAND
bisect--helper: move all subcommands into their own functions
bisect--helper: remove unused options
Whenever a branch is pushed to a repository which has GitHub Actions
enabled, a bunch of new workflow runs are started.
We sometimes see contributors push multiple branch updates in rapid
succession, which in conjunction with the impressive time swallowed by
even just a single CI build frequently leads to many queued-up runs.
This is particularly problematic in the case of Pull Requests where a
single contributor can easily (inadvertently) prevent timely builds for
other contributors when using a shared repository.
To help with this situation, let's use the `concurrency` feature of
GitHub workflows, essentially canceling GitHub workflow runs that are
obsoleted by more recent runs:
https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency
For workflows that *do* want the behavior in the pre-image of this
patch, they can use the ci-config feature to disable the new behavior by
adding an executable script on the ci-config branch called
'skip-concurrent' which terminates with a non-zero exit code.
Original-patch-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 13:26:20 -05:00
533 changed files with 8672 additions and 4621 deletions
@ -1347,8 +1347,8 @@ author to given a talk and for publishing this paper.
References
----------
- [[[1]]] https://www.nist.gov/sites/default/files/documents/director/planning/report02-3.pdf['The Economic Impacts of Inadequate Infratructure for Software Testing'. Nist Planning Report 02-3], see Executive Summary and Chapter 8.
- [[[2]]] http://www.oracle.com/technetwork/java/codeconvtoc-136057.html['Code Conventions for the Java Programming Language'. Sun Microsystems.]
- [[[1]]] https://web.archive.org/web/20091206032101/http://www.nist.gov/public_affairs/releases/n02-10.htm['Software Errors Cost U.S. Economy $59.5 Billion Annually'. Nist News Release.] See also https://www.nist.gov/system/files/documents/director/planning/report02-3.pdf['The Economic Impacts of Inadequate Infratructure for Software Testing'. Nist Planning Report 02-3], Executive Summary and Chapter 8.
- [[[2]]] https://www.oracle.com/java/technologies/javase/codeconventions-introduction.html['Code Conventions for the Java Programming Language: 1. Introduction'. Sun Microsystems.]
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.